 Welcome to our talk, 2021. Our January, back to the future of Windows vulnerabilities and the zero days we brought back with us. My name is Tomer Bar. I've been in the cyber security field for more than 15 years, currently leading SAVEBREACH Labs as Director of Security Research. My main focus is on vulnerability research and nation-state APT research. My name is Eran Sehgon. אני עשיתי בזמות סיברבסקיילה בו מלך 7 שנה עבורת כאן, אני עשיתי כבר כסמחה סיברבסקיילה בין גלגת סיגת השני שנה היחסה שלי נעבורת על רעשת ומאישה דיבייסטים ניתן במהד חיים דוד דוד שהם בו מישהו מהמסך והם מלך שאני עשיתי כאן In 2020, security researchers reported a record number of 1,000 windows vulnerabilities. We were curious what superpowers will we get from researching this huge number of vulnerabilities. Can we leverage our findings to discover zero days? We decided to go back in time to 2016 to search for patterns and automatically classify all the public vulnerabilities since then. We believe that only by connecting the dots to a bigger picture, we will be able to come back to 2021 with the success of achieving our goal. We will start with describing the research goals, assumptions, motivations, and approach. Then we will explain the challenge and our chosen solution, process, and infrastructure. We will detail the process step by step, provide a detailed and to an example from zero to zero day. We will present the six vulnerabilities and post-exploitation techniques we discovered. We will end our talk with a new proposed idea for discovering vulnerable windows machines and open a Q&A session. We defined three main goals. First, dive in and understand the root cause of each vulnerability and how Microsoft chose to address it. Second, automate and provide tools to ease the process from root cause to POC. Last but most important, discover new zero days based on a semi-automated process. Every research starts with assumptions. Let's describe them. The first assumption is that although Microsoft is a huge enterprise with thousands of developers, we believe that similar vulnerabilities will be mitigated by similar patching techniques. The second assumption is that Microsoft will try to fix each vulnerability with as minor as changes possible in order to avoid backward compatibility issues. This approach might be prone to patch bypasses. Last, it will be a good idea to search for zero days near a code that already been known as vulnerable. Probably the first thing you will do is to start checking if your assumptions are correct. We encountered recent public research on just one Anthos kernel function, which included five different vulnerabilities from various categories in a short block of code. We won't dive into the specific details of each vulnerability, but you can get the impression that our assumptions were fulfilled. Vulnerable code is a good place to search for new vulnerabilities. The vulnerabilities have different categories, but all are due to wrong bound checks or wrong handling of memory allocations. Now that we are more confident in the research idea, it was time to think about our research approach. Until now, patch diffing was done manually by comparing the differences of a vulnerable executable to its new code version after the fix. Furthermore, the main goal of this process was focus on understanding the root cause of a single vulnerability and building a one-day to exploit it. We aimed higher. We wanted to find zero days in the end, but is it possible to jump from root cause to zero days? We understood that in order to achieve it, we will need to build an automated process that will gather the insights from all the patches in a single searchable DB. Let's dive into our infrastructure implementation moving over to a run. We are required to take four steps in order to achieve our goal. The first step is to understand each one of the patches Microsoft released and infer the vulnerability root cause. Then we want to enrich the context of each vulnerability by correlating publicly available information about the vulnerability with each patch. The third step is to find a way to trigger the vulnerability. In other words, which actions lead to the vulnerable code? Finally, after gaining this valuable knowledge, we had to find ways to harness this knowledge for hunting zero days. Let's begin with step one. The automated process includes five sub steps. First, downloading all Windows 8 security-only updates since 2016. The reason we chose to download security updates of Windows 8 and not Windows 10 is because Windows 8 has security-only updates while Windows 10 doesn't. It is possible to discover all the vulnerabilities Microsoft ever fixed in Windows 8 since 2016 using the security updates. After we downloaded all Windows 8 security-only updates, the automated process extracts the KBs and compares the extracted files versus the older version of the same files using Bindiff Tool. Now the fun part begins. The changes are classified into groups and inserted into a database so we will be able to search for patterns. Each group represents a specific type of change and we will talk about these features in the next slides. To get some perspective on the scale we are talking about, here are a few numbers. You can see we have more than 100,000 unique functions. Each one of them may be a part of a patch, so it is clear that we need to understand which changes are interesting and which aren't. Inside the KB, some of the files are interesting while others aren't because a PE can get indirectly updated such as a change in one of its libraries. It has a huge effect. For example, GDI32 is one of the most vulnerable PEs in Windows. It has more than 26 statically linked libraries. Some of these libraries are statically linked to another libraries and so on. In addition, every PE in Windows 8 is part of a package and every time one of the PEs in the package has to be updated, the entire package is recompiled together and all the files are included in the KB. In order to detect which changes are interesting and which aren't, first, we need to understand which changes the compiler can do. It is not simple to distinguish between changes made by the developer and the changes made by the compiler because of three compilations. If you compile the same code twice, the compiler sometimes compiles it with different alignments of the code, reorder some instructions, and all other optimizations. We had to find ways to reduce the compiler noise. We chose to reduce the compiler noise using features. The features group similar changes together and allow us to look for vulnerability patterns. We developed 33 features. Most of them are in the function level, but other are in the executable level. Each feature is optimized differently. Some are optimized to have the lowest amount of false positives, while other give us the overview insights on the patch. There are two types of features, patch-related features and vulnerability-related features. Patch-related features group patches by the type of the change made by Microsoft, such as add a new function code. While vulnerability-related features group patches by vulnerability category, such as off by one, use after three, and so on. The beauty of the features line their simplicity. We'll provide some examples in the next slides. 2019, here we come. The first example of patch-related features is the number of excerpts. It is not directly linked to a specific vulnerability category. This feature compares the number of function calls between each patch function versus the unpatched version. In this example, ReadPropVariant added one excerpt to iStreamRead. Let's look into the overview of ReadPropVariant function. It is very easy to spot the change block. Let's zoom in to understand the root cause. As we can see, one call and three move operations were added. The root cause of this vulnerability is type confusion, due to the fact that ReadPropVariant reads a decimal from the file without resetting VT to VT decimal. This means that we can control the type of ReadPropVariant object. The patch added a call to iStreamRead and set VT to VT decimal to avoid type confusion. Let's jump back to 2018. An additional patch-related feature is number of conditions. It compares the patched and unpatched version to the file and looks for different amount of conditions in those functions. In this example, the patch added five new conditions to the function NTFS find files owned by SID, changing the number of conditions between 27 and 32. We can also see that we automatically add correlation between finding in our DB and public exploits. Zooming into the patch, we can see the five added code blocks in the patched file on the right. Their goal is to verify that the user has admin permissions by calling SEToken isAdmin and allowing the listing only if the user has listing permissions. Let's move to vulnerability-specific features. Tomer? Thank you, Eran. Until this point, we introduced patch-related features. Now, we represent how these patch-related features can be used to understand the vulnerability root cause. We will go over a few examples. The professor told us not to, but we are bound to break rules following our guidelines of learning from the past. We look at high-profile integer overflow vulnerability, the SMB ghost. The patch uses function rtlulongad in order to verify that the allocation size will not overflow before calling the allocation function. Is this a unique fix? Or should we expect Microsoft to use this fix method in other integer overflow vulnerabilities? Let's go back in time to the earliest patches we identified through binary diffing. We went back in time to 2016 Las Vegas Defcon 25. It seems that there is a searchable pattern for the integer overflow vulnerability category. The fix added xref to the several ulongarithmatic function before calling the vulnerable function. In this example, it calls the ulong multiply function in order to verify that the allocation size will not overflow before calling the allocation function. We constructed a list of ulong functions and added an intsafe feature. Our integer overflow feature returned with more than 200 results. We selected a vulnerability that as far as we know has not been publicly analyzed yet and see if the feature provide us with automated root cause analysis. We chose the anti-DLL patch from April 2020. The only function that was really changed was LDRP search resource section. And the change included adding both RTL ulong multiply and add functions. Indeed, the patch was implemented by verifying that the allocated mapping size will not overflow. This was done using the same ulong function for vulnerability we saw in the previous example. As far as we know this patch pattern used since 2016. So now we have a working integer overflow feature. Sometimes features are very simple but are still very effective searching for all added functions that include integrity level expand the category and provide several additional vulnerabilities. There are additional features that we've researched and found interesting. One feature is race condition vulnerabilities which are usually fixed by adding log functions. Another feature is directory traversal which are mitigated by detecting the dot dot slash in strings. We'll speak about it later. And finally symbolic link vulnerabilities which are very common and are usually mitigated by adding a check for the final path. Let's describe the second step. Patch Tuesday occurs on the second Tuesday of each month and includes a report for each patch vulnerability. The report includes important data such as name, CVE number defected OS vulnerability category and etc. It used to have a description until mid 2020. The information can help us focus on the relevant features and the related executables. We have created an automated process that uses the API for downloading of all published CVE data since mid 2016. Please note that Microsoft recently changed the API and released a new one. The correlation process consists of four steps. First, take the previously extracted patches files. Then query all CVE data using Microsoft's API. Then extract the vulnerable component name. We will refer it as VCN from vulnerability CVE name or description. And finally, correlate the CVE to the patch files based on four different correlation methods. First, we check if the VCN extracted from Microsoft's API is a Windows service. The list of Windows service names is available in the registry. As we can see, connected user experiences and telemetry is the name of Diartrex service. If the VCN is not a service name, we will check if the VCN is included in the file description of any of the system files. As we can see, compatibility appraiser, VCN is the description of the appraiser executable. The third method of correlation was built manually. We classified those VCNs based on our Windows internal knowledge. The last correlation is based on gather statistics. Past association helping understanding the correlation between VCNs and the vulnerable executable. In this example, error reporting was the VCN which was found in three patch Tuesday during 2020. If we search for files which were patched during these months and were not patched in other 22 months, we found eight possible patched files. Prince Spooler was the VCN in four monthly patches and was associated with 12 possible patched files. Next time, we will see this VCN as part of a future vulnerability report, we will be able to correlate them to those files. The four method automatically produced correlation of 90% of the CVEs. Now that we know more about the context of each patch, we can prioritize which changes are more interesting and which are less. Moving over to Iran. So we found the patch linked that to the corresponding CVE and found which type of vulnerability it was. Now, how do you find which PE can trigger that code? Who is calling that PE? And which functions are called? Our goal in step 3 is finding out how to trigger the vulnerable function and generate the code that will trigger it. There are two sub steps for this step. First, find which executables can trigger the vulnerable code. And the second generates the code with the correct input to trigger the vulnerability. We added new features that will extract all the function calls from all the PEs in Windows. And for all the versions in order to generate a call graph from or to any function in Windows. We are talking about a very large scale. We are talking about more than 6 million function calls. Our graph is just like the graphs in AIDA but across binaries and huge. With the data we collected now we can create a call graph across the entire Windows system. In the picture you can see a visual representation of the call graph we created in order to find which functions are called Unwrap XML invitation. Later on we will demonstrate how we are going to use this function in post-exploitation scenario. We support multiple ways to detect a function call from function to another. Low time linking, the common way to call for an external function run time linking such as get proc address and com servers and clients. We generate these graphs for Windows 8 and Windows 10 because after all we want to find zero days in Windows 10. In addition to creating the graphs we collected information we need to choose which functions are the most interesting. We do this by looking for a function which are close in terms of distance to the vulnerable function verifying whether or not the function is exported and even scraping the Internet. We collected information and code examples about each function from multiple sources such as MSDN documentation and scraping project in GitHub. Using existing examples from the Internet is much quicker than writing them ourselves, especially when we are talking about non-documented functions. But we found that even that wasn't enough. We are still missing lots of functions so we had to find additional ways to generate POCs. So we want to have the option to trigger all RPC servers quickly. RPC is a common inter-process communication. It can be over network or locally. RPC calls use ideal interface definition language to set up the communication between the clients and the servers. In order to communicate with an RPC server we have to include the corresponding IDL file. Therefore, we would like to extract from Windows as many IDLs as possible. Since the original version of RPC View supports extraction of a single interface IDL we use a modified RPC View tool to extract all IDLs with all the interfaces. The extracted IDL file includes dozens of structs and unions. Most of them raise compilation error due to order of definition. So we fix that as well. RPC View works on the running RPC servers so we had to start all of them. After running a script to automatically start Windows services we got 150 running services. After we extracted the IDL from the RPC View we used the automated process. For each interface the automation created a new project. Setting the relevant parameters for the code and compile it using MS Build. In the end we successfully automatically generated 127 working projects. We'll provide a specific example of how easy it is to exploit the task scheduler ALPC vulnerability using our semi-automated process calling some of the functions that the IDL exposes. We generate and compile a template project. The project includes two files. The IDL and the short generic template for setting the needed RPC parameters for compose a RPC binding. In this example they provided UUID the protocol the RPC endpoint name are relevant to the task scheduler RPC. This template allows us to call each RPC functions that we would like to trigger. After the automation created our project with all the dependencies comes the manual part writing the exploit itself. All we need to do is call the correct functions with the correct input and we'll get a working project. We'll get privileged escalation. In order to exploit this vulnerability we created an hardlink and did two RPC calls. The first RPC call creates a folder in the task scheduler folder and the second RPC call sets the permissions. This will trigger the task schedulers to set the permissions of our executables resulting privileged escalation to anti-system. Moving on to Tommy. Until this point we covered how we can understand the root cause and trigger a one day in scale. But how do we find zero days? We will use the xxc vulnerability as an end-to-end example to demonstrate the four step process from zero to zero day. We enrich our db with the vulnerability category named CWE for each vulnerability. Then when we queried our db for the different CWE categories we noticed the 611 category improper restriction of XML entity reference with six relevant vulnerabilities. Let's query to see more details about them. We eventually found eight patched vulnerabilities six of them with full details in our db. So we were curious what is this vulnerability category. As described by CWE using XML with an external entity allows remote file read. Sounds very interesting. Let's dig deeper in order to understand how xxc happens. xxc vulnerabilities occur in xml parcels. When the xml contains a reference to an external entity such an HTTP server, this external entity can get the content of local files. xxc vulnerabilities contain multiple parts. The upper entity is the input to the xml parser. The xml parser reads the content of the file system ini and then sends a request to the attacker server to get the rest of the xml. The second entity the lower one is the response of the server. When the xml parser parsed this response it generates an additional HTTP get request with the content of the file system ini. This is according to the standard of xml and in order to prevent this feature in the xml parser the dtd which stands for document type definition must be disabled. Let's analyze the patches and understand the root cause of these xxc vulnerabilities. The patch for CVE 2018 0878 added four conditions to a functions called loadRA ticket in the MSRA executable. We know that the MSRA executable is used for Windows remote assistance where a basic user can set an invitation ticket to advance user to log in to either agent and remotely assisting. This function probably loads the invitation ticket. The loadRA ticket creates the xml DOM document COM object and uses it to load and parse the xml invitation file. In the patch version of this function in order to disable the support for external properties, four properties were added, restrict document, excel script, externals and prohibit dtd. The rest stay the same calling put async and load functions. We also followed the xref to the user COM CLSIDs. We will get to it later. We wonder is there a pattern for xxc patching? So we developed a feature to search for all added library dtd patches and found three past patches. Let's verify the first patch in the upnp host. We can see that HR load documents now calls the restrict DOM document. Let's see who also calls this patch function. The ISAPI extension executable was patched by calling restrict DOM document before a call to load xml. This is a different function than the load function used by remote assistance. So there are probably other vulnerable function used by the same COM object. Digging deeper, we found the third vulnerable function set xml. At this point we understood the root cause and defined the conditions for exploiting the vulnerable code. Let's dig into the first condition. We developed a code to trigger COM object. First, we queried all CLSIDs from all the files in Windows 10. Then, for each CLSID, we enumerated all functions and interfaces. Then we generated source code which created each one of the COM objects. And finally we called all these functions with an xxc xml file as input. We automatically discovered the vulnerable COM objects through the filtration data by adding the vulnerable CLSID to it. As we can see, we found 4 COM interfaces and 16 vulnerable CLSIDs. We developed a new feature to detect all the locations that seems to be vulnerable to xxc. This feature is different from other features. It does not compare between unpatched and patch version. Instead, it searches for xxc condition in all Windows 10 executables. Eventually, we found 52 candidates. 25 of them are marked as possibly vulnerable. They will be vulnerable if one less condition is met. If we are able to control the input of the past xml. For example, we can see the nsarray patch function loadarrayticket which is patched, while at the same time, they load and sort array invitation history function might still be vulnerable. Let's see if we can control the input xml. At the bottom, we can see the patch xml DOM object that was used in the original vulnerability and was called from loadarrayticket. To our surprise, there is a second instance of the exact same vulnerability xml DOM object just a few commands above used by several functions including our possibly vulnerable function load and sort array invitation history. Load and sort array invitation history indeed uses the vulnerable load function. We can see that the file to be loaded is the parameter a2. The gettingVitationManager loaded function calls our vulnerable function with the xml file ra contact history from app data. This file is controlled by the user with no need for admin privileges. This is the history data read from a legitimate ra contact history xml file that we changed to contain the address 1111 as the previous IP requested remote assistance. Now, we know how to trigger the vulnerability. So let's begin with the fun stuff. On the bottom left we see the MS array that was run with offer array as a parameter. It can be reached also with double click on MS array. In Procmon you can see the read action of the raxport history file and it's also its content which includes the export code. On the right we can see the first DTD get request for xxcxml which reads a remote file ctos 10 machine and exfiltrate it to the ctos server using a second get request. Microsoft fixed this MS array vulnerability we found and assigned it the ID CVE 2021 34507. In total we discover six vulnerabilities. All were reported to Microsoft. We will go over them one by one. The most critical vulnerability we found was the vulnerability in the windows help. We sent a spear phishing mail with a compressed attachment of an html file we named the html file to include .chn twice. This forced the vulnerable process hh executable to pass it although it's plain html and not a chn file. We can choose almost any extension to this file and it will still be passed. We tested the attack on default fully patch windows 10 machine and got blocked by the mark of the web mechanism but when we compressed the attachment even without encryption it bypassed the sandbox and exfiltration was successful. The third vulnerability we found in MMC include several snappings. The link to a web address snapping expects an html file but if we will provide an xxc xml exploit instead it will be passed by mshtml dll we can see on the top right that it will read the remote file and exfiltrate it to the c2 server it will provide an error message which includes the content of the exfiltrated file to trigger the windows media player remote file read vulnerability you will only need to open the option screen and select the network tab this will trigger vulnerable passing of the xml file which we can control on the right you can see the stack trace. Please pay attention to the calling of the vulnerable load function from mshxml3 dll. Moving over to Iran In addition to finding vulnerabilities in native code we wanted to discover zero days in managed code so we added a feature that decompiles all .NET executables in windows 10 using ispy ispy decompiles each executable to a visual studio solution with c-shop files the files look very similar to the source code even the parameter names and function names are kept we search inside all the .NET files for known processes and functions that are vulnerable or configured to use .DTD all the .xml parsers with .DTD processing enabled are vulnerable to .xxc such as .xml text reader we found two .xxc vulnerabilities in official windows sdk executables we found vulnerabilities in .xsd a utility to generate schema from given source and another in .xsltc in order to exploit the vulnerability in .xsd it is required to execute it with a path to .xml file which contains the .xxc unlike .xsd .xsltc requires multiple parameter so it is a little bit more complex to trigger in addition to the vulnerabilities we found we found two .xxc post exploitation techniques we can use them to exploit text files using .dll signed by Microsoft instead of using suspicious network APIs this increase the odds for bypassing security controls the first one is peer group parse invitation it is documented function found in msdm and the root cause of the .xxc found in internal function named unwrap .xml invitation we traced back to find which function it is called from using our call graph we found multiple options but we could not find any attack vector that can trigger these functions without calling one of them ourselves the second post exploitation technique is found in .xsml in PLA .dll in order to trigger that function we used one of the examples provided by msdm so we had to change only few lines of code in June 2020 Microsoft released a fix to CVE 2020 1300 it is a directory traversal vulnerability you can see that they declared to have fix it on windows 8 among other windows versions based on our correlation capability between CVE report and the relevant patch files we understood that the patch provided partial coverage so we dug deeper using our directory traversal feature we searched for any function with ..slash or ..backslash as an argument we found that on June 2020 Microsoft indeed patched two files local SPL and win32 SPL but the patch to the printBRM engine was done only in August we compared the patch to the patch of windows 10 and found that on windows 10 all three files were patched in June does it mean that windows 8 was vulnerable to a one day for two months the printBRM engine executable was patched only in August and with the same patch logic used in windows 10 in June adding two WCSTR function for mitigating ..slash attacks the printBRM engine was not patched in June 2020 we verified that in windows 8 in June it was vulnerable to one day attack of CVE 2020 1300 the printBRM engine allowed importing of a CUB file if the CUB file contains a file name that contains ..slash Charles it will extract the file to an attacker controlled location such as system32 webm which is prone to DLL hijacking we proved that it was possible to have RCE on fully patch windows 8 between June and August 2020 with a one day we also found an additional unpatch to issue in 2020 this time the vulnerability provided arbitrary delete capabilities more research is needed in order to determine whether this is a new opportunity for finding low hand game force or just a coincidence Microsoft Response regarding the MSRA vulnerability the bug was fixed as part of July patch Tuesday regarding the other vulnerabilities no fix is currently planned since they don't meet Microsoft Service Bar we are encouraging you all to use and expand our tools that will be published today we think that the vulnerabilities we found are just the tip of the iceberg we would like to credit multiple researchers we based our research on top of their previous research thanks for joining moving on to the Q&A session