 Alright hello everyone and thanks for coming to our talk today. We're pretty excited to be speaking in the ICS Village for the first time. We're going to talk about an ode that we found in an ICS system. This is actually going to be somewhat of a redo of our talk in track two yesterday, but we're going to focus a lot more on the challenges we faced specific to ICS systems. So first of all, who are the two crazy guys standing in front of you? My name is Doug McKee. I work for McAfee's advanced threat research team. I've got a little over eight years of experience in vulnerability research and pen testing and... And I'm Mark. I'm also a researcher at McAfee's ATR team and I have a little over eight months of experience in vulnerability research and pen testing. Alright so first, what are we here to talk about today? Obviously it's going to be an ICS system and for this it's Delta Controls and Telebus Manager and this is a building controller. So what building controllers do is they're installed in commercial and industrial buildings to manage things like HVAC, access control, pressure rooms and hospitals and they do this largely over a networking protocol called BACnet and that stands for Building Access Control Network and this is not proprietary to this Delta system. It's actually the same across the industry and so when we started looking at this system specifically we kind of ran into what we like to call the perfect storm and that's also why we started looking at it more heavily is one it's network connected and these are connected to the internet right so we always like to look for vulnerabilities that are network accessible because they tend to have a higher impact and they give us a really good place to start from an attack surface. Also when we started looking at the device we found that it had an active UART header on the board that did not require any authentication and it dropped us right into a root shell which allowed us easily to extract the firmware and to start debugging and when we extract the firmware we also found that the developers had left symbols in the binaries which again made things easier for us to move forward and I'll reference this throughout the presentation and last but definitely not least as most ICS systems do that run Linux anyway they have a busybox installed and in this instance of busybox Netcat was left installed on the box which became extremely useful when you get to our post exploitation phase. So I'm going to give you a really short high level understanding of the vulnerability in the exploit and then we're going to focus on some of the challenges. So first what is the vulnerability we found using broadcast packets, backnet broadcast packets that we were able to find a buffer overflow. We were able to turn that buffer overflow into a right what where condition and use it to gain control of execution. So once we gained control of execution we were able to redirect execution to shell code which we were able to put on the heap because of the buffer overflow. So a little bit more about the vulnerability this is what it looks like in code using the I2D compilers I've got it condensed here. The simplest way to look at it is we have a standard buffer size mismatch so in the top diagram we've got a buffer that I've labeled source and it's being set to 1732 bytes and it's being read directly from the wire. A little bit later down in the code we have a second buffer being allocated to 1476 bytes. As you can see those sizes are not the same it's eventually used for a mem copy and therefore we have a buffer overflow. And in order to get that to use to gain control of execution we used the right what where condition to do something called a GOT overwrite. So if you're not familiar in Linux the GOT stands for global offset table. This is a table which is populated at runtime therefore it's generally writable and it basically is a match between function name and address. So in our case one of the functions in memory was scdcode backnet UDP and this function was very close to where we found the volume and so therefore by overriding the address for that function we were able to redirect that function pointer to shell code which was stored in memory. So the shell code as I said we controlled a very small amount of memory on the heap which made shell code slightly difficult but we were able to do a little bit of hopscotch in memory to jump around and actually get our shell code to work in memory. We did a classic return to libc attack and leveraged the netcat that was already installed on the system to gain a reverse shell. And with that reverse shell is kind of the basis of where everything starts and from there we're able to exploit further infect and exploit the system. So I said one of the things we want to focus on in here is the challenges we face during our process for ICS specific systems and so like most people static analysis is great but we would really rather use dynamic analysis as much as possible in the exploitation process it's just simply easier. So in order to do that we compiled gdb server for the appropriate architecture and we dropped it on the system. Now as soon as we started using gdb we are instantly messed with a bunch of error messages and then the system rebooted. Now that the error messages have been redacted here at the request of the vendor but the gist of the error messages were that there was a washdog timer that failed, kicked the system and therefore the system reboots. For those not familiar in ICS systems washdog timers are typically used in order to ensure that if the system hangs for any period of time an action is taken to stop it to get it back going again and in this case the action is rebooting. Well that's obviously going to be not good for dynamic debugging but because we had symbols and because we were able to use those error messages it's actually pretty simple for us to find in software where this is taking place. Here you can see that there's three areas where a counter is being decremented by five and when the counter hits zero it reboots the device. Well from a software research perspective this is actually not a hard problem to solve. All we have to do is a technique called binary patching and we only need to patch one single byte in the entire binary. So what we did is we changed that five to a zero and by changing that five to a zero the code still executed but it effectively didn't do anything. In fact you can see up here on the screen on the right hand side that IDA doesn't even recognize that that line exists anymore because it's completely ineffective. So we did this and started debugging again and we were successful for an entirely total of three minutes and then the system rebooted again and the watchdog got its revenge and Mark and I were trying to figure out we just washed the watchdog why is the system rebooting? Well again something specific to ICS systems is there wasn't just a software watchdog but there was a hardware watchdog which is very common in ICS systems and we had forgotten to handle the hardware watchdog and so you can see that here on the UART screen that the hardware watchdog is being set every 180 for up to 480 seconds every time on boot and otherwise three minutes. So the important thing about hardware watchdogs is that if we have to deal with them as security researchers the developers also have to deal with them which means that the developers have to deal with them there's already code on the device to handle this problem you just have to find a way to get access to it. Well in our instance the developers had exported all of their functions to all of their SOs meaning that all we had to do was write a very small C program import their libraries and call their functions and have that boot up on start and now we were able to kick the watchdog in the exact same manner that the developers did to prevent the device from rebooting. So by employing both of these two methods to handle the software and the hardware watchdog we were able to debug the system dynamically which was extremely helpful for the rest of our research. So we faced a couple more challenges and for these to explain these I'm going to hand this over to my co-worker Mark. Thanks Doug. So yeah the next challenge we encountered is that we wanted to get this exploit to work remotely up until this point we have been conducting the attack on the same network that the device lives on using broadcast traffic which was nice because we didn't need to know the IP address of the device but ultimately we wanted to expand the impact to be worldwide. Okay so how would we go about doing this as you may or may not know the attack as it stands uses broadcast traffic which normally does not travel over the internet. Well thankfully a certain backnet technology came to save the day and that backnet technology was BBMD or broadcast IP or sorry backnet IP broadcast management device which is sort of a mouthful but suffice it to say it allows for the communication of backnet broadcast traffic the exact kind we're using to travel over the internet and so in the diagram on the right you see two such backnet networks connected over the internet and each of them have a BBMD system so the way this works is that traffic intended for a foreign network is first sent to the source networks BBMD it uses unicast traffic to send it to the destination networks BBMD whose job is then to rebroadcast that traffic that's all well and good but how does this help us well ultimately by using this technology we discovered through testing that this actually gets our exploit to work entirely over the internet. I want to reiterate at the time of this writing any eBMGR device connected to the internet with its default network settings could be pwned 100 percent remodel using this exploit. Now this is not the end of our challenges not by a long shot because after all up until this point all we have is a root shell to the device and persistence this is very nice but ultimately we're more interested from an impact perspective from controlling the hardware on the other end of this device whether this be HVAC systems and a data center access control in a government building or pressure rooms in a hospital that's where the real impact lies we need to get control of those. So our initial hypothesis was to use the programming already on the device and see if we can get that code to work for us. The first approach we used was to try and look at the database files located on the device's file system and these contain some information about the state of the iO hardware. Now the first thing they jumped out at us when we looked at these database files is that I had no idea what the hell I was looking at this was too complicated so we decided to keep looking. The next thing we try to look at is controlling the iO state directly from the touchscreen as you would in normal operation and then this actually generates loopback traffic and we decided if we could investigate how these packets are structured and maybe replay them we could get it to work for us and those paying close attention might notice that the packet structure is actually very similar to the database files and that I still have no idea what I'm looking at so we try something else again and what we ended up working for us is actually hooking to the functions that control iO natively and you see one of these functions in front of you from GDB output and once again symbols being left and proved invaluable here since names like can iO write output proved to be a huge indicator this is a good place to start since we knew that all the iO modules are handled through the CAN bus. Now through some further reversing of these iO control functions we discovered there's actually different functions for handling all the different categories of iO and there are six such categories for the eBMGR. You have inputs outputs variables and any of these can be either analog or binary. Now the idea would be to alter the parameters going to these functions in order to get the code to work for us but the question is which parameters do we alter and how do we do it? Well for that we did find one key commonality between all these various functions and that's the first parameter being sent. The first parameter as we discovered is actually data structure that describes the hardware being altered. In particular this data structure for the first four bytes are the ID or the fingerprint of the device and this is unique to each one. At a 12 byte offset we also discovered the holds the current state of the device which is the most crucial component but additionally at a varying offset you could also find a descriptive string of that device which is how we know this one is responsible for monitoring room temperature. So using this information is actually pretty straightforward to alter these data structures coming into these functions and thus get control of iO at least in theory. So using this information we were able to craft malware that took advantage of this hooking method in order to control the iO but ultimately we had another challenge in front of us and that challenge was that unlike the watchdog kicking code which we could just throw some C code onto the device and execute some functions that exist in memory, the handles to the iO hardware were trapped inside the program space of the existing programming. It could not be accessed normally from an external program. Somehow our malware had to be inserted into the memory space of the existing programming to do its job. How would we do it? Well the solution ended up being something called LD preload and for those that don't know LD preload is a Linux environment variable and shared objects it points to are loaded first by dynamic linker when a binary is executed. Now this is not a new or novel concept we've not come up with this it's actually a pretty common strategy in this field for inserting code into the existing program's memory space which is exactly what we wanted to do. So in our case we used persistence on the device in order to set this variable to point to our malware and thus we could insert it into the device's programming but then the next challenge was where do we put it? Well here are our options. So this is a rough overview of the the programming startup routine and through some further reversing we just decided that the best place to put it would be this function called can iO and net highlighted in yellow. The reason being is as Doug mentioned previously this is a vastly multi-threaded application it runs many threads in parallel and in order to avoid race conditions or inconsistent behavior we wanted a function that was called early initialization called once and called by a single thread and this function met all those criteria. So what does it look like once our baby malware is inserted here? Well once this function is executed our malware starts up and it runs a thread in the background that elicits for commands sent over the network by the attacker. It elicits on a TCP port and depending on the content of these commands the malware would then intercept the corresponding calls to the iO control functions all to the parameters as I mentioned previously thus granting us control of the hardware. Now we were getting close to the end here but we still had one and possibly the most important challenge in front of us and that was I couldn't afford my mortgage. Now hackers we got to eat just like anyone else so we wanted to see if we can make any money off this thing. Now obviously on stage I would not try and sell any kind of malware but purely hypothetically if I was going to I might mention that it does give you automatic discovery of all the hardware. You know sit back and relax it does all the recon for you. Hypothetically of course. Another thing I might mention is that you get remote control through an interactive TCP session. I'll quick aside how many of you here show of hands saw our original talk yesterday? Okay that's not many and I'm going to recycle my jokes good. Alright so my therapist always told me that two-way communication is key to any healthy relationship and so we decided to let you communicate with the malware through an interactive TCP session. Now am I saying this malware is going to fix your marriage? Yes that's exactly what I'm saying. Okay next you can actually see the state of all the hardware in real time so you can see exactly how much damage you're causing and last but not least send a single command and you revert the device to its original unhacked state wiping of any traces the malware is ever there. Best of all if you call in the next 30 minutes hypothetically you can get all this and more for 30,000 easy payments of 1995 shipping and handling not included and if you're still not convinced be sure to walk over 10 feet that way after the talk is over to check out our presentation you can see this bad boy in action but to show you a quick overview of how this looks in the real world I'm going to hand it back to Doug. Thanks Mark and just to be clear we're having some fun joking around but we are no way releasing the exploit code malware code or selling anything. I don't want to get fired and my boss is sitting right here so okay so now that we've gone through a little bit of the challenges and a really high overview of how this works what does this really look like and one of our missions at McAfee's ATR team is we really want to demonstrate these in means that it's really easy for everyone to see what the impact is and because of that we actually took the time to build a fully working demo unit which as Mark said is literally 10 feet over there afterwards if you want to see it in action but I'll show you the demos as videos right now since they won't let us will it over so what we built was a fully functional HVAC unit that's cooling a simulated data center because why a data center because this is one of the uses that this exact model is used for we actually hired an installer that installs these on a regular basis to build this and one of the main things they install it in is in data centers again we're going for is realistic as possible so if we take a look at what this unit looks like excuse me on the back hand side you see all the components of a normal working HVAC system so we've got the valves we've got fans we've got pumps and all this is doing is taking cold water from a cold water bin and using that to send cold air into our simulated data unit you might say there's not usually cold water in a real unit but again my boss was too cheap to buy an actual condenser so that's what we were working with on the other side we have a a glassed in simulated data center we've got all this again HVAC equipment that you'd seen a data center and we even have a raised floor with a server sitting on it generating heat for a moment I'll bring your attention to one of the sensors it's the independent temperature sensor this is the only component not hooked up to the delta system because it gives us ground truth through what the temperature is regardless of how much we messed the system up on the top we have some simple LEDs indicating different states of the equipment again mostly just so it's visually easy to see but the most important one is the alarm and the alarm's an interesting one because in in real life the alarm could be done one of two ways it could simply be an LED it could also be a siren but more frequently the alarm is used for like email notifications so that somebody knows this system's in distress so anytime we talk about the alarm that covers all of those the attacker would be able to manipulate that as well so i'm going to show you what this would actually look like if someone attacks it using our premium malware which will help pay for marx mortgage on the upper left hand right hand side of this left hand side of the screen you see the attacker running the exploit script that we ran on the top right hand side of the screen you see the the delta controller picture in picture as the attacker launches the exploit leverages the vulnerability it downloads our malware and a few other fun images and this reboots the system to leverage that ld preload method that marx talked about here you can see we've replaced the images just to prove that it's hacked we no one would ever actually do this in the world but we're just going for a visual representation also we're logging in to follow along but no one actually needs to touch or log into system so what are some of the things the attacker might do once they got to this point on the system well one thing they might do is control the outputs of the system the system takes input and based on those inputs have a reaction to the outputs so here what you're seeing is the attacker manually override several of the outputs on the screen one at a time in the bottom right hand corner of the screen you will also see that the the system is reacting accordingly so the pump shutting off the alarms turning off etc on the top right hand this part of the screen you'll notice that the delta screen is not reacting in fact it's still staying in its previous state this is because our premium malware has the ability to adjust or not adjust the screen based on the attacker's will and if we start talking about alerting alerting or whether the user's aware this is a very important feature so what you just saw turning stuff on and off that's binary outputs these systems typically have both analog and binary outputs and all analog means is they're set to some floating point value instead of just on or off so to manipulate those the attacker can do that as well through the same interface this time that you will see the delta screen update because we're going to have the attacker choose to update the screen so they're going to change the fan speed in the valve and you can again see the component reacting appropriately in the bottom right hand corner of the screen also what we've done in this malware is provided a reset function which sets everything back to its original programming without removing the malware's presence so it kind of gives the attacker a way to put things back and still have persistence on the device so controlling outputs are fun but it would be a lot of work for an attacker to have to manipulate all the outputs in order to get the outcome that they wanted remember I mentioned that these systems are designed to take input and based on that input perform some action so instead of controlling the outputs if we manipulated the incoming data then the the rest of the system would just function as normally so in the world of HVAC if the system thought it was reading 30 and 40 degrees from the data center what would it do do the same thing your house HVAC would do would turn itself off because it has no need to cool that's exactly what we have going on here and you can see our independent temperature sensor in the bottom right hand corner the temperature is rising alarms are not going to trigger because the system doesn't think it's in distress you know all the all the valves are dampers and stuff they're going to close on their own without the attacker doing anything because that's how the system's program to react when it thinks it's only 30 or 40 degrees inside the data center so this is actually a more impactful scenario controlling the inputs than the outputs and a data center again is just one example of how this how these would work in the real world but think about you know a major corporation insert your favorite name cloud provider and if their data center was to lose HVAC for a couple days without their technicians knowing being any of the wiser we could probably impact a pretty large portion of the internet but we want to highlight that data centers are not the only thing because this is a building controller and it's not agnostic to one industry these are installed across a very wide range of industries and I'm going to highlight a couple examples in those industries just so we understand the impact beyond melting down a data center so the first one I'm going to talk about is the healthcare industry and the healthcare is slightly unique in the fact that building controllers control pressure in individual rooms throughout a hospital and what do I mean by controlling pressure well by setting the pressure in different room I'm sorry by setting the pressure in different rooms at the different degrees they prevent the spread of disease so by having a positive pressure in one room and negative pressure in the other room diseases can't travel from one side of the hospital to the other side the other unique thing about controlling pressure is it's just a very very small change in pressure unlike temperature you might walk into a room and say hey it's kind of hot in here if you walk into a room you're not going to say hey the pressure is kind of low in here right so the human's not going to notice that change it's a very small change and if an attacker was to get on one of these systems make it so the hospital pressure was even throughout everything I'm talking the OR you know the quarantine room the rooms where the housing critical diseases or samples all the sudden all that stuff would be able to travel airborne throughout the hospital and that could actually affect human life that's a pretty interesting scenario if we talk about the government sector for a moment I only bring up this one because we did some online open source intelligence research and found that these systems are actually being used in certain government buildings state governments for access control well if you think about that impact if we were able to access one of these systems in a state government building for access control I don't think I have to go into too much detail on at a hacker conference of what that would provide right I don't know we would be able to access maybe sensitive data that we weren't supposed to get to social or way into engineer our way in and last but not least I'll take a moment just to highlight education just because I think it's a one that people are probably asking themselves why do I care if you can affect the HVAC system in my kids preschool and what I would argue is we often forget that because this system is completely accessible over the internet and remotely exploitable it's not only about the system itself unfortunately we don't we find that a lot of times these are not air-gapped and there's other systems on the network and being able to control a system that's on the network in a hundred percent capacity would allow us to potentially view video feeds that are going over the network or compromise other net other systems on the network which all of a sudden you might care a little bit more about than just the temperature but don't worry no one would actually connect these systems to the network so we really have nothing to worry about from a network perspective that I forgot about that Shoshodan does show us this a little different than the optimal version so at the time that we originally did this the exact model that we exploited with that has the vulnerable firmware there was about 500 these on the internet at the given time and the thing was where we discovered this vulnerability is pretty low on in the firm pretty low on the stack and it's because it digest the backnet protocol it's actually used across multiple delta controls devices it's not specific to this delta control unit and if we we look at those numbers connected to the internet all of a sudden we get to about 1700 connected to the internet that has the potential to be vulnerable to this vulnerability that give or take 10% or so for honeypots that we found on the on the internet as well so with that said I think we've covered just about all of the the impact and the details of the vulnerability I want to take a moment to highlight the vendor for this uh per McAfee is responsible disclosure policy we disclosed to the vendor uh that we found this bug and everything that we were able to do with it in late 2018 a cve number was generated and then we worked very closely with the vendor and in 2009 June of 2019 the vendor released a patch ATR has tested the patch and is happy to confirm the patch fully does mitigate this issue and I also want to take a moment just to say that Delta was awesome to work with and they were kind of what I'd like to see is the gold standard going forward for security researchers releasing bugs a lot of times I've been in the industry for almost 10 years and there's a lot of bad blood sometimes between uh security vendor or security researchers and vendors uh these guys were extremely receptive to the information we provided them with they worked with us very closely to make sure the system has been patched and this is really what we want to see from all the vendors in industry we're not trying to do anything negative we're actually trying to help you and find these bugs before the bad guys do and this this is really a good example on how effective that can be if you can get a good relationship with the vendor because they're actively patching these systems right now and we've actually already seen the patch being applied and less of these systems vulnerable on the internet so uh thank you for your time Mark and I will be here at the our demo station which is again literally 10 or 20 feet over there we're happy to answer any questions you might have further or and I think we're probably we're gonna get off stage here so any questions you have please come to our station thank you