 Good morning everybody. My name is Jim Harris from PFP. Today I'm here with Carlos, our CTO. We're going to talk a little bit about side channel analysis for critical infrastructure protection and since we're kind of early and we're starting a little bit early feel free to interrupt with questions whatever as we're going along and hopefully this will be an informative session so yeah okay I hate standing behind a podium normally I like walk around when I'm talking so this is gonna try he's gonna walk and I'll talk a little bit and then we'll switch doing oh great okay all right this is a little bit easier for me because I have to pay some little ADHD so what we're talking about is essentially using we're trying to take a technology that we basically developed for the US government US military and our first commercial targets were in the ICS space and the reason is because it's a similar use case you have absolutely critical infrastructure that absolutely has to be protected you have some things that cannot go down and you can't load software on them and you can't look at the network traffic and you can't do any of those things because we have a lot of this big divide between the focus of the OT which is safety and security and availability of the system and the focus of IT which is trying to prevent compromises and breaches and after so I have a long weird history I won't get the whole thing but I spent although I started as an engineer back in the 1990s I took a detour through the FBI for eleven years as a special agent working mostly in cyber division and that type of stuff and then I became a consultant mostly back to the government and did a lot of critical infrastructure protection cyber events in an effort to help people talk more intelligently about risk decisions and IT and OT divide was a big part of my consulting business back in the day because this difference between how people view things I mean there's lots of different things you go into about the psychological differences of how IT folks tend to think abstractly in terms of things like IP addresses and MAC addresses OT guys tend to think concretely and things like mechanical processes and switches and all the way through to the difference between a focus heavily on confidentiality and a focus almost entirely on availability and integrity. So the problem is of course there's a shared deployment and the old days OT and the IT didn't meet there's OT was on you know Modbus serial ports things like that and IT was IP address Ethernet and they just didn't have a big connection but now everything including the Modbus has gone to Ethernet as well and there's even you know wireless products that can be IP based as well and we have this problem essentially of we have to make some decisions about systems that the folks who traditionally do IT technology don't necessarily understand what the system is doing or how to judge it the context in which it's operating. So yeah let me get to the last point that Jim was talking about so this is a lot of emphasis on using machine learning for security and one of the main problems we have with machine learning is that you know that of the ground truth which is when you're training your machine learning you have to make sure whatever you claim is good it's actually good whatever you claim is malicious it's actually malicious and there is a very blurry line between the two so having coming up with that ground truth in the first place is very very difficult and the last one when we talk about the endpoint paradox is that most of the endpoint protection relies on actually installing agents on the devices themselves and of course that you know you need that endpoint information for context is you just looking at the network and you see a packet going through you don't know what you don't know what it did you know did it deploy did it do something back to it you don't know you need to you need to have that endpoint context to really understand what's going on with your network with your whole system but in order to get that context you need usually you need to install an agent in the endpoint itself which means that it's a little bit like asking you know the fox having a chicken's are in the hen house because the moment somebody compromises that endpoint you know that they can make the agent lie to you so we have that in that paradox that we call it because you need to rely on the endpoint but you cannot trust it. Alright so a lot of the things that people do today which are trying to separate the systems from the internet good patching good but difficult if you've separated the system from the internet. Using these kind of traditional IT systems they're all you know things that are necessary but not complete right they don't actually solve the essential problem they also don't necessarily solve the fundamental end-side or problem that everybody kind of understands they now face. With this you know sorry sorry let me let back to Carlos because we were something about the right limited operations I didn't that he put on here I wasn't sure about. Right yeah so like Jim said IT is very different most of the time when you talk about security you bring the things that we learned in the IT world we try to jam them into the OT world and they're very different worlds there's one quick example that you know for Windows updates the best time to do it is on a Sunday at 2 a.m. in the morning when it's not disrupting anybody that would be the absolute worst time to do it in an OT system because if something goes wrong you want to have everybody looking at it so they can take action so they're very very different and and when you try to jam the security solutions from IT into the OT you leave some systems vulnerable because not all of them can be deployed so you have in addition to that a lot of the operational requirements that are very strict they're very different from OT systems you have embedded system you have legacy devices you have a broad variety of platforms that have to interact and where reliability is king so it makes it really difficult to use the things that we have learned in the IT world directly just apply them in the OT so yeah that's what we were trying to say in this work yeah and to that end I can't remember I sort of blanked there for a second but I had a an interesting two different consulting engagements and of course I can't mention the companies involved but both of them were utilities and one of the utility proudly said we solve all of our problems with air gaps and then they went on to describe their asset okay so it's truly air gap 10 they said yeah absolutely air gap this is a power supply system I said okay so you're billing of the energy right how does that get to your your billing department they said oh what just you know goes through the firewall in the npl s to the business system simply but that's not an air gap no it's like an air gap but it's not an air gap and it was just kind of funny that we as we were having this discussion they really genuinely truly thought they had air gap the system by putting a firewall in npl s through the firewall I had another company that really truly they were as far as I could tell air gap right they had completely severed any IP connection into the system cool how do you update the firmware oh we go over to the internet machine download the firmware put it on a thumb drive walk over to the machine and load it okay technically an air gap but obviously another vector and since they had no other software on the system to protect it or any other device to protect it because hey we're air gap to what could possibly go wrong they were doing that which post Stuxnet everybody knows doesn't work so what we're looking at instead is something a little bit different so the challenge here of course is to put something that doesn't require loading software interrupting the network or could possibly become a point of failure for the entire system. So PFPs what we've been researching and doing for quite some time in the government space is looking at side channel analysis now everybody in the conference has probably heard somebody talk about side channel analysis talking about reading RSA keys or breaking or doing bad things to a system but we're kind of the other side of this which is we want to use the same process we want to look at tiny fluctuations on either the power or the EM emissions to determine what the state of the system is if it's in a known good state a known bad state or an unknown state which in this type of application should be considered bad so if you think about the power plane inside of an electrical device right like our badges and everything like that each time a processor microcontroller whatever has to make an operation has to do something at a clock cycle even if it's negligible it has to reach in that power plane it has to pull some power out so if you think about that is a very still crystal still lake right like Lake Tahoe in the summer you know looks really clear you almost don't want to touch it because as soon as you touch it you know you're going to create ripples as ripples are going to go on smaller and smaller but indefinitely if you think about a deterministic process like an industrial control system reaching in dipping into that power plane over and over again it creates very pattern patternistic is not a word I'm sure but deterministic patterns of waves on that plane so we're using usually in EM in these cases and we'll talk about why we're using that along with some signal processing machine learning to basically identify in time and frequency space what are those things that are important between the different operating states and then outputting a statistical fit of what state you're in the machine thinks you're in and how confident it is in that state is there anything I wanted to add no no that's that's really it know there's a lot of single processing involved so when we start talking to people about you know side channels and transforms and whether transforms and things like that often they don't the traditional cyber people they have a little hard time wrapping their heads around it but in principle is very very straightforward you know when you have a digital device you're flipping bit from 1 to 0 to 0 to 1 and the more bits they flip in every clock cycle the more energy you need to flip those bits so as you execute your logic you're flipping more or less bits that give you this very tiny but very unique pattern that depends about the hardware and the software and that's the one we're going the people doing such an attack they go after that to steal information we're flipping around and we're using it to make sure that nobody has modified the logic in your device oh yeah this is the slide the next slide I did we'll see in a minute is one that for some reason takes a long time to load on this computer but let me talk about it a little bit while this one loads oh yeah sure so the training right now and actually the our current setup is actually based so I should probably haven't talked about it more but the machine learning training we got a couple of different paths one is the original machine learning algorithms at Carlos as part of his PhD work developed some time back and that has been developed into what we're currently using I'm also doing some work now in deep learning convolutional neural networks to do the same thing so less signal processing up front more deep learning which obviously takes more processing power but can get better separation in some odd cases some you know difficult cases but that's still kind of under development but Carlos can talk more about the his work is yeah so so this was my background is in wireless communications I used to work with software defined radios and the origin of the technology it was looking at how to help regulatory bodies certify software defined radios and enforce those certifications when the FCC tests a new radio and puts a stamp for approval that can be sold they they certify a specific hardware with specific software it's a pair and if you change either one of them you have to go getting retested to get re certified but they never said how they were going to enforce that so that was part of the work that we were doing and figuring out how can we help how can we detect that either one of them has changed of course we look at such channels they worked and application for cyber security was you know straightforward in terms of the training we have a battery like Jim was saying we have a battery of different machine learning algorithms and they go from the traditional the support vector machines in a random forest just base classifiers and we're doing a lot of work lately with deep learning and just giving really good results and all of them work in in different cases so we have a battery of them we do a lot of feature extraction ahead of time a lot of single processing to clean the signals synchronize them and and clean them up and then you know we pass them to the classifiers a lot of our work is part of a DARPA project on using AI to classify signals in different areas has fed this so we're kind of finding what are the best things to work on different use cases is different machine models have different accuracies depending upon the signal you know different parts of the signal in fact we still don't fundamentally understand why some things work better and that's part of what we're doing now is fundamental research is can we figure out why certain machine learning algorithms work better on certain types of signals and not on others and you know that's there's still a lot of fundamental questions to be answered about that so when to finally loaded the slide and and one of the things that when we tell people that we look at that channels usually a pfp center power fingerprint and we often look at power consumption of the devices people often think oh let me see that the level of power in my in my my cell phone and like the battery indicator and that's that's not what we look at we look at tiny tiny patterns this is what they look like that's one of the traces from a PLC actually this is what they actually look like and and if you see the picture of the of the chip there the emissions radiating directly from the from the silicon this is part of the fundamental physics of the semiconductor state as you're moving electrons around to generate those fields and and we're the ones those are the ones what we're picking up because we tell people often are we looking at power how you're looking at how much you know what if I turn my what if I turn my battery my my screen on or what if I turn a fan there's gonna mess you up no we're looking at the the emissions directly from the processor that it's executing your logic so it's a it's a different concept no packets no system calls yeah and one of the things about this per right here so if you go see our demo and then next area you'll see that we have a very tiny little loop antenna that little loop antenna is mostly going to pick up because a question commonly comes up well isn't that subject to a whole lot of noise but the type of probe that you'll see there is mostly picking up the magnetic component of the of the em emissions right so it doesn't that drops off very rapidly with distance so it's much you know more accurate when we use the em we also have it some demos where there were some installed on the wall where you see it's using DC power which is also pretty good not always in an ICS if it's end line because then potentially we could become a point of failure for the system but the em is really really good and works well in a noisy environment because we're mostly measuring that the beacon the magnetic field component of the em field so so we talk already about such else attacks and the tempest you probably guys are familiar with the tempest was designed specifically for those side channels so when we you know if you familiar with those you know that they haven't used for decades to extract this information which is using in a slightly different way and if you see the rats and stuff that goes in there for a tempest system that's what you have to do to prevent the signal from leaking out go ahead that's right so so basically that would be the the case of jamming right if somebody would be jamming you with a with a magnet or with what anything else so we would see it you would see oh there's just this big jam and and we will flag it and somebody would have to go look at it but it would be very obvious that you're being jammed in the signal that's true that's true they could be just working with magnets right it is possible but it's very unlikely so we actually were doing some tests in a substation they have these big massive transformers right next to it and you can see that you know there's this huge of electromagnetic fields around it and they actually when you go there they they ask you they no metal you have to you have to put your suit to be able to get there and in these more work just fine because you know at that point it becomes you increase a little bit the noise and the the the signals come at different spectral bands so you can filter those fairly easily and if somebody they say if somebody were going to be playing with a magnet right next to your device and doing this like you know a several kilohertz moving it well that could probably you know impact us but very unlikely yeah static magnet the the delta of the or the change for moment-to-moment of the magnetic field isn't going to really register it's the delta and so like you're moving the magnet rapidly in and out if you do it you know ten thousand times a second and it would definitely make a field and a lot of the magnetic fields we would be around might be static yeah yeah I mean exactly it is potentially there that's part of the reason when you're doing the baseline the machine learning you should do it and as much as possible in the environment in which it's going to be deployed so as close to that environment as possible so that the machine can already learn what the ambient noise that it might pick up looks like integrity assessments again what we're essentially doing is trying to look at once we have built a model of it we're looking through those side channel patterns we have our baseline we measure distance from the baseline to what we're seeing right now and then give you a confidence level out of this is the state I'm in this is my confidence level if the confidence level gets too far outside of what is acceptable then and it doesn't match any other state then it's an anomaly and I don't know what it is I can't help you figure that out but I can absolutely tell you that it's not operating exactly the same way could be because an electrical failure could be because somebody's doing something with electromagnetic fields nearby I don't know what it is I just know I'm not in the right operating condition that you expect me to be in okay this one I have to turn it out because so there's two ways in which we normally do the training the way that we prefer that we prefer is when we do the supervised learning and which means is that you grab your device that you're going to be monitoring and you bring it to your test and evaluation room and you make it go through all the different paces all the different go through all the different states this is the exact same type of assessment that you would do to do code coverage on your traditional functional tests so you want to you want to exercise the different path doesn't mean you have to exercise all the different inputs you have to exercise so the different execution paths in that way you can come up with a complete you know library of what the normal the real states are and then if anything were to come in we will we will flag it of course that requires you to have a test and evaluation room and then you can able to monitor the for the execution of different states the other one is unsupervised learning where you simply observe the life for a period of time and whatever you observe in that time you make it you make a part of your library and they might anything you don't see you can match it you flag it but in that case we can have more false positives because we haven't seen all the states and people often people often ask us well how about complex flat forms you know you have a really really you know complex watch some PLC's are actually fairly complex and what we tell is we limit the scope of those we either force them to execute a specific task and we make sure that that task hasn't been compromised or we go low level make sure that the firmware and then the initial execution the bios hasn't been tampered with so let me go to the next one so one of the people that thinks people asked about the performance is how well it does this is one of the early work we did with DARPA and it shows the ROC receiver operating characteristic curve basically is that how good a detector is how well a detector works and so vertical axis is probably a detection when you have an anomaly something else you detect it versus a false positive which is when you have a real legitimate event that you've mistaken flag it as an anomaly and in this case you see that for the blue line for over 80% of probability of detection you have a 10 to the minus 15 false positive rate and the reason why we can do that you see the three lines is because you you when with pfp works differently if you were to send a file to virus total and get your assessment it will tell you malicious no malicious whatever if you send the same packet a thousand times so the same file a thousand times you get the same answer a thousand times with pfp you observe one execution instance and you give you an assessment which is a black line if you you can observe another execution instance of the same you know code you can put them together in and start entering the noise out so you get a cleaner signal the more you observe the cleaner the signal it gets so that's what we can come up with this such low probability of false positives works differently so and one of the reasons people doing assessments using such channels integrity service in such channels is much harder than just asking the device hey you know are you okay but there's a lot of advantages of doing it this way and the main one is that we do no harm so normally like we've talked before we have that line between the OT and the IT the safety critical side you have to make sure you do no harm to those systems and with pfp because you can be physically separated from them we can physically air gap from it you guarantee you do no harm there's no latency or reliability impact on your network or on your device itself are you really just putting a probe right next to it you can because we look at them as a signal we don't care what the the box that generated them there so we look at them as black boxes you can support in better legacy devices can be real-time systems and we had no latency to them so there's no need to read certification a lot of critical infrastructure plans they have to be go through a very rigorous certification process to make sure that they don't you know explode and kill people so do you every time you introduce a change in in any of those systems it can be very expensive to go through do the whole recertification with pfp since you're not changing any of that by using that channels and you're not changing any of the system you avoid all that complex process and very importantly it does not introduce additional vulnerabilities something like 30 percent of attack vectors don't remember the actual numbers come from actually you know security solutions with pfp again you are separated from it so you introduce no additional vulnerabilities to your system and you can detect that very quickly in the last one that we have here is very robust against evasion technically it's possible for somebody to generate a sequence of code that matches perfectly what the other code was doing but it's very very difficult it's possible but it's very very difficult and also covers accidental false so for some reason you have this gamma ray that hits your system and it's that means behaving we will catch that as well and and it's not a malicious attack but it's something that you need to know because you're dealing with critical infrastructure it integrates with any other solutions so you can you don't have to modify any of your system including your security solutions that you have in place you can have your access controls you can have firewalls anything else you want we just put an additional layer of protection and because we're air gap from it if you compromise the device the target device that we monitoring you cannot get from there to us and you compromise us you cannot get to them yeah and again to this point about adding a layer so we're obviously not sitting here saying that you know this is the only way you should monitor the device there are lots of other things you should be doing this is just an additional checkpoint I mean when you get to the idea of what we can detect again at the end of the day if the function gets weird which could be because it's failing or could be because somebody is attacking it all we can tell you that it's weird we can't tell you that it's you know malicious or we can't you know in some limited circumstances with other types of devices we have categorized things like Mariah on on cameras and things like that so we have known bad states so we can characterize but there's too many of those and there's enough people in that area doing that type of stuff we we focus more on making sure that the hardware is what you expect it to be so it's time okay well this was a very interesting live but we spent too much time at the beginning so we'll keep it let's just wrap it up there's DARPA project that is funding us to do this work and there's basically two deployment options you can deploy the technology runtime or you can deploy the technology as a screening so if you create an infrastructure you want to make sure that your devices haven't been the hardware hasn't been compromised you can tell her the tools to do that or you want to make sure continuous monitoring and we say continuous you know we really mean 24 seven every second making sure that the execution of your system hasn't been compromised so we have those two deployment options we've done tests with PLC's you can come to our next door to your demo we've done tests with Cisco routers networking structure a number of platforms so with that let Jim know wrap it up yeah so again we'll wrap up quickly we don't want to go over time and you know respectful the next speakers that are going up but please come see it think about this and other applications of this we're still you know kind of in that transition stage between research and government and do d research and like practical application so love to hear your ideas thoughts you might have on it and look forward to talking to all of you thanks very much