 Hi, so my name is Tim Yardley. I'm here to talk about some work that that I've done under DARPA radix program Whoa, I'm getting audio from somewhere. Oh, I'm sorry about that. I was getting feedback from the live channel So I am here to talk about building a cyber physical test bed And how we've used that to support black star rest black start restoration under cyber fire So this has been part of a of a DARPA radix program The radix program is rapid attack detection isolation and characterization systems and I'll talk a little bit about what that is here in a minute So just for sort of coverage and In statement the test bed work that I'm going to talk about was funded under the DARPA's Rapid attack detection isolation and characterization systems program. I work for the University of Illinois. I'm the principal investigator of the University of Illinois effort under there We also had some government's me support subject matter expert expert support from Idaho national lab and then The program evaluator is a company called provatech And all three of those organizations have been critical to To the work that i'm talking about in terms of enabling The black star restoration efforts and and the validation of the work itself So let's talk a little bit. Give you a little background on me. So my name is tim yardley I'm a principal research scientist at the information trust institute at the University of Illinois And I am also a father a husband and in broadly a researcher I've been doing Work in industrial control systems for about 14 years now And I've been doing work in security for She's probably almost 30 years now across the board My background came from sort of the think tank IRC eras back in the early days of f net groups that that let's say Like to explore and like to figure out things but also cause some havoc in the process One of the original members of woo woo and a variety of other security think tanks over the years And and I've sort of been in the in the realm of of computer security for for quite some time I've been working in academia for about the past 12 years And prior to that I worked in industry supporting a variety of different efforts So the DARPA radix effort itself Let's talk a little bit about what it is. So the program is designed to Build technologies that fill a gap and that gap really is the notion that That if we are attacked by nation states, we are ill prepared To be able to determine exactly what happened where it happened and how it happened and get rid of it as fast as possible So DARPA stood up this program to build technology that that advanced that state of of Let's say preparedness and then to evaluate that technology on a realistic Facility and so our role in that program was to build that realistic facility We'll talk in more detail about sort of the objectives of the program itself So these are all public slides from From DARPA that are here But but the key objective of the program itself is to enable black start recovery of the power grid Amidst a cyber attack on the energy sector's critical infrastructure in this particular case the critical infrastructure is the electric power grid so In a prevention Sort of state you would say okay. Well, let's defend and detect what's going on But this program starts from the adversary has been successful. So you have a complete blackout No power anywhere and you have to figure out exactly what happened how it happened et cetera so the Devices that are involved you have to figure out what devices can be trusted because you don't know what's compromised and what What is safe or what is operating? per norm and many of those Of the physical infrastructures are are effectively controlled by By intelligent devices. So these are the ics devices that you have to explore On let's say the cyber side of the equation that are controlling the physical aspects of the grid And these assets are spread throughout the country. So how can you do this in a scalable way? How can you deeply and forensically poke at these embedded devices? How can you figure out exactly? What is trustable? What is not trustable? What was attacked how it was attacked and then get rid of it as fast as possible? The goal is to do so across the entire united states within seven days And that's to isolate characterize and restore Any crank paths necessary to bringing the the grid for the united states back online? So Let's talk about the manifestation of this so in the first Year or so year and a half of the program We built exercise environments that were run at the University of Illinois That we're getting people's feet wet and getting the technology ready to to explore Let's say the beginning edges of the of the problem space And as we evolved both the technology and the program We had to turn the corner. So people will only believe What happens in a lab to the extent that it's Oh, that was in a lab environment. Oh, that's fine. If you do it just on science, but that's not the real world So we took it to a federal island, um, which is called plumb island. It's the home of the animal disease center controlled and owned by dhs s and t and uh, and is a former Citing of fort terry in the spanish-american war and a bunch of other stuff The size of the space that we're controlling there is roughly the size of central park And we have built electrical infrastructure on that island And so i'll talk about the most recent incarnation of it that that was completed in november of 2019 That was the sixth exercise of the program We have one more exercise still to do covet has As postponed that a little bit, but uh, but still pending so far has not been fully canceled And we uh, we look to sort of do that last uh last to rob the program and uh invalidate the advancements from From the last exercise till now So in the island infrastructure We basically have built uh, three full utilities and those utilities are represented by gear that we'll talk about here in a moment Uh, they have an emergency operation center for each utility And then there's a regional coordinator that's trying to coordinate the actions amongst the different utilities as it goes through We have, uh, the national guard involved. We have, uh, the radix performers involved And lots of other Entities that that help establish communications in a variety of way and really set up What is a a fairly austere environment from the beginning to enable the the capture and and execution of of these exercises So, um utility a has five low voltage substations and i'll talk about what low voltage means to us And one high voltage substation. It also had one generator which uh, which is the crank path in essence That's getting to the high value assets utility b had seven low voltage substations along with three high voltage substations And that uh crank path had a critical national asset on it that needed to be up and maintained No matter what uh utility c Was similar to utility a it had five low voltage substations and one high voltage substation as well as one generator And so each of these utilities differed in the physical topology and layout of the substations They also differed in the equipment that was deployed across each of those substations So the program is really broken into um into multiple different technical areas You could look at it as four or five depending on how you want to count The first area is situational awareness. And so when they deploy on this infrastructure that we've built They are trying to figure out what happened or what is currently happening On the infrastructure to get as much situational awareness as you can provide And the reason for that is you can't really trust the devices once they've been attacked to be telling you the right things Um, and you you are in a blackout scenario. So you may not have Visibility in much of of the grid environment network isolation is technical area two and technical area two is focused on taking the uh communications that are no longer necessarily trustworthy And expanding those into a realm of trustable or not. And so by that, I mean Point a to point b may have talked to each other before Uh, and it used to be a dedicated private link or or whatever it may be but now uh traffic that's going across there is not Um, uh, not reflecting what it did previously. So maybe somebody's man in the middle of it Maybe somebody is manipulating it. Maybe it's getting black hole in some other way Um, unknown operation, right? So how do you take the outputs from a and get it to be in a way that's trustworthy and secure And such that it cannot be Modified in the middle or that it is evident when it is modified in the middle So that's the area of technical or the the area of research for technical area two Technical area three is thread analysis. Um, and they really are intended to do the forensic response per se on the devices itself So how do you diagnose and remove cyber threats from the embedded devices from the different pieces? Uh that are involved, etc. Uh as you go through the the uh investigation part of of the environment So the environment itself is technical area four. Um, and that's conducted by us at the University of Illinois Uh, and that is the environment by which, uh, the exercise happens, which is on the on plum island But also the environment at the University of Illinois campus That uh, the performers remote into to build out their technology to extend their capabilities and to investigate The edge cases of of how the grid operates when certain things happen to these devices So that's part of our central facility and I'll talk a little bit about that And then the island is basically a distributed manifestation of that central central facility at Illinois The last technical area is technical area five and effectively that is the evaluators of the program. Um, and they build out how to run an exercise in this space and how to Determine whether or not progress is being made on the on the technology What the appropriate challenge levels are the let's say pain points or or how deep or how hard the red team pushes Etc as we go through the the environment and they technically grade technical areas one two three and four As they as they go through that Uh So, uh onto the next slide Radix exercise six was really a move from Strategic to the strategic notion early on in the program to operational notion With the utilities involved etc to a technical deployment on on top of that so If you look up in the upper left hand corner The strategic notion is is the concept that we talked about of you have a large-scale blackout and what you need to Accomplish the operational notion notion is the building and establishment of these crank paths And then the tactical is let's execute on those crank paths and figure out exactly what happened and how it happened So, uh, the picture on the bottom right at the moment Is a drone footage Sort of Zoomed in on one of the substations and you can see that they're built in shipping containers and those shipping containers have gear inside them that That i'll show you here in a moment And then these containers are arranged and linked together to build the the crank path and we do that with with basically above ground number two so cord That has connectors on the end that we plug into to the individual gear that we have Inside the boxes i'll show you what that looks like here in a moment um So you can see sort of a little bit of an edge here of of what is inside a container So this is standing inside a container looking out at the moment With the door open So on the left hand side there, you'll see what we call the relay box on the right hand side You'll see what we call the power box Which is a skid mounted Hoffman enclosure that the controls The flow in essence of the substation So that's the power grid aspect and the controls of those grid components are are in the The relay box on the left hand side There's also some sensors that are up on top that are providing Or that are performer technology providing some of the situational awareness and attempt to determine ground truth As we go through the environment and i'll zoom in on a lot of this as as we talk So i'm going to only talk about the test bed itself i'm not going to talk about any of the performer technology that specifically built but But i'll i'll dig quite deeply into the test bed, which is my area of responsibility So the mission that we set out to do is to provide realistic environments that enable this cutting edge r&d That's not yet done by any commercial available product Or any existing research off the shelf And then we take and create this environment in a way that allows us to validate the effectiveness and And frankly the efficiency of those tools as as we go through So the goal of the program itself is really to take a generational leap forward in the capabilities of test beds so we've Been building cyber physical test beds And leveraging cyber physical test beds for about 13 14 years now at the University of Illinois By many accounts were sort of the gold standard in terms of capabilities across the the nation And and arguably the world But even so when we pitched our our capabilities for this program our our Proposal and the going in salvat of of our proposal was effectively We have assembled the right team to solve this problem But what technology exists today and where test beds are today across all of them that you will encounter in anyone that bids All of them are woefully inadequate to be able to actually go to the level of realism that will be necessary to validate these tools Um, and even with us, uh, it is an extreme long shot As to whether or not this will be achievable in in the time frame and advancing fast enough to be able to support this post attack analysis And so so let me riff on that for a second Cyber physical test beds before this program started were primarily focused on let's either build an environment where we're looking purely at a physical Phenomenon or let's build an environment that's proving out a hypothesis physical or cyber and look at Look at it from that particular angle sort of ignoring all of the other details But in this program everything is unknown coming in you don't know what the attackers did to you You don't know even what the attackers want to do to you Um on the environment So you have to have every piece of it as real as possible, but you can't possibly go build um, you know, uh three real crank paths and 24 27 real substations Uh out there because it just costs way too much money to do and it's a dangerous environment to be in So how can you minimize? The environment maximize the safety um and also maximize the realism without Running into problems of naysayers with uh simulation being involved or emulation of devices, etc So they need to be able to touch it. They need to be able to feel it. They need to be able to see it And uh, and they need to be able to trust that what it does and how it works is going to be Reflected or reflective of what happens in the real world so The outcomes of of the test bed work itself, uh, obviously we've created a lot of tools We've pioneered some new techniques and methodologies. We've combined existing solutions both that we've had and that others have had To to build an environment together and we combined that not just the academic knowledge that we had at the University of Illinois But in partnerships with key vendors and uh, and also with uh with asset owners and and operators and to build really an environment that That reflected not only the real world, but that that took a whole leap forward on on its ability to evaluate research So what is a test bed? so a test bed really is is somebody has a need let's call them the customer and And that need is to evaluate something and so a test bed is Assets the thing maybe that they want to evaluate on It's the people with the knowledge on how to build that environment in the way that That represents the scenario they need to look at etc What to capture how to capture where to capture it etc It's the science of how to do so in a realistic way while still enabling the necessary data capture that sometimes these systems Inherently don't support And then it's that data itself and that data itself is what is captured from the devices either willingly or not How the system was operating packet captures as an example of the communications that are going across Ground truth as to the physical telemetry of what was actually going across not just simply what the devices Our reporting is happening And then a manifestation of or configuration of all of those things together That is provisioned out into an environment that you then do the work on And so our capability we can provision locally the assets in in our central environment We can provision portable environments like what we've built on on plume island and deploy on plume island And we can also provision into the cloud in a variety of different ways So Why a test bed? What what's the value of a test bed? You've seen obviously the ics village and capture the flag stuff But this is a little bit different right And the reason for a test bed the reason for the work that we do is that this mission critical technology And why do why do I call this mission critical technology? So the technology being built under radix Is intended to be uh, let's say habits glass broken when We're in a blackout scenario effectively After we've been hit that's where the real value of this technology is And there's arguments and and I am one of the people that will argue this that that technology needs to be used even before we're hit but But in the end our grid is down That's what this technology is built to solve And so it is absolutely essential if this technology is called to practice That it works and that it Resolves the issue or that it can figure out what's going on in the issues, etc Before we need it because if we don't and we're in a national blackout a national disaster sort of scenario Attacked by an enemy or otherwise How do we how do we come back if we break the glass on this technology and it's not been proven to actually work? And so you run it and it's like I can't figure out what's going on I see nothing wrong. There's no problems here whatsoever, but the devices still won't turn on The grid still is down These devices aren't operating correctly And that's a bad scenario to be in so this is truly mission critical technology And we have to prove that it's effective before we need it But we have to go beyond the theoretical testing of it We have to put it in all sorts of scenarios across all sorts of different platforms to verify that it works even in edge cases that it wasn't expecting in Circumstances where it's missing data in circumstances where it's even being directly attacked or or attempted to be Let's say misled on what is going on so Our our solution for that is a realistic recomposable and well instrumented test bed Is essential to being able to prove that out because even the real grid environment Cannot be manipulated in the way that we can with the test bed environment And I'll talk a little bit about some of the innovation in that space Um and frankly everything as I started with my opening salvo and the proposal that existed including the illinois capabilities Wasn't good enough before this program started So our approach across it is to build real systems. Um, we also build models Looking at models on the cyber side and physical side That adapt to the exercise needs that help us build out behaviors and changes in the flows and the communications of systems To operate like the real world or to operate In a way that an adversary may be able to manipulate Everything is built in this modular way. It's adaptable and and let's say recomposable in a variety of different ways So we can take a piece and take a substation and how it's physically wired and physically set up at the moment Press a couple buttons push a different configuration And now it's a different substation same devices different configs Different network layout, uh, etc. And all of that Um adaptability or modularity allows us to recompose the system in any way necessary To present different challenges, etc as we go through There's also instrumentation. And so the instrumentation is key in that Many of these systems, let's say will cooperatively give you a certain amount of data But sometimes when you're doing forensic, uh, analysis, you need to be able to gather things that are deeper Or if you're trying to do experimental validation, you need to be able to look at things that the system inherently won't tell you Or you need to look at it with much more scrutiny than what you would typically look at in the real world So how do you do that and turn on an appropriate level of of data output data capture Etc. But that doesn't actually affect the behavior of the systems because sadly some of these systems as you may know Are underpowered and if you turn on let's say Full complete logging of the system or other things it can bog down the operation of the system And then it no longer participates or acts as it would in the real world So the last part is is really knowledge and um by that I mean We don't just say look as academics. We're bright people trust us. This works. We had to bring in real operators We had to bring in the the manufacturers the vendors across across many different platforms and talk through with them Their best practices. There are common misconfigurations that they see when uh when integrators are building their platforms The uh common ways that they configure their substations in the real world for the asset owners um Etc to both cause Let's say Human error to happen in ways that people accidentally misconfigure things But also to mimic as closely as possible how people are actually configuring these in practice And that is to get the right level of protection the right level of output of data And and even notions of like, okay, what does a SIP compliant? Substation look like in terms of what it is logging or not versus one that's not Um, so that all comes together in that knowledge area On the innovation side, um, we had to innovate quite a bit. Uh, and so the let's say orchestration or automation stuff that we had previously, uh was Good enough for research But it made a lot of assumptions And so by that I mean it would operate in a centralized environment But when it tells something to be reconfigured or tries to control something it expects that a the device is reachable B that that it has access To that device that it has the credentials to get on that device It assumes I guess c that that it Knows what the state of that device is And then lastly everything that it did before it also assumed that the device was trustworthy and in a known sort of condition so we had to let's say break down all of those assumptions and operate in a way that that let's say Didn't rely on any of those existing substations We applied a bunch of research as well, obviously, uh, we had uh, you know over a decade of of work in the prevention space and In the detection space and and remediation space at the University of Illinois And we had used that in the test bed in a variety of ways We had used it or proved it out in the test bed some of which is even in formal companies now Transitioned either to big vendors or as startups And we had to apply that in in sort of a different way as part of the test bed not to just Say, okay. Look, here's the test bed environment But for instance if we could reach deeper into a device Then we used some of that research that we had to dig deeper into those devices And pull out and extract information that supports the validation of the technology without affecting the performance of the device itself our team In particular myself and a few others have went really deep on some of these platforms over the years And so we brought a wide variety of Devices to the table that we already knew quite a bit about on the inside Sometimes even let's say one could argue more than what the vendors know about their own devices In in terms of the knowledge and ways that we could poke around inside of these these platforms We had to also build because we needed to deploy on on an austere environment If you don't have it Then you better bring it type notion And so we had to build these boxes in a way that were field serviceable We had to be able to quickly replace Components of the system if it were to break or if the intent of the cyber attack against it was literally to brick it So that it was no longer functional How did we restore that or replace that in as fast of a situation as possible? To move on and not basically stop the whole exercise if something were to break We had to advance our automated configuration data extraction And and also the notion of the system and its observation when even the network links were no longer trustworthy or Reliable to be up or down Remember we're in a black start scenario So we don't even we're not even guaranteed that we'll have power On each of the substations to be able to communicate to them and when they're brought up we need to You know Maintain the state of everything we captured as they go up and down like uh like uh, you know, uh, see Seesaw As as they're being attacked and brought up and brought back down etc And then we also needed to have the environment in a way that could be recomposable change the structure of the crank paths Change the behavior of the substation itself without going through and recabling or rewiring everything that's in there on a hands-on nature So what are these environments? So, um combined I call them The substations in a box. It's two components. Um, this has been built on on the extensive facilities We have at the University of Illinois that I've sort of alluded to We have roughly a hundred million dollars worth of hardware and software at the University of Illinois that have been Built up over the past decade plus Much of which by donation That's enabled all sorts of research that we've done in the past With trustworthy cyber infrastructure for power, which was an NSF effort DOE DHS effort called Trustworthy cyber infrastructure for the power grid it had grid to the end Our most recent center That's wrapping up in the next year or two called the cyber resilient energy delivery consortium Which is also DOE and DHS funded Our critical infrastructure resiliency institute in a variety of other things that that Have have and leverage the test bed resources that went on So the substations in a box as I mentioned they're designed to support this black star crank path analysis and deployed in the field Real grid environments built etc. They're built in pelican style cases. So they're literally shiftable and deployable anywhere We need to stand them up. They're generally mostly ip 55 watertight, uh, when when they're, uh Shipped and and moved around When you physically deploy them, we take the case lids off and put them in enclosures The reason for that is is literally so the devices inside don't overheat But also because you do need some physical access to the devices to control breaker operations and other aspects We built an environment on an island. So the power infrastructure of what is in the overhead and underground stays But basically everything else gets torn down and built back up every six months in a different way There are currently 26 variants of substations deployed across Across that infrastructure those substations have relays rtu's substation network switches routers As well as an experimental fabric underneath. Uh, that's controlled by stn That allows us to do a lot of the capture and and let's say dynamic changes of of the substation itself All sorts of protocols are deployed. You can see a list of them up on on the screen There's both serial and ethernet communications. We have custom power connections on the power boxes that allow us to To link these systems together in a safe way And then we have a high voltage infrastructure that i'll talk about as well So, um, what's a power box and what's in a power box? So power boxes basically think of it like the physical infrastructure of Of the island or or of a real substation. So that's the breakers the bus bars The incoming and outgoing feeders on the system We have a local load feeder as well We have signalization lights that indicate what the status is of energization what the status is of breakers We have a dead bus sync light that that's Provided we have analog sync check relays contactors auxiliary contacts on the systems cts and pts Control circuitry behind that allows us to operate breakers and various other stuff We have different modes of operation sort of a safe mode where we can walk away from the system and the system can't possibly change which is a Sort of a unique scenario from the real differing from the real world And then we obviously have the ability to locally control Breakers as well. So what does that look like? So there are effectively two Types of power boxes inherently that we've built One is a 208 volt three phase system. One is a 480 volt three phase system They look basically identical from the front except for the size of them is a little bit different We also have these high voltage systems, which really are Hoffman enclosures that are wall mounted and act like let's say semi intelligent Uh breakout boards for for providing telemetry from the high voltage gear To the corresponding devices that are then operating and controlling that high voltage gear And so think of it sort of like a mapping board in a way And each each power box generally has an incoming circuit a load circuit and then two outgoing Circuits As it's built out. So basic electrical diagrams in the in the middle But nothing let's say Shocking about that and then there's the other side of it. And so Each of these devices have the number two SO cord coming in to these Hubble connectors on the edge But they also have billicle cords, which are Amphenol connectors mill spec amphenol connectors that basically take all of the telemetry of what is happening on inside the box And provide that telemetry to the devices that that need to control it So that includes the analog and digital signals that need to be sent back and forth between the devices to control them But but also the CTs and PT outputs, etc From the system itself so that all of the sensing is is detectable by the relay boxes And so what are the relay boxes? Well, the relay boxes are really the brains of the substation So here are a couple examples showing some of the different technology that that's in play Up in the upper left. Those are ABB relays along with an ABB RTU. This is sort of a legacy RTU Platform that ABB leverages or uses and has deployed around the world called the RTU 560 In the middle you'll see Some more ABB relays middle top And above that instead of an RTU 560 you see a Motorola device. This is a Motorola ace 3680 If you move to the next image a upper right hand corner That is an ABB comm 600 rack mount or a comm 600 r That is acting as the RTU over those ABB relays that are there Bottom left hand corner. You'll see some touch screen SEL 751 relays those are controlled the RTU in that particular case is an SEL R-TAC a 3505 In the middle you'll see touch screen relays and in that one there's an SEL R-TAC as well, but that's an SEL R-TAC it's a SEL 3530 instead of a 3505 And and then the far right you'll see more SEL relays. These ones are not touch screen These are another variant of the SEL 751 and those ones are being controlled as from an RTU perspective by a Novotek Orion LX So this shows just some of the diversity of of platforms that are there. There are much more obviously across 26 substations Every single substation is unique in some way shape or form So we have a lot of diversity across across the environment in terms of platforms Technologies and configurations and so diversity could be purely on the configuration side. For instance different protocols being being communicated different topologies being set up between Utility a utility c utility b etc It's and lots of other variation on top of that so Let's talk about some let's say lessons learned from the program more challenges that that we had to Tackle now that you understand some of the gear. So first off is safety When you can't trust anything in the system whatsoever Because it is compromised and it's compromised in a way that you may or may not No, may or may not be able to determine and you don't know what is trustable or not All of the systems that are there are effectively designed to protect you But if you can't trust the digital systems to protect you anymore then you need additional layers of protection So we had to layer protection throughout the system in both physical and cyber form And that included things like analog time Analog protections in the system like analog sink check relays time over current protections thermal protections On the digital side or on the cyber side the protective relays were configured with safe insane settings for Overvoltage and undervoltage conditions, etc. The boxes had arc flash analysis done on them to determine Potential exposure or safety measures from PPE perspective that needed to be done The cabinets were all either padlockable or direct key lockable We had on the high voltage side and telerupters that were acting as fail safes If there's anything that flows through the low voltage to the high voltage side An out of sync close or something like that that happened to happen Or a surge somewhere the telerupters or a fault on the line The telerupters were there to protect the the system at the high voltage side We also purposefully did not target the high voltage control That way we didn't run into into let's say big issues low voltage was something we We could cause a problem on and be okay with but on the high voltage side people could get hurt All of the connectors we used were screw-in locking style connectors We had all sorts of internal wiring protection, etc So the key is is that the environment itself was designed to be safe no matter what So people that knew nothing about power systems could still safely operate in this environment and not Not run a risk of being electrocuted or or whatever We always had power engineers and safety officers effectively on site that were making sure that people did safe operations and maintained The necessary perimeters even at a noise level from the generators and in various other aspects But really the system protected itself in every way shape and form Even when the system wasn't trustable So let's talk about some operational lessons that we learned in executing exercises for black star restoration, but also In austere environments under conditions of blackouts, etc And so let me talk a little bit about the mode of execution as we as we go through this So when I say that we're operating utility environments, we are actually operating the utility environments and by that I mean utility operators from real utilities come to the island and they run the infrastructure And they basically take control of it. We hand it over to them And then they tell everyone else what to do how to do it when to do it, etc On the system now, obviously we have some exercise control over that But the intent is to really make this as real as possible. So let's talk about that realism Many people don't believe what is possible until it really is let's say slapping them in the face and by that I mean A common view is look relays. They're embedded Devices you can't make them do things you can't disable their protection. You can't Um, uh change their mode of operation beyond their config Um, and until they saw us do that They didn't really believe it even when it was happening right in front of them They still didn't believe it until they dug deeper and started to look deeper at what was happening and how it was happening To realize that look bad things really are possible that Right now your mindset is that this isn't possible at all that no one can do that You had to you know physically modify the device in order to do that and it's like no we can actually do that via cyber means So let's talk about the people, right? one of the things that uh, that was uh, an operational lesson Is academia is really good about thinking outside the box But we needed to be extremely agile and think even further outside of the box and push through and build stuff that uh, frankly Was uh, let's say impossible to build at any given moment We took on building one utility then two then three All in six month iterations completely different architectures completely different devices building the physical boxes Uh and standing it up in a new realistic configuration Uh at each sort of interval Uh, that's pretty difficult to do it's multiple weeks to build the environment each time It's multiple weeks of testing. It's multiple weeks of evaluation to make sure that it is as real as possible That it doesn't have inherent artifacts itself that people may view as compromises, etc In the system that it is pristine blue sky trustable, uh, etc And built the way it should be built And that results in let's say extreme levels of stress at at times But as a team not just the University of Illinois, but broadly the entire program We all pulled together and found success in every exercise that we had There was no exercise that failed because the infrastructure or the people failed to deliver Um, and so that brings us to pace. So I mentioned every six months So DARPA programs move very fast and the expectations are very high of the technology of the evaluation of the test bed Etc. And so there were many times that we Were facing failure, but let's say by by pure blunt force and by pure blunt force. I mean uh, number of hours and long days and Um, and leveraging and leaning on on each other throughout the program We were able to pull through and pull off what seemed to be impossible Uh, when we started So those are some just operational aspects, uh, that we learned as as we built the environment Especially out on an island. And so when I say, you know, what are some challenges, right? So there were times when we were, you know, we took a ferry every day from the mainland to the island And there were times when, you know, northeasters were blowing in and other aspects where the waves were 12 14 feet high and the You know conditions were such that if you go to the island, you're going to be sleeping there because you're probably not going to make it off um, and that's kind of, you know, let's say A bit much But uh, but there were many days that happened like that. There were days we were out there executing the environment trying to restore after the cyber attack, uh, and uh, and you know, there was northeasters blowing through with 40 plus mile an hour winds And downpours and inches of rain falling an hour Etc. So not just hard from a technical perspective, but even harsh environments that we were out In as we were trying to restore these these devices And and as the teams and the utility operators were operating these devices Faced with those types of environmental conditions So we learned a whole bunch of other lessons too. So one of them, um, that was interesting to us was that uh, when when the systems are intended to break When you know that you can't trust anything anymore You have to think differently about what you can build And how you build it so that it works consistently and reliably Even when everything is intended to be broken So that goes back to some of the stuff we did with safety But also on on the cyber side. So as an example, we had no guarantee of power reliably or not At any point in time. So how do we guarantee we don't lose data as an example as we're going through or that We get let's say eventual consistency where if part of the network is up and being controlled by our orchestration And another part of it is down. How do we get it to catch back up and get in the right configuration? When it does come back up if it needed to be changed or to collect all of the data that That it had when it wasn't centrally Reachable when it was isolated and only let's say on backup power on its own So that presented some interesting challenges that we had to tackle which is basically Fault tolerant and distributed computing in a nutshell, right? The other issue that was somewhat interesting that we had to innovate on was The gear in the field typically isn't hot swappable. You can't just go grab a relay That's in a real substation and pull it out and pop another one in in most of the substations without having to do some rewiring without having to take circuits out of out of commission Deenergize them lockout tag out procedures, etc But if you're in a fast-paced exercise environment, we have seven days and if people Get stuck and we need to reverse out basically what the bad people did How do we reverse that out in as fast a way as possible? Without bringing the whole exercise down or the whole exercise to a halt for hours on end So we had to create sort of quick connects and all sorts of other things that allowed us to swap gear out very very quickly Further when we're testing bleeding edge technology things fail Because sometimes that technology doesn't work or the research Doesn't do what it's supposed to do and so you that's your primary plan is okay Look it's going to try this and if it works great Well, you have to have a backup plan right and in many cases we had to have another backup to the backup plan Because we couldn't fail and make everything stop working So if anyone in the program failed We had to have a backup plan for what would happen if they failed and then a backup plan for if our backup plan failed And then the last time the last one which Is a lesson learned is we thought we were prepared in many many many occasions But let's say, you know in between days of the exercise when we would pause and go back to our hotels to sleep, etc I would often make runs to home depot or loaves or Electrician stores or whatever because no matter how many spare parts we had no matter what tools we had on On us or available to us anything that can break Will break at some point when you're when you're running these environments And it's almost always in the way that you never thought of or that you couldn't have anticipated something that has You know a 10,000 hour meantime between failure or 100,000 hour meantime between failure fails in You know 10 hours instead so Lots of interesting challenges on on just keeping things up and operational And making sure that stuff was readily available if anything did break Um, so let's talk a little bit about some personal takeaways and i'll wrap these up really quickly here but uh Let's talk Let's say not as the University of Illinois, but let's talk about this from from my perspective. So vendors Be this ics cyber solutions or the vendors themselves. They often can claim capabilities that Uh, that let's say are much more limited in the real world application than what most people realize So a utility may say I go buy this platform and it's got me covered And it can do all of these things and I can check that box and i'm good um, the reality is is No matter what vendors claim, uh, there's Often still very large gaps there Um, and so even the commercial off the shelf ics cyber solutions that are out there They're missing huge amounts of surface area on what adversaries can do against these boxes Radix technology was designed to close some of those gaps But not all of them, right? So even as great as the radix technology is we still have a long way to go, right? um in building the environments that uh that we built and and helping to construct Uh, the the exercise evaluations and and the test effective payloads and all that in essence is How do you cause a condition from a cyber means to happen on these devices? Such that there are artifacts or or implementations that then people can forensically find, right? Um, so all of those in in doing that we found hundreds of issues on these devices And it's not like we've never looked at these devices before and by issues I don't necessarily mean security vulnerabilities, but in some cases it was security vulnerabilities But merely other items like look compatibility between host a and host b it doesn't work Even though they're supposed to interoperate it doesn't um, there's nuances differences There's differences in documentation on what is communicated on the actual wire versus what it says it's going to communicate as um another uh thing Which was a personal takeaway is you know being the adversary is fun, right? It's it's it's nice to be on the red team and hack these devices and and break them in a variety of ways But in the program The defense and recovery technology including the radix technology and commercial off the shelf stuff that that i've personally seen It's still way behind What uh, I was capable of or what others on the red team were capable of doing to these systems So if we're so far ahead, we could just obliterate any of that tech then What fun is there in that like it's not even a fair fight at that point So a lot of times we limited it um the the activities that we were doing to sort of poking the bear Rather than destroying it. So we didn't go out for you know the the throat kill We kept a pace with what the technology was capable of and designed something that was just a little bit ahead of that Pushing them each uh each iteration to improve um, and so uh one sort of broad takeaway is um, and I say this not just of technology in north america or whatever But the whole world what they have in this space in industrial control systems Detecting cyber attacks on these devices that detection that mitigation and that remediation technology for the electric power grid um and I'll stretch it a bit and say actually all critical infrastructure Still really has a long way to go There's still so much work that needs to be put on that on those platforms to really protect the systems From somebody who is truly focused and determined and understands these systems at a deep and inherent level So that's all of my technical content I will just flash a slide uh real quick, which is the testbed at illinois and and much of what we've built in the radix program Has been enabled by by lots of companies. These are some of the uh the companies that have donated gear to us software to us Etc that have helped enable uh the things that we've done But without them and and without uh the commercial support and the vendor support and the utility support broadly across this program We wouldn't be successful the DARPA radix program wouldn't have been successful So thank you to all of those companies and and what they did And then I think we still have a couple minutes for questions Uh, and so uh, I'll also leave a sort of bonus link down there at the bottom There's a github repo that I created a number years ago at the s4 conference And uh, and I've been maintaining it Sort of ad hoc ever since uh, and that's a bunch of ics security tools in a variety of different forms that are aggregated um and and categorized in in various ways and mirrored when when their original location is no longer available Um, so do check that out. It has a whole bunch of useful things in it Uh, and with that I uh, I will stop talking and I think we might have a little bit of time for a question and answer Uh, let's see. Hey, Tim. This is Bryson. How you doing? Wonderful. Hi Bryson. Thank you. Uh, thank you for the talk. Um, I I noticed that you are on our discord And I actually promoted you to a speaker during your talk Um, so you should now have that badge tied to you And uh, what we recommend is um, if you could post that github link in the speaker q and a Section of discord Okay, and then um reach out for folks to engage you there and for questions Okay, sounds good So really appreciate you jumping on and giving this talk um Having having been and seen this for myself. Uh, it is really impressive And uh, my favorite part for the sensors was of course the um inflatable guys like you see next to the huge car sales Um, I thought that was a really interesting way to show whether something's up or down at a physical distance Yeah, we affectionately call them the dancing men And sort of a a funny aside In the northeasters when you have torrential downpour those things get wet And then they turn into sort of like whip lashes So I was up there, you know untangling them on many occasion and getting sort of smacked by the uh dancing men as Because I was trained to untangle them so that they could fly Well, that's how you know it works Well, anyway tim, uh appreciate, uh, you uh joining us in supporting the village and uh, look forward to the commentary and the q and a on discord Awesome. Yeah, and everyone do check out what they've set up for, you know, the ctfs and other things in the village You know, they do uh, they do an awesome job of creating environments that you can play with Sadly, I can't offer my environment out to the world in in such an easy and inaccessible way But uh, and hopefully in the future, I'll get deeper engaged with the ics village and bring some of this tech And some of this capability to uh to the village so you guys can all play with it too Yeah, we look forward to that. Um, that's probably been the biggest innovation we've had is because of the pandemic The amount of effort we've had to spend on figuring out how to make these things virtually accessible Which is the typical limitation for concurrent access We kind of have solved it. So, um, we'd love to touch base with you afterward and talk about it Yeah, it's uh, uh, you know great to hear that you guys have had to tackle that We're currently tackling that for the DARPA radix effort as well With the last exercise it will be predominantly remote Um, and we went uh from the prior exercise being you know network isolated completely trusted everyone in In a specific zone to you know part of this is deployed in the cloud and everyone is distributed around the the country and all accessing this in a in a controlled and and uh crazy way including like streaming body cams that we'll have and all sorts of stuff So, um, we're on we're on that same roller coaster dita covet at the moment Well, we uh, we look forward to collaborating. So again, tim. Thank you very much and uh, we'll see you on discord Yeah, you guys have a great day. All right. Take care