 Okay, let's get started. Hello everybody. My name is Jeremy Rosen and I'm Embedded Technical Expert at Smile, which is a big open-source company and I wanted to talk about safety versus security and how the way people think when they're thinking about safety and security contradicts and kind of tend to pull the projects in opposite directions so before going into the meat of the talk a few warning points to kind of set out what I'm going to talk about I'm going to talk about philosophy and culture I mean I'm going to talk about how people react when I talk about security or safety in the embedded context So it's not a technical talk. It's about how technical people react from what they've learned about safety and security My company does embedded stuff and mainly industrial embedded stuff. So that's That's a very special subset of embedded development in general It's very different from say consumer products. It has its own constraints in particular with regard to updates which are way more Complicated we'll get to it. So just keep in mind that's not all embedded systems That's the one I see. I see about 20 to 50 embedded projects a year. So quite a few I can start doing statistics about them And all projects are different So I will tend to say yeah, but you can have to do that on your embedded project All embedded projects are different each of them has its own Problems and you always have to think about your particular case and since it's Philosophical talk I have very simple definition about safety and security And I don't want to go into the exact details and the meaning of those words safety is anything related to reliability and Security is anything related to hostile takeover in general. Okay We will try to talk a little bit about why embedded systems suck at security I mean everybody knows that embedded systems suck at security and it's the next next be big end of the world thing That's coming around, but there are very little People that actually try to understand why and where we are where it comes from beyond the the usual There is no updates. It's more complicated So just to start my talk I would like to do a quick show of hands in this room Who consider himself a safety person? Oh quite a few and People who are more like security people a Bit more I would say one third to third and who's more or less doing both I End up doing both so that's there are quite a few people like that So that's good because that means we'll have people from both sides I'm gonna try to think with the other person's hat So Let's start with safety safety people are kind of brainwashed in the way they think they're extremist So I'll try to explain how why where it comes from to the security people who are not used to them So the first thing with safety is you want your your system to always work with a very very strict definition of always so Having correct software is not enough. You need to prove that software is correct So you need to prove that your software is correct I have no idea how you can prove that machine learning is correct That's gonna be very interesting in the coming years You usually need to prove that your hardware is correct on safety critical system It means that you cannot use caches because it might change timings And you don't really know how they work and you need to prove is that your tools are correct compilers When you do safety critical codes you have people that are paid whole time to prove free the assembly with the source code and Just compare to make sure that the compiler generates correct code because you cannot Trust your compiler and then because you need to prove things correct You need to simplify things so you have those crazy general rules You need to follow when you're doing a safety critical code like no dynamic memory allocation or Software version which is no dynamic memory allocation after initialization Anyway, you want to make sure that you don't do not have use after free So no free no double free no free no memory leaks no memory allocation So it's it simplifies stuff a lot and you need it because you need to prove it a Consequence of that is that when your product is out and you discover a bug in the product the first thing safety people will do is try to find out if the bug has any consequence at all because Changing the code mean certifying the code again, and that can be extremely Expensive so if we can prove that the bug has no consequence. That's way better and Also when you're doing safety critical stuff Any change is safety is a safety change and need to be evaluated. I Had an excellent rate recently from people working on hardware that was going into train. So they had this little racquet ball Units that they had to put in and those units had connectors on the front side With a little knob that you could screw on and the knob was linked to the unit with a little chain At some point they had to change the provider as a provider for those little chains and they had to research if I as a hardware Because they change the hardware so don't ask the questions just really certify the whole thing So safety people are completely paranoid, but that's why planes work. It's because they check everything But security people are kind of brainwashed the other way So the big difference is security is about not just things working correctly or all the time It's more about making sure that can it cannot be used for anything than its original purpose So everything is an attack vector. I mean I'm do a little security But I'm not a security person and when you read how a specter and meltdown works You're like it's impossible. Nobody would think of that and nobody could really exploit that do they oh they do So the whole thing is also any little hole is Potentially a leapfrog to a bigger hole and a bigger hole and a bigger hole So you need to do everything check everything fix everything and then a big difference is security is a race You have to find the weakness before people that are going to exploit it. You had to fix it as fast as Possible even if you have to temporary Temporarily break another part of the code. That's okay because leaving a hole is dangerous And then you have to deploy and you have to deploy fast because once you start deploying Basically you publish the bug so people will try to start exploiting it so speed is the essence and The whole world is out to get you so security people completely paranoid But the thing is attacks are a real thing spectrum meltdown has been exploited in the wild and the security Culture that way of doing stuff. There's a whole talk about upgrade upgrade upgrade upgrade. It works It does reduce all the security threats. So it makes us it makes sense So now that we've been explained that what happens when we Have to put both side to side you have the safety people that will tell you your code must be proven and Certified and that's long. I mean if you have to proofread every line of assembler it takes Years, but on the other hand the security people they want to go fast and they have very good reason to want to go fast Safety works because you know exactly what your software has to do So you have very well-defined constraints and operating range whereas security will just Start by looking where out of the city of the ranch because that's where the attacks come from you must protect from hostile behavior Not just accidents not just bugs you must Protect from people who actually are here to get you so it's a very Different way of thinking about bugs One of the big big difference what's one of the ones that has the most impacts is that the way bugs go down Safety people Consider that the more you the more time passes the less bug left you have so long Testing works long testing will reduce the number of bugs and as long as you do not introduce new code The number of bugs left will go down That's not true with security because threat models evolve and something that was safe Might not be safe anymore some code which was state of the air of the art a couple of years ago Might be dangerous nowadays because we have a new way to exploit it So again, there is this consideration about time makes a huge difference and Then you have the idea of known bugs in safety You will check that the bugs has no consequence and if your bug has no consequence you will actively ignore it I mean There is a known bug on Airbus planes which mean you have to shut down and restart your plane every hundred and forty days This bug will probably never be fixed because they have a fix you just turn off the plane and back on and That's good enough and they don't want to Change the software because see changing the software in any way will probably be way more dangerous than adding a line to the manual and get Your pilot to do it because pilots are very good at following instructions And so as a consequence safety people will upgrade only as the last resort Safety people do not care about new features. You never need new features because your product never changes and The idea is that you just fixed bug because they're here is totally Impossible it's way too dangerous and the other hand we had this morning We has a talk the keynote by Greco and Hartman who told us Every bug you fix is a security fix and even if you don't know it So you should take in every possible change and all we run the newest version and that makes sense, too That's the big problem. It's totally opposite, but both aspect makes sense so any change is a risk any Change is a risk and need to be justified and any bug is a potential security weakness and needs to be fixed How do we solve that and then you have this cultural problem? Security trumps everything so safety trumps everything when your safety engineers tell it tells you no Your products won't go out and when the security engineers tell you you need to fix that bug you need to fix that bug today and When you discuss with product managers that have Products that are in the making and they're going to go out and they don't know how to deal with this They're torn because they is there a wall They're literally torn apart between those two completely different consideration and how do you solve them? in the embedded world Historically safe safety tends to win Which is one of the big reason why you don't have that many upgrades in industrial product again customer Consumer products are a bit different, but in industrial products. You're like, okay, so you want to do a security fix I need six months of testing Do you do we do it now or do we wait for tomorrow when you'll have three more fixes? Yes, but that will reset our clock and we cannot wait six months because in six months you have so many new bugs around What do we do? So that was safety versus securities and then I need to do a side note about embedded systems and upgrading embedded systems So I kind of assume everybody here is in the embedded world But still it's important to point out that Upgrading an embedded product is very different from upgrading any other product The first thing is that we need to have upgrade six systems that are extremely robust I mean way more than in the data center. Why because if an upgrade fails a product is a brick There is sometimes you have no access to the product because the product might be Might just be into your consumer's hands, so you don't know where they are They might not have a network access in particular if an upgrade has failed it might not have that work access anymore Some products you just cannot access we had once an example of a product that would act that was actually Literally poured into concrete So no you cannot access the hardware You need to have all cases when when your life when the life of your product is very long You have to deal with stuff that only happens once in a decade bad blocks bad blocks on this is a problem You cannot change the hard disk and you cannot just throw away the old hard disk and put a new one not with embedded products Conflicting configuration files. That's a big problem If you have well, let's call it a naive DBN based system when you upgrade packages You might have conflicting files in at upgrade time I had that one on a TCFS tab on a product and it break the product So you need even if you have a per-package upgrade You need a way to get back to a completely well-known states that might not be useful But each which is safe and is able to upgrade the product back into a good state You need to do that when you upgrade you want to keep user configuration It means that your new system has to be able to read the old configuration With all the problems of upgrading the configurations and having a problem and having to downgrade the configuration to get back to the previous version stuff like that and Last you need to be able to upgrade Everything maybe not the first-stage bootloader, but about everything else Which means you need to upgrade your kernel Which means that containers are not a complete solution because containers do not upgrade the kernel They share the kernel the kernel is always on the host So even if you want to upgrade to have your whole system container based and just upgrade container per container per container Which is easier because a container is more or less a single file You still need to have a way to upgrade your kernel and kernel upgrades can fail and when an upgrade fail Your system has to have a way to get back on its feet on its own So the whole upgrade problem is complicated Some systems can stop fortunately few of them most embedded system don't have that sort of constraint, but some systems have Safety critical or vital features that you just cannot stop during an upgrade So they have to have some sort usually a secondary microcontroller Which will just keep the breathing system working while you upgrade the main system And if the main upgrade system failed and the breathing system has to works on its own on until someone goes for the beep and replace The whole product or something but that means that even rebooting on an embedded system can be very hard You can't phase out hardware. I mean, that's also a big problem is once you sell a product How long must you maintain it as long as you have a consumer using it? How long is that? It can be five years for some products It can be 20 years for other product So you need to have back if you do an upgrade to new version of software You need to have it be backward compatible to the first version of your hardware if your product works for 20 years We're in 2019. It means that you have you would still have to support hard right where hardware from the years 2000 It's getting complicated because that would be 386 386 support in Linux is phasing out How do you do? How do you deal with that? deployment time is hard Deployment time can be controlled by the user. When will your user turn your product on? Who knows so once you have the upgrade available on your servers How long will it take for every instance of your product to go to be updated? I don't know. I had some consumers who are some customers where we were discussing them And they told me yeah any upgrade six months to certify six months to deploy It's kind of long And yeah Very long-term support means you can't trust anybody you can't trust your subcontractor So that's a bit of a problem if you go with there are lots of companies that provide The base system for embedded products and will deal with the upgrades for you will they survive as long as your product You can trust your technologies. It's what I said about 386 on Linux. How long will they support your hardware? it's hard and Your whole team will change. It's not I'm putting it as you can trust your engineers to survive But people retire people change job or people simply don't want to spend 20 years on the same project So your whole team will change during the life of your product and you need to deal with that and it's hard So that was about upgrade security in the embedded world is also pretty different from Security the one you have in the literature, which is mainly based on the data center problems Because we have our products in different places so physical access can't be restricted It always depends on the products, but lots of products are just out in the street So you need secure boot You need to make sure that only signed code can be booted because people will go around and open the hardware and steal the hardware And looks inside so that makes things very complicated from a product management point of view because each product has to have a unique key because if someone Hacks into one product and the product is the same has the same keys and every other product He has corrupted the whole line of products, so you can't do that So you might need there are multiple ways of dealings with that, but you might need a unique image per product Which mean redoing your image instead of simply flashing that makes very complicated factory processes that you have to deal with Very complicated is the problem of a return to a trusted safe So you have your product in the wild and for various reasons someone attacks your products and so someone takes over Your product, how do you get it back? When you're in a data center, you just stop the machines Reformat with and reinstalled in your some software and restart it on the embedded world What part of the software stack can still be trusted? It's hard you the first idea would be well I have some very rich boot loaders like you boot which can do absolutely everything You boot could be is able to reinstall a Linux from a known image somewhere you would can totally do that Yes, but boot loaders attacks are a thing So you need to protect your you boot and if someone takes over Linux He can basically write wherever he wants in memory. So he can attack your you boot J tags attack are a thing. So the other idea would be let's plug some hardware and have some Some way to upgrade via a J tag for instance But if you have a J tag port out people will use it. So most embedded processor you can actually use article hardware software Switches to deactivate entirely and once and for all J tag so you can do that But in that case you cannot get your product back by reflashing via J tag So the only way which is kind of safe is having your first stage of boot and your known good image being in a rom So you can reinstall from rom and you hope that nobody will have attacks that actually corrupt The image on the rom once it's reinstalled or you have a way to make it good again but rom are expensive and in the embedded world anything that ends on the bill of Material will be scrutinized. So it might be a few more dollar on your board But a few more dollar can be a huge difference and You will also being working against the culture because again, there is no upgrade culture in the embedded world That's mainly because most embedded products Are products that have been around for a very long time But the idea of having a computer inside is new. I mean if you take I don't know air conditioners We've had air conditioners for for years and right now every air conditioner manufacturer It's considering adding a small raspberry pi type board inside So you can control with your phone and all that kind of stuff That means that those people who are mechanics Suddenly they have to handle all the problems of dealing with the distribution with software upgrades With with the lifetime of the different products they have inside when it's a building sized air conditioner It means that which will it will stay in the building for 30 50 years And they have no idea how to do it. It's not their culture. It's not their job And again And on the other hand we have all sorts of startups that have great ideas for embedded Products, but they don't think long term and that makes sense when you're a startup when you're a startup You don't prepare for the 30 next years because that's A waste of money in a way because you don't know if you don't survive that long and it's expensive And that has no real point So both sides mean that from the start it's hard to have something ready for upgrade It's hard to have people that are thinking in term of upgrades So at some point when you're developing your Your embedded product and you're thinking about upgrades and how to deal with upgrade you have this hard choice Breaked or pooned Do I put security first? Which means that at some point the product will be lost because of an attack or do I leave Some sort of door for my malevolent which means that he will take over but at least the product has maybe a small chance of still working at least partially And you know when I when I go to my customers and as I ask them the question There is a big blank because nobody wants to think about that And those are real problems in the embedded world. You cannot get your machine back once it's corrupted So then you have the security update. So first When you go to big industrial people like that and you ask them how often do you think you need to upgrade their products that they will do Oh, it's pretty easy. We monitor our cvs Let's not go back to this morning's discussion But we monitor cve and we look over every patch and when there is one which affects our products Then we backport and we do a new release. Okay. How do you think this will happen? Oh once a year maybe That's not so as needed as in when there is a vulnerability just does not send the as a test of real racism. So looking around at big embedded Makers, how often do they upgrade android monthly security update windows monthly security Linux it depends a lot on of your distribution But usually it's more or less a rolling release type of thing basically which upgrades on itself iOS It's more or less as needed, but it turns around a monthly upgrade. So Best practices when you look around seems to be you need to upgrade your product monthly and that's a Good thing except recertifying is long when it takes six months. How do I do it? Yes, but if it takes six months my vulnerability window is huge It's humongous. I mean it's it's free. It's a free party for people who want to attack and against once someone has corrupted the product You cannot get it back so when we look at things from both sides both sides have very Good and very strict processes that are justified by years of good practice Both sides have these very strong arguments that their way of doing things Works But they works only because they are very strictly followed. You can't do half security half safety. You will get none And they work they're effective at what they are meant to do So completely opposite. We have speak critical versus confidence critical proactive versus reactive and preventive versus proven Okay, so just they're completely pulling in different directions. So I Have no magic for you. I don't know how to do both and I don't think it's possible to do both with the way They are working currently so To finish we'll have a look at way we can mitigate the problem how to make safety Faster how to get security to work better with safety critical software that sort of things So We can avoid The problem entirely not all product are safety Critical, but any Okay, let's say any connected product needs to care about security So is your product safety critical or is it some sort of remnant culture from previous products? But that makes no sense nowadays That works both ways If you're a security person will my product be attacked by an advanced persistent threat? Probably not So maybe you don't need all the security level You can put in and at some point there is some balance to do and there is To find out what levels of security you want because high level of security is Proportionally way more expensive in the embedded world Because because of the way it works and the risk we have But whatever your choice you will need a robust upgrade system because breaking is the big problem With upgrade systems A good way to do it, but it's also not that easy is to have recertification go faster. That's uh, some cultural problem In the certification world is certification works really well It has some very well tested processes that have been around for decades And that nobody wants to touch because they work So stuff like automated testing is minimal Okay, so there is probably ways to accelerate certification by automating more stuff and making sure that automated Testing is acceptable by the certification authority and that's not a luxury That's not about saving cost because usually in safety critical application They they factor in the cost of certification. It's about being able to certify faster for security reasons Okay, you could that's something you need to discuss with your safety office your safety engineer But you might want to have some sort of fast pass recertification for security problems in A network facing areas not the whole product some parts of the product will be safety first Other parts will be security first. Maybe you can only have different upgrade passes as far as recertification works for your security critical parts And minimize the safety critical perimeter that's something safety people tend to do anyway because of cost But again, it's a good thing try to reduce how much is safety critical And then you have everything that is that you can use to completely separate safety and security so the three most obvious way of doing it are containers hyper visors and Good old hardware separation the three of them we see in the industry containers is interesting But you will tend to but they have the problem of the kernel which is common and thus is safety critical So you can't upgrade your kernel. It's pretty useful because it means that The network facing parts can be in a container. So your firewall you could upgrade separately for instance, but You still have some parts that will stay safety critical But it's cheap and containers are now a well known technology hyper visors get it one level lower Basically, you have your safety critical minimal os on one side and a linux which protects the network on the other side that's Mainly how it works That's pretty good because it means that you can upgrade your whole linux kernel included usually But the hyper visor is now a safety critical part So it's easier because you will have few Hopefully few security problems in the hyper visor So you've reduced the problem, but you might still might still have some And you have to prove your hyper visors Which is hard And then hardware separation which basically mean you have two core or a cpu and a mcu next to it There was many ways of doing it But basically you completely separate the hardware and that's the best solution, but it has another problem. It's expensive And as I said on mass produced products every dollar counts And the reason why people are considering hyper visors and containers in the embedded space Is mainly cost if you had infinite money You would just put more hardware and throw more hardware in the problem and you'd be done But that's not how it works And then the Big part is is to plan for security updates. That's the cultural answer The problem we have is not just that we don't have Security update mechanisms on embedded products nowadays All projects all embedded projects have some sort of security update mechanisms are not all out yet But that's that problem is solved people want update systems But most of my customers have update systems and don't have a team for maintenance So you need To plan that beforehand you have to have an agenda a maintenance product And because you need to control cost You need to have a documented end of life for your product at some point You will have to kill your product and if you're if you surprise your customer You'll get a bad reputation for it If you plan it from the start of the product and you tell your customer about it from the start You will just be following your planning and it's a public planning and it won't give you bad reputation And there we have it. That's basically what I see From the reaction of the people I talked to about security versus safety and how why people just are afraid of upgrading Why why in the embedded world we tend to have very very old software? It's not just a problem of doing the upgrade or doing the maintenance It's also a fear of new features and that's something we need to understand in order to be able to talk about it Thank you. Any questions