 Hi everyone. Today I'd like to welcome you guys to Car Hacking Village of DEFCON 2020 Safe Mode. Today's presentation is going to be a throwback to the times before J1939. If you don't know what that is, that is a protocol that runs on top of CAN and is primarily used in heavy vehicle component communications. So we will be talking about how we came about the creation of a decoder slash parser for J1708 and J1587. So just a quick breakdown of how this presentation is going to pan out. First, we will let you know who we are. We're going to let you know how this project came about. And then we're going to give a brief recap of the protocols that are interesting to us. And you will then see how we came across these protocols. Afterward, we're going to talk about the decoder slash parser and we'll give you a quick demo. Last, we'll conclude this talk with the info you'll need to pick up where we are leaving off. Okay, so my name is Daniel Salom. I'm a reverse engineer at Assured Information Security. We're based out of Rome, New York. I've been with them for a little over a year and a half at this point. And I spend my days analyzing the security of things, digging into protocols and developing anything to support these efforts. You can see in the images, I'm on the right hand side. And there's an image of Thomas on the left hand side. So now you have proof that we actually work. Yeah, hi. So my name is Thomas Hayes. I'm a hardware engineer at Bendix Commercial Vehicle Systems based out of Illaria, Ohio. I spend most of my days designing hardware, so PCBs and digital and analog interfaces for brake control systems. And I've been there for almost five years now. Okay, so what are we doing here? Well, we discovered that there are more than just J1939 messages that cross the networks in heavy trucks. And within this traffic, there's some very interesting data. So we're here to show you how it came about as we needed a simple way to see it. And I can imagine you guys want to see how the decoder works as well. So is there a backstory? We know trucks that have been playing them for a while. This project started off in mid-2019 when we started to do some exploratory testing on heavy vehicle systems. We noticed multiple networks and the interconnection between all of these modules. We started to explore some tools and decoders that are already out there available for use for people in industry. And these legacy protocols of J1708 and J1587 are still being sent across the bus, even though now can J1939 is much more prevalent in newer vehicles. And it looks like it could be very interesting as it is a very important bus that happens on the vehicle with a lot of useful information that we were interested in looking at. Okay, so how do we, I'm sorry, what do we need in order to get things rolling? First is an understanding of the protocols involved. In this case, we mentioned J1708 and J1587. They kind of work together. We also need a hassle-free way of analyzing the traffic that we see. And at that point we can decide what may or may not be done with this data. What are J1708 and J1587? So they are SAE, the Society for Automotive Engineers Standards for Medium and Heavy Duty Vehicles Post-1985. J1708 describes a physical layer and data links layer and defines a bi-directional serial vehicle network. J1708 defines how the messages are structured and how they're distributed over the network. And 1587 makes up a transport application later and describes messages format and parameters. So that's almost a big dictionary for all the possible messages that can be sent over this communication protocol. So 1708 is almost always used in conjunction with the application layer protocol, J1587. It's based on an RS45 bus. We can have a max of 20 nodes on a vehicle and it runs at 9,600 BOD. A message contains a one-byte-long MID, a message identifier, followed by some data related to that MID, and a one-byte checksum. The message can be up to 21 bytes long. However, if the vehicle is stopped and no movement is detected, the messages can be longer if required. So J1708 MIDs, we go up to 0 to 127, defined by J1708, and these are defined by SAE for important vehicle systems. And then from 128 to 255 are defined by J1587, and these are used a little bit more for maintenance and information on the vehicle, more from a diagnostic side. It's very cool. The MIDs also serve as an arbitration method for the communication. So the lower your MID number is, let's say 0 to 7 for engine, that means it's a very high-important message and will be passed before perhaps the triple quarter message when MID at 56 to 61. The checksum is quite simple. It's a two-s complement of the sum of the message and the data content is defined by the applications document given by the supplier. So you can send the messages defined by SAE or possibly something defined by you as a manufacturer if you need to send something specific or proprietary across the bus. So J1587 defines the PID, the parameter identifier. So that's a little bit of a complement to the MID. So if you have an MID for the engine, a PID can be related to your engine oil or a knock sensor or something in your camshaft. So it's a little bit more information for the PID for the system to recognize if there's a fault and direct where that fault can be. So look, it's this of one MID, one or more PIDs and a checksum. The data length of 1587 messages is mostly limited to 21 bytes according to J1708. And if a message needs to be longer than 21 bytes, we can use the COTS, the TP4, so that can be used to segment and reassemble large messages if needed. So the PIDs range from 0 to 4 bytes and the unit's resolution range will be defined in the spec or by whoever is sending them out. So if you're one engine manufacturer could say the PID for oil pressure is one PSI per bit or another one could say it's five. So it just depends on the manufacturer for that PID. There's also SIDs. These are related to the MID, but these are more related to anything that can be used to identify a section that is not necessarily covered under a PID, but covered under a diagnostic MID. So the MIDs for using SIDs will begin at 128 as defined by 1587. And these are really used for something that can be troubleshooted or replaced by a field technician if needed. So what's in the data? Now we, a lot of our testing was in the communication between trucks and trailers. So the trailer has an ABS ECU and the trailer also has an ABS ECU and those two will talk to each other using this protocol. The trailer ABS can have all kinds of diagnostic message. A trough test is a test of the system. It will modulate the air solenoids a little bit and dispense a bit of air. Like when you hear our truck break, you hear a large air release sound and that's kind of just making sure that your system is working properly. You can restart the ECU, you can check the status of the ECU and most importantly demonstrate an ABS fault. So as per SAE standard, the ABS fault, if the trailer has an issue, the trailer ECU has to communicate to the truck ECU and illuminate a light on the dash, as well as a light on the trailer itself, indicating that there is an issue with the ABS and it needs to be resolved. Diving a little bit deeper. There's also a configuration options on the ECU. So wheel size, which then relate to the speed of the vehicle as there's a towing ring that counts the amount of rotations done and driver info. So where the truck is, where the trailer is going and what truck it is connected to at the lot and sensor data. So it's a speed, mileage, your odometer stored on the ABS, trailer ABS location, temperature, and many proprietary vendor messages. So who uses J-1708 and 1587? So any kind of heavy vehicles that we all know, but also school buses, all that same platform, military vehicles, and we even found that yachts will be using the same communication platform. So our use case, we were observing exchange between a tractor and a trailer. They, as we said, the brake controllers communicate over J-1708 and J-1587 and these messages actually pass over a power line, which is very cool. Trailer manufacturers did not want to add another wire or two needed for an ABS communication. So they superimpose a sinusoidal wave on the power line going from the trailer, which then gets filtered out on the trailer side and on the truck side, depending if it's using that communication. It's very cool. But Dan and I did not focus our time on this. We were working on the J-1708 and J-1587 protocol. Okay. So the question comes up now, how do we look at these messages? There's a couple of options. We can use tools that were developed for J-1939 and our backward compatible with the other two protocols we mentioned. Or we can get our hands on an RS-485 transceiver, some bubblegum, a paperclip, maybe a tinfoil hat and see what we can hack up. But what happens when our bus is a power line? In this case, you need a specialty diagnostic tool. Some of the larger companies like Nexik, Haldex, VG Tech, they provide these. But the problem with this approach is it's going to cost you some mula. And flow between programs is not really there as software is kind of tied with the hardware and it's all proprietary, which means minimal flexibility. So our solution is pretty J-1587. This is a decoder that helps with at least part of the problem. By default, this takes in hexadecimal bytes, which are common delimited, but really it can take in a stream of any bytes and a user can define how to manipulate that data to be parsed correctly by the decoder. It also supports reading from standard in, files, or sockets, UDP or TCP. And what it does is it takes this information, these messages, and it will break them down into a human tolerable format so they can be easily analyzed by an end user. So to start off, what happens is the parser will read the SAE documentation PDFs. Okay, and it will do this to set up a primary database, which is used for interpreting the data that's coming in. And this is needed because there's no simple database holding the data that we need in the convenient format. Okay, it's all in the specifications or proprietary by vendors. So there is an option for an end user to create custom messages with this tool. And this will be handy for working with proprietary data. The benefit of this using this tool is that it can be piped from or to other programs, as it's just a command line utility. So there are some requirements. What do we need to get this off the ground? Well, the first part is on you guys, you're going to need an interface to some bus to get the traffic. The next part is, it uses the program PDF to text, which is contained within the popular utus package standard on most Linux distributions. Okay, and what happens with this is we use this tool to convert the SAE portable document format specifications into formatted text files. Okay, then you need Python 2. We are hoping to port this to Python 3, but it started as a spaghetti code project that we just needed up and running. And Python 2 was default here. And that's what we stuck with until now when we realized it's actually useful. There's still some minor issues with it. The spec obviously was not written with being parsed in mind. Okay, so there's a lot of regular expressions and things that parse this information out of these files. Okay, so I think at this point, I'm going to give you guys a quick demo about how this thing works. It's a repository, which will be made public after this talk. And you can just clone it. And at that point, you will set up the configuration file to point at the text files that you've created from using PDF to text on the SAEJ 1587 and 1708 specifications. So the first thing that I'll mention is this repository within we have a tool called fuzzy messages, which I'll be using for the demo. All right, this program just creates random bytes. I believe one to 19 bytes in length. And this winds up working out as a nice test environment to create a robust program. That way it's not just failing on any sort of messages. As you might see when running us, you know, different types of hardware, you'll get funky bikes that aren't actually part of the message at times. So we've decided to make this program a little bit robust and that won't just error out on an incorrect message. It might print a warning to standard out and it'll keep chugging. Okay, so the first thing I'm going to do is I'm going to create a file with, let's say 100 messages. And I'm going to write that to temp. Okay, let me just make sure that that worked. Yep, seems like I got messages in there. All right, so now to run the tool, I just point it at that file that I created with all the messages. I'm also going to give it a flag to print the limiters between the messages. So you can see when there are errors, it gets printed to standard error. So these could be piped to DevNull if you're not interested. But you can also see here the original message, MID, in hexadecimal and these also are accompanied by decimal forms. And then the PIDs and the meanings of the PIDs along with the accompanying bytes. So that's pretty good. But if you want more information, you can give this program a little bit more verbosity like so. And you'll get details. This is the same message here that we saw previously. But now you can see it's giving details about the individual bytes related to a PID. So here we can see there's one byte and this has to do with the extended range barometric pressure. And here are the units right here. I just want to mention one thing. We are using random bytes for this data. So this MID says idle adjust system. And it's not necessarily in a real world scenario. This PID is probably not going to be matched with this MID. But we don't air out. We just want to parse the data as it comes. There's a couple of other options I wanted to mention. So some people prefer JSON. We added a flag for JSON. I will disable the output that you just saw, which can be easily grept. And I will print these messages in JSON format. And I'm also going to pipe the standard out to dev know. Okay. So you can see in here, same messages, but these are output as JSON formatted messages. This could be convenient if you use the JQ tool and you want to filter these messages the way you'd wish. But there's another option for you. And that is we have a white list here. So what we can do is specify specific PIDs, like if we are interested in certain PIDs within a message, we only want to print those messages. We use the white list option. So I will just grab one from here and let it roll. So you can see I printed out one message with that PID 48. And that is in actually decimal form for that. Okay, so there's a couple other things that this can do. If you just give it the dash H, if you use some help, one of the options is for a custom database. Now this means if you want to define your own messages and overwrite those that are provided by the spec, or let's say you're working with some proprietary messages and you have a good idea what it means, you can define that. We provide a sample file in the repo and you can see here that basically this is the format. You can just json and you just give it your definitions. You can use this file as a guide. Okay, there are still some issues. There are a few PIDs that are a little bit more tricky to work with, and those might not have their individual byte definitions over it, but it's working progress there. I wanted to mention that, okay, if you want to read from standard input, the way that you do it is you just pipe hexadecimal delimited bytes as I mentioned to this, and you will use this usual format for reading from standard input. Okay, I'll use packet delimiters here. And you can see what the meaning is, but one of my colleagues came to me and he said, well, you know what, you know, we have an issue because the output that I'm getting from my program is not set up like that. It doesn't have the comma delimited values. It's all just a string of hexadecimal bytes. So what we did was we added the option here. If you look in this file called canon functions, an end user can define a function to be called on the command line that will manipulate this format. And the only requirement is that the output would be a Python list of bytes. Okay, or a Python list of integers. So in this case, we can use the dash j option and say canon node delims, I believe I called it. And you can see that it will just parse the message in that format. Okay, so it doesn't really matter. All you need to do is know what your input looks like and figure out how to get that into a Python list of integers. Let me see if there's anything else I wanted to mention. I think that's it for now. I'll let you guys play with the rest of those options as you see fit. So getting back, what are the benefits of using this decoder? Well, there's no real software investment other than getting your hands on the SAE specs. Okay, I'm sorry, we can't just provide them for free. But SSA, sorry, SAE would not be happy with us. And they take the intellectual property very seriously. So I'm sorry about that. You can handle it. I'm sure the other benefit is that you can work with anything that outputs 1708 1587 messages. So where can you get it? If you want to get your hands on this tool. You'll go to this URL. After the talk, we will make this a public repo. We're rolling with an MIT license. So don't forget where it came from and feel free to make money on. But remember that there are a few dependencies. So you need the specs. You need PDF to text tool and Python two at the moment. We wanted to give a special thank you to NMF TA for giving us the opportunity to do this work. And our friendly motor freight carriers for allowing us to play with their trucks. Also the Hilton executive lounge for hosting a nice hack tackler atmosphere by letting this group meet and set up mobile pass benches. And we only left a few solder burns under nice wood grain tables. We just want to thank you for joining us and enjoy the rest of your DEF CON talks.