 Okay, so can anyone hear me? Okay, cool. So my name is Krzysztof Opasiak for some non-polish speakers. I think there are some. Christopher, Chris would be good enough. I work for Samsung R&D Institute Poland. I do some USB support in Tizen operating system. It's based on the Linux kernel, so we share the same kernel as most Linux distributions. Today I'd like to tell you a few words about debugging, let's say, the most common problems with the USB. It's not going to be a very, let's say, advanced talk. It's rather for beginners, some tips and tricks, what you can do to add use to some common drivers to work with your specific hardware or where to start debugging those drivers and what are the most common mistakes. So in the beginning I would like to give you some brief introduction to the USB protocol. Then we will talk a few words about the plug and play functionality, how drivers are chosen, what we can do with this. We will do some tips and tricks with the driver, how we can modify the kernel policy, what can we do to force some driver to bind to our device even if it was not designed to bind to this device and how we can sniff what the driver is really talking to our device, how we can get this. In the end, some summary and the Q&A session. So to clarify, this presentation is generally about USB, about USB devices management, about sniffing the USB traffic, modifying drivers policy. It's not going to be a talk about the KGB or TracePoint or any other kernel debugging techniques. So if you came here to listen about this, now it's the time to change the room to some other presentation, they are also cool. So okay, so we are everyone here to talk about USB. That's great. So USB basics. First of all, to talk about the USB, we have to learn a few things about it. First of all, what USB is really about? Some people say that USB is rather a network than a bus because it's about pretty the same thing as the internet. It's about providing a service. Of course, it's a little bit different level of service because we have rather storage, printing, internet, camera, or any other services, but they are all provided by USB devices. So in internet world, we have a well-known architecture client server. In the USB world, we have our USB host, which is the master on the bus, which is the device which is being used by the end user. And we have our USB devices, which provides additional functionality to USB host. So that's a little bit equivalent of the server from the internet world. Single USB device may provide multiple functionalities. Single device may be connected only to one host in the same time, but single USB host may have multiple devices connected. Okay, so let's go to some basic entities from the USB world. Endpoints. In TCP world, or generally in the internet world, we have the abstraction of ports to identify the application which we want to communicate with. In the USB world, we have, let's say, a little bit equivalent of that ports. It calls endpoints. Single USB device may have up to 31 endpoints, including endpoint zero, which is the only one which is mandatory. Each endpoint, apart from endpoint zero, may transfer data only in one direction. It means that endpoint is in or out. If endpoint is in, it means that it may transfer the data from device to the host. If endpoint is out, it means that it is able to transfer the data from host to the device. So we always match the direction from the host perspective. Endpoint zero is the only one which is mandatory. It is different because it may transfer the data in both directions. So in internet world, generally we have to choose between TCP or between UDP, depending on our requirements. In USB world, we have four different types of endpoints possible. First of them is the control one. This is the type which is reserved for endpoint zero. So each device is able to speak using the control transfer. It is the only one which is bidirectional. It is used to discover USB device capabilities. The next one, bulk, is the most common one and it is used to transfer a large amount of non-time sensitive data. It means that you give a package of data until, okay, send it when the bus will have time to do this. Next two are endpoints which are transfer types which are called periodic because they reserve bandwidth on the bus. Interrupt is used to transfer a small amount of time sensitive data. In this one, and in all those two, the delivery is guaranteed and in isochronous which is used to transfer a large amount of time sensitive data, there is no guarantee of delivery. So it's, let's say, more or less equivalent of UDP and it is used to stream audio or video through the USB bus. How does endpoint fits into device? Endpoints are grouped into interfaces. Interface is a group of endpoints which are used to implement some well-defined functionality. Usually we place, let's say, two bulk endpoints to have bidirectional communication channel. Interfaces are grouped into configurations. Single USB configuration may have multiple interfaces and all of them are available in the same time. Single USB device may have multiple configurations but only one of them may be active in the same time. All interfaces from configuration may be used only if configuration is active. Endpoint zero is the only one which is not grouped into any interface. It's always available. How do we discover the capabilities of our device? We do this using the data structures called USB descriptors. What are the most important descriptors? First of all, the device descriptor. The most important fields for us from the perspective of choosing a driver are the vendor information and class information. ID vendor is a number which you can get from the USB org. Of course, you have to pay for it. It's not free. When you get your ID vendor, you get the whole pool of ID products so you may release your products with particular identification. You may use this to choose a driver which is specific for your device model. But of course, this is not the exact model of the device. It identifies the model from the USB point of view. For example, if you connect your Samsung phone to the computer, usually you will get that this is the, let's say, Galaxy S2. Because all galaxies from S2 to S7 are backward compatible from the USB point of view, so there is no reason to update the ID vendor because your operating system is used to that ID vendor and knows how to handle it. Class information. The same class information, this triple, is repeated in interface because single USB device may provide multiple functionalities. Those functionalities may be totally unrelated. So that's why on the device descriptor, we can say, okay, use the information for each interface from the interface descriptor. What kind of classes we have? Generally, all the stuff which is well known, mass storage, a pen drive, audio class, application specific and vendor specific that are one which are used when you do Arduino or some other stuff, which when you develop your own USB functions. Human interface, of course, the mouse, keyboard, et cetera. Okay. Then we have three strings, the string which identifies the manufacturer, product, the serial number, of course, any of these strings is not mandatory. You can place whatever you want and watch out for the serials because especially for cheap devices, they like to be almost the same like one, two, three, four, five because yes. Of course, the information about number of available configurations. In configurations, we have information about how many power the device needs to drain from the host to use this configuration and the number of interfaces. Then we have our interfaces with the class information and the string describing the interface functionality and of course, the endpoint descriptor which defines the endpoint address and defines the type of this endpoint. Is it bulk, is it bulk, interrupt, ISO? And of course, the maximum packet size which can be sent to this endpoint. Plug and play. This is one of the functionalities that the user likes the most in USB because you may simply connect your device and it's working out of the box. So how does it happen? First of all, we need to plug in the device. That's the mandatory step. Without this, it's not gonna work. Then your host needs to detect the connection, set address because each device on a USB bus has a unique address assigned by the host in the beginning of the communication. Then you need to get information about the device. It means getting the descriptors. Choose a configuration and then choose drivers for interfaces. Yes. Generally, Linux chooses drivers for interfaces not for the device as a whole. Each interface may provide different functionality so we need a different driver to provide this functionality to the user space. Valid addresses starts from one and ends at 128. Address zero is reserved for the new USB device which arrives to the system and has not before we set the address is used to communicate this. Usually it's just simply the next available address and nothing more. Device details, it means that the descriptors or the information will be later available to you via CcFS and which configuration we should choose. First of all, if the device has one configuration, there is no problem. If the device has multiple configuration, we choose the first one which first interface class is different than vendor specific. The comment in Linux kernel says that Linux is not the most popular operating system yet. So we are more likely to have a driver for some well-known standardized communication protocol done for some vendor specific protocol. So that's why we by default choose a configuration which provides well-known standardized functionality which belongs to some class like human interface or mass storage or that kind of stuff. When configuration is chosen, all interfaces from that configuration becomes available to our system. So now we have to choose a driver for each interface but what this driver really is? First of all, it's a piece of kernel code described using struct USB driver. It usually provides something to the user space because user connected some device to use it functionality. We connect our pen drive to use the flash memory which is on it. So that's why we have to provide something to the user space. Something means block device, something's mean network interface, all that kind of stuff which can be used in a standard way by our operating system. The driver itself is really a implementation of some communication protocol because what driver really does is just packing the generic calls from the user space like send me this packet to that network address. It packs it and send it via the USB bus to the USB device. That's nothing more. How we choose our driver? We're in a standard Linux way. It means that kernel maintains a list of registered drivers and each driver declares a list of compatible device identities. Then kernel goes through the list and if find a match calls a probe function. What does it mean that driver declares a list of known device IDs? Well, we have destruct USB device ID and here are all fields which can be used for matching with the device identity. Those values are similar to those from the USB descriptors, from the USB device descriptor for example, the ID vendor and ID product. And the one field which is here, the match flags, it is used to define which of fields from the structures contain valid data and should be used to match against. There is also additional field which is used to declare what kind of quirks should be used for this particular device model. It means that some device, generally we have a generic drivers, but some models of devices require some special handling. So drivers places information about required special handling in this field and that's why they can use the same driver it may be used for let's say multiple, quite similar but not really the same devices. How does it look on a big picture? We have our USB device, the host controller, driver for this host controller, structure which represents the USB device as a whole and the generic driver which simply creates our interfaces. We have interfaces, USB drivers bound to those interfaces and something provided to the user space which can be used by our generic user space entities like X11 or our web browser if it's network interface. So everything happens automatically. It's good, generally it's good because it's exactly what users like but what if we need to modify somehow some parts of this policy? First of all, now USB is being a security threat. It means that there is a lot of attacks which does for example the host finger printing and check which drivers are available and then find the vulnerabilities and try to exploit them. There is a bad USB attack which simply introduce your pen drive as a keyboard and try to execute some code. So generally in secure systems we would like to limit the number of allowed USB devices or maybe limit the number to the list of functionalities. So we would like to use only particular interfaces. What can we do if wrong configuration has been chosen? Our colonner colonel choose the configuration which has which class of the first interface is not vendor specific and what if you would like to change it? What we can do if driver has not been found or wrong driver has been bound to our USB device. First of all, all information about connected USB devices is available via CISFS. We go to CISBUS USB devices and what can we find there? We have three types of catalogs. First of them, it's called USB and some number here. That's the directory which corresponds to the host controller which is built in in our machine. The second type is the USB device. How those names are created? X is the number of host controller to which this device is attached. Then we got the physical path to the port to which USB device is connected. For example, this printer is connected to third port of root hub, second port of next hub and the third port of the last hub. It means that its directory name would be three, two, three. Simple. Then we've got directories which corresponds to the interfaces. It starts with the USB device identity and then we've got the configuration number and the index of the interface inside this configuration. How can we limit the number of allowed USB devices? We can use the kernel feature which is called the USB device authorization. What does it mean? Each host, each USB device has an authorized attribute. If this attribute is set to zero, then configuration for this device is not being chosen. It means that this device is relatively unusable until we authorize it. Each host controller has an authorized default attribute which is the default value for the authorized attribute for each new device which arrives to the system. So this gives us time to execute, for example, LSU-SB and check what we have really connected and based on that information, we can decide if we really want to use this device or not. All that stuff can be automated using the USB Guard project which is the security project which I'm providing filtering for the USB devices. What if you would like to, let's say, improve this and allow only particular types of communication? For example, we want to allow only pendrives but we would like to use this for multiple USB devices. What if our device provides together a pendrive and keyboard functionality but we would like to use only the pendrive functionality? Then we can use the interface authorization. Pretty the same as for USB device but there are some difference. First of all, when you use the interface authorization, configuration for your USB device is chosen. It means that in device hierarchy, you will see also the USB interfaces but those interfaces are not available for drivers or not available for user space application to communicate using the Leap USB. They are simply filtered, kernel checks if the interface is authorized or not. If you decide to authorize the interface, there is no automatic driving driver probing like it was in device authorization case. It means that you have to manually trigger probing of driver for interfaces. What we can do if configuration which is chosen by kernel is not the most suitable for us? Well, we can simply just go to Ccfs and change it. No philosophy, it's single right to this attribute. So now let's go to some stuff related to the drivers. We have some set of drivers, for example on some vendor kernel on our board and then some new device appears on the market. It is backward compatible with some old driver but this old driver doesn't have a suitable entry in device ID table. It means that it didn't declare that it is compatible with this device because the driver is older than the device on the market. Generally, we can recompile the kernel and simply add this to the code and that's the way we should go. But sometimes we may not have the source code or the config file or all that kind of problems or we just simply want to make it work now without recompiling the kernel and uploading it. So we may add some vendor ID and product ID par to the driver using the Ccfs infrastructure. What is, what we can really do? We can add a simple vendor ID and product ID par. We can add vendor ID, product ID and the interface class or we can add vendor ID, product ID, interface class and the device information. You remember the field from the device ID table I told you which is used to store some quirks for the devices. This is exactly the data format which can be used to reference the quirks from another device. Those two fields are used to go through the device table in the driver and copy the value of the quirks to this entry. Remember that all numbers here are interpreted as hex. It means that when you execute the LSUSB you will get the interface class which is vendor specific for example 255. But if you will try to write the interface class 255 it will simply get an error. You should use the hex format and simply write ff. We add new device ID in the driver directory. We simply write the new ID file. To show the list we may simply just cut this file and remove the previous entry using the remove ID file. If you would like to check which driver is bound to our device we may just go to the device directory and check the driver symbolic link. If you would like to unbind the driver we just simply write the interface directory name to the unbind file in the drivers in the driver directory or to bind a driver we simply write the... Here is an error, here it should be bind, not unbind of course. We simply write the interface directory name to the bind file. Okay, so let's try to do this. What I have here is the Android board. It's let's say a little bit equivalent of Galaxy S3. It has the USB chip from the Synopsys. I'm running their Tizen, but I don't serial console, okay. So this is the serial console from my Android board. I will use now config.fs, lib.usb.g and gadget2 to construct some USB device on this. Okay, I have just created the USB device and now I will connect it to my system. Okay, first of all what we should do is to check the dmessage for the details about our device and we have the information that new device has been found, the vendor ID, product ID, strings and here we have information that some driver has been bound to interfaces from that device. What we can do now is to use lsusb to check what we have really connected and this is the device. Okay, what we got here? First of all is the data from the device descriptor together with strings. Then we have information about configurations. We see that this device has two configurations. First of them contains the internet function and the second one contains the vendor-specific interface but I know that this interface is simply a generic serial function so it provides a serial port. For now what is available in my system is network interface. If I do if config minus a, I've got the USB zero interface. We saw in our dmessage that this USB device is connected to port three dash two. So now let's go to the CISFS. I am in CISboss USB devices and go to three dash two. Okay, this is my device. What I would like to do now is to show you for example the interface authorization. I will write zero to USB three interface authorized default. Oops, of course zero. It means that each new interface which will appear on this bus will be unauthorized by default. So if I now go to the device directory and do eco to be configuration value then the network interface disappeared because I changed the configuration to the second one in which the serial is available and if I change back to the first configuration in which the network interface should be available it won't be available. Still, because all interfaces for example this one is not authorized. To authorize this interface I have to simply write one to authorize because this function use two interfaces. I have to do this also for the second one and then I have to trigger the driver probing for that interface. It means writing eco and the directory name three dash two colon one dot o two minus n of course because this is anyone see error eco. Now it's fine and if I do if config I have the USB interface once again. We can do also the modification of driver probing policy. So we just go to let's say I told you that I have the serial function and for now this serial function is not usable because the generic serial driver doesn't bind to my vendor ID and product ID pair. So I may go to serial function serial driver, generic drivers, generic eco for FF which is the interface class. This is the vendor ID and product ID of my USB device as you saw from the LS USB output and then I simply may do this to new ID. And now if I go back here and do eco, of course let's turn off the interface authorization. So it means that each new interface will be authorized by default and we now have to switch the configuration to the second one because serial function is available in second configuration and let's check our dmessage and we've got new serial port, new TTI available in our system which means that generic serial driver has now bound to this interface. Okay, so we know how to force a driver to bind to our USB device but what if something gets wrong? What if our drivers start behaving unexpectedly? Probably we would like to debug it but it's pretty hard to debug a code if we don't know what happened, our problem. So that's why it's really useful to know what kind of data device is sending to our driver that it gets into the trouble. So first of all, let's start with a little bit of theory. USB bus is a host control bus. It means that nothing on this bus can happen without host first initiating it. So if our driver will not ask for data, it will not get them. The same with the control transfers and all kind of transfer. If host doesn't ask for data, device is unable to send it. On very low level, USB transport, the transfer is divided into transaction. Transaction means delivery of single packet which is sized up to max packet size declared in endpoint descriptor. Transfer is set of consecutive transactions from which all of them are maximum size and the last one may be the short one, shorter than available data. And the level on which our drivers operate is the transfer. It means a bound of data which is sent from host to the device. The level of transaction is a level of on which USB host controller driver operates or the USB device driver operate. So it's a, let's say, it's very lower level. The base abstraction in Linux kernel for the USB transfer is the USB request block. Kernel provides the definition of structure which is a kind of envelope for our data which is sent by the driver through the USB bus. First of all, we have a pointer to the device. We have the pipe field which is encoded of the endpoint number, so the destination of our data. Status field which describes if this bunch of data has been successfully delivered or there were some problems and the transfer did not finish. Transfer flags which can be used to modify some transfer behaviors. The buffer, so the data which we would like to send over the USB bus, length of the data and the actual length which means how many data has been sent before, for example, error occurred. If we are using the control transfer we have to additionally provide a setup packet. It's a predefined structure which simply contains couple of integers which are used to identify the request type. I send you to the USB spads to find what those numbers exactly means. And we have the context which is the pointer to be used by the driver as the context of this particular data delivery and of course the complete routine which will be called when this bunch of data will be delivered or error occurred or for example it has been canceled. To allocate the USB request blog you should use this function and free it with a suitable function because your host controller may do some mapping to the memory, et cetera. So that's why you should not allocate this structure on your own in the driver. So how does typical USB driver looks like? Generally we have three USB related functions. It's a probe functions. It's a probe function which is used to check the device and allocate the resources. Then we've got the disconnect function which is used to release the resources. Complete routine which is used to check the status which should check the USB request block status and many drivers doesn't check it. Get the data from the USB request block and resubmit the work. Why is it resubmit? Well, generally drivers doesn't like to deallocate and allocate USB request blocks all the time. That's why driver usually allocate a bunch of USB requests and then just simply reuse them only placing the data in suitable position. All other functions will be related to other subsystem which is used to provide the USB device functionality to the user space. For example in pendrive it will be related to the block subsystem. Typical box in USB device. Missing the scriptors. Well, first of all, USB devices can do really unbelievable things. That's how the box are being found. For example, using the fuzzer, the UMAP tool or something. It means that device can be very different than what you expected. For example, device with your vendor ID and product ID may have zero endpoints or may have endpoints with very different types than you expected. So each time when probe function is called you should check if the device provides really what you expect and you should check of course for errors. There is no error paths on missing entities. So for example, if endpoint is missing then driver tries to get this endpoint and it gets the null and we simply have the null pointer of the reference then later in the code. Driver has problems with correct error handling in complete function. It means that when the device disappeared in the middle of USB transfer there is null pointer of the reference or accessing the data which has not been sent by the device. This was for example, the case with the driver to display link. The open source driver to display links. And of course, USB drivers are implementation of some communication protocol. They are vulnerable to some malformed packages. So remember that if you are communicating with the device you should always do some sanity checking of sizes declared by the device. For example, when we ask about the configuration, the scriptor, we ask for the total size to allocate the buffer. Then we allocate the buffer and then we ask once again with a different size and we assume that the device will send us the same amount of data. It may not. So how can we learn what device sends to us? First of all, there are some hardware solutions. They are pretty good. The one from LSS is the one I have on my desk. This one is also really good. And what is the benefit of having them? They will not only show you the data on the transfer level, they will also show you the low level data from the transaction level. It means that they will show you how many knocks there was from the device, what is the delay between single transactions, et cetera. So they are pretty good, but as you see, pretty expensive. There is a cheaper solution. It's called the OpenVis SLA. If someone is using it, I will be happy to have a conversation about it. I heard about this project. I didn't use it personally. I'm talking about here because I want to find some community of this board because the project, let's say, it's a mystery for me. It looks dead, but some people claim that they are using this board. And this is an open source communication analyzer on FPGA board. So that's the hardware solution. Fortunately, we have a cost-free solution, which is called the USB monitor. It's a simple logger for events like submit-urp-complete and submit-error. So you can get all the data on the transfer level and only on the transfer level. You are not going to get the data about single transactions from here because it's a simple logger. It's nothing more. We have text interface, the binary interface, and there will be one instance of the binary and text interface for each USB bus available in your system. Having a text interface is great. Having a binary interface is also great, but writing your own program to interpret this is really not a thing you want to do to debug your USB traffic. That's why we should go and use the wire shark. But remember that there are two kinds of events. We have submit when driver submits the USB request, which means asks for transfer, and then the complete routine that the transfer has been completed. And the data is not always valid. If you have endpoint which is in, which is used to send the data from the device to the host, the data is valid only in complete routine. If you have out endpoint, which is used to send the data from the host to the device, the valid data buffer is only on submit function. So don't check the data if it is not valid. Okay, so now let's try to catch something in our wire shark. You remember that we have serial device connected to our computer, to my computer. So now let's open the screen. Okay, so that's the serial output. Now let's go to the, this is the device control. Now I will, it would be better to run wire shark now. First of all, to use the USB monitor, you have to mod probe the USB module. And when you run your wire shark, you will see couple of USB mode devices. The number of the device is the bus identity. We use the third bus, so I will start listening on the USB mode three. And because that's not only device which is connected to the bus, I get a lot of mess and I don't want to play with it. So I will simply just filter it using the device address and the device address for my device was LSUSB. The device address is seven. Equals to seven. Go. Now I will just simply send the test message from the device. That's the serial console to the Odroid. And as you see, it appeared in the screen. So my host received the message from the USB device. And if I go to the wire shark, I will see that the in data has been sent. We have, this one is complete because driver submitted the in orbs when I opened the serial TTI. So now I have the complete routine. That's exactly the message which I sent from the device to the host. And then I have another event which is ARP submit which means that driver is waiting for more data from the device. Okay, so that's pretty all. So just to sum up, to identify your USB device, you generally use the values from the USB descriptors. You can get them using the LSUSB. It's really friendly and useful tool. Remember to check it. Each driver binds to the device based on the cloud compatible USB device list. But of course you can modify this and you can sniff what your driver is sending to the device for free using USB mode. So thank you. I think we have some time for let's say two questions if there are some, yeah, are there effective? Well, okay, so the question is that I used wire shark and the question is about the commercial USB analyzers if they are effective. Well, yes, generally they are effective. They can do not only the USB transfers. So because the transfer level is the level on which USB drivers operate. But if you'd like to write a driver for a host controller or device controller, probably you'd like to not only know the level of the USB protocol, but also the lower level, the level of sending single transactions, resaving NOx, et cetera. Yeah, yeah, exactly. That's the stuff for which you should do the hardware analyzers to check them. Of course there are desectors also for the protocols on the top, like mass storage, human interface, et cetera. But wire shark usually also have them. Of course they are much better and the performance is better, et cetera, but they cost a lot. Okay. Well, I don't know your USB. The question is if we can use logic analyzers for the USB. So, okay, so just to sum up and the logic analyzer can be used if it is fast enough and you have a suitable software to define, to desect the physical layer. Okay, any other questions? Okay, so thank you very much.