 Hello, hello. Good morning, everyone. We're going to start this lot. The EuroBSD Confundation is very happy to have Patrick Veldt today, who's going to talk to us about Wi-Fi fidelity. Work on Patrick, and thank you very much. Yeah, welcome, and let's talk about BWFM, or first, about me. Who am I? I'm just an OpenBSD developer. I took care with Katanus and JSG of the ARM64 subtree, so I implemented many drivers and the support for many boards. And I also used to update the LLVM subtree, which we unfortunately cannot do anymore because of the license change, but, well, that makes my plate a bit lighter. Considering that I am one of the maintainers of the ARM64 subtree, I also have plenty of boards, plenty of SBCs, plenty of hardware that's just laying around and essentially doing nothing. And I gathered a collection of devices, like the Cubux i from Solid Run, which is a tiny little cube, a MacBook, which isn't an SBC, a Raspberry Pi 3, and some Intel-based mini-PC. And all of those machines have in common that they have a Wi-Fi chip, the Broadcom wireless full-mac. And I was wondering how hard would it be to make it work? If you look at the milestones, you can see that I started sometime in 2016. And it took quite a while. I don't know exactly what happened until September 2017. I, well, don't remember anything what happened there, but at some point in 2017, I got the Wi-Fi scan running. So there was a lot of work involved in my spare time making it work. And after I got the first Wi-Fi scan, like I could see in Ifconfig, all the nodes that were around me, it was basically enough to get it committed, cleaned up. And as you can see, like in the following months, I implemented in host AP mode, so we can be in access point by ourselves and added a PCI Express and SDIO backend. So generally, how do you attempt to write a driver for a hardware? First of all, try to find documentation. Just get grab an operating system, search for data sheets, try to find something that already has information for you that you can use. If you don't have any of that, you can basically just quit now because it doesn't make any sense. There are some hardcore people who like to reverse engineer blobs if your country's law allowed that. But that's so much work. And essentially, you don't do anything else but reverse engineering at blobs. But if you find code, if you find data sheets, try to understand how the device is actually working, how it's supposed to interact with you or how you are supposed to interact with the device. And that will help you implement a skeleton and implement all the layers that you need. And in the end, realize it's going to be a long project because as you have seen in the milestones, it takes quite a while to get it running, especially if you just do that in your spare time. First of all, let's talk about Wi-Fi or why this chip is special compared to the other chips that we have. This chip is a so-called Fulmac. And as you can see in that simplified layer, the Mac layer is now part of device. Essentially, there's firmware running on the device like an operating system, but it's probably some small real-time environment that takes care of all the Wi-Fi handling. That makes it easier for us because we don't have to do all of that. The firmware does it, but it increases the complexity in the firmware. But still, the layer to the firmware, to the hardware is so much easier compared to doing the Mac layer yourself. Now, if you look at our OpenBSD stack, we see that there is essentially one part of our system that is responsible for Wi-Fi, and it's the Net800211 stack. And this one was written with softmax in mind because at that time, the chips weren't powerful enough to do that, which means it will be more interesting or it will be harder for the chip to be supported in our stack because we don't have that kind of abstraction. So I looked around and I found a Linux driver, and thankfully it was actually ISC license, which means I could just grab code however I wanted to and read code however I wanted to. And there are actually two drivers, two drivers that were committed by Broadcom, and two drivers that we've written in 2011 or something like that. One of them is for the fullmax. The fullmax are the newer generation chips, which are more powerful. And the other one is for the softmax. And the softmax has the Mac layer in code, and you can see that also in the lines of code. It's a big difference, especially there's one file which has 28,000 lines of magic. Look at it. The first 10K lines are essentially magic numbers that can do something or nothing. Porting that driver would also be a horrible task. So on the bright side, this is a driver that is only used on older MacBooks. So when people come to me and say, hey, is this chip supported? And I see it's too old, and I can say no, it just would take too long to write that driver. But still, you have those 28K lines of magic. What did Broadcom probably do? Well, I guess they put that file into the firmware. So all the complexity and all the issues that they have are put into the fullmax, into the fullmax firmware. But still, it makes our job easier because we don't have to do anything about beacons or frequencies or whatever. We don't care about that. All we have to do is basically initiate a scan, configure an SSID, say join that SSID and handle the events and handle the data that arrive on the data pipes. So let's talk first about how the chip connects to the bus layers and how it's structured. Essentially, the chip has three different busses that it can talk over. There's a USB bus, which is used on some USB dongle. There's the SDAO bus, which is essentially like an SD card. And it's being used on plenty of SBCs like those all-winner boards. And I even have an Intel hardware that has it connected like that. And then there's the PCI Express backend, which is being used on MacBooks, and there are even a few M2 cards that you can put into your APU using a mini-PCI Express adapter. And the USB and SDAO part have a middle layer in common, the BCDC, which is a very simple layer. PCI Express is a bit more complex because under PCI Express bus, the device can actually deem A from your memory. So to make it faster, it's a really different system while for USB and SDAO, you can actually send packets and frames to the other side. What I did is I started with SDAO because that was the first hardware I had, like the Intel hardware, and realized that testing will be too slow, essentially because I had my, I don't know, 40-inch TV, and that was the only monitor I had, and the Intel machine didn't have a serial, so every time I was sitting on my couch, looking at my big TV with a D-mask, and, well, nah. So I bought a USB device, a so-called official Raspberry Pi USB dongle, because at that time when they produced the dongle, the Raspberry Pi didn't have in-built Wi-Fi. And since the Raspberry Pi uses a Broadcom chip set, and it's kind of from Broadcom, they also use the Broadcom Wi-Fi. I started building the lower layers, like how do you communicate with the USB device, and then edit the top layers to complete, like, the full chain of how you talk to the device, how you do Wi-Fi, how you set the configuration, and then edit the PCI Express and SDAO back and later. How does USB work? Well, no, I'm not there yet. First of all, what do you do when you write a driver? You write a skeleton driver. Something that simply attaches. It says, hey, I've attached. And then you build on top. You say, you try to find out how do you talk over the bus. Like, if it's memory mapped, you can just bus-based map it and go ahead. If it's USB, you will try to find the pipes so that you can send messages over the pipe. And once you did that, try to find out if the device is alive. Like, read its chip ID, read its version, read a MAC address and see if it responds. And if you got that going, then you know, OK, I can talk to the device, and then you can start building all the layers on top. To USB. Interesting. Looks like those are the old slides. Anyway, first of all, you have some kind of control pipe that you need to send messages so that you can set the SSID, set the keys. It's actually very simple. And then you look for the data and the events pipes, like you have an RX pipe and a TX pipe, where you send packets over it. And once you have those set up, you can basically start setting the configuration. And when you have the configuration, you can set the SSID, start a scan, and then essentially wait for events to happen. Because at that time, when you have the pipes open, you will get transfer complete and get information. How do you set the configuration? On a top layer, it's really simple. It's essentially a key value with the key being an S key string and the value being some binary data. So you fill a struct with the information that you need, like when you scan, you can say which frequencies you want to scan, or if it's an active or passive scan, and then you basically say, do an E scan with those parameters. So now if you imagine writing a driver like that, it's very simple because you have all those commands saying set this configuration key to that, set this configuration key to that. And it's easy to read, and it's also easier to write. After you've done that, essentially it would look like this. You concatenate the key and the parameters. And the delimiter is basically the null terminator of the string. You prepend some header. The header specifies if it's a get or a set and the whole length of what you're setting or getting. And that's essentially not much more magic than that. Once you have done that, essentially you can say do a scan. And once you will do a scan, you will get all the events. The events and the data are essentially Ethernet packets. You can see it on the right side. They're essentially Ethernet packets with a prepended header, the so-called BCDC header, and in between there are firmware signals. The most important part of the BCDC header is essentially the data offset, because this one will tell you when the actual Ethernet data is starting. Events are also just Ethernet frames, but what are events, first of all? Events are, I have seen a note. I have associated. I have authenticated. So all that stuff that happens on a Wi-Fi is transferred to you as a driver as Ethernet packets. So you can see that it's essentially a specific Ethertype that they use for the events. Now, if you look at SDAO, SDAO is a whole different bus, and it works completely differently, but it has something in common. You get the packets, the event packets on the incoming line, and you send, but you send the configuration and data on the outgoing line. You have a bit more than that, because you can actually access more registers, but essentially you have a feeful where you write your bytes into. So when you read a packet, you essentially do a read of 32-bit integer or 16-bit integer first. You see it has a specific line, and then you read from that feeful the length of the whole packet. There's something special on SBCs with the SDAO bus. Usually you would imagine it's just an SDAO SD card that you can put in, but it doesn't completely work that way, because on all the SDAO SBCs, it's essentially soldered on. The chip itself is just soldered on the board itself. And there's a bug that happens on some host controllers. So when you do SDAO with a one-bit channel, you have one pin where the data goes over, but obviously you need some more speed. So you do a four-bit data communication, but there's an interrupt pin, and the interrupt pin is shared with one of the data pins. So the specification says that during a specific interrupt period, this IRQ pin is supposed to be sampled. And some host controllers have trouble with that. What they essentially work around with is they route a GPIO pin outside to some other GPIO, and that will be used as interrupt pin, and then you don't have to sample the interrupt pin itself. Well, if you look at the device tree, it will look like a big hack, but I guess it works, and it makes it faster, so why not? Then let's have a look at PCI Express. I will go essentially into less detail on PCI Express, because it's much more complex with all the DMA going on. Essentially, you have multiple ring buffers that you stuff data into and receive data from. You have, first of all, the TX control ring, which you use to send the configuration data, for instance, or to create flow rings or something like that, to send commands to the chip itself. Then you have an RxPost ring, where you put empty buffers inside. Essentially, it's a command telling the firmware, hey, here's an empty Mbuff. You can fill this one with data, with input data that you receive from the outside world. You give the device an Mbuff, and the device will tell you once it sends an RxComplete that packet ID, whatever, like number 120, has completed. Then you need to look up what is packet ID 120, get the Mbuff for that, and put the Mbuff back into the tree network stack. Also, obviously, you have a control on TX control ring, so it will let you know once the commands that you have sent have completed. Also, you have so-called flow rings. Those flow rings are dynamically allocated, and those are essentially the actual TX rings that you use to transmit data to a client or to an access point. What you can do with it, apparently, is that you have multiple queues per node, for instance, if you're an access point. So I guess the chip itself can do some better queuing with that. And you have to open it up and tear it down when a node attaches or detaches. The interesting bit also about PCI Express and SDAO is that you can read from basically the chip itself. The chip itself looks a bit like this. You have a bus with a PCI Express and SDAO back end, and over those, you can talk to the OTP or to the RAM, which you need to upload the firmware for the device. So essentially, you turn the arm off, then you upload the firmware, and then you kickstart the ARM core again so that it starts the firmware. And you also write some configuration at the NVRAM. What you can also do is read some kind of a Dmask. Essentially, with the standard firmware, it will write ASCII characters, like print. It will print to some buffer. And you can use that information to debug your own issues when you debug the driver. For instance, I had an issue where I sent too many packets to the device, and then the device started complaining in a Dmask, which was really, really helpful to figure out what was going on. But still, this tells you that the firmware is actually quite huge. It creates either packets that part of some configuration stuff. It prints information. It kind of looks like Linux, but I guess it isn't. And if you look at the firmware itself, like just run strings on the firmware, you will see at the end that they have some feature flags. And when they create a firmware, they essentially enable and disable what the chip and the firmware can do. For instance, this is a PCI Express firmware. It can do peer-to-peer, which Apple uses for the add-rop feature. SR stands for Suspend and Resume, which means that the firmware helps you in suspending and resuming the chip so that it can save its state. There's the MCHAN feature, and it also has the MBSS feature, which is not on that, but it's a firmware feature that you can read. This is used for if you want to create two access points, for instance, or if you want to operate on multiple channels. And, yeah, you can also run basically a GetFirmware, GetFeatures command. So it's basically the same system as setting a variable. You can read a variable, and there's a variable that tells you what features the firmware has, the chip has, and then you can act on that. The tricky bits are bit floor control. For instance, on SDIO, I sent too many packets, and then it just essentially gave up. And then it was kind of hard to figure out how am I supposed to know if it's too much? There are actually multiple ways of doing that. In every packet that you send, you set a sequence number, and every time you receive a packet, it has a maximum sequence number. So you can basically calculate the difference between the maximum that you're allowed to send and the one that you have sent so far. Also, you have an issue with asynchronous control messages. Just imagine you establish, you attach to an access point, and then you set the WPA keys, and only after that you are supposed to send network packets because otherwise they are not encrypted. And if you don't do that synchronously, well, you will send the packets before the keys are set. And that means that you have to essentially make sure that it works in order. But if you look at PCI Express, there you send commands. And at some point in time, you will receive a packet back, which tells you, oh, it finished. But that means that until then, you have to sleep and wait until it arrives, which you can only do in process context in OpenBSE, so you're not allowed to sleep in interrupts. Also, the whole network stack integration is tough, essentially because it was written for Softmax. So what we have to do in some parts is like, if you receive a beacon, which says, hey, I saw an access point, or here I'm an access point, we don't really get that kind of information. But our stack expects that beacon. So we have to essentially fake it. We have to create a Wi-Fi frame internally in the driver, which sets all the bits with the network which the Wi-Fi stack will expect. So we fill that and put it into a stack. It's the same with association and authentication. So when we get the event that we are authenticated, we still have to craft a Wi-Fi frame and send it into the stack so that the stack knows that it actually happened. That's not nice, but I guess for now that's a good start and then we can start on abstracting it and making it, I don't know, like Linux with another layer. But that also means that people would have to work on it and I guess that's a lot of time, so it was easier to do it this way and maybe also keep it this way. We also have issues. Like, I told you it has a firmware, and the firmware, well, it's big, it got many features, and it also has bugs. I told you also about the events that are Ethernet packets. Well, someone asked himself, what happens if I send those Ethernet packets or those event packets from the outside? Apparently, well, I think it was just passed through to the driver, which means that you can send malicious packets from the outside to your driver, which it then tries to parse. The CVE was in 2016 and they fixed it, I think more than a year later. There was also crack, which they also fixed some time later, and this information is based on the Linux firmware kit. So it can be possible that if you are a bought manufacturer and you get the chip from Broadcom or from some other supply in the chain, then that they might have a new firmware for you. But for the end user like us, who are often getting some binary blobs of firmware from the Linux firmware kit, it will take some while until you actually get the fixes. So the information is released, but the firmware isn't updated, so that's not really nice. For crack, what we did is essentially we turned the feature off. It has an inbuilt supplicant, so you can tell the chip that you can use this Wi-Fi key, whatever it is, and then it will do the WPA handshake for you, but we turned it off, let our stack handle it, we have to fix this, it works. Another issue is the so-called NVrem. This is essentially a configuration of the chip for the board. So imagine you're a board manufacturer, you buy the chip, and then you go through CE testing, and the CE testing will complain that, you know, you draw too much power, you do too much this and that, and then you have to adjust that. So there is, and you can change the firmware, so what you change is the configuration. This is always or sometimes needed on PCI Express, like the MacBooks work out of the box, you don't need to supply the NVrem. But there are those like the DPD pocket, which is a very tiny Intel-based computer, where you have to supply it. When you use the AIO, you always have to supply it, there is no way around it. And for USB, you don't need it. It's very simple, you don't need it, the device has it itself, but there are also not many USB devices around. Where do you get that file? Well, if you produce that hardware, you create the file. And what those board manufacturers also produce is usually a Linux image or some mechanism to build images. And that one will have the files. So if you're openBC in trying to supply openBC, where are you going to get the files from? You would have to look for, or you would have to implement it for each board. The DPD pocket, which has the PCI Express backend, actually has the configuration stored in an e-file variable, which means that, well, you can start Linux and then try to grab the file. OpenBC so far has no way of reading the e-file variable when booting up and being in a kernel. That would be cool to have, but as long as we don't have it, there's no way to do that. But what Linux Firmware actually recently did is they gathered a collection of those NVRAM configuration files. So it's getting better, but it's still only for a handful of devices. Still not nice. So the biggest issue with running OpenBC with all those devices is when you don't have the NVRAM, how do you get it? So far, we have no solution for that, other than sending me a mail and I will reply with what you need. Well, current status. This picture is pretty good. JCS took it after PCI Express worked. He essentially booted his MacBook with OpenBC and ran speed tests. So it's like 200 megabits down, 100 up. That's actually pretty decent, especially compared with the other drivers that we have in the tree at the moment. So it works as a client. It's really fast because all of that Wi-Fi stuff, you don't have to do that. The chip does it for you. It does all the queuing. You just have to supply the packets and it does it for you. The chips are on recent MacBooks. However, the chip is also on Raspberry Pi's. So a very popular SBC has that. And it's also available as USB dongle, but it's only 802.11n and only 2.4GHz. And I'm not sure if you can still buy it because those were produced when the Raspberry Pi didn't have inbuilt Wi-Fi yet. And that's why it was the official Raspberry Pi Wi-Fi. Also, often enough, it works as access point. Sometimes it doesn't. My co-worker's wife always complains when a Wi-Fi doesn't work and then he tells her to reboot the machine. And she's desperately waiting for me to find a bug and fix the bug. So I guess that is something that I will have to do in the future at some point. Also, we have the possibility to create multiple access points or to be an access point and be a client at the same time. You can create virtual interfaces on that firmware and then send packets based on the interface index. That would be really cool to have, but it also means that you would have to think about how do I create such virtual interfaces in the operating system. OpenBC does have a way like that, but I also have an idea on how to do it, but I haven't started it yet. Also, suspend-resume. That should be a feature that everyone should have when they have a laptop. They want to suspend-resume. I didn't have a device which I could use to test suspend-resume so far, so I wasn't able to implement it and that's also why JCS isn't using it anymore. Then there are so-called firmware signals. Those are being used in the BCDC packet that we saw previously. There's some kind of in-band signaling which I haven't looked at in the end of code to parse that and I'm not sure yet what this is for, but I will have a look. Basically, the last thing is support more devices. The biggest part of that is not actually improving the BWFM driver. It's more about adding support for all those SBCs because the PCI Express back-end is written, USB is written, SDA back-end is written, but what is not written, that the drivers that are still missing, is the connection to the system. So there are plenty of SBCs around from all-winner or whatever else which have that SD card controller, but we don't have a driver for that yet. For example, the Raspberry Pi. We run on the Raspberry Pi, but we have no storage yet because they have two SD card controllers, two different ones, one for the Wi-Fi and one for the SD card, and we are still missing the driver for that. Now I'm at the end. I have to thank Stefan Speerling. He's been an immense help in getting that driver into the tree, into adding all the features into, well, integrating it into the stack and how to actually write a driver and how to structure it. And, well, much thanks to our Wi-Fi guard. Thank you, Patrick. We have time for questions. Come on up, please. So did you port it, the Linux driver, the ISC Linux driver, or did you just look at it from time to time? Both, actually. I didn't just port it because it was, well, it was like 35K lines of code. Yeah, as in... I just, the drivers, it will be too much. So essentially what I did was, I wrote a skeleton driver and I started like adding the features bit by bit. Some bits I could just cope and adjust, copy and adjust, and other bits I completely rewrote but based on the information that I saw under the Linux driver. Okay, and so for the firmware, did you take everything that is in Linux firmware and you created an OpenBSD package or whatever? So we have an OpenBSD port package which is based on the Linux firmware git. Yep. Okay, thank you. So actually if you plug in a chip and you boot up, the firmware update program will pick it up and see it's a BWFM that attached and then it will install the firmware for you. Are you aware of any work that's being done to create like an NL 802-11 I guess interface to our net 802-11 interface to make porting Linux drivers generally easier. There are a few in the tree right now that have an acceptable license that aren't GPL contaminated. That would be nice to bring over but since they're coded to a different network interface than if net 802-11 it's rather difficult to do. Are you aware of any efforts in that area either on OpenBSD or other BSDs? I know that we don't have plans to do that and I think we won't actually do that because there's also so much work to restructure it and well the gains are copying files but also on the other hand those files won't just automatically compile because there's so much infrastructure that is completely different and it's not also the Wi-Fi stack but also other parts of the stack that you cannot just drop it in. What we do is essentially we implement it driver by driver essentially so we are improving the IWM driver for newer chips and for newer firmware. We would like to have the the SRS10K driver implemented but we also won't copy that driver. We have a new implementation basically which is based on that and then there are plenty of other drivers which have well, a license that doesn't work with us and I would say that most of the devices that you want to have are essentially the Intel chips because those are available in basically every computer or every laptop and then you have the SRS chip. My laptop doesn't have an Intel chip in it, it has a RayLink chip in it, RTN wireless and the FreeBSD kernel has a number of Linux API calls that you can use from drivers so that's one of the big missing pieces or the bits from how do you interface to the network stack and given the number of different drivers we could choose from it and it would be less work to provide that than to for each new chipset that comes out rewrite it to our old Sam Leffler era networking code so I was just curious if people were thinking about that or still doing it one at a time as the need came up to get it on the record actually because Stefan is now speaking who is the Wifi god of OpenBSD he's saying that writing the interface to drop in the Linux drivers more work than just maintaining and writing the drivers ourselves we will get Stefan on the mic so if you I can't look at you now so the thing is the drivers already take care of a lot of hardware abstractions and they do this regardless of operating systems and this is already quite a lot of code now this is the hardware side now the operating system side is a different piece it's basically the drivers that's beneath these two layers in between them and so what you're proposing is like okay well take all the hardware stuff from these drivers and make your kernel work with them but also make your kernel compatible with the upper interface so you're duplicating effort to make your kernel work because now you have to maintain not just the hardware interfacing side of things but also the kernel interfacing side of things so you have a lot more code to take care of and we are like two and a half people in the project taking care of Wifi and we can't even keep up with the current number of hardware devices that are out there and so so I think and also Linux is moving so fast and they have so many developers that are doing track-off stuff and for example the graphics stack as you can see is already difficult and doing the same in Wifi I think just won't scale if you look at commits like Broadcom and Linux, Intel drivers and Linux and you will see that only the developers of those companies who are employed there actually commit meaningful things that are related to hardware changes all the other people in the community just to clean up work and refactorings but they don't actually do anything that matters as far as the drivers are concerned so you need like, they have like an army of employed people and we can't keep up with that so it's just hard enough already to do this for the hardware itself but also doing it for the rest of the kernel would just be impossible Okay, so it's basically interface velocity in Linux which has been a known problem and the complexity of maintaining effectively two 802.11 stacks in the tree that is what makes the idea of having a Linux compatibility layer difficult to first of all implement because it's so big and second of all to maintain because it's moving so fast and it's easier just to pull the hardware bits in Exactly, and also Linux actually only has not just one Mac layer implementation but several because some of the drivers don't use Intel's Mac 802.11 they use others for all the hardware generations in particular so the Broadcom driver is large because it doesn't use that framework for example or parts of it but not the way Intel does So it's I've looked at exactly one driver and it's like oh this uses this interface why don't we have this interface I thought it would be simple but as you just explained it makes sense so thank you for adding some color to what seemed like a simple problem but actually is complicated No worries, thank you Hello So would you describe Broadcom as a bit more open source friendly entity? No, I wouldn't Try to get data sheets, essentially there are none Well, data sheets those exist but really documentation which talk about the register how to interact with it it doesn't exist and you can see that with plenty of other devices and plenty of other chips So it's pretty good to have that for a C license but I feel like that's like the only thing that they actually produced The previous driver and the one that they're still shipping to customers is some internally developed driver which you are not allowed to show anyone to the outside so the one that is in Linux is basically a re-implementation of the driver that they ship to their customers but otherwise that's essentially the only thing I have seen about Broadcom but I guess there are different divisions because Broadcom is a big company they do also some other network controllers and whatever else so maybe some departments are a bit different about that Am I saying that Cypress now owns Broadcom and yes there are Cypress chips that are essentially Broadcom chips but I don't know if that will change I don't think but I guess not Alright, thank you Patrick