 My name is Vinayak and today I will be talking about Bluetooth Low Energy Controller in Zafir. I have like two parts to my presentation. The first half is going to be introduction, so it's going to be like introduction to Bluetooth Low Energy. What it is when I say it's Bluetooth Low Energy, how it is implemented, in terms of basically the technology as such, not anything to do with the code. The second half would be the design or the architecture which I am currently working, which is to improve on what is upstream already. So I'm not going to take time explaining the old architecture, but the new one is reusing most of what is there upstream and what's going to come in Zafir. Before I start, may I ask like how many of you here is using Bluetooth Low Energy and how many of you actually develop Bluetooth Low Energy, that is basically the host or controller or application or profiles. So since I see that most of them are users, I guess the first half would be interesting for them to know what's happening underneath when I talk about Bluetooth Low Energy technology. Before that I will just start with, I was supposed to have a clicker here. So as I said, introduction. So I've been developing Bluetooth both BR-EDR and Low Energy from the time I have passed out of my university. In between, in the beginning I was also doing a bit of Windows applications and ASP.NET and stuff like that. But what I am doing today is today I'm primarily the active contributing maintainer of the controller in Zafir, prior to which and also as a consultant to Nordic, I have been developing the in-house Bluetooth stack in Nordic for the last 10 years. I have moved back to my hometown in Bangalore and I continue to be a consultant for Nordic. I will be a bit fast because I'm already into my five minutes into my presentation so I will probably go fast and I prefer to basically show demonstrations in the time I have. So Bluetooth, what is it? It's basically short range, low power, personality and networking communication or something that is between your watch and headset and your phone and probably between your TV and your remote and so on. So it's like just a short range communication stuff like maybe up to 100 meters if you have 20 dB TX power. It's in the 2.4 GHz ISM band which is basically free to use and the special interest group was formed in 1998. It's over 20 years of Bluetooth around you and if you have a scanner or a phone, if you just scan now you might get over 200, 500 devices just around you sitting here which are advertising and probably all your car kits and headsets receiving calls and stuff is actually using Bluetooth and BDR. Basic rate and enhanced data rate. So there's billions of products shipped and there's over 33,000 SIG members who are basically contributing and using the specification. What is Bluetooth low energy? It's ultra low power so it's even lower power than the basic rate. It's optimized for short burst data transmission which means that it's very small packets that's on air and very short RX and TX windows and the main idea was raised to idle which is basically you turn on the radio as seldom as possible and turn off the radio as soon as possible. With this you can actually achieve like a connection, transmit data and tie down within like 6 milliseconds so your device can advertise and be scanned by a second device and connected and transmit some minimum data and then tie down by 6 milliseconds. When I started it used to be before the specification was adopted it was called like a sensor profile and it just mentioned about value like a parameter and value. All that you wanted was this is the parameter, this is the value you want to transmit. By the time it was adopted we basically had layers of modules or components in the stack. So it's still simple stateless operation stuff so the information is transmitted as a parameter and a value but you still have a lot of other infrastructure around it and it was supposed to be a low memory footprint so I think the thesis that introduced Bluetooth low energy Vibri basically mentioned that the whole implementation could be done in 4 kilobytes of ROM. Yeah, but that was then. And then you should be achieving about over a year of coin cell battery so it's like CR2032 cell you should be achieving a year or more of battery life. So moving on to the technology what's underneath it the BLE stacks look as shown here. So at the bottom you have the fee which is basically the 2.4 GHz then there's the link layer which basically implements the roles and states and a standard host controller interface specified in the specification which then can be used by hosts like Zafir host or Bluezy host to basically achieve a connection with a peer device and so on. And then in the host you have the L2CAP which is a logical link control adaptation protocol over which you have the attribute protocol as I mentioned the parameter value is actually exchanged over attribute protocol and then for the security manager protocol and gap is the generic access so that is what is used to basically recognize a device and establish a connection and so forth. And the top is the profile that profile is basically user use cases and stuff like that so head over, GAT is used by your keyboard and mouse proximity or your proximity tags of course the keyboard and mouse will have batteries so then you have a battery profile to exchange the battery level and so on and so forth so there's temperature, heart rate, blood pressure and now there's like tons of them. So on top of this you have your unique application so how you would basically differentiate from your competitors. On the BLE fee as I mentioned it shares 2.4 gigahertz along with the BREDR in BLE you have 1 megabit symbol rate, 1 mega symbol per second you also have 2 mega symbol per second signaling rate the modulation is still GFSK and you can have TX powers up to 20 dBm and compared to BREDR which uses 80 channels BLE only uses 40 channels and 3 of which is used for advertising so which means that the device that wants to be connected would be advertising and the device which wants to connect to this one is scanning on those 3 channels and those are the 3 channels on which you would establish a connection and once the connection is established it happens on the 37 other data channels and as you see I have mapped those advertising channels and they are spread across so that they don't basically interfere a lot with the Wi-Fi channels. On the link layer stuff the responsibility of link layer is to basically maintain the advertising state, scanning state and then and be in slave or master role the legacy advertising payload was up to 31 bytes the external advertising now supports up to 255 bytes plus you can chain them to achieve longer advertising packets but the advertising extension would then be using the data channels so they will have a very small packet on the advertising channel and then they will continue with the advertising data on the data channels on the data channels the legacy PDU size was 27 bytes so with 4.1, 4.2 you can actually have 255 byte payloads PDU lengths and BLE has built an AES128 with CCM so all your data exchanged would be encrypted so this is just a small animation of what's happening with respect to advertising so let's say we have a device A so we have a device A that I didn't know so we have a device advertising at 20 milliseconds and they are advertising on 3 channels every 20 milliseconds and we have a scanner which has a window of 25 milliseconds I think and it scans on channel 37 and then has an interval of 50 milliseconds so it then scans on 38 so in the first scan window the scanning on channel 37 is able to capture two packets and the second window the channel 38 is able to see one of the advertisement on it and the third one so this is what you would see on air when you have if you turn on your phone and scan for devices it will receive packets whenever it sees the advertisement on those channels the topology a scanner and then you have an advertiser advertiser is advertising a scanner would be listening to the advertisement the scanner if it is in the initiator role would send a connect request packet to the advertiser and then the connection is established so basically a scanner becomes an initiator and becomes a master an advertiser would become a peripheral if it is in a connectable advertising so similarly you can actually have a master connect to many slaves if the master is also a BR-EDR device it can then have a BR-EDR connection and you can also have a scatter net scenario where the slave can be another advertiser and be connected by another device how does the connection look like so once the connect request has been sent the connect request will contain an offset from the time the connect request was sent where the master would transmit with an access address which was also mentioned in the connect request the slave is listening at that interval once it receives a packet with the correct CRC it's going to basically use a small duration to turn around to be a transmitter so that small amount of time by specification is fixed at like 150 microseconds and it's called an interframe space so between RX and TX or TX and RX you always have a 150 microsecond space and if nothing is there to be transmitted between the slave and the master at that interval then the radio quickly goes to idle so a packet here like 27 byte packet would consume like 328 microseconds and then you have a 150 microsecond gap and then another 27 byte if received would have another 328 microseconds so around 806 microseconds is what there is the transmission or use of radio there and then the next interval it will repeat the same the master started the transmit and the slave receives it and transmits its packet back and if there are more packets they keep continuing that with the 150 microsecond gap until the next connection interval and they can keep on going on as long as there's data so how do I show this in Zephyr so let me just, I have my boards here I have two Nordic Semiconductor Development Kit I have the latest 50 to 840 chips and I also have the 50 to 832 which I have connected to the Power Profile Kit so I can, I hopefully will be able to show whatever I showed you in this graphics I should be able to show it natively on Zephyr because I have the GPIO debug enabled and upstream if you enable then you will be able to use any logic like an oscilloscope or logic analyzer to see what I showed in this graphics here so let me see if I capture so before that it's important to show what I'm doing alright, I don't know, I don't see what I'm doing so typically they make menu config has a section for Bluetooth let me see how can I do that is there a bigger was it, what was the key combination there command clock okay yeah, okay so then there's a section for Bluetooth and you just go down yeah so in the Bluetooth link layer there's of course I'm having the source base which I'm working now on the new architecture so the old one is available as a software based BLE link layer and then the new one is basically the architecture is now split into two different layers in the implementation I will come to it why it is that way and what I would usually do is I would go down and there's something called Bluetooth controller debug pins there's also a profile radio ISR which basically prints how much CPU time is utilized by the controller in its radio ISR but usually this should be sufficient to enable the GPIO toggling and use a logic analyzer and measure how much of CPU time is utilized by the controller implementation so it's already built I don't know which one is peripheral but yeah so I already have a connection so let me just start capturing this and I will reset one of the board and I will reset the other board so I have a master and I have a slave at a peripheral so when I reset one of the board is advertising and the second board is scanning for it as soon as it finds the heart rate profile in the advertiser it's going to establish connection so it should have already happened as I said let's stop this so if you see all this break here that's where I press the reset button and this is where the scanning is happening since I already had a connection the connection needs to have a supervision timeout before it starts advertising so this was a previous connection where I kind of reset one of the board and the other board was not reset so it continued to go into supervision timeout this is where the slave wait a second so this is where the actual chip start up and there's advertising going on so there's advertising and there's scanning so there was no advertising that coincides with the scanning here so appears like this is where the advertiser was discovered by a scanner and then from this angle I can't make out so this is the initiator here so this is the place where the connect request was sent by the initiator and this is the first EU connection event and henceforth you have the connection events where the RX and TX are being exchanged here so these are the radio interrupts and if I zoom in then you should be able to measure the amount of time it takes the CPU time used by the controller for the radio ISRs so going to the controller so the controller was contributed in 2016 and since then I've been basically running it through the conformance tester and because the conformance test also which is defined by the Bluetooth SIG undergo basically enhancements and fixes to them so I had to basically catch up with the conformance test and run through all the conformance tests for Bluetooth 5 specification and we basically got a design listing for the development kit in October 2017 so what you have now on the upstream Zafir is actually a qualified design for the controller component what it has is it's fully BLE 5.0 compliant it has unlimited roles and connection count what that means is that the RAM permitted you can actually I think we have set up a limit of like 64 as a count in the KCONF but you could go more than that also if you want and it's only restricted by the RAM it also supports concurrent multiprotocol already and it has intals and scheduling of roles so if you have a device that wants to establish 8 or 16 as a master connect to 16 peripherals the scheduling is going to arrange them in an order such that these connection events don't overlap so it establishes connections equally spaced in time and also if you do a connection parameter update it's going to find a non overlapping offset and then maintain the connection there and also as a slave if a single device is a slave to more than two masters and since the slaves depend on the master clock and they would drift into each other the current implementation on the upstream master is able to detect that and then do an auto connection update to move them in a non overlapping intervals it's also the design is I just say it's portable but since there's only Nordic chip there but it's open for other vendor silicon to implement the hardware abstraction layers how does the architecture look of the controller so at the bottom you have the SOC there is a bad bone hardware abstraction layer earlier in 2016 most of the hardware in the SOC was accessed using the hardware abstraction layer but the goal is to delete, remove the hardware abstraction layer and use native Zephyr drivers from the controller code so above the hardware abstraction layer we have something called a ticker which is responsible for this periodic scheduling of the advertiser scanner and the connection roles Utility is a bad bone implementation of queues and FIFOs and a very lightweight memory management it also has a small module that have developed called mayfly and the idea behind this is basically race to idle so it's just a way of executing functions across inter-priorities the link layer is the core implementation of the packet transmission and control packets and stuff like that above which is the HCI and then you have the host Zephyr host and application the goal is for the new architecture is to have a multi-vendor support which means that vendors would like to basically add in their BLE support at various layers in the controller so the split is such that if vendors want to basically have their own packet management in let's say it could be in their silicon or it could be in a binary blob or so the new architecture would permit them to basically reuse probably mature implementation that they already have so like in the previous presentation they said that they have a corpus and then they have softwares in it which they have been using for many years and probably they continue to use it because of it being mature so the new architecture would permit vendors to basically replace at various layers in the controller replace the modules and add in so here on the top it is the application then the host which is both Zephyr threads and then there is an upper link layer which is going to be open source and it would have all the control procedures and scheduling of the events, radio events and so forth and then there is a split you have the lower link layer in case of Nordic it would be open source the lower link layer is nothing but the radio access how the TX and RX and the packet being exchanged between the two devices and then there is the hardware which is basically in the SoC and the vendors could choose to reuse the lower link layer if they want then they would just have to abstract the hardware interface or they could just utilize the API between the upper link layer and lower link layer which is in the next slides I can show what those are may fly, so if I go back so there is a mention of tasklet there but actually it was a goal that probably we will have something equivalent in Zephyr that could be used in the controller to execute restore functions and so on so until then we do have something but I haven't ported it yet to the native implementation in Zephyr yet until then there is a small implementation in the controller which basically is something like a work so if you have a thread in Zephyr you could have a work which is executed by a work queue so what is a may fly may fly is similar to a work in thread but it's basically scheduled by an ISR and executed in an ISR so I try to represent that in the diagram there so you have collars and collie and all of these are either thread or ISRs and a caller would request to schedule a function it's just a normal C function and would request that to be scheduled in an ISR at the priority it wants it to be executed so each ISR has a queue of functions to which the caller would enqueue it and then trigger the IR queue to basically process those functions this was something that I had hand drawn like way back in 2012 how I actually vision the controller to behave with respect to handling the radio so at the bottom there is a thread, the T represents the thread so thread is going to basically make some API calls and there is going to be a level of execution there where basically like a job that would basically do the scheduling stuff and then have a higher priority context that would basically access the hardware and set up the radio and so on I don't want to go into the detail, the topmost is basically the counter compar registers firing as a consequence of the setup done by the job and then there is a radio interrupts happening as you receive and transmit the packets so what's happening on the scheduling in the new architecture you can see the bigger blocks are the execution context as host and in the center you have the upper link layer and then on the rightmost side you have the lower link layer when you want to start advertising then the API from the application would then translate into HCI the HCI would then so this is the HCI HCI would then call the link layer interface to prepare a scheduling message that's placed into a queue here the upper link layer would then set up the counter compar registers when the compar triggers then the requests are then placed into a pipeline because you could have overlapping schedules they will then be placed in the pipeline and then they would basically schedule the preparation for the radio event in the lower link layer once the radio event is finished which is basically to do the packet transmission and reception there would be a message sent back to the upper link layer saying that that particular event is finished the upper link layer would then look into the pipeline to see if any other overlapping events are there back to back or very close then it's going to basically run the lower link layer again on the TX path again there would be a HCI command which then goes into a fixed length FIFO that's because in the HCI you have a read buffer size if you say for example 10 TX buffers then the 10 is going to be shared across your connections so all that is needed is just a fixed length FIFO for the TX but then once it's received by the upper link layer it's demultiplexed into connections so if you have 10 connections and you had 10 TX packets each one for those connections then they are basically demultiplexed into those connection contexts and when a connection event occurs only that packet is utilized by that event transmitted and then comes back in a FIFO here because the num completes would at max be 10 because that's what was configured in Zafir that you would like 10 TX buffers and there's a FIFO here, there's a FIFO here so the only difference is there's a FIFO going downstream but then demultiplexed into context safe queues and then there's a FIFO back receiving the number of completed packets down to the host thread, that's the TX path and on the RX path the host thread will always fill up a free FIFO with free buffers which would be picked up by the radio events and then the received packet is then sent down to the upper link layer which would parse it to see if they are data packet or control packet if they control packet then they will have a special processing being done if it is data packet then it would be sent to the HCI layer and then host so I will like to show, I just have the Power Profile Kit so I can show you the Power Profile but before that I already have some screenshots of how it looks upstream with respect to the architecture there so the top is basically a zoomed out Power Profile stuff where it's doing continuous scanning but you can clearly see that there is a lot more CPU idle between channels so these are like 37, 38, 39, 37, 38, 39 but then there's the multi-role where you also have advertising while scanning as you see here the advertising actually utilizes once whole it's basically replaced a scan window here to do the advertising and then the scan in the next scan interval is when the scanning is happening but with the new work which I am doing the scanning basically utilizes very little time to switch channels so it's like from 70 to 300 microseconds to switch the channel to do the scanning and advertising at the same time is basically pipelined here and has a resume so as soon as the advertising is finished the scanning continues back to back after the advertising so this means that the radio utilization is higher in the new architecture compared to the old one so these are just the points like continuous events for continuous scanning and direct advertising are truly continuous excuse me I will probably just take another 2 or 3 minutes to show I have this tool from Nordics in the connector which has the power profile which apparently opened up in this window again so let's see it's not generating yet so let's see I'll just give a last try and then yeah it's probably I must be having some wrong firmware there so basically it's just going to have it's going to display whatever I displayed over there in terms of the power profile we believe you yeah because I have taken the screenshot so yes so I guess if you have some quick questions I can answer I'm always available in IRC and you should be able to send me an email anything related to bluetooth I will usually look at in the Zafir mailing list so the new link layer that you presented the new architecture lives in a private branch right now because of certain other items that we're not going to discuss here but when do you expect at least the first bits of it being merged upstream so we can actually see this so I'm still porting all the control procedures that's the remaining part of it which I promise myself that it should be like three weeks but I also look handle bugs and requests on the mailing list so yes so that's what is consuming my time but I would say that yeah sometimes in three weeks but this private branch also is used to develop bluetooth features, future bluetooth features so SIG members will be given access to all the new work being done but whatever is adopted those will always be merged back upstream so the bluetooth 5 features once I'm done with porting all the control procedures should be along with the new architecture available upstream while the new features would be developed in the private repository but it's not private as such it's open for all the bluetooth SIG members because of the bluetooth confidentiality yes so thank you