 Good afternoon everyone. I'm Neil Pandia. I'm Nate Temple. And we're with Edis Research and we'll talk about RFNOC accelerating the spectrum with the FPGA. So just a quick background about Edis Research. Edis Research was founded in 2004 by Matt Edis and it was acquired by National Instruments in 2010 and it's located in Santa Clara in California. And Edis Research makes software-defined radios. They're divided into four different families, the B-series, the N, the X, and the E-series. And generally speaking the B-series are radios that connect to the computer through USB, the N-series connect through Ethernet, the X-series connect through 10 gigabit Ethernet, and the E-series are embedded radios. And so most radios, software-defined radios, connect to a computer directly and all of the processing for the radio occurs on the CPU. And that's shown in this graph, in this diagram at the bottom where we have a flow graph running with a waterfall display and all the calculations for that waterfall are being done on the CPU inside the laptop. And people have tried to accelerate this data flow and increase the throughput of applications like waterfalls and increase the data rates between the radio and the CPU. And one approach has been with the general purpose processor and SIMD instructions. So libraries like Vulk in GNU radio, if you've heard of Vulk, vector optimized library of kernels. And so there are assembly language primitives that optimize the DSP processing on the CPU. And that effort's been very fruitful, but there's still more that would be desired. And so the GPU has been used and there's a program called Foster that we'll talk about in a minute that uses the GPU. But the GPU has its own set of limitations. It doesn't always match well to a lot of block based SDR signal processing. And it carries with it a high latency penalty. And so the GPU has had limited applicability to general purpose SDR processing. And so the other thought was to use the FPGA that's inside most of these radios. On the industry search radios, every radio has an FPGA inside of it. So the thought was why don't we use the hardware to accelerate our processing. And so the FPGA on the radio performs all the high rate processing like up conversion and down conversion. And it's kind of the brains of the radio. It controls all of the ADCs in the DAC. It manages communication with the host computer. And the code for the FPGA is open source. At ADS, the driver for the radio is open source as well as the FPGA code. So everything is open source. It's hosted on GitHub. It's written entirely in Verilog. And these are the FPGAs used on some of the different USRPs. The older Gen 1, Gen 2 are some of the original USRPs from way back in the day. And they were much smaller. The Gen 3, E310 and X310 FPGAs are a lot larger. The X310 is the largest FPGA that we have. And it has a lot of space available for custom signal processing. And we use GNU radio with the USRP. And we've integrated a way to use the FPGA and do FPGA processing from within GNU radio. And we'll talk about that. We'll go through an example to show the motivation for using the FPGA. And we'll look at Welch's algorithm for power spectrum estimation. And we'll look at an example where we have a USRP source block in the corner. And this is a GNU radio flow graph taken from GRC. And samples are coming in at 200 mega samples per second. And then being processed in that chain with an FFT complex to magnitude and then a moving average. All of that processing is being done on the host. And so the host computer has to ingest samples at 200 mega sample per second. So 800 megabytes per second. It's a lot of, it's a very high data rate. And then process those samples and do all of this math on every single block that comes through. Some of the math and a lot of these algorithms can be parallelized. So in the case of the FFT, that's a natural choice to move to the FPGA. And so we may want to do that. The transport between the radio and the host is already pretty saturated. And the host is already spending a lot of CPU cycles ingesting those samples. And so we can alleviate some of the burden by moving some of the processing to the FPGA. In the case of a full rate 200 mega sample per second stream, that's 6.4 gigabits per second. Processing and reducing the processing is one reason to use the FPGA, but there's another important reason. And that's latency. A lot of algorithms cannot be implemented or at least cannot easily be implemented using just host side processing. In 802.11 and Wi-Fi, you have the siphoth timings. In Bluetooth, you have some other very challenging response times. And you can't meet those where the samples have to go from the radio to the host, go through the whole stack in the kernel and get processed by the application and then come back out again. And so you have to use the FPGA for a lot of applications. So it's not just throughput, but it's also latency. These are the different hosts, sorry, the different interfaces that the host computer can use to connect to the radio. And each has their own limitation in terms of throughput. And even the fastest 10 gigabit link is limited to about 250 mega-sample per second. And so if you saturate that link, the host will struggle to keep up with all the samples that are coming in. And even using a really wide link like 10 gig ethernet may necessitate the use of the FPGA. And so since the radios have big FPGAs and this one here pictured is the X310. It has the largest FPGA that we use on the radios. Why don't we use it? Well, FPGA programming is difficult. It's very different fundamentally from C or C++ or Python or host side processing. Has anyone programmed in Verilog or VHDL before? Are there any FPGA developers out there? Okay. So I'm sure you'd agree that FPGA processing, FPGA programming is fundamentally different from the host. And not everyone is familiar with it and there's a learning curve to get up to speed. A lot of times when you have a design team, there's three domains. You have software experts who are experts at implementing something in Python or C++ and so on. You have radio perhaps. You have algorithm experts who might live in the MATLAB domain and focus on the math and the algorithm itself and the performance of the algorithm. And then you have FPGA experts who are experts at fitting a design into an FPGA meeting timing and building hardware to do the task. And a lot of times those domain skills don't overlap and so it's FPGA development can be hard and time consuming. And so the goal with RFNOC is to make FPGA acceleration more accessible. It aims to provide a way where you can write an RFNOC block to implement your logic, your functionality and insert that into the framework and let the framework take care of all the glue logic and all the data processing, all the data handling, the data flow, and integrate that into the host so that you can use the API, the API for the radio to control and operate your logic, your block. And in the past the FPGAs were more monolithic and you'd have to look at the entire design and figure out where in the design to put your code. But with RFNOC, the goal is that you can use the framework to handle a lot of that plumbing for you and really just focus on your own application. RFNOC is GPL and specifically LGPL and so the modules that you write you do not have an obligation to release the source code to those modules. It's fully integrated with GNU radio but you do not have to use GNU radio. You can also use RFNOC from C++. So if you're not even using for GNU radio for some reason a lot of stacks don't use GNU radio like the cellular stacks OpenBTS and SRSLTE and so on. Don't even use it and so if you're not using it that's fine. RFNOC can be used from C++ as well. This is what the architecture looks like. At the bottom you have the FPGA domain and the dashed red line is the boundary between the host and the FPGA. On the FPGA you have a Ethernet MAC interface and this is the block that interfaces to the 10 or 1 gigabit Ethernet interface to your host computer. And then there's a crossbar. The crossbar is a packet switched crossbar. Basically a switch and network switch which is where RFNOC gets its name RF network on chip. And so packets of IQ samples are switched or are routed throughout the chip and through the crossbar and from the crossbar all of the blocks are connected and whenever a block wants to talk to another block pass samples to or from another block or pass samples to the host computer or receive samples from the host computer it goes through the crossbar. There's a radio core block which represents the interface to the transmit and receive chains and takes care of all of the plumbing and all of the interface to the radio. And then there are these computation engines or more commonly known as just RFNOC blocks. And this is where you implement your logic whatever that might be of a turbid decoder or some other function. I just said that that the FPGA connects to the RF front end. It controls the functions of the radio and the Ethernet interface is not just for Ethernet. There's also other interfaces supported as well like PCI Express. The RFNOC block is something that you can write or you can get from other sources. I'll review that in a minute. And the crossbar interconnects all of the devices on the FPGA. On the whole side the UHD driver using the RFNOC framework allows you to configure your block and use your block and provides an API to control and access your block from your C++ program or your GNU radio flow graph. And you can do that in C, C++ in Python or in GNU radio which under the hood is C++ and Python. So look at an example of plotting spectrum. The radio core is now represented by that block in the top corner of the RFNOC radio block. Notice this is still GNU radio but the block names have changed because you're now using blocks from the RFNOC library. And so that library provided by Edis Research when you install it instantiates all these different blocks that are on the FPGA. It's a little hard to see but the lines between the blocks now become green and that indicates the data flow between blocks is on the FPGA and not on the host. And let's say we have the radio core sending samples onto the host and so we have RFNOC radio core block on the left and then a dashed line. It's a little bit hard to see but that arrow between the radio core block and the stream to vector block is dashed indicating that the data flow is crossing a domain boundary. It's going from the FPGA to the host. And then the rest of the processing right now is on the host. And say we want to move the FFT to the FPGA to accelerate it. And so we replace that FFT block that's being run on the host with an RFNOC FFT block. And notice the arrow between the RFNOC radio core block and the FFT block is now green. Again to indicate the data flow is on the FPGA. And so now we have an FFT block and the data flow would look like this. The radio core receives samples. The samples go to the crossbar. They're routed to the desired block. In this case the FFT block. And then the FFT block performs its function, its FFT and sends those samples out back to the crossbar and they're routed to their destination which in this case is the host computer. And the rest of the processing in the chain in the flow graph occurs on the host computer. If you want to move additional functions to the FPGA like the logarithm or what not in this flow graph you can do that and add additional RFNOC blocks. This is what the blocks look like when you zoom in a little bit closer. All of the communication between blocks is packetized. And so there's a packetizer and a depacketizer in every block. And all of those modules in green are provided by the RFNOC framework. You don't have to write those. And so there's a packetizer and depacketizer to packetize IQ data, sample data from the radio across the crossbar. And there's a FIFO for flow control. The FIFO also serves another purpose for clock domain crossings. I'll talk about that in just a second. And then there's a TX interface and an RX interface that controls communication with the rest of the radio. And in the example from before from the previous slide samples are coming into the radio core block and being received. They go to the crossbar and then they come into the FFT block where they're depacketized. Again, there's a FIFO for flow control. And then they go to the FFT itself, which is in the pink box. And that can come from any source. You could write that yourself. It could come from Xilinx. You could get it from opencores.org or wherever. And the only requirement on your logic is that it uses axi streams. And there's a slide coming up where I'll go a little bit more detail about axi. But it's an industry standard for a point-to-point link between modules. And most IP out there, third-party IP like from Xilinx, supports axi. So as long as your IP speaks axi, if you would, you can insert that IP into the RF knock lock in the pink box here and connect it to the rest of the framework. And I said that about these footnotes here I already said. And where do you get this IP? Let me fast forward for a minute. Actually, I'll not fast forward. And let me go to a different example. So look at a cognitive radio example. This is a hypothetical example where we want to do some cognitive radio and control a lot of it from the FPGA, not just from the host. And so the radio core perhaps is receiving samples and that goes to the FFT block which comes from say Xilinx. An FFT is calculated and that spectrum is sent to the spectrum policy block which could be some kind of soft processor or some other kind of logic that maybe looks for energy in a bin or some other criteria to determine that something is happening in the spectrum that we're interested in. And then when that happens, a trigger is sent to the TX modulator block and when that block sees the trigger, it downloads or it receives. It requests a payload from the host computer for samples to be transmitted and it goes and transmit those. So this is an example of RF knock, an application of RF knock. And notice that all the blocks, all the RF knock blocks that are added and orange don't have to come from the same source. In one case, in this example, they're coming from Xilinx. In another case, they might be coming from Vovato HLS, I'll talk about in a moment, or they could be something that you wrote yourself or you took from opencores.org. The blocks can come from all these different sources. Like I said, you can use the built-in edis research library. When you install RF knock, there's a couple of blocks that come with RF knock. I think there's about 14 or so right now. And they support all the common DSP functions like FFTs and Windows. There's a Sig Gen block. There's a bunch of blocks that come with the framework. You can of course write your own in Verilog or VHDL. Most of the tools are dual language tools these days. So you can use either language. You can use opencores.org, which is an open source repository for hardware and obtain a block from there. You can use third-party IP from Xilinx. You can use Vovato HLS, which is a tool that will generate Verilog and VHDL from a C++ module. So if you're not interested in coding C++ or sorry, Verilog VHDL from scratch, you can use that tool to generate the Verilog and VHDL from a C++ module. There was a challenge that we ran last year with Xilinx and three teams were selected and their code is available on GitHub. So if you're thinking about using Vovato, you may want to look at their code. It's all posted on GitHub and they, I don't remember all of the three projects. I think one was the ATSC decoder, but they implemented three different systems using Vovato HLS. And so there's C++ modules that generated RF knock blocks. And there's also HDL coder from the Mathworks where you can take M files or Simulink models and generate Verilog and VHDL from that. When you build the FPGA for use with RF knock, there's a couple of tool chains. The older radios use ISE and those radios don't support RF knock. RF knock is only supported on the E-Series, the X-Series, the X310 and the N310, N300. And they use Vovato which is the current tool chain that Xilinx supports. If you're using the E310 or E312, you actually can use the free Webpack edition and so you don't need a license. You still need a license file, but it doesn't cost anything. But the larger FPGAs in the X and N series, N300 series, do require a paid license. And once you had Xilinx installed, you would build an FPGA from the command line. We provide make files in the UHD repository when you install RF knock and you invoke the build process from the command line. There is an optional GUI, as you see in the bottom of the slide. There's an optional GUI that you can invoke. And the GUI is basically a wrapper for the command line. So you can use the GUI to click through and select what target you'd like, what blocks you want to add to your design, and then go and build a design. This is what the GUI tool looks like. You select a target on the left. You provide some edis blocks in the middle. You bring them over to the right side to show all the blocks that are in your design. And then you generate the bit file. You generate the FPGA image. And it will show you the command line that it's going to use to generate that. And you can take that command line and copy-paste it into a script or invoke it manually on the command line if you prefer. Or the tool can just invoke it directly. The GUI tool can invoke it directly. And with that, I'm going to, we'll talk next about Phosphor and I'll turn it over to my colleague, Nate. So many of you are probably familiar with an auditory module for the GNU radio toolkit called GR Phosphor. It's one of the, it's an emulation of an RTSA like spectrum visual, visualization. GR Phosphor, originally GR Phosphor was designed to run on a GPU using OpenCL and OpenGL for acceleration. It's great for fast signals. And if we look at the demo that we have running here, it's a, we're looking at Wi-Fi and Bluetooth there, so really short intervals. There's also an RF knock variant. Now this, the, what's running is the RF knock variant. It was created by Sylvain Munant. You can find him on Twitter at TNT on the OsmoCom Wiki pages and along with an additional video demo of Phosphor. So this is a GPU Phosphor which you're probably familiar with. This is running with a B200. So I'm looking at 50 mega samples, 50 mega Hertz worth of bandwidth at a given time. On the left here, this is a bunch of push-to-talk traffic, police, fire, EMS type, you know, business radio type stuff. Then we'll see LTE, various LTE channels, along with a pair of WCDMA channels there. Now this is all being processed on the host. So the CPU has to bring in the samples and shuffle them off to the GPU or then the FFT and processing is done. Which can be challenging. It works for 50 mega samples. Like a, the new NVIDIA's 980s will do 400 mega samples or so through them. But, and one point to make, the data transport rate across from the radio to the host is 1.6 gigabits per second at 50 mega samples. This is also a 50 megahertz view of the same center frequency. However, now this is the RF knock version. Instead of running at 1.6 gigabits per second, the transport rate from the radio until to the computer is around 4 megabits. So it's incredibly, an incredible reduction in the bandwidth that's required for that. One interesting note I want to make about this screenshot, kind of a little historical neat thing. If you look in the middle there, there's a peak. There's several little small peaks there between the WCDMA channels and the LTE channels. And here they're gone. Radio's sitting in the same spot or the antenna's sitting in the same spot but they're gone. Where did they go? Did anybody have any idea what those missing signals are? If you guess GSM, that would be right. And so this screenshot is from before January 1st, 2017 and this is after AT&T shut off their GSM. RF knock offers, the RF knock version of phosphor offers various shadings. So here's a few views of those. Now this is the same center frequency but instead of looking at 50 megahertz, worth of bandwidth, we're now looking at 200 mega hertz worth of bandwidth. And this would be if we were to be streaming the 200 megahertz over to the host, that's at 6.4 gigabits per second. But even though that we're now doing this on RF knock, this is still at about 4 megabits per second. So it's possible to stream this over the Internet. So here's an example of a Wi-Fi channel. And you can see some Bluetooth, I think, there are a mouse or something, a little NFR 24 modules. So this is an example of the RF knock flow graph. Starting with the radio block on the left, you'll note, as Neil mentioned, the green lines that are in between the blocks indicates that that data is being transported on the FPGA between the blocks. So first, we're bringing in, in this example, we're bringing in 200 megahertz off of the ADC and then we're going to go through a DDC, so it's the digital down converter. This is a decimation stage within the FPGA. So it's going to bring it from a 200 megahertz rate down to a 50 megahertz rate. Then we run an FFT windowing function, I think, in this one we're running like a Blackman Harris. Next we go through an FFT, which will compute the Fourier transform. Next it goes into the RF knock phosphor block. And this will actually take that vector of bins and generate the visualization. That's sent through a couple of FIFOs for buffering, also through a copy block once it's under the host for additional buffering. And then on to the actual phosphor display block, which is what you actually see and that's one of the only things that's running on the host. This could run on a Raspberry Pi. And in comparison, if you want to take in a 200 megahertz sample stream, need basically a high-end i7 CPU, just to be able to keep up with it. This can run on a Raspberry Pi. So we're currently running the demo here in Wireless Village to provide spectrum monitoring service for the wireless capture fillet contest. There's a physical knob that is on the desk and Neela's gonna pick it up and hold it. You can come up and you can tune it around. And if you press it, there's predefined frequency so you can hop around to some interesting looking signals. So right now we're looking at 2.4 and if you click it a few times, you'll hop around, you'll find the 700 megahertz LT and various signals. Come up and play with it, twist it. If you press and hold it, it will adjust the gain so you can turn the gain up and down right there. So in summary, ARFNOC, we're trying to make FPG acceleration more accessible. This doesn't mean that if you don't, with exception. HLS, with Vivaldo HLS, you're able to convert C and C++ into the VHDL or Verilog code. But if you don't know FPGA, Verilog or VHDL, there's still a barrier of entry slightly. It's not a fully automated framework. Like you do have to understand a little bit about FPGA development. It's tightly integrated with GNU radio, but you do not have to use GNU radio. You can use it with, you know, C++ and pure UHD. There's a built in block or library of blocks for OFDM, FFT, fur filters, signal generator, the phosphor blocks. Those blocks are all portable between all the users. So if you develop something on an X300 with some slight modifications, possibly, you can port it to any other system. Generally, if it works, I want to work on all of them. It's completely open source, as Neil had mentioned, so it's licensed under LGPL, which means that you do not have to release your IP if you choose not to. We have a knowledge base that's at kb.es.com. There's a great getting starting guide. There's also a companion video which will walk you through building your first RF knock block. Just a little plug. There's a GNU radio conference coming up. It's in Las Vegas or in Henderson this year. We will be running three four-hour workshops on RF knock, which will completely go through and you'll build your first FPGA image and run it. We'd like to give an exceptional and a special thank you to, first and foremost, Zero KS and Rick Melodic and the entire wireless village crew. The service that these guys provide is outstanding. It's pretty rare to find a group of people who are so dedicated and so thoroughly interested in teaching the world about wireless security. As soon as you take your data off the line and you put it in the air, it's free for everybody to take. There's no security. A lot of times RF security is overlooked. Also, I'd like to thank the core RF knock developers and architects, Jonathan Pendlam and Dr. Martin Braun. They have architected and built this wonderful framework that is making it much easier for you to put your custom DSP into the FPGA. Thank you for coming. I appreciate everybody's attention and that's it. Would you like to have any last words, Neil?