 Now that we've implemented the software and tested the counter in software, what we're going to do is implement it in hardware. And specifically, we are going to use this board. This is a Digilent Cmod A7 board. The link is down below. Here is the webpage. There are actually two versions. There is a 15T and a 35T. You will actually want to get the 35T simply because it has more gates in it. So there is more room for stuff to put in. It's got two buttons. It's got two single color LEDs and it's got a three color LED. You can program it through USB. So this is quite a nice board. It's also relatively cheap compared to all the other FPGA evaluation boards. So that's kind of nice. The Digilent page has links to the datasheet and to the Schematics, which is quite nice. There is also a page on how to install the IDE that we are going to use. This is actually Xilinx's IDE called Vivado. So there is getting started with Vivado, installing Vivado, and especially installing Vivado board files for Digilent boards because there is some configuration that you do for the board itself to let Vivado know what kind of FPGA is on it. There is also a programming guide which contains some other useful information, especially some configuration information. This is really important when you want to program the board and essentially get the program to stick so that you can remove USB from the board and then just pop it into your breadboard, power it on, and hey, your hardware configuration is downloaded to the FPGA automatically and it just runs. So this is a very important page to know about. I have also created a reference card to the board that we are going to use. You can see that we have the pins down here, along with which FPGA pin those correspond to, and then some extra information about the pins. There is this connector up here called PMOD or modules, or pluggable modules, or plug for modules, or whatever. Digilent sells a whole bunch of modules that you can stick in here. They are not cascadable like you would normally expect with say Arduino boards, but nevertheless, if you want some sort of hardware capability like GPS or Wi-Fi, this is kind of convenient. We are not actually going to do any of that. We are going to connect it to an Arduino to stimulate the hardware. There is also some information about the buttons and LEDs. The other thing is that the Digilent board has 512K by 8 SRAM on board. It is 10 nanosecond SRAM. Here is some information about the address lines, the data lines, and so on. There is a link to the datasheet. Also, the reference card contains all the documentation, which is important. There is a little legend down below and a little warning about how to hook up external power. The basic idea is that you either power the board from USB, or you power it from this external pin, not both at the same time. Because basically there is a diode connecting the two, and if you connect power on both ends, current will flow through the diode, unimpeded by any resistance, which is not so good. There is also a recommended voltage for actually powering the board through the external voltage input anywhere from 3.5 to 5.5 volts. The other thing that I should mention is that the board is strictly 3.3 volts. Now you can configure the pins for other voltage standards less than 3.3 volts, but we are just going to be using 3.3 volts. Do not connect any 5 volt equipment to this board because you will fry it. That is really bad. That is another important consideration since we are going to be stimulating this using an Arduino. You will need to get a 3V3 version of Arduino, such as the Arduino Due, which I highly recommend. Here is the Digi-Key link to it. You can see that it is $89. Again, not that expensive compared to a lot of other FPGA development boards. The version of Vivato that you will download is the Webpack Edition. It is free. The reason that it is free is that it supports only a limited number of chips. There is the RTX 7, the Kinetic 7, and a couple of other boards. Generally, the Kinetics are larger than the RTX, I think, and I hope that the RTX 7 that we are going to use, the 7A35T, is large enough to implement our CPU. That would be awesome. If not, well, we are going to have to go back to the drawing board. But in any case, the Webpack Edition is free. The installation instructions, like I said earlier, are in this page from Digilent, which is really nice. Again, all the links are down below. Let's get started with a project. I am going to start up Vivato. Here it comes. Great, we are going to create a new project. I am going to put it in my Plug1 directory. Now, you might notice that the directory is F colon. Yes, I am using Windows. You can run Vivato on Linux. I haven't tried it, but in any case, my main machine is Windows, so I am just going to run it under Windows. The project name, I am just going to call it Plug1A. This is basically the first revision. Plug1A. I am going to create a project sub-directory because Vivato creates a whole bunch of directories in there, and I want to keep things clean, so the sub-directory is going to be called Plug1A, and then all the other sub-directories that Vivato creates are going under there. Next. This is an RTL project. RTL stands for Register Transfer Language. This is basically the level that we are programming at when we use System Verilog, Register Transfer Language. We do not specify the sources at this time because we will put them into the project a little later. Now, here we go with selecting the parts. Remember earlier, I said that Vivato had some board files, which are important. If you go to boards and you go to vendors, we have digitalandinc.com. Normally, if you download Vivato and you did not download the board files, you would only get AfNet and Xilinx boards, so that is important to do. Digital and Inc. The display name is going to be the CMOD A735T. That is great. There is only one choice. We pick that, and we go to the next, and we go to finish, and it creates our project. Let us just open this up. I am a complete newbie at FPGA programming. There are probably many of you who know how to do this a lot better than I do. I may stumble through this a little bit, but we will see how far we can get. From what I have been able to tell, you go through a flow. The first thing is you add sources. Let us go to add sources, and we want to add design sources. We are going to add files. When we add files, we are going to ... Where are my files? There are an F colon. Plug1. There we go. I have copied this from the Linux machine. This is exactly the same as we had in the earlier video. We are going to copy both of these files. We are going to hit OK. We are not going to copy the sources into the project because why would we have an extra copy lying around? We might edit one copy and then wonder why our edits did not stick. That is it. Finish. Great. If you look under design sources, you will see syntax error files. These are files that contain syntax errors. Why do they contain syntax errors? Well, let us click on one, and you can see that the type says verilog. The type is not verilog. The type is system verilog. That may have something to do with the file extension. I do not know. Once we change all these files to system verilog, you will see that there are no more errors. Great. There are the design sources. The next thing that we do is ... There is an IP integrator. We do not need ... There is an IP catalog. We do not need any of that. IP stands for intellectual property. It is a term that rubs me the wrong way. It basically means a library of hardware. For example, an SD card interface or a memory interface or something like that. Those are all called IPs or IP blocks. I do not like it at all. You see my library over there? That is all my copyrights. Those are my copyright blocks. It is kind of weird. We are not going to do any simulation because we have already done so. We have already run our tests using our verilator test bench. The next thing that we have to do is ... I really do not know why this is not done automatically for you when you select a board. You need to go to Tools and ... It is not here. Great. We need to open the elaborated design. Elaboration is another word for compiling your system verilog files. We are going to open the elaborated design first. Great. This is a footprint of the device. It tells you where all the pins are. We are going to use that in a moment, but not quite yet. Now, if you go to Tools, you can see Edit Device Properties. This is the really important part. You go to Edit Device Properties. Under General, this is all mentioned in the Digilent Cmod Programming Guide. Let us see. We go to Enable Bitstream Compression. You set that to True because apparently bitstreams are big. The bitstream is the thing that actually programs the FPGA. In terms of configuration, change the configuration rate to 33 MHz. For configuration modes, this is basically how to ... It tells Vivato what hardware is being used to program the FPGA. The bitstream is typically stored in maybe a flash memory somewhere. Vivato needs to be able to configure the FPGA to tell it how to load itself, basically. We select Master SPI X4. I will just close. I will select that. Hit OK. That was one pretty important thing. Let's take a look at DRC, or actually schematic. This is what Vivato has determined our hardware description describes. You can see that it's fairly straightforward. We've got some inputs and we've got some outputs. We've got a bunch of flip-flops. This is our address register. You can see that we have an adder here. This is a thing that adds one. Then we've got two multiplexers, which are driven from our command. One multiplexer is for the clock enable on these flip-flops, and the other multiplexer is for the data. You can see that the data is multiplexed between adding one, the load address, or zero, or nothing. You can see that the clock enable is multiplexed between ones and zeros, depending on the command. This is basically a schematic representation of what we wrote in System Verilog. Great. If you look down below, and if you don't see this, up here there is a layout selection. You want to select IO planning. IO planning is what you do when you say which inputs and outputs go to which pins. The first thing that we are going to do is, let's see, I'm pretty sure that there was another step that I'm actually missing. Let me go to Tools, Edit Device Properties again. Let me just check something. Startup. No. Yes. Under Configuration Voltage, you also have to specify what is connected to the FPGAs configurations. Configuration bank voltage selection should be set to VCCO. Again, I really don't know what these mean, and I'm pretty sure that there is something in the thousands of pages of data sheets that Xilinx has that explains this. And Configuration Voltage is 3.3 volts. Now we've done it. If we go down here to the IO ports, we can see that there is an IO standard that you can select. The default is apparently low voltage CMOS 1.8 volts. We want to change that. We're going to change everything to low voltage TTL. What's the difference between low voltage CMOS 3.3 volts and low voltage TTL? They're both 3.3 volts, but CMOS is really for driving very low loads in the basically up to about 100 microamp load. Low voltage TTL is for when you want to drive loads that are maybe up to about 20 times higher than that, like maybe 2 milliamps or so. So we're going to select low voltage TTL for pretty much everything. Low voltage TTL, low voltage TTL. And then we've got two scalar ports. The scalar ports are 1-bit vectors essentially, and of course a 1-bit vector is like a scalar. So we're going to select also low voltage TTL on these things. Great. Now we have to decide what pins we want to use. So if we open up address, we're going from 10 down to 0, and we have to select what package pin. For whatever reason, board part pin isn't coming up. I have no idea why. So let's take a look at our handy reference card. Here we have a whole bunch of pins that we can use. It would be nice to put the address lines together. So let's just start from the top, and we'll declare this pin as a 10. The next pin down is a 9 and so on. So we have M3L3A16. Oops, I have just somehow changed the direction. There we go. So again, let me make this smaller, move this over, and now we just start typing it in. M3L3 and so on. And the last pin is J1. Okay, that's pin 11 right here. So we've defined the address lines. Great, I'll close that. We have the load address. It would be kind of nice to have the load address on the exact opposite side. So we'll start with pin 48 here as a 10 and go down to a 0. So we start with a V8 and so on. V5 and U4, great. We also have the command lines, command 1 and command 0. I'm going to, let's see, I'm going to put those on the next two lines. No, actually, I'm going to put them all the way down here, T3 and R3. And I'll explain why in a moment. So T3 and R3. Okay, now we have some scalar ports, we have reset and clock. Now, I'm going to define the clock first, and here's why. If you look over here in the clock column, you can see MR and SR. And down here in the legend, it says multi-region clock capable and single-region clock capable. We want, and we actually need, a multi-region clock capable pin for the clock. The reason is that FPGA is, at least for Xilinx, are divided into regions. And clocks are, or clock pins, are preferably bounded to one region, but there are some clock lines that can go to multiple regions. Now, because we really have no idea how the tool is going to place all of our modules, we're going to pick a multi-region clock. Now, again, this is probably not the right way to do it. From what I can read, clock specification is a whole science, which I don't have. So anyway, I'm going to specify this pin over here, W5, as the clock, W5. And that pretty much explains why I didn't want to put the two command lines on V4 and W5, because then my multi-region clock would not be accessible. Yeah, I could have used these ones down here, I suppose, but anyway. I just want the inputs all on one side. And the reset line we're going to put on V3. V3. Command 1 and Command 2 are already placed and will be unplaced first. No? Did I screw that up? Okay, so this thing I've seen several times already. And I think it must be a bug. There is absolutely no reason why it should say that some other line is already placed and will be unplaced. So first I'm going to unplace these. Okay. So if I say T3, it gives me this buggy error. If I go to name and I change the order of the names, then I select this and say T3. It works. Buggy software. And believe it or not, Xilinx charges thousands of dollars for the fully unlocked version of this. So I have no idea why they actually charge for the software at all, considering that they don't make their money in software. They're a hardware company. They should be charging for the chips, which they are. So anyway, the reset line. Okay, where does the reset line go? It's just below the clock. So at V3. V3. Excellent. Okay, all of our lines should be now allocated. Yes, they are all allocated. So we save the constraints file. If we go to sources, you can see that we now have a constraints file, which basically constrains the design to placing certain pins at certain ports. Great. So let's go to report DRC and see what happens. So DRC is the design rule check. So we want to make sure that we didn't do strange things like connect inputs to inputs or outputs to outputs. So we're just going to run this and no violations found. That's awesome. So once we've done that, we are going to run synthesis. And in the upper right here, we will see that we're running the synthesize design program. And it just keeps going. Great. So we can open the synthesize design. Now what synthesize means is it takes the register transfer design and converts it into gate level designs. So the things like the multiplexers and the clocks get expanded. So if we open this, we can see, maybe if we go to where is synthesis schematic, you can see that this is a lot more complicated. So basically what it's done is it has broken down every single multiplexer and it's even placed buffers in there. It's broken up every flip flop. So yeah, it's a lot more complicated. It's put in the pin buffers, the output buffers, and so on. So this is the gate level version of our circuit. There's another tab here called clock resources, and you can see that some of the pins are labeled SRCC and MRCC. Again, those are single region clock capable pins and multi region clock capable pins. So you can see that we were quite clever in assigning our clock to a multi region clock capable pin. If we didn't do that, then from my experience, a whole bunch of warnings are created. And in fact, what happens is to get from one end of the chip to the other, you will have lots and lots and lots of delays in your clock. Maybe as much as five or even 10 nanoseconds to get from one end of the chip to the other, as the signal has to pass through all these different multiplexers that go between the regions. So anyway, that's quite nice. Now again, we can report any design rule problems. So if we click on design rule and we click OK, no violations, that's awesome. So the next thing is to run the implementation. Now before we do so, there is one thing that we should do, and this is run the constraints wizard in synthesize. The reason that I say this is that when I try this before, if you don't do this, implementation just sort of hangs. The implementation is essentially placing all the gates and placing them on the chip, deciding where to put them on the chip, optimizing the routes and so on. And basically it hung during optimization. I suspect that it's because it knew absolutely nothing about timing constraints. Basically say I want, I think it means I want this signal from here to there to take no more than X nanoseconds. A lot of this, I have no idea what it does. I'm just going to go with what actually worked. So the first thing it says is that the clock has an undefined frequency. And I believe it really needs to know the frequency or the maximum frequency of the clock in order to be able to place the clock lines properly and ensure that to get from one end of the chip to the other, that can happen within the clock period. So we're just going to say 100 megahertz. Now we're not actually going to run our design at 100 megahertz. I think we're probably not going to run our design at any more than about 10 megahertz because any higher than that and you start running into high frequency design problems on boards and especially on breadboards. Now the reason for that is that as your clock frequency gets higher, your edge has to get sharper and sharper. And the sharper your edge, the more high frequency harmonics that edge requires in order to maintain the sharpness of that edge, sometimes up into the gigahertz range for a fairly sharp edge. And there's no way that you're going to run one gigahertz signals on a breadboard. So this is why on breadboards you typically want to limit your frequency to very low frequencies and as a rule of thumb that's maybe 10 megahertz, maybe even below that. But anyway, we're going to specify that we want ideally our circuit to be able to run at 100 megahertz. So if we go through the next things, there are no recommended constraints, there are input constraints here. I have not been able to figure this out. I don't really know why the data signal is referenced to the positive edge because these are inputs. And why would an input happen after the input clock? I did notice that if you change this synchronization to source, then it looks a little more reasonable to me in that if you have input data and an edge, there's usually a setup time and a hold time. And I guess that's what this means. But I have discovered that you can click that off and just continue. And these are output delays. And again, for the same reason I don't really understand this diagram. Click that off. Just go to next, go to next, next, next, next. And those last few pages had to do with multiple clocks, which we don't have. We have a single clock. So finish. And what that did is it added some lines to our constraints file about the clocks. So now I'm just going to save the design, control S. Now I'm going to run the implementation. Synthesis is out of date. Okay, that's good. Probably because we added a constraint, maybe. So okay to launch synthesis first. Yes. Implementation will automatically start when synthesis completes. That's great. So we're going to re-synthesize. So there it goes, synthesizing the design. So that's great. And that's just going to keep going. Okay, synthesis complete. And now it is going to start the implementation. Okay, this is optimizing the design. So it's basically placing things and routing them. And this is the place where it hung before when you didn't actually specify the clock speed. Okay, now it's placing and routing. Okay, I guess optimization wasn't place and route. But this is definitely place and route. And presumably what it does during place and route is it tries to maintain the clock constraints that you set up. Because obviously you can come up with a really inefficient design that takes signals forever to get from one end to the other. Okay, that's it. We could open the implemented design and take a look at what it did. So you can see that this is the FPGA device. There are six regions. One of the region, one of these regions I think is unbonded, which means that there aren't any actual pins that go to it. But there are these other five regions. And if we take a closer look, here are some pins. Take an even closer look, see what it did. Okay, what's that? Is that a pin? That is a pin. That's another pin. So these are the pins and these are the routes. We can even zoom in even further and look at all sorts of crazy stuff. So here you can see that there is a logic block here having to do with the pin and there is a wire and it goes through this logic block and then it goes through this sort of crossbar thing and then it goes off and so on. So, and you can see that we're using, we're going across this region and to this region down into this region. And I think some of these lines here might be the clock lines, typically the multi-region clock lines run down the middle of the chip. And there are some outputs over here. So that's pretty cool. It doesn't really tell me anything. So this is interesting down here, the design timing summary. So you can see the worst negative slack is 7.0 nanoseconds and the worst hold slack is 0.2 nanoseconds. The worst pulse width slack is 4.5 nanoseconds. I'm not sure what slack is. I'm going to click on 7.0 nanoseconds and you can see that there is a path that goes from an address register C, that's clock, to an address register that's data. I don't really know what that means. Why would address register a zero clock? Oh yeah, of course, all the clocks are connected together. So this is from clock to data. I guess that would be the delay, total delay 2.9, logic delay 0.9 nanoseconds, net delay 2 nanoseconds. And the requirement is 10 nanoseconds. Ah, okay, so slack means how much of your clock period do you have left after all the path delays? So you can see here that the total delay is about 2.95 or about 50 picoseconds shy of 3 nanoseconds out of the 10 nanoseconds that we've told. Xilinx, how much we require the chip to go from end to end. So that's pretty cool. There's a warning here and basically it says, oh, we didn't specify input and output delays. Those are those clock constraints that I know nothing about. So presumably, I guess, if we put in some clock delays, we can get some better timing here. This says set up and hold. I don't know why there was a clock set up and hold. But yeah, this is a little bit of a mystery to me, but that's okay because we're just going to test it and continue. So now that we have run the implementation, the last thing that we need to do is generate the bitstream. So this is the, in the Arduino world, this would be the flash or the executable that you're going to download. So the first thing we need to do first is go to the bitstream settings. And under bitstream settings, there's minus bin file. You need to check that because what that does is it sets up, there are two formats of file. There's a bit file and a bin file. The bit file is used to just load the FPGA from USB directly. That means that you can load the FPGA and run it. But when you power it off and power it back on again, the FPGA is blank essentially. The bin file is what's used to go into the flash. So the bin file gets programmed over USB into the flash memory of the board and then the board resets and the flash memory will load the FPGA. So this way you can disconnect it from USB, power it off, power it back on again, and the FPGA will immediately load itself from the flash and your hardware will go. So we want to make sure that that is checked. So we hit OK. Then we generate bitstream. So you can see that we're going through a flow where we're adding sources, we're elaborating, we are synthesizing, we are implementing, and we are generating the bitstream. So here we're doing write bitstream. Now if you have any comments about what I might be doing wrong and I'm sure I'm doing plenty wrong, or if you have any comments about what these clock delays actually mean, please log on to the EEVblog.com forum and go to Projects and find my project, which is building a CPU on an FPGA that can play Zork. Join that thread and comment away. I'm not enabling YouTube comments because we all know about YouTube comments. So all right, we have generated our bitstream. So we can view reports or we can do whatever. What we're going to do is we're going to open the hardware manager. Okay, so it says no hardware target is open. That is correct. That's because I have not hooked up my board to USB and I'm going to do that now. So I'm plugging in one end of the USB cable and I'm plugging in the other end of the USB cable. And the first time you do this, there will be a driver that gets installed and it takes a little while, but it will get installed. I've already done that, so I can just hit open target and auto connect. So this will automatically detect the board and you can see that indeed we have detected our XC7A35T FPGA. Now to program it, I can go down here to program device and we select the FPGA and we need to select the correct bitstream file. This is the .bin file, so we find it here and hit OK and hit program. Okay, now that took, that was fast. Ah, okay, sorry, my bad. What that did was it programmed the board over USB. That's not what we wanted to do. We wanted a program flash. So what you have to do instead of program device, you need to add configuration memory device. And we're going to click on that. So again, this is in the programming guide. Yes, now this is in the programming guide for the Cmod chip. Programming the Cmod A7 using JTAG is what we did when we programmed over USB, not directly into flash. This is basically directly into the FPGA. What you want is programming through quad SPI. The quad SPI is the flash chip. So you can see that we're going to add configuration device and you can see that we're going to select micron. And you want to select the N25Q. So we're just going to search for N25Q. And there's a bunch of them. We specifically want the Q32 3.3 volts. Hit OK. Do you want to program the device now? Yes. Now we can specify the configuration file. It's the .bin file. Where is this? It is in PLUG1A. PLUG1A runs .imple and then the .bin file. So we hit OK. And all of this we can leave alone. Basically this is going to erase the configuration first. It's going to program the configuration. Then it's going to verify it. That's pretty good. So we'll just hit OK. And this will take longer because we're writing a lot of data into flash, which is always kind of a slow thing. Great. Programming was successful. Now at this point you would want to reset your board. So all I'm going to do is unplug the USB. I don't think I can actually reset the board here. I can refresh the target. What does that do? Well, it certainly doesn't reset anything. Refresh? Yeah. Okay. I don't think there's any reset here. So what I'm going to do is I'm going to unplug the USB. I suggest that you don't unplug it from the micro USB port on the board because that just adds more fatigue to that connector. You want to disconnect it at the computer. So I did that. And of course we get a whole bunch of errors. We don't really care. And then I'm going to plug it back in just to power it up. So when I power up my board, the red, green, and blue LEDs are lit because those LEDs are active low LEDs if you look at the reference guide. So I think by default they would just be on, which is good because it means that the FPGA is alive and doing something. Or is it? Well, in the next video what we're going to do is actually hook this up to an Arduino and write a little Arduino sketch that can send commands to the board and read back the address that it's in. So until then, take care.