 Greetings, nMyGen friends. Today, for part two, we're going to be talking about nMyGen, and actually creating just some small piece of hardware using nMyGen so that we can get used to it and get used to the concepts. And we're actually going to install onto a physical board, namely this lattice evaluation board that I showed last time. So let's get started. So the first thing that I suggest you do is definitely look in the description and take a look at the nMyGen tutorial that I wrote. Again, it's more a series of comprehensive notes to myself, but that will definitely teach you the basic concepts and sort of get you oriented around what nMyGen is and what are the various concepts and things that it uses. So this code right here is what I consider to be the very bare minimum of a Python file that you're going to use with nMyGen. So the basic concept is that of the elaboratable, which you can sort of consider to be roughly equivalent to a module. They usually have a one-to-one correspondence. However, that's not always true. So remember that in HDL terminology, elaborate means to take your code and replace all of the definitions and generate all the things that you need to generate to come out with something that can actually be compiled into hardware. And that's all elaboratable means right here. It means that it has an elaborate method. Now, the platform you don't have to worry about at this point. Platform actually means what hardware platform or evaluation board or regular board and chip you're going to be using. Platforms contain resources, mainly pins and clocks and things like that. So if you are interested in actually connecting up your nMyGen hardware to the actual pins of the FPGA, that's where platform becomes important. But we're not going to do that right now. We're just going to look into simulating and formally verifying our module. So elaborate outputs a module. This is a Python type annotation. I strongly recommend that you use type annotations because otherwise you really have no information about what type a certain argument is or what the thing returns other than comments. They're just free form, right? So Python has a type hinting system which can actually check the types that you pass in and that are returned. So elaborate typically starts with creating just a module, which is an nMyGen class, and returning that module. Now, this is something that I like to put into elaboratables. These are ports and ports will basically return a list of what I considered to be public signals. And if you look down here in main, you can see that for main runner, I'm passing it ports, which basically means that these are the things that are going to be visible in traces. They're also going to be visible to formal verification. So that's why I just like to put that in. Okay. In your main, you're actually going to start up a parser and you're going to parse the arguments. And the reason that we do that instead of using the ordinary Python argument parser is that nMyGen sort of is taking over that level. So you can actually do things like add arguments. But right here, we're not going to do that. So the next thing that you do is you define a top level module and you add your sub modules to it. So here, the module that I want to instantiate is called adder. So I'm going to create an instantiation of adder and I'm going to just call it adder. And I'm also going to add it to the sub modules. This is actually a dictionary. I think it's a dictionary. But you can just say m.submodules plus equals adder and that will sort of give you a module with no name. But it's usually it's very convenient to give it a name. So you can just say m.submodules dot the name of the thing that you're instantiating. So in any case, and then you run the main runner, which takes a parser, the arguments. So that's how you get your arguments into the parser, the top level module, and then the ports that you're interested in outputting to a trace if you're doing simulation or making available to YOSIS if you're doing formal verification. Okay, so that's the bare skeleton. So the next thing that I've done is I've defined our sort of public public facing signals and you can see that I have an X, a Y and an out. Notice that I don't actually specify the direction of the signals because the direction is determined by and my gen by how you use those signals. You don't actually have to explicitly say what they are. So X is just an 8-bit signal. It's just eight wires. You can also say whether they're signed or unsigned by doing something like unsigned of 8 or signed of 8. The default is unsigned. So we're just going to have two inputs that I'll call X and Y and we have an output that I'll call out. So as you can guess by the name of this module, this is just going to add the two together. So let's do that by defining that logic in elaborate. And that's all there is to it. So first of all, you can see let's get this out of the way. I've added my signals to the ports. So now those are available for simulation or verification. This is basically the line that adds X and Y and outputs the result to out. Now it doesn't look like ordinary Python where you would just say self.out equals self.x plus self.y. Remember that n my gen generates the hardware code for you. It doesn't actually translate Python to hardware. So what you're doing here is you're basically saying, okay, in the combinatorial domain, and again check out the tutorial to understand what domains are. In the combinatorial domain, I have a statement and that statement is that self.out is equal to X plus Y. Okay, so let's see how we can simulate this. So the first thing that I did was I added these two imports, simulator and delay. And in main, first of all, I've commented out the main runner because that's really only useful for outputting code that you're going to run through YOSUS. And this is not going to be run through YOSUS. It's going to be run through n my gen's native simulator. So the simulator does have a bug associated with it in that it will not actually output the traces for input signals. So this is just a workaround. What I've done is I've declared just two local signals, X and Y, and I've added some combinatorial logic to just copy X into adder.X and Y into adder.Y. So that's what these two lines do. And then I create the simulator based on M, which is our module. And it's very important to put these workaround lines before the simulator construction because constructing the simulator causes it to look at M and all of the signals that it has. So now it has an X and a Y signal. So the next thing that we can do is define a process. And a process is basically just something that the simulator will iterate over. It acts as a generator, so it's got yields. And what this does is the simulator simply looks at the process and generates one statement. And in this case, the statement is X equals 0. Then it goes on to the next one and the next one and so on. And this is useful for when you don't have a clock, we'll get to when you do have a clock later, how to simulate that. But without a clock, we basically have to specify a delay. And this is a delay in seconds. So what this says is set X to 0, set Y to 0, and then delay for one microsecond. And then we can look at the output. Then set X to FF and Y to FF and delay another microsecond. Then set X back to 0 and delay another microsecond. So we add that process to the simulator. There's another function called add sync process, which is for adding processes that use clocks. But again, we're not using a clock here, so we just use add process. And then we call the simulator's write VCD function. VCD is the output format or input format used by GTK wave, which is a very useful waveform displayer. There is also a file called test.gtkw. And what that actually does is it's sort of like a configuration file for GTK wave to tell what signals in what file you're interested in so that you can just run your simulation and then load up the GTKW file, and it'll immediately pop up with all the traces you were interested in, as opposed to a blank screen and then you have to select the traces. So unfortunately, under the Windows subsystem for Linux, because it's running as a Linux program, the file that gets generated by write VCD, the GTKW file, refers to the VCD file as a Linux path. And unfortunately, if you're running GTKWave.exe, which is a Windows program, well, Windows doesn't understand that path, so it's just going to say, I don't know what that file is. So this is of less utility if you're using the Windows subsystem for Linux. You can of course run GTKWave in Linux, or rather in WSL, but then you have to run an X server on Windows, and that just gets to be a pain. So anyway, and then you simply run the simulator and it will output the files for you. So let's take a look at that. So I'm just going to run Python 3 on adder.py, and now it's done. I mean, there was very little to do anyway. And now I'm going to run GTKWave on test.vcd. And when we open it, we will see that first of all, we need to select the top module and then add X and Y to the traces. Inside the top module is adder. Remember I said that adding the module to the submodules dictionary using a specific name is very useful, and this is why, because the name pops up here. So we're going to add out. All right, now this is in pico seconds, and of course we're looking on the order of microseconds, so we need to zoom out. Okay, and there we go. 0 is 0, ff plus ff is fe, and 0 plus ff is ff, so that is how we simulate it. Now if you want, you can add some assert statements in here to make sure that the results are as you expected, which is great for just doing a few tests, but really you want to do formal verification. So maybe we should do that. Now, as I think I've said in a previous video, formal verification is sort of like unit testing squared. In unit testing you come up with some test cases and then you run your code through those test cases and make sure that the output is as you expected to be. Now what formal verification does is it says, okay, for your code the following properties have to be true all the time, or they may be true in certain cases, and then what the formal verification engine will try to do is falsify that assertion. So let's take a look at how we can do a little bit of formal verification on our code. So what I've done here is I've added one more assert from asserts called assert. What I've also done is I've put back our main runner, I've commented out all the simulation stuff, and I've added a single assertion. Now, in this case, addition is fairly simple. What this proves is that I can actually copy code. That's really all that it will formally assert. However, this is a lot more useful when, for example, you would implement addition not using the plus operator, and relying on n-my-gen-uniosis to do the right thing. But when you're actually, you know, maybe you're constructing a ripple adder or maybe you're constructing an adder out of smaller adders and you want to make sure that you're doing carries properly and that sort of thing. In this particular case, all I'm doing is using plus and plus right down here in the assertion. Let's take a look also at the SBY file, which is the symbiosis file. You can see that I have two tasks called cover and BMC. BMC is bounded, so you give it a depth, which is the number of cycles that the formal verification engine will run before saying, okay, everything seems pretty good. There's another mode called prove, which basically uses induction in order to prove that. If everything works for n-1 steps and it works for n, then by definition it will work for all time. And we're not going to do that right now. All we're doing is bounded model checking. Cover is for when you want to make sure that particular case can actually happen. So we'll look at that in a moment. Depth, we set to 2. We could actually set it to anything, but 2 is just fine because we don't have any clocks. Multi-clock, I turned off because, well, we don't have any clocks and we certainly don't have more than one. The engine that I'm selecting is boolector, which I found to be a pretty good overall proof engine. The script will read iLang, which is the intermediate language that is used by IOSIS. And nMyGen can output iLang. And basically nMyGen always generates a module called top, so that's basically how it works. So let's see what happens when we try to do some formal verification. So the first thing that we do is we run the Python program in generation mode. So instead of just Python 3 adder.py, the main runner looks at certain arguments. So it looks at the first argument and sees if it's generate or not. And then it looks at the minus t option, which is the type of output that it's going to do. iL is for iLang and v is for verilog. But when you're working with IOSIS, you may as well just go straight to the intermediate language because, I mean, why would you have verilog code? Unless I suppose you are giving the verilog code to some other non-open source tool. So anyway, we're going to output this to a file called toplevel.iL. So there we go, no errors. And we can actually look at the iL. It's understandable, but basically it's a bunch of cells that get connected to each other. And now in order to run formal verification, I run the SBY tool on adder.sby. Now if I run it, okay, that's great. So I see here done with pass, and this is for the cover mode, I believe. Where does it say cover? I'm sure it says it's somewhere over here, but I'm just not seeing it. The next mode that it does is BMC, and we can see down below here that it failed, which is really weird. Why would it fail? So we've got a counter example trace over here, which we can look at using GTK wave. So let's do that. Okay, so what we want to do is we want to look at the signal. So there's x, there's y, and there's out. And we can zoom out to the very end, and we can see, well, okay, 7e plus fe is definitely 7c. So what went wrong? Well, in actual fact, 7e plus fe is actually 1-7c. So this is one of the traps that nMyGen can lay for you, is that when nMyGen sees an 8-bit signal, added to an 8-bit signal, it generates a 9-bit signal. And then if you compare the 8-bit signal to the 9-bit signal, well, it's going to say, of course, they're not the same, especially if the 9th bit is set. So the fix for that is to truncate to 8 bits. So this only happens when you're doing formal verification because usually what you're doing is you're assigning the result of an operation on signals to another signal. And that, by its definition, will truncate the signal to whatever the destination is. Here, however, what we're doing is equality. So both sides have to be equal. And a 9-bit signal is just not equal to an 8-bit signal if the 9th bit is set. So let's go ahead and recompile and rerun. And there we see that we have passed formal verification. So I can be sure that no matter what inputs are given to X and Y, the output will be correct. Now, again, this isn't much of a test because I'm testing whether I can type plus in my module and whether I can type plus in my formal verification. But again, if you were implementing addition using some other mechanism other than just plus, then this verification is actually extremely useful. All right, we've talked about assertions. Let's talk about cover. Let's cover, cover. So what cover does for you is it allows you to askiosis to come up with a sequence of inputs that generates the condition that you're trying to cover. So for example, I can add a cover statement that says, give me a set of inputs such that the output is equal to ff. And that's really all I need to do. Now, for here,iosis is free to choose anything it wants for X and Y. It'll probably choose one is zero and the other one is ff because that's probably the easiest way that it can figure out to get to that number. So that's not that difficult. I'm going to add another cover statement and it will search for both where adder.out is ff and and I think I need to surround these with parens. I can't use Python and and I can't use Boolean and that's the double. Well, yeah, I can't use the Boolean and which is that in hardware, what you're actually doing is you're using an equality which results in a single bit, which is either one or zero. So that's why you have to use the bitwise and and then you have to surround your terms with parentheses because I think the bitwise and takes a higher precedent. So it might do, you know, address dot out equals and then internally parentheses ff and whatever, which is not what you want. So what I'm going to do is I'm just going to add another condition where adder.x is equal to fe. So of course it should find adder.y being one. So this is not that interesting for purely combinatorial circuits but for things like clocked circuits where you really need a sequence of input signals, this is much more interesting and much more useful. So these are the two statements that I've added. Let's go to the SBY file. So we're not going to change anything because we already have our cover task set up and we don't have any clock. So a depth of two is just fine. So now let's go ahead and compile. So the first thing that we do is compile. Okay, I forgot to add an import for cover. So let's do that right now. It's in nmygen asserts cover. Try again. Great. And now we'll run SBY. Okay, and the bottom part is for bounded model checking, which we know passes. The top part is for coverage. And what we're looking for is a log message that says reach cover statement at adder.py35 and reached cover statement at adder.py36. So this tells you that it found a way to make those statements true. Now, the interesting thing is, well, let me go ahead and do this. I'm going to change this so that adder.out is equal to zero. And adder.x is fe. Because I want to emphasize the fact that the cover statements don't have to interact with each other. So I can choose different cover statements that may actually contradict each other. And Yosses will find cases for each one. So let me go ahead and recompile. And run SBY again. And we can see that we found, again, one cover statement and another cover statement. And it also says that there are some traces that I can look at. So let's go look at those traces. We'll look at the first trace, which corresponds to line 35. And line 35 is just checking that the output is ff. So I run GTK wave on. I think it's cover something, something cover. Unfortunately, it doesn't tell me the full path, not like BMC does, which is kind of a pain. So I think it's test something. No, what is it? Adder, cover, engine, zero, trace, zero. And let's see what it came up with. So if I open up X and Y and out, you can see that X is ff and Y is zero, zero. So Yosses basically says these are the inputs that will give you what you want. Let's take a look at the other trace, which is trace one. This is where we set the output to zero and one of the inputs to fe. So if we look at that, we can see that the one output is fe. The other, the one input is fe. The other input is two and the output is zero. So again, Yosses has figured out exactly what inputs need to change to in order to get the result that you want. Now, if you put in a cover statement that absolutely could not be met, like for example, and adder.y equals zero, then Yosses will spit out an error saying that it could not meet the cover statement within the specified number of cycles, which in this case would be two, but of course, you know, we know that it's never going to happen. So that's cover. Okay, so I've added an assumption here and the assumption basically says that X has to be equal to Y shifted left by one. So in other words, X is twice Y, modulo 256, of course. So what this assumption says is we're restricting our inputs to these particular values. And you have to be a little careful when you're using assumptions and BMC, which is the Asserts, because what those assumptions do is they restrict your inputs to certain values. And then when you do your Asserts, you're basically saying that no, the inputs can't be any particular value. They have to be these particular values or they have to satisfy these assumptions, which is useful if you know that your inputs have certain illegal states. They don't have to be inputs even. They can be, you know, internal flip flops, for example. So you can basically say, well, these are the illegal states, so I want to assume that you're not in an illegal state. So what we're going to do is we're going to put together a little more complex cover statement right over here. We're just going to say that the output has to be greater than zero and less than four zero. So the question is, can YOSIS find an input X and an input Y, such that Y is twice times X? So let's go ahead and compile and run. And we can see that indeed YOSIS has found something. So let's take a look at the trace and see what it found. All right. So it found six and three is nine. So that makes perfect sense because I just specified that out had to be between zero and four zero, and YOSIS just found six and three, which satisfies X is twice Y. I think I misspoke before. But yes, X has to be twice Y, and the sum has to be between zero and four zero. So there you go. Anyway, that is a quick look at how you use asserts, assumes, and covers. Now you can add other interesting things. Like, for example, let's suppose you wanted to cover something but only in another circumstance. So for example, we could do this with m.if, let's say adder.x is equal to adder.y shifted over by one, then cover this statement. So in other words, this cover is only active when this statement is true. So this is another nmygen construct if it looks like a Python context. This will probably end up being equivalent to the assumption. However, the interesting thing is that because I didn't use an assume, it means that when I run BMC, it will run over the full input set to make sure that the assert is always true. Here what I'm saying is I'm asking YOSIS to cover only if this statement is true. I could stick the assert down here, down here under the if, and then it'll have the same effect as an assert. So let's run this and see what YOSIS comes up with. Okay, it has passed, and let us look at the trace. And the trace shows interesting. This time it shows BE and 5F to be 1D. Effectively, when you're asking YOSIS to do covers, it will sort of try to get the simplest result, but in some cases it's just going to be random. It's just going to be, you know, well you asked for this and this satisfies those conditions. In general, when you're dealing with clocked logic, YOSIS will try to find the lowest number of clock transitions that lead to your conditions for coverage. So that's pretty much that. Speaking about clocks, let's look at clock domains. Every module in nMyGen gets a clock domain called sync. I have this module that I've called clocky because it's a clock. So what I've done, you can see that I have a seven-bit signal just called x and a load signal that's one bit and a value signal that's seven bits. And the idea is that x is going to count up on every positive edge of sync. And you can also set load high and put a value in value, and that will load the value into x on the next clock pulse instead of incrementing by one. So here's the logic. So we can see with m.if, so if self.load is true, then on the next positive edge of the clock, then self.x equals, now MUX is basically like the ternary, you know, question mark colon that you see in C Java, that sort of thing. So what MUX is, is basically it uses a one-bit signal as a condition, and then if that bit is true, then the value of MUX is the first value, otherwise it's the second value. So basically what this is saying is on the next positive edge of the clock pulse, set x equal to, if self.value is less than or equal to 100, then the value that you want to load in, otherwise just 100. So here I'm basically limiting the value that you can put into 100, just for fun. L.if, so this is the equivalent to Python L.if, so else if self.x is 100, then on the next clock pulse, or the next clock edge, put zero into x, otherwise increment x. So set x equal to x plus one. So that's really all this thing does. So basically it's a counter. It counts up to 100, and then it resets, and you can also load it with some value. And if you try to put in a value that's greater than 100, then it will go ahead and limit it to 100. So that's all this thing does. So here I've added x.load and value to the ports that I want to give to any external process like YOSIS. And here is the main. As before, we have our standard workaround where input signals are not actually output to the trace file by nMyGen unless they are in the top-level module. So here is where I'm putting it in the top-level module. Here is our simulator instantiation that we've seen before. What's different is we now add a clock, and you can specify the clock period in seconds. So this is one microsecond. So the idea is that the clock has a one microsecond period, or it's one megahertz. Okay, let's skip process for the moment. And at the bottom, we can see sim.addsync process instead of just add process. So this is what I mentioned before, is when you have a clocked process, you add it using addsync process and not add process, which would be for combinatorial processes. Okay, so if we look at process, it looks a little bit different. Yield, just by itself, will cause one clock cycle to go by. Yield with some statement in it will cause that statement to be executed right away. So basically what this is saying is, wait for one clock period, then set load equal to one, and value equal to 95, and then wait one clock period, then set load to zero, and wait one clock period, and then wait a bunch of other clock periods. That's really all that does. Everything else is the same. So let's go ahead and run this. Okay, so it has simulated, and let us open up GTKWave to look. And at the top, we have load and value, and inside clocky, we have the clock and the reset. Now, of course, it doesn't look like it's doing anything, but that's because we're still in the picosecond range. So let me just increase the zoom. Okay, that's probably fine. All right, so here's X. Let me go back to top and get load and value. All right, so we can see that there is actually one clock edge before the process actually starts. So because signals, by default, get reset with zeros, so X is zero. So on the positive edge, we get one. So the next thing that happens is I ask for load to get set to one and value to get set to 5F. The unfortunate thing is that just by looking at this, you can't really tell whether that happened just before the clock edge or just after the clock edge. In actual fact, it happens just after the clock edge. So this is the yield. Then we set load to one and 5F into value. Then another clock period goes by, and during that clock period, load is one and value is 5F, which means that 5F gets loaded into our counter. Then we do a bunch of yields just to show that we're incrementing. In fact, we're incrementing to 64, which is 100, and then back down to zero, and then we continue one, two, and three, and so on. That's really all there is to it. Again, it's a little bit confusing when these signal changes because if you zoom all the way in, these signals change basically right on the clock edge. So you have to sort of remember that signals, these input signals actually change just after the clock edge. Okay, let's look at some formal verification using a clock domain. It, again, may seem pretty simple and straightforward. The logic isn't really that complicated, so there's not a whole lot to verify, but let's go through the exercise anyway. So I've commented out the simulator stuff and added back our main runner, and I have an assert. So in this particular assert, I want to make sure that if the value of the clock is greater than zero, then what I want to write is then the value of the clock is equal to the previous value of the clock plus one. So, in other words, the clock incremented. Obviously, I don't want to handle the case where x is currently zero because then I would have to say, oh, well, then the previous value should be 100, and in fact, I could add that in later. So this is the property that I want to assert holds if and only if the output is greater than zero, that clocky.x is equal to the past value of clocky, that is, the value of clocky.x one clock cycle earlier plus one. And then, of course, I use the usual trick of truncating the values so that we're not comparing, you know, seven-bit values to eight-bit values, that sort of thing. So let's run this and see what we can come up with. Okay, so I generated the code, and now I am just going to run clocky.sby, which looks exactly the same as adder.sby with the exception that I've set the depth to 40 because I just want to make sure that the thing works for 40 cycles. Okay, let's run it. Oh, and look, we've got a failure. So let's see why. Well, if I pull up the trace for BMC and I look at the clock and I look at reset, okay, the clock is going, reset is low, x is zero, but then it went to 40 and then 41. That's pretty weird. What happened? Well, it looks like what happened is that formal verification found a case where the value of x was 40, but it was not equal to the past value of x plus one. And the reason is that it said, oh, I can just load a value into x and I have just violated your assertion. And again, it will try to find the shortest path to that falsification of the assertion. So we can see that it's also set value to some random other values. That doesn't matter. The point is that we can look at this trace and we can see, oh, those are the inputs that falsify my assertion. So now I know that I have to be careful about the load signal. So let's go ahead and include something about load. So I want load to be zero, but not only that, I also want the past value of load to be zero as well, because if the past value of load is one, then we're just loading x and then we can't have our assertion. So what I want to do is I want to say, and I guess I could just say past of clocky.load is equal to zero. I guess the present value of clocky.load doesn't really matter because the present value of load will only affect the future value of x. So let's run this and see what happens. And we can see that it ran through to the very end. Great. Okay, I've added one more assertion that I want to check. I want to make sure that the clock rolls over properly at 100. So basically if x is zero, then the past of x must have been 100. Now this is not necessarily going to be true, because at the beginning of time it's probably going to start with zero, but let's see what happens. Okay, and we did get a failure. So let's see what verification came up with. Okay, there's our clock. There's our reset. Looks like we're resetting something. Now we're load. We're not loading anything. And we've got x. So the present value of x is zero over here, but the previous value of x, or the past of x is also zero. Why? Because Yosses decided that, well, we can always reset the clock. So if the clock was zero and we reset it and it's now zero, well, you didn't roll over. Obviously we need to take that into account. So what I've done here is I've just added another clause basically saying that the past of reset has to be zero. The tilde is negation. So basically I'm saying that we want the negation of past reset to be true. In other words, reset needs to be zero in the past. Oh, okay, name reset is not defined. So I need to tell what reset is and the way you can get the reset signal for a clock domain is by just saying reset signal, signal. And then you can name the clock domain here, but by default it's just sync. You can do a similar thing with clock by saying clock signal, but we don't actually need that. So that's my reset signal. Okay, let us run. Okay, we've got another failure. Let's see what that is. Clock, reset. Reset. Okay, reset is zero. X is zero and then it goes to zero and then it goes to one. What happened? Load was set. Okay, that pesky little load. So what you can see is that we have to be really careful about the conditions that we want to check. So let's also say that we want not past clocky.load, which is basically the same as past clocky.load equals zero. And now we've passed everything. Okay, so what we can do now is we can probably just test load. So with m.if, if in the past clocky.load was true, then in the present we want to assert that clocky.x is equal to also from the past the value that we wanted to load in. Okay, now what landmines can we think of? Well, what happens if both reset and load are set? Well, so it actually turns out that reset takes precedence over basically everything. So what's going to happen is we can have reset and load go off at the same time and we won't actually load anything, right? Let's just prove that by running this as is. And as expected, we have a failure and if we pull up the trace we can definitely see that reset is high and load is high. So of course that's not going to work. So let's go ahead and fix that by also stating that reset must not be high. Okay, we've got a failure. Why do we have a failure? This is probably going to be interesting. So reset is low. We've got our load. Ah, okay. All right, so we've got some interesting things happening here. Notice that when the past of load was true and the past of value was 7.1 the current value of the clock is 6.4. Why? Because we cut the value off at 100. So we're going to have to take that into account. So let's see. What should we do? I guess what we can do is we can put in an assumption. Why not? So we can say m.d.combinatorial assume that clocky.value Well, we don't really want to assume because I guess at some point we might want to test what happens when clocky.value is greater than 100. So why not do this? Why not put in a multiplex if the past of clocky.value is less than or equal to 100 than past of clocky.value else 100. And again, we're getting dangerously close to the territory where all we're proving is that I can type the same thing again because this is exactly the same logic. So you might want to break this up into two statements. Basically the idea is that if you write code in one way then you want to prove that the code works in a different way. But let's just continue. Okay, we've got pass. Okay, so I think that's really all I want to do about this. Let's briefly talk about cover. So let's just say m.d. Would it be combinatorial or sync? Let's try sync. Plus equals cover. Now I want to cover the case where clocky.x is equal to 3. All right, so I want to see what sequence of inputs cause 3 to be output. And typically it's going to be the shortest possible trace. So let's just run this. Okay, so let me just go up and we do have a pass for our cover statement. So we can look at trace 0 and see what Yoast is determined was the way to get to an output of 3. So there's our clock, there's our reset. Here's our load, here's x, and here's our value. So what Yoast has decided is I'm going to load 3. Okay, that's perfectly valid. What happens if we want to say but we don't want you to load? So what I could do is I could just say with m.if past clocky.load not that then cover that case. So in other words, don't use load to load into 3. And what do we have? clock reset load x value, right? It just loaded 2 in and then clocked it once and then we've got 3. So obviously we can do this exercise with restricting load. And then Yoast will be forced to start from 0 clock to 1, clock to 2, and clock to 3. Now the interesting thing is that what happens if you try to cover the case where the output is 99? Well, the problem is that you will not be able to find a sequence that does that because you have restricted according to the SBY file your depth to 40. So you would have to increase your depth to something like 100 in order to actually cover that case. So you need enough runway basically to find your signals. OK. So that's formal verification with a clock domain. So we've got one last subject to cover, I promise, and then this video will be over. And then we'll get to part 3, which is actually working on the CPU. But the subject is how do we get the stuff that we wrote in nMyGen into actual hardware? OK. So here is the Lattice evaluation board, the iSE 40 HX 8K breakout board. It's got a bunch of one-populated connectors, three unpopulated connectors. It's got a bunch of LEDs on it. This is the Lattice chip itself and down here is the FTDI USB interface. So it also comes with cable. So when you just plug it in it comes pre-loaded with just a blinky. You can see that it is blinking. It's got a bunch of it. It's got a power LED on the bottom. It's got 8 LEDs on the top. There's also another LED which I'm not quite sure what it does. This is the D1 and the ones on the top are labeled D2 through D9. So this is what happens when you just first power it up. So that doesn't mean you can program it or anything. So in order to program it we are going to need to load some software which is unfortunate but that's what you have to do. So first things first this is the FPGAWars repository. I don't know why it's a war but okay. And the first thing is that we have to load the FTDI driver. This is a more open FTDI driver than the ordinary FTDI driver. And it allows other software to communicate with FTDI chips. So there's this find all program which you can just download and unzip and open it and here we see findall.exe This is a Windows program so you can put it in any Windows location. It should all be accessible through WSL. Now if I just run where did I put it? On my F drive so if we just run it it will say number of FTDI devices found zero because I have no FTDI devices plugged into USB. Now if I plug in the lattice board ok so it's blinking and then if I run this again it's going to say that it found one FTDI device but it can't really communicate with it and that's because we don't have correct open driver installed. So that's what we're going to do let's see the driver that you want to install is called libUSBK So first we want to make sure that we can actually see the device so I'm going to plug it in again now for Linux I think this is as easy as going into the USB or the device configurations and setting up properly it's just one file but for Windows it's a little more involved because unfortunately the Windows subsystem for Linux does not yet support USB or if it does it only supports certain devices definitely not this one. So the first thing that you want to do is pull open the device manager and you want to look under USB and you're going to see USB serial converter A and USB serial converter B if I unplug the thing those two devices go away plug it back in ok those two devices reappear there's also this USB composite device which also appears so the thing is that the serial converter does nothing for us it is not capable of communicating or rather the program that we're going to use is not capable of communicating with that sort of thing the other thing that the instructions show is in devices and printers you should see the new device Lattice FT USB interface cable and that just says that your Lattice board is connected to a new device we can pull open device devices and printers or rather printers and scanners this instruction manual was written for Windows 7 so it's different on Windows 10 you go to Bluetooth and other devices and you can see down here the Lattice FT USB interface cable is there but there are no options except to remove the device there's actually no good and what they're talking about is this thing which has these large friendly icons so where do we find that we find that under devices and printers so all control panel items and you go to devices and printers and there it is right over here Lattice FT USB interface cable now if you click if you double click that and you click on hardware you can see USB composite device, serial converter A serial converter B and it happened to load it on COM 6 as basically an emulated serial port so that's what we need to change so you can see here they have the same thing okay so Zadig is a Windows application that will install the nice USB driver so we can go there you can download the executable from this link that link goes to Zadig 2.2 let's just see if there is anything more recent than 2.2 there is it's Zadig 2.4 sure let's download that we can I guess run it okay let's go back to the instructions okay execute Zadig initially no devices are listed select the option menu and click on list all devices so options list all devices okay done that in this example 11 devices have been found well there is an awful lot of devices here so now click on the driver bar and select the lattice interface cable for interface 0 let's see okay okay and basically we can see that the current driver is just FTDI bus which is the unfriendly version and we are going to change it to libusbk excellent and it shows here the USB ID so this is basically the equivalent of USB devices on Linux where Windows will look at the USB device that gets connected it will look at the USB ID the vendor ID the product ID in this case it's 0403 and 6010 I think the 00 is the interface number but in any case once that comes up Windows or even Linux will look into its internal database of device drivers and will say oh okay I need to load this device driver for that device so what we are doing here is we are saying when you find a USB device that has 0403 as a vendor and 6010 as a product that is the lattice evaluation board then load the driver libusbk okay so we are going to click on replace driver which for some reason takes a long time I don't know why oh okay installation can take some time thanks for letting me know that is actually nice UI practice over there okay the driver was installed successfully closed now you can see that Zadig is reporting that the current driver for this device is libusbk and that's pretty good so I believe we can now close this I am just going to unplug the board and plug it back in great and now let us look at the device manager under usb and we can see that usb serial converter A has disappeared and that is correct and if we look there is a libusbk device which is the lattice FT usb interface cable which is correct okay and it says if find all does not work after installing to interface 0 try to install to interface 1 also okay please note this could break things wonderful so now the next thing that we are going to do is we are going to run find all again and we can see that it is recognized that is because find all is able to communicate with libusbk and libusbk is able to communicate with the device so that is really how you install the driver specifically for the board now why did we do this it is because we want to run a program called ice prog which programs ice 40 devices like this and the only way that it can do that programming is through the correct device driver so if we go to this other repository for FPGA war is called tool chain dash ice storm and we go look at the releases we can see that there are a bunch of releases that include statically linked binaries for yosis ice storm and arachnid pnr and the thing that we are really interested in is ice storm because we have already installed yosis and arachnid pnr or actually next pnr on windows subsystem for linux so the only thing we are interested in is ice storm which is the actual programming software so what I am going to do is select windows x86 1.11.1 after downloading we should get a bunch of files actually one file in here so it is a tar it is an archive of a zip file and if you go inside the zip file you see a bin directory and inside the bin directory you actually see a whole bunch of utility programs and the program that we are interested in is iseprog so you can go ahead and extract that to any windows drive and I have it installed under f so mntf iseprog we can run it why don't we do this for linux well again the windows subsystem for linux does not recognize usb devices so we have to go through the windows program so it is basically saying missing argument which is great if we go to the help we see that there are a bunch of things that we can do the thing that we are really interested in is minus b and what minus b does is it bulk erases the entire flash before writing typically you want to do that and then you can give it an input file and that should actually load the program so so let's actually put together a program and generate a binary bit stream out of it and load it on the board so what I am going to do is I am going to create a module called blinker and in it first I am going to access the platforms default clock frequency you can always get this because platforms have a default clock frequency so what I am going to do is I am going to set up a timer and all cards on the table I stole this mainly from the examples from nmygen so that is why it looks like this so we are going to define a timer signal the maximum value for the timer signal is just going to be whatever that default clock frequency is divided by 2 now the default clock frequency for the board that I am using is 12 megahertz so obviously if we want the timer to go off at 1 hertz intervals we need to toggle it at 2 hertz basically which means that I need to count 6 million counts of the clock and then do a toggle so 6 million is just basically the clock frequency divided by 2 so by using max in a signal I can say what the width of that signal is so for 12 megahertz or rather for a maximum count of 600 million that is about I think 23 bits so nmygen will basically look at max and say ok how many bits do I need to have a maximum of this number and the answer would be 23 bits reset here tells me that when the board powers up or whenever you toggle the reset pin what should the value of the counter be set to and in this case it is going to be a count down counter so we are going to start it from whatever the maximum we want minus 1 so the next line here is basically the fun part this is where you get to have signals associated with actual pins on the FPGA so you can request a named resource and this particular resource is called LED and we will get to that in a moment the LED resource is an output pin so I use .o to specify that I want the pin's output signal so LED is now a signal you can use .i for an input if it is a tri-state pin you can say whether the output is enabled or not by using oe if it is a bi-directional pin where you have both .o and .i available then .oE tells you whether you are doing inputs or outputs so .oE if it is 1 you are doing an output and if it is 0 you are doing a read all this information is in the tutorial so the next thing is if our timer reaches 0 then we want to go ahead and just set the timer back to its max value and toggle the LED otherwise we just want to count down on the timer and that is basically it for the Blinky program now how do we get this mysterious platform so the first thing you have to decide is what basic platform or base platform you are going to derive a board from now there are a variety of base platforms and I enumerate them in the tutorial but if you look at the website if you look at the github repository under nmygen vendor you will see the different devices or base platforms that you can use so there is an Intel FPGA there is the Lattice ECP5 the Lattice Ice 40 and so on and some of these are compatible with Yosis some of them are not compatible with Yosis and then you have to use the vendors tools the Lattice Ice 40 I particularly like because it is supported by Yosis which is nice and it is pretty fast so because I am using an Ice 40 device if we look inside here we can see that the name of the platform that I need to derive from is called Lattice Ice 40 platform and there is a lot of information in here about the required tools how it works, what the output files are if you use the Ice Storm which is the open source tool chain that you have to do if you use Ice Cube 2 which I think is Lattice's thing either LSE or Simplify I guess this is what you have to do so we are actually going to be using Ice Storm so anyway that is the base platform now if you don't want to bother with deriving all the board characteristics in a base platform you can always go to the NMyGen boards GitHub repository and it has a list of pre-defined boards so for example I could have chosen the Ice 40 HX8KB evaluation board I guess that is breakout and as you can see it derives from Lattice Ice 40 platform which defines all the things that you kind of want to define so you can use it straight or you can use it as a starting point so I used it as a starting point because I actually don't need a lot of what is defined just for my simple example so here is the simple example I am just going to call it board now what you have to do is you specifically for the Lattice Ice 40 platform and different platforms are different so be sure to look at the GitHub repository for the platform you are using you need to specify the particular device that you are using and again if you look at the class in the GitHub repository we can see a list of devices that you can specify so of course I am going to be specifying the Ice 40 the Ice 40 HX 8K because that is what I am using and you can see that it translates to various tool chain options same thing with package the particular package that comes on the breakout board is a CT 256 that just means that it is a BGA with 256 pins on it so the important part the really important part are the resources so these are named resources and they basically tell you what pins or what so for example on the board there is a clock that is connected to pin J3 the reason that it is connected to pin J3 is that it is a global buffer and here you have to look at the chip's data sheet global buffer basically means that if you input a signal into this pin it is available all over the chip maybe all over most of the chip or a part of the chip or whatever the point is that it is a signal that you can get a very high fan out so you can feed it to thousands and thousands of lookup tables and flip flops and things so obviously when you have a clock or any signal that is going to go to a whole bunch of flip flops and lookup tables you want it to be global so in this particular case I am defining a resource called clock 1 0 I think lets you specify if you have multiple pins I am not quite sure what this number does but I am just going to set it to 0 and we will leave it at that pins that tells you one or more pins that are associated with the resource now in this particular case I don't have more than one pin associated with the resource in the example for the board we can see that there is a resource here where we are describing a whole bunch of pins for LEDs and I don't really want to do that because I only have one LED but anyway so this allows you to specify which pin is associated with that resource and what the configuration for that pin should be now the configuration for a pin is either going to be input which is I output which is O bi-directional which is I-O or tri-state which is O-E and this is gone over in the tutorial then for clocks you want to say what the frequency of the clock is so in this particular case for this particular board the clock that is connected to the chip is 12 megahertz and then you want to specify the chip specific attributes and these attributes first of all they have to be supported by the platform in nmygen so you need to know what they are and second of all they have to be supported by the toolchain so global equals true again this tells you whether it's a globally buffered pin and I-O standard tells you the voltage standard now when you're using the vendors tool you can specify things like low voltage CMOS 3.3 or low voltage CMOS 2.5 and that basically configures that pin for the particular voltage standard that you're trying to apply to the pin not many modern FPGAs have a TTL standard so they're all basically low voltage CMOS 3.3 volts and below so the fact that this just says SB underscore LV CMOS I don't exactly know what that means I do know that it translates to 3.3 volts low voltage CMOS I don't know if there are any I-O standard any other I-O standards for this chip supported I've looked in the source code and I don't really see anything else so if you know the answer to that leave a comment because that would be kind of great to document alright so that's clock one we also need a reset for a clock so again it's a good thing to put it on a globally buffered pin so in this particular case it's pin R9 the direction is an input and of course it's a global pin and the I-O standard is low voltage CMOS and finally I have the LED that I want to blink the particular one that I want to blink is connected to pin C3 in order to determine the pins you need to look at the documentation for the board this is the one known as D2 it is an output pin and it is not a global pin and it's I-O standard is low voltage CMOS ok so those are the three named resources that I really need for my hardware if you want to define more resources you can do that and then you can name them things like buses so for example let's suppose you had a 16 bit address bus well you could say resource address and then there should be some way of specifying 16 pins I know that the string here is space separated so you can specify 16 pins one after another again I'm not quite sure what this number means maybe because you can have LED 0, LED 1, LED 2 and so on and I guess the default when you request one of those pins is resource number 0 ok all boards have to have a default clock so this is what the sync domain is going to be connected to you also need a default reset which is again what the sync domains reset is going to be connected to and then you need to specify connectors I have not figured out exactly what connectors do I just know that you need a list of connectors again if you look at the example board that is given by nmygen boards all of the connectors are laid out and it seems to be the jack number and then the pin layout again I don't know what purpose that serves but I do know that if you leave connectors out you get an error so let's just leave it at that finally there's this snippet which I stole completely from the example board this allows you to simply run the entire tool chain when you run the python program which is convenient just plug in your board run your python program and it programs itself I'm using unfortunately the windows subsystem for linux and wsl does not support many usb devices and it certainly doesn't support these FTDI usb devices which means that I can't automatically run isprog in fact I might actually be able to change this to isprog.exe in other words I can't run the linux version of isprog because it's going to look for a linux usb device driver I have to run isprog.exe which is the windows program so I could probably modify this to to look for isprog.exe this thing here I could probably just replace with isprog.exe interestingly it calls it with minus capital S and not minus B I don't know why but in any case the final piece of the puzzle here is what you call in main so the first thing you do is you construct your board and then you call its build function the build function takes a module so in this particular case we're going to construct the blinker module and pass it to build and then the key argument do program will tell the program whether to actually program the board or not automatically so in this case I'm setting it to false because I can't run the linux version of isprog if you are on a native linux platform or if you're doing this from windows I assume that you could just say do program equals true and it'll just work and that's pretty much it so now let's run blinky.py and that's it the build files are put in a directory called build and if we look at it there's a whole bunch of files they're interesting to look at the file that's really the important one is top.bin that is the bitstream that gets sent to the board directly but you can look at the other files one of the other important files is top.rpt this is a report file and it tells you exactly what the osis and isprog toolchain or isstorm toolchain is doing in order to synthesize your file so for example you can see here that the input file name is top.il and we know what il files are and really the interesting thing is way down at the bottom this is a report of how big your implementation turned out to be so we can see that we used 84 lookup tables and we used 149 cells sbcarry is the number of carry units and this generally has to do with when you have adders so of course because we have a countdown timer we implemented it basically as an adder where you add negative one each time so it used 27 of these units can you do better? oh absolutely because of course when you do a countdown timer all you really care about is setting it to the initial value and then waiting until it counts down to 0 you could use something like a linear feedback shift register and then you wouldn't have any carries it would certainly be a lot faster you wouldn't have any addition logic because you're not really doing addition all you're doing is you're counting in a sort of kind of pseudo random sequence but the problem with linear feedback shift registers is that you really have to step through the shift register one by one by one in order to determine what the actual output is so it doesn't go 0 1 2 3 4 it may go 0 15 well it doesn't go to 0 it may go 1 15 4 17 95 again it's pseudo random and you can determine what the sequence is but of course if you want to implement something like a countdown timer then you need to know what to set your shift register to in order to get the right value you know when you clock it however many times you're going to clock it ok that's enough of that the point is that you can run your program and look at the report and then you can modify your program and then run this thing again and look at the report there is a random component in how synthesis works so so it's it's a somewhat deterministic random process which basically means that if you make a tiny change in your logic it may actually change the way that place and route actually works so you know your blocks may be placed in different locations things will turn out differently so when you make a change and you see the number of lookup tables go down your logic may be more optimal if the number of lookup tables goes down and stays down but often you'll just see it go up and down and up and down and sometimes you'll make a change where it will go down and subtle at that level and then it'll maybe go up and down up and down you know at a lower level so just be aware that when you make a change and you see your number of lookup tables go down don't celebrate um obviously that is the true number of lookup tables that is used in the FPGA it's just that if you make another tiny change that maybe you think doesn't shouldn't have any effect on the number of lookup tables and then all of a sudden you see your number of lookup tables go up it's not your fault it's just you know that's the pseudo random deterministic nature of place and route let's go ahead and attempt to it was actually build but I'll just do it anyway oh wow it actually recognized it great so right now it's doing an erase and looking at the board um you can't really see it let me see if I can dim the lights there we go so these LEDs at the top are just sort of lit they're not blinking it's dim okay now it's reading and it's programming what do we see? well we see that one of the LEDs is blinking which is exactly the program that we wrote and it is blinking at a rate of 1 hertz which is correct excellent so we know that we can program this board properly which was the whole point of this particular exercise now the other interesting thing is that we can see that one of the LEDs is blinking but if I bring it in a little close you can see that the other LEDs are actually lit dimly and I think that this is probably because we didn't actually set the level for those LEDs so it's just sort of like I don't know floating there or something so presumably we could change our program to actually zero out those LEDs and hopefully that should make the LEDs go not lit go off okay now I've got the board unplugged and just to show that we have actually put the program in flash so we don't have to reprogram it again which would be kind of a pain but let's go ahead and plug it in and we can see that our LED flashes again so basically what happened is the program on FPGAs is stored in flash and then every time the FPGA starts up the program gets loaded from flash into all of the FPGA lookup tables and flip flops and then it begins so some of the older FPGAs have external flash the newer ones like this one has internal flash and that is why they're called field programmable because they're I guess programmable out in the field whatever that means the field usually refers to some place where you don't have a development environment so I guess if you can flash the memory you can reprogram the FPGA in the field so now the interesting thing is that now the other there's this other LED which is lit like up here sorry I'm looking at the video and I'm trying to move my hands in the right direction but everything is upside down or backwards but anyway you can see that one of the LEDs is also strongly lit on again we did not specify what the output of that LED should be so effectively it's random so if you have outputs make sure that you set them to some known value otherwise they'll just be random well that's it for this video it's been quite a long slog this has been about an hour and a half which is quite intense but hey we've learned about nmygen and that should be enough to allow us to start on our 6800 CPU implementation which we will get to in part 3 so stay tuned