 everything going in. Hope everyone's having a good day or a good night, good evening, or a good morning, wherever you are. Just seven episodes, just seven weeks and working on an end-my-gen version, a risk-5 processor, that's not on an FPGA. I don't know if it's ever been actually done before or completed before but this is the task that I've set myself to doing. So thanks for watching and for this stream, oh yeah that's my humidifier, nothing's burning. So for this stream basically what I'm going to do is I'm going to take a small bit of the end-my-gen code and turn it into an end-my-gen version of some actual chips which will start basically creating the hardware for me. So if you want to know more about end-my-gen, look down below in the description. I've put together a tutorial, I've put together some nice exercises starting from you know zero to get you familiar with the end-my-gen concept and what formal verification is all about. So yeah why don't we get started. So let's go and take a look at some code. So let me see. Alright yeah so this is the part of the processor that decodes an instruction into its component parts. And by the way there seems to be about 30 seconds delay or so between me speaking and the chat. So if you're in chat and you're asking a question and you don't hear from me for half a minute to a minute or so that's probably why. So the goal is to put the magic smoke into the hardware. Yes sure. Okay so this is the section of the code rust embedded. No sorry I'm not into rust. So this is the section of the code that breaks apart an instruction into its component parts. Now some of the component parts are not actually used for some opcodes like for example for you know like a JAL instruction you don't have a destination register or maybe you don't have an RS2 register or something like that. Funk 12 is only used for system privileged instructions. So you've got the opcode which is seven bits which is always the lower seven bits of the instruction and the lower two bits of the instruction are always one one and that is why is that I think that is because that is what the non-compressed instructions are. So anyway so RS1 that's source register 1, source register 2, destination register. Funk 3 is for the ALU operations along with Funk 7. So that's why the ALU Funk is there and Funk 12 again like I said is for some of the privileged instructions. So this is just you know straightforward wiring. The interesting thing that I want to tackle today is decoding the immediate value out of the instruction. So if we take a look at that code we can see that it actually depends on the format of the instruction of the opcode. So there is this immediate format which are things like opcode immediate so where you add an immediate value to a register the immediate value encoded in the instruction is a signed 12-bit value. So you take 12 bits out of the instruction starting from bit 20 you turn it into a 32-bit signed and that's the immediate value. S is for store, R is for register instruction so of course there is no immediate so I just set it to zero. What was U? I think U was just unsigned yeah unsigned so you take a bunch of bits out of it I think of 21 bits and you just stick it in the immediate value so effectively it's zero extended. Opcode format B are the branch instructions and if you're wondering where all these opcode formats come from you can look at the risk 5 specification again there's a link down below and that specification shows the formats of the different instructions and basically it lists out the format for ISR and so on. So branch it's a lot more complicated because the bits are sort of you know shuffled around in the instruction and you have to sort of unshuffle them and put them together so that's why it looks a little more complicated but again the bits are all there there are 13 of them and it gets signed extended so that's a branch. With the jump there is actually 21 bits again the bits are all over the instruction and you have to put them together and then sign extended to 32 and then finally we have some system opcodes and these are the CSR instructions so things like reading and writing setting bits and clearing bits in CSRs those are control and store registers there are only five bits so there's not a whole lot there and it's also zero extended so. Alright what should you learn and my gen Verilog embedded Linux Rust RTOS not to mention hardware well I don't know anything about Rust so you can learn it or you don't have to embedded Linux I I mean I can use Raspberry Pis but you know I don't program the operating system RTOS again I don't program the operating system Verilog I've used and I didn't like it that's why I went to end my gen so what should you use what should you learn and my gen and yosis that's pretty much what I exclusively use I I no longer use Verilog or VHDL or anything like that it's it's all in my gen yeah okay so yeah WebAssembly I don't know anything about WebAssembly other than that it's nice and there is actually a WebAssembly version of yosis that was put together but I don't use it so anyway okay so this is the function that I would like to convert into hardware so let's take a look at kikad want to drop this yeah here we go okay so here is here's some kikad for what I think the circuit is gonna look like so maybe a little small let's see if I can zoom in on that okay so the idea is that like the function let me just pull up the code again so the function basically depends essentially on the on the opcode format right and every opcode decides what the opcode format is will this stream be available for offline view later on yes it will it will be a permanent part of the record unfortunately you've never used and my gen and what is the difference between that and VHDL and Verilog okay so VHDL Verilog and and my gen are all high level hardware description languages what VHDL and Verilog are Verilog is I think based on adda and VHDL is based on C and they're their own they they are their own special language so you have to know all the ins and outs of VHDL and Verilog and typically when you get some software from an FPGA vendor they will accept one of one or both of those two and my gen is sort of like well you know if you're familiar with Python you can use all the Python goodies to generate some HDL code the HDL is the HDL that and my gen spits out is called RTL IL which is register transfer language intermediate language I think and that can be ingested by OSIS to program FPGAs or you know whatever it can also be translated into Verilog which you can then use on your own FPGA vendors software so and the reason that I like to use and my gen is that it's Python and I like Python and if I need to generate code I know how to do it because I know how Python works and it's it's fairly straightforward it's a general purpose language you know unlike something like Verilog and VHDL they try to be a general purpose language but they pretty much fail miserably at that that's my opinion and that's where we're going to leave it so okay so anyway what what I was saying is that the the decode of the immediate value pretty much depends exclusively on the opcode format which is determined exclusively from the opcode and then the rest is just shuffling the bits of the instruction around and either sign extending or zero extending and that's how you get the 32-bit immediate value so the circuit that I want to create takes in the instruction and the opcode format or just the opcode and it spits out the immediate value so here is the circuit again so we can see here that the input is the instruction this is the 32-bit instruction it's the raw instruction that comes from memory what's this yeah that's my humidifier I have a feeling that I'm going to explain that quite often because it does look like smoke okay so you've got the instruction and also down here you've got the opcode coming in now yes the opcode actually does come from the instruction itself so I could just you know tear off those bits out of the instruction and stick it right in here but I felt that just for readability it would be nice to break out the opcode into its own special signal okay so there are a bunch of different opcode format so here's the here's the circuitry that'll take care of the I format this is the S format this is something I'm calling upper which I guess is you there's also B and J and the system now each of these we go into that so what I what I'd like great so what I'd like you to think of this section as is a multiplexer and the idea is that we are decoding the the immediate value for all of the instructions but we're only letting through the immediate value that is appropriate for the opcode so that's why I have this let's just call it a selector and it selects one or zero of these six values and then they all get tied together because it's a multiplexer now what is one of these look like well it's basically a bunch of 32-bit buffers so these are 74 FCT 16244s which are 16-bit buffers they are arranged in groups of four that can be controlled independently with an output enable signal for each one and I've got two of them because I need 32 bits so this is basically a 32-bit buffer now the interesting thing is that if I if I turn off the output for this buffer then the output goes high impedance which means that I could tie a whole bunch of buffers together turn one of them on and then I could turn it off and turn another one on and that's basically a multiplexer okay do I think logisim is suitable for teaching a digital logic introduction I have never used logisim so I can't answer that question sorry okay so the buffer is pretty interesting let me see if I can pull up the datasheet on that okay so here is the datasheet so the worry about buffers when you when you've got two buffers with their outputs connected together is that when you turn one of them off and the other one on there is a danger that they'll both be on for some short period of time and the thing about buffers is that they can source a lot of current so if two buffers are on at the same time that's really bad because they're probably gonna burn the thing that I like about the 16244 is that if we go down to the timing we can see here the output enable time and the output disable time and the nice thing is that the output disable time is always at least faster than the output enable time which means that if I turn one off and the other on pretty much simultaneously I can pretty much be guaranteed that the output will be disabled before the output of the other one is enabled which means that they won't conflict with each other so that's why I think I can use these effectively as 32-bit multiplexers by simply tying them together so if we go back up one level we can see that we've got that we've got these 32-bit buffers all with their outputs tied together and one of these is only going to be only one of these is going to be selected or possibly you know zero in which case the output is going to be high impedance so I should probably have like you know some pull-down resistors of some relatively high value just to make sure that the output is zero if none of these are selected because the thing about the thing about modern CMOS and even TTL is that if you have an input you need to make sure that the input is not high impedance that it has a specific logic value otherwise the input in internal to the chip is going to float to some halfway value and it's going to burn a lot of current so so if you ever run into a situation where it's possible for a bus line not to be driven and that bus line goes to some inputs well make sure that you put pull-up or pull-down resistors on them so all right what about the not gate delay the not gate delay yeah that I think I think what you're talking about is the not gate from internal to the to the buffer if that's the case well the the timing data sheet the timing in the data sheet is basically from that input to the output so that includes the not gate delay if that's there targets for the clock speed yeah so if you've looked at the the previous videos you'll know that I have a system clock a machine cycle and an instruction cycle basically the system clock is the fastest clock the machine cycle is six system clocks so I basically have six phases and that forms one machine cycle and then there's the instruction cycle and instructions can take anywhere between one and three machine cycles so converted into system clocks an instruction can be anywhere between 6 and 18 system clocks I'm hoping that I can get a system clock up to 10 megahertz that would be nice so it's never gonna be fast enough to run you know like Linux or anything like that but I don't care I mean I'd like to program like it's 1985 again so there you go okay so anyway we can see that shuffling the instruction bits around is just a matter of wiring right because all of these are buffers you just route the instruction bits to whichever bit is appropriate in the buffers and then you select whichever whichever buffer is appropriate and if we take a look down here this is the selection logic so here's the opcode it's seven bits and that goes into a generic array logic chip so a gal 16 v8 there's also the ATF 16 v8 which is still being sold these these chips have been in use since the early 80s if not prior to that can you have one selector active and simply ignore the immediate bus at least it won't be easy no because in the circuit the immediate bus does go somewhere it may go to a multiplexer it may go to another multiplexer where the immediate value is simply not selected in which case you would say well then why does the immediate bus even matter again it goes to the input of a chip somewhere and the chips respond very poorly if their inputs are high impedance which is why you always want to pull up or pull down so that's pretty much why oh can't you have one selector active yes okay I could I guess that's that's definitely a thought yeah I misunderstood that that comment that's definitely an interesting thought so instead of for example the the system immediate value being being just for the system opcodes I could make it for every opcode except the ones that I've already decoded that's an interesting concept I'll have to think about that and see if it actually makes sense it probably does make sense because yeah the immediate value would not be used by the rest of the circuit good idea thank you very much okay so the gal so this is basically generic array logic it's I'm not using it's it's it's register capacity there are registers on the output that you you know that you can bypass really it's just a sum of products so it's basically a bunch of ands and then tied together with wars and this is what the logic is going to look like these are the opcodes that we have so I consists of just take a look at what those opcodes actually are so I is load miscellaneous memory which I actually don't use and opcode immediate S is for store U is for upper U stands for upper and that's the AUIPC and LUI instructions B is for branch J is for JAL and JALR and then of course system is for system so um so yeah okay so what I want to do is I want to get started by essentially let's go back to the code and we can see in the sequencer card that what I want to do is replace decode immediate with its chip equivalent okay so we know that we're going to need some 74 16244s and a gal and of course the gal is just going to be some logic it's not going to be a quote standard chip because it does have to be programmed I could replace it with a bunch of gates but then it would be messy and stupid and I hate it and my life would suck so let's go ahead and write some code to simulate a 74 16244 the idea is that when I replace the decode immediate function when I replace the insides with an equivalent that uses the chips logic as modules I should just then be able to run formal verification and it should just work so let's try that so what I'm going to do is create a new file and I'm going to pull up just my standard stuffs standard header and all the in my gen stuff okay and I have a skeleton somewhere but I'll just sort of steal it from here and I'm going to call this IC 74 16244 contains logic for a 74 16244 16 bit buffer there is an elaborate function oops I was done yeah save as IC 74 16244 okay now I need an elaborate function where's elaborate so that's the skeleton of influence the logic of the okay so that's the skeleton of an end my gen module basically again if you're interested to know exactly how to get started with end my gen there is a series of exercises look down below in the description for a link to those exercises once you've done those exercises you can look at the tutorial and there's a lot more information about all the different things you can do with end my gen over there okay the 16244 buffer is not in the official kycat library yes I have thought about adding it however right now I believe we're in the transition between kycat 5 and kycat 6 and they've changed the format of the libraries so the problem is that if I go ahead and create the symbol and then check out the code from the official sources they I believe want me to have a pull request against the current nightly version which of course is kycat 6 and I'm using kycat 5 because I'm a little too afraid to use the nightlies kycat 6 is not yet released so basically we're stuck in a kind of holding pattern until they release kycat 6 and the new version of the library comes out the new the new format so yeah alright let's see okay so what I want to do first I want to define my signals so if we look at the data sheet again well we've got actually why am I looking at the data sheet I've got it in kycat let's go into one of these buffers so there it is so the idea is I have four groups of four okay and they're called a so what I'm gonna do is for my signals I'm going to have a zero equals an array that's an n my generate so let me just make sure that I have that just a four-bit signal and there are four of them I'm not going to bother making this an actual array I'm just going to enumerate them okay now we have negative output enabled zero negative output enable one two and three and these of course should all be self self self okay and then we've got the outputs which are called why and those are also four-bit signals that's lock is on okay those are the inputs so inputs output enables for each section and now I suppose I could make a sub module in fact why don't we do that so I'm just going to do this sub so this is a four-bit buffer and I'm just going to say that the input is self dot a because these are just independent negative output enable and why is signal for okay so instead of having all this we don't need anymore okay so let's just define the four-bit section okay let me just check the chat how are things going in chat how's chat going everybody doing good in chat looks good okay so what we're gonna do is okay now again in I've explained this before but I will explain it again you can paste in column select mode all shift really that's pretty cool thank you for letting me know our chip timings relevant okay so the unfortunate thing about a design language like and my gen is that typically they're targeted towards FPGAs and you don't really worry about the timing except you use good practices I suppose I could simulate things like delays by having a system clock that was half a nanosecond and then I could actually you know write these chips so that there's a delay between the input and the output based on these clocks I could do that but honestly what's the point you know as long as I keep in the back of my head the timing requirements I think it'll be fine so yeah okay so what I've said before is that and my gen can't do high impedance signals so signals are either one or zero so the typical way that you put together two high impedance signals is you output zeros on the signals and the bus value is just the ore of all of the output signals that way if none of them are on the output is zero and of course I guess you would assume that in real hardware you would have pulldowns but if one of the outputs goes high and all the other outputs are zero when you all them when you or them together the result on the bus is just that one output so that's what we're going to do here so the first thing we're going to do is combinatorically why is equal to zero so by default we just output zero now if self.noe equals zero because this is a negative output enable why equals that's a four-bit buffer okay so now because the 244 16244 oh the test bench yeah I do not believe in test benches so what you call test benches is what the software world calls a unit tests unit tests are great if you want to test one or two scenarios however with hardware I typically use formal verification which is a lot better than unit tests because formal verification tests all possible scenarios not just the one or two that you can think of so now yes you could design a test bench with delays in it but again as long as you keep in the back of your mind the timing requirements I think it'll just be fine so alright so this is our four-bit buffer now let's create a 16-bit buffer out of that so what I can do is I can say self dot a I guess I'll call it zero equals signal of should I make yeah signal of four sure self dot a one signal of four and so on three okay now we've got the negative output enables why did I get rid of those I needed those two and three and yeah self is kind of like that this keyword in Java for sure okay is self y equals zero high z or output is driven to low output is driven to low and again like I said this is appropriate when you want to output high impedance but you're outputting it onto a bus zero or with anything is anything so just like an output combined with high impedance is that output so logically it works out okay so we've got sub modules so m dot sub modules plus equals let's see I'll just call it s for section zero equals I think I need to do I can just say okay so these are the sub modules what tool chains are you using for formal verification I'm using YOSIS so again if you check down in the description below for the end my gen exercises you can follow along and learn how to use end my gen and then learn how to use the formal verification engines for YOSIS to do formal verification just join we simulating design bit yes well no we're not simulating it we're actually building one again this is a risk five processor that is not being built in an FPGA it's just regular logic components that you can buy off the shelf except for FPGAs because I don't want to do that and we've already written all the logic if you check the past seven episodes that's what that was all about we formally verified it to ensure that it works and now we are replacing bits and pieces of the higher level code with essentially lower level gates or chips so and then we're going to run the formal verification again and make sure that it works and if it does we can be pretty sure that when we build the circuit it should work modular the timing requirements as I keep being reminded okay so we've got these sub modules now we just need to hook up the signals so in the combinatorial domain because there's nothing synchronous going on here so s0a equals self dot a zero so we're basically just wiring our input inputs enables but so we're just wiring our inputs to each of the signals in the sub-circuit right here so four of them two three we're also going to want to do the same thing for the negative output enable signals so I can do this and I think great that's our output signals and now that's our output enable signals and now we have to wire the outputs to the outputs of this module so that would be self dot y zero equals s0 dot two three does that look right I think it looks right let's write some formal verification code to make sure that it actually works all right so writing some formal verification code let's go down to a class method there's a space there to appease the linter and let me also can run this thing from the command line okay so for formal verification what I do is so I start off with the module so with formal verification you're basically you know creating a module just like you're creating a module for your for your module thing and also I return two things I return the module and the signals that we're going to manipulate in order to test so the first thing I need to do is create a an instance of the thing that I'm going to test so that's of this right over here and then I need to add it to the list of sub modules for this module so now I can refer to s and I can say that the inputs that formal verification is allowed to manipulate are s dot a zero s dot a one s dot a two and s dot a three and also the output enables and oh zero one two three okay and now we need to write what we want to always be true so here are a few things we want to be true if negative output enable is equal to one assert that y zero is equal to zero I guess we're going to just copy that one two and three actually I don't need the equals one because it really is just one bit one two three and there's an else part else s dot y zero is equal to s dot a zero so now again you might think that this is kind of ridiculous because you can just run through all 16 well 32 combinations and you're done oops I forgot the colon and that would and that's true for this really simple circuit but this is kind of a different way of describing exactly all the 32 different input combinations and yeah I know so much boilerplate welcome to programming so I could I could check all 32 inputs but all 32 combinations but what if you had a 32 bit input I mean you're not going to check all 4 billion combinations that would be ridiculous so that's why you write it like this so and again you know for something simple like this it doesn't really look all that different from writing a unit test to check all the combinations but again when you get just to more complicated things you just can't write unit tests okay main is missing that's because main is in util looking pretty good all right so the first thing I'm going to do is try to compile this so let's take a look at the command line okay so Python 3 I see 7 4 6 that thing Jen it works okay it compiles that's great so now let's create the symbiosis file so let me go grab a simple one this is fairly simple copy it to I'll just copy it to IC it's probably going to be used for all the ICs oh wait I already have one I call it ICs okay well let's do that then all right SBY minus F sees SBY and we're gonna run it in BMC mode and it passed okay well that was anticlimactic it works well okay well we know that all of these properties are now true they hold so it does what it's supposed to let me just double check that I got these negative output enables correct because you know inverting signals is really bad that last line is not visible on the stream thank you so let me go ahead and pull up the code again okay so we know that all of these properties hold true and that's that so now let us go ahead and try to use it can we use it in code immediate and we do this right now we are not going to use this right now we're going to create that gal okay so for the gal this is going to be just another module so I'm just going to copy this and call it gal immediate format okay so need this and this is just going to implement the logic that I'm going to program into that gal so this is IC gal immediate for the code okay so the inputs the input is self-dot opcode and that's a signal with how many bits seven six seven that's it the outputs are going to be the different negative output enables so there's one for I negative OE let me just check my circuit there we go so I need SUBJ and sis so let's go back to I had SUBJ and sis so those are our our output so the logic is going to be so in fact I wrote the logic here let's see well let's deal with sis first so 111 I can just copy this so there's the logic so plus equals okay so self dot I and OE is equal to not of self dot actually you know what I can make that a switch statement switch on for zero be let's see for this oh sorry sorry very sorry okay so what I did was I copied the the logic out of Kaik had that I had in there so here's Kaik had copied that into the code as comments so that now I can write the logic so we have a seven-bit input signal which is the opcode and we have the outputs which are the output enables or negative output enables to the to the buffers okay so for these cases a gal is not exactly a lookup table it's really a sum of products so you have the input signals and then you and some of them together you can also and the inverse of those signals together and then you can or some of those ends together and there's usually a limited set of of or terms that you can have per output gals are typically registered so they have registers on the output that you may use if you can if you want to or you don't have to there is internal feedback in that you can include the outputs as input terms as well so they're very useful little blue logic chips unfortunately they've kind of been superseded by FPGAs because everyone wants wants to stuff everything into an FPGA not just blue logic so but they're still being manufactured so there you go all right let me get some defaults here so m dot b dot combinatorial equals self I n o e equals zero and I'm just gonna repeat that for all the others s u b s okay so that sets that should be one so by default they're all off okay for these cases self dot I dot n o e equals zero and I just do the same thing for all the others this is pretty much where if I were in a real video I would edit this out but we're doing this live a bunch of these here is you and sis so the thought was that I could actually make this a default why don't we just do that now why not so let's see m m b this j are these two that's really it do I really want to formally verify this no after all that philosophizing about how great formal verification is I'm not really interested in formally verifying this chip will formally verify it when we replace it in the sequencer alright let's go ahead and replace this so first of all what I'm gonna do is I'm going to create in the sequencer card chips chips as a Boolean that defaults defaults so the idea is that here I can say self dot chips chips and then when I create a sequencer where chips is true then it will generate the code using chips okay what other tools have I used for development is and my gen the best you've used okay well I haven't used all that much I've done a little bit of VHDL and I've done a little bit of verilog and I got really annoyed at it because well it's a new language and I hated it and I didn't like verilog because it was very add alike and I never studied adda and I didn't like VHDL because it was just enough like C to be annoying so I decided to look at alternative HDLs so-called new HDLs and and my gen was the one that I like because it was Python based so okay where was I right so I now have so I now have this Boolean that I can set when I create the sequencer called chips so when chips is true I can just substitute out all this stuff so here's what I'll do if if chips call self dot code in with chips because remember and my gen is a generator of code it's not the code itself if soft dot chips okay so now we need to define decode in with chips right so decode in with chips so what we're gonna want to do is create rules I don't know if this is gonna work but what I want to do is create two instances of the IC so there is IC 7 4 1 6 2 4 4 actually 12 of them yeah 12 of them so buffs equals so now I have an array a Python array not an in my gen array Python array of these things okay and also I want a gal equals IC gal in format decoder okay I need to import those so from IC 7 4 this import I see that from IC gal gal I'll probably end up just putting all the gals in a single file so that I can just do from IC gal but you know for now I can't be bothered all right where is and and dot sub modules plus equals gal okay so those are the sub modules so now what I want to do is code this according to the circuit so oh boy oh boy oh boy how do we do this okay well the first thing I could do is I can hook up the gal signals I know how to do that right so I have self dot opcode that's just one of the signals that exists in this module so what I'm gonna do is say combinatorically gal dot opcode equals self dot opcode done and then I want to route the outputs so let's see buffs okay let me make some intermediate signals first so signal so let's do I NOE signal so we have I s you be J and sis so what I want to do is did I not make opcode publicly available I'll fix that later okay m.d it's underscore a pseudo private okay so buffs equals buffs of zero dot NOE zero equals I NOE two and three buffs of one which is the other now I need to copy this for the other six thirty two bit buffers and there is a way to do that because we're using Python for I in well let's do this equals s s you for I in range six now instead of this we're going to say and instead of zero it's going to be times two so what I'm doing here this is I times two plus one as I'm just writing the logic is I'm just writing some Python to create the logic in a regular programmatic way okay and I suppose that if I had these as arrays I can make it even simpler but whatever okay so that hooks up these signals now I have to say now I have to route the gals signal outputs gal dot right then oh middle mouse dragging is even better s you okay so that's the gal hooked up so I've got so I've got these intermediate signals those signals get routed to the eight output enables the input the opcode goes to the input of the gal and the output of the gal goes to these intermediate signals all right what's next well shuffling around the instructions so I am going to have to go through each of these buffers so let's go with my keyboard is nice and clicky my keyboard is a DOS keyboard DAS DOS keyboard it's got very nice clicky and believe it or not it's not the clickiest that it could be okay so what I'm gonna do is let's show I can again okay so here is I so this is how we split up the instruction for I so now I'm gonna have to think a little carefully about this what I want to do is create some more intermediate signals so I 32 equals the signal 32 so this is going to be the input for the I set of buffers okay because I know that I'm gonna screw this up so s you be so what I want to do is just like I routed these buffers I want to route these input signals so what I'm gonna do is something like m dot d dot combinatorial equals buffs of 0 dot a 0 is equal to a 32 0 to 4 so that's the pythonic way of selecting the first four elements out of I 32 so this is just bit 0 3 3 inclusive right and then I have a 1 4 3 8 12 16 and of course I'm gonna do the same thing for the other 16 bit buffer in the pair except now this is gonna be 16 to 20 20 24 24 28 and 28 to the end okay and now gonna do the same thing so SIG 30 input input 32 equals I 32 S 32 U 32 32 J 32 32 for I in range six just generate those things so input 32 so I okay so that's those intermediate signals okay so now now that I've routed input into the buffers so they've they've sort of spread out now throughout the buffers now what I can do is I can put together the inputs so I can say something like for format I what have we got so I 32 m dot d dot combinatorial I 32 so the first 12 get instructions 20 to 31 and then the others from 12 to the end equals whatever self dot instruction 31 is so what I'm doing here as I'm taking bits 20 through 30 inclusive and I'm putting them in bits 0 through 11 inclusive so I'm basically taking this set of 11 bits and sticking them in the lower in the least significant 11 bits of the 32 bit input and then what I need to do is replicate because I'm doing sign extension so I want to replicate this 31 this 31st bit into the rest of the bits and the way to do that in pipe in and my gen is using the replicate function which replicates this for let's see 32 minus 11 is fence post error right right 32 okay yeah sure fine well I mean if I have if I have too many bits and my gen is going to to truncate to the correct amount of bits so you know I could I could even make this 32 if I wanted to but do that okay is replicate imported I believe it oh well okay so first of all this is state that's where I put it is replicate in here okay is there an error that showed up somewhere else why is this showing up as an error why does it care is that a warning unable to import oh that's because it's in because for some reason I put it in this file you know what I'm going to rename this file to just rename I see gal I see gal okay that'll make things easier on me the range does not add up for the first assignment let's take a look do you mean this you are correct that's what it should be okay and now it's complaining about I 32 because I called it I underscore 32 for some no doubt ridiculous reason okay so that's format I so this is the sign extension okay get it out of your system sign extension has a funny abbreviation so the next format is ass so let's go back to cacad and we can see that this is how the thing is broken up so so wherever you see this is this is how things are being broken up so whenever you see one bit going into multiple bits that's actually a sign extension and if you look at this zero over here that's because we're stuffing all zeros into this part of it so Python likes underscores I like underscores too I also like dashes dashes are fun so anyway so now we want to take care of s tutorial equals so now s is s 32 module and plus equals op on it seems like it is actually so just a bit of explanation what this actually is so m dot d dot com is actually in the combinatorial domain it's basically just an array of n my gen statements and eq when you take a signal when you take a signal and you slice it the result is a signal and when you do dot equals on it that becomes a statement so it's basically just an array of statements okay so s 32 let's see so from 0 to 5 we're going to stuff in variable note that this is not an intermediate signal this is a Python local variable so so this is instruction 7 I could just do this and again and my gen is just going to truncate it properly so sure why not just do that yes the video will be saved to the channel no worries there okay so from 5 to 11 is instruction 25 starting from 25 bit 25 and then we're going to have a replicate so s 32 from 11 on is equal to replicate it's actually instruction 31 and replicate it how many times do I really care no so that way it'll replicate it out to 32 bits and then truncated into whatever is assigning to it okay so that's format s let's handle format you this is upper so for this one I want to put zeros into the lower 12 bits so 0 through 12 equals 0 so you know I don't have to replicate this because 0 is actually an integer so and you 32 of 12 and on is just equal to instruction of 12 what's format b fairly complicated so the 32-bit input to the B buffers are 0 is 0 through 4 it's it's really error prone to to translate between inclusive and exclusive bit boundaries so no doubt I'm going to get something wrong but anyway let's see so this starts from instruction at the 32 so from 5 to 11 there is actually an alternate way of specifying like beginning and number of bits there is an end my gen function that you can call on a signal which will give you back a slice and you just give it the starting bit number and the number of bits I think it's called bit select so you could do instead of this instead of what I just did you could do bit select 5 comma 6 and that will give you 6 bits starting with bit 5 which may be less error prone but I sort of got used to this okay so 11 starts with instruction bit 25 bit 11 is just equal to instruction 7 and bit 32 12 to 31 is just sign extension so it's just going to be equal to instruction 31 sign extended so instruction 31 that's the most significant bit which is essentially the sign bit of a signed signal so that's why I'm replicating it so that you know it's sign extended okay format J okay J 32 okay let's see 0 is again equal to 0 S 32 so from 1 to 10 inclusive starts from instruction bit 21 J 32 11 is equal to instruction 20 J 32 19 inclusive starts from 12 and on and then replicate the sign bit 21 okay and then finally we have format from a format sis so sis 32 of 0 through 5 equals instruction starting from 15 and the rest is okay those are the inputs okay so let's go back to Kaikad and you can see that what we've been doing is we've been putting together these signals so these are the inputs to these buffers so this was all the code that we wrote we're getting it from the instruction we shuffle the bits around and we construct these individual things and previously we wrote code to take these individual signals these nets and put them into the buffers so now all we have to do now error in line 692 I'll get to that in a moment so now all we have to do is hook all of these up together thank you thank you very much hopefully I would have caught that during formal verification and it would show me something so alright so now we have to hook together all of the outputs together and the way to do that is okay so we need to stick them all on self dot underscore him self dot underscore equals just the ore of all the outputs well I don't have intermediate signal outputs it would be nice if I did so let's just say output I 32 there is another error on line 692 another one I don't see it this this was 692 right these J's well hopefully you can tell me what it is because I am not really seeing it okay let's go ahead and grab copy it out 32 and these are the O's 20 to 21 let's see yeah that's right that should be 20 double check the circuit format J should be 20 to 31 yep thank you very much okay again formal verification would have caught that but thanks for watching thanks for watching out for me because that saves me some time okay these are the output signals alright so I can just take this stick that here here and then this is just why I just need to shuffle these things around so out 32 I 0 to 4 12 12 to 16 0 1 2 3 you drag selected text you can can does this look right yeah so this is the first set of four buffers this is the second set of four buffers and I've got my output signals okay so the immediate value is simply equal to all of these word together is that it I think so so let's just make sure we've got this so we have 6 32 bit buffers which translates into 12 16 bit buffers and then we have a gal so we add the buffers and the gal to the sub modules so we have some intermediate signals we've got 32 bit input signals and 32 bit output signals for each of the buffers and we've got the output enable signals for each of the 32 bit buffers we route the output enable signals to the individual output enable signals of all the buffers same thing with the inputs and outputs we route the gal we route opcode into the gals inputs and we route the gals outputs to the output enable signals and then finally we construct the immediate values themselves and multiplex them all together alright so in the sequencer card I have this chips Boolean that's by default false so in the formal CPU when I create sequencer hard I'm gonna change this to true and now I need to run some formal verification so let's open the command line so first of all will it even compile so I have a make file and I can select something so let me select one of the instructions so what I've done is I've taken I've taken the instruction set and I've divided it up into you know like easy medium and difficult difficult would be like the three cycle instructions that do loads and stores and the easy ones are the one cycle instructions like like a UI PC or L UI and I split them up in formal verification so that I can verify just one of these one of the instructions and I went into the philosophy of why this is done in episode 7 the previous video the other nice thing about splitting things up is that then I can run them all in parallel by simply doing something like you know make minus j8 which will run eight formal verifications you know each one on different sets of instructions at the same time so I'm just gonna do this and hope that it compiles it compiled it's running oh got a bug let's see what the bug is so it failed in 452 so let's go to VS code take a look at what 452 is and sometimes that will give a clue to what the bug is but if not okay 452 uh-huh so according to this the result was unexpected so it looked at the destination register and it saw that the result was not what this was so the question is where did I screw up so let's uh let's see if we can try to figure that out um there are a lot of signals though let's see if we can list them all so let me start up gtk wave gtk wave minus f so let's go to the sequencer and pick out a couple of the signals so first what I usually like to do is just display okay so these are the two phases of the clock all right uh so we have some of our intermediate signals are listed which is nice and we've done le UI which would be a U mode instruction so first of all uh let's take a look at the immediate the let's take a look at the instruction let's take a look at the instruction where is it right here okay so there is the instruction it's f2 0 0 d0 b7 so what's the immediate supposed to be um do I have result here I do okay so the the result is supposed to be f2 0 0 d0 0 0 so that's what we expect and what we can do is we can also look at the the register that it was going to stick the value into so there is an rd in here somewhere okay so it was going to stick it into register one so now I can look at reges after and look for zero one where zero one looks like it put zeros in there okay so obviously I got my wiring screwed up let's see if we can figure out which wiring got screwed up so let's go back into the sequencer so all zeros is suspicious that that kind of means that nothing came out of the buffers so the first thing is did I actually enable the buffers so uh the negative output should be u u no e nope it's one okay so that's definitely a bug all right so where is no u no e that's this u no e so that means that the gal is not galling so there's the opcode uh let's just take a look at underscore opcode 37 okay let's like take a quick look at the gal code copy of I didn't name it uh so I don't know which one of these sub modules it is that's okay uh so by default it's this and here's my input and I expect 37 which one is 37 where is he you don't see 37 in here did I screw up the table so this is I let me just comment on this I okay did I get my table right uh let's take a look at oh yeah I'm sorry um you're not you're not actually seeing vs code are you there we go so this is this is the the gal code and I'm looking at you um the opcode is is 37 and that's not any of these um oh wait this is 37 yeah oh you know what I did I forgot I forgot to specify what domain the signal changes in that's pull up commit yeah opcode is seven bits um the bottom two bits are always one um in the spec so all right try it again hey I like this game okay that worked uh let's try another short instruction uh a ui a ui pc this is taking longer for some reason but it worked okay that's two down a bunch to go so what I would do uh now is I would just run all of the instructions that's going to take 13 or so minutes so let's get it started uh let's start it with um eight processors some of them are going to complete quicker than others um and my 13 minute estimate is from when I didn't have the chips in um so anyway I'll just get things started let's see what else can we do so these are all the these are all the um the shards that I basically sharded things into so all the loads in the stores are all separate uh verifications there's fatal which is for fatal conditions like misaligned addresses there's interrupt requests the various jumps csr e-call and e-brake beer break and coffee break um op and op in and the upper immediate so are things doing is it happy it's pretty happy so far so uh time yes okay sure time bake minus j that's right I always forget to add time yep okay excellent um I didn't waste too much time I wasted maybe a minute that's okay uh okay uh let's see what else can we do um well I'm not going to mess with the code uh let's take a look at uh kycat again so this is what we built this is what we built in software um and it was a little painful again because we had to map abstract signals onto real chips um so that was kind of painful but you know if if formal verification works it works um and that means that I can you know be assured that if I build this using the real circuit using real chips um time passes yep wait time passes well played um if I build this using the real chip I can be pretty assured that at least this section of code is gonna function correctly um again modulo any timing issues um so that's pretty cool um let me show you uh the level above this wow this is pretty cool and a little small but hopefully our thing's doing let me just take a quick peek things are pretty happy so far in formal verification so this is this is sort of like the top level of the sequencer um you can't really see it but up at the top these are some CSR registers there are seven of them that I had to stuff into the sequencer and this was just so that I could handle um interrupts uh and exceptions so I stuck those up there uh they have some multiplexers so uh they get multiplexed and then multiplexed again because their outputs can go on to over here so that's that's the multiplexer so when I look so when I looked at the code I saw that the x bus could get data from anywhere from anywhere of any one of three places it could get data from the pc it could get data from stored memory or it could get output from one of the CSR registers so yeah so rather than rather than making a um seven input multiplexer I just you know broke it up into two four input multiplexers because that's what I had lying around uh you know maybe I'll change that I don't know but that's that's the CSR so basically what I did was I went through all of the code um and I looked at you know what got multiplexed on to each bus and these are the multiplexers that I came up with so for the most part they're all four input multiplexers um you know if if you ignore the whole CSR multiplexer thing um so that's kind of nice um this one down here this one right here is oh the doorbell rang I'll bet that's a package um it can wait we don't really get porch pirates around here he said um hoping that that there were actually no porch pirates well anyway um so rd uh this this is the multiplexer for the destination register number uh so it can either be the destination register or source register one or two from the instruction or it could be zero you know in in in the case where I don't want to write to a destination register so that's what that's what these multiplexers do um and then over here you can see some of the internal registers that I've started laying out so this is the pc the program counter uh this is the memory address so this is the uh data if we write something to memory and this is just a temporary register which I found but get my package it might be cake I'm pretty sure it's not cake you want me to get my package fine I'll get my package and I'll leave you with this okay let's turn this into an impromptu unboxing session maybe you can see those but those are spray tops and glass models why you might ask am I getting spray bottles because I have a Madagascar hissing cockroach as a pet and they require high humidity so this is a spray bottle that I can fill with with water and then just spray his container uh and increase the humidity yeah I've got this smoking humidifier smoke in humidifier and uh you know it doesn't humidify the entire room so I need to mist his cage every so often that's why I got this unboxing over oh and I didn't get the plastic ones because I felt like splurging um and I felt that you know maybe maybe the plastic would degrade in the sunlight so I got some glass bottles I think I paid like 10 bucks for it anyway anyway anyway anyway so okay so these were the internal registers um and the internal registers and the internal registers could get their data sourced from different places so again we've got some more multiplexers this is kind of a theme of the entire circuit I wanted to build the whole thing out of as many multiplexers and buffers and registers as I could um because that's that's really easy please not the green face yeah Jim Carrey man I like him and I hate him um I like him because you know he's he's funny but sometimes you know he's a little bit over the top many times he's over the top I mean he's he's just like you know out there uh anyway uh enough of that so this is what I laid out so far for the sequencer so things like the register card and the shifter card those are pretty straightforward um but the sequencer is like what is the name of your cockroach his name is hamster why so that I can ask people if they want to see my hamster okay um oh dumb and dumber don't even get I I did not watch that movie because to me watching people be dumb is not funny so I just I just don't find it funny I just find it pathetic actually anyway uh don't get me distracted okay so so the sequencer is the most irregular uh of the circuits that I'm going to build so things like the register card the shifter card uh what else is there um the um what else was there I'm missing one of the cards um uh the the ALU card right um so so those are more regular in that you know for the most part they're just a bunch of buffers and then you know the the the actual chip functionality um so just go outside no politics in the chat guys no politics in chat uh so I did kind of want to get started with the sequencer um card first just to see what things would look like um and to get you know get started on a little more interesting circuitry the other thing is that uh you you can see that you know we've got these basic modules here so we've got like a four input 32 bit multiplexer so if I can build a little module that is a four input 32 bit module you know maybe I could just build a bunch of them and and you know maybe stick cards or something like that uh yeah the shifter is is definitely a bunch of buffers it's all buffers basically um so yeah memory card yeah um I haven't even designed that yet but I have the memory interface so as long as it conforms to the interface it'll be fine um oh and that's another thing um I'm planning on using static rams because they're cheap um you know it's a 32 bit machine which means that at most I will have four gigabytes of memory unless I do things like you know have processes but I don't because I don't have a supervisor mode and I don't have a user mode I just have a machine mode so I'm doing bare metal programming so the most ram I can ever have is four gigabytes and you know SRAM of that size is well maybe not easy to find but um I don't have to go the route of dynamic RAM which I absolutely hate because you need refresh circuitry and it's slower static RAM is faster so eh whatever I probably won't need four gigabytes of static RAM you know maybe a few megabytes good enough um so anyways so there are these little modules there are these multiplexer modules and there are these these you know register modules and you know maybe I could I could build uh you know one circuit for each of these and then just you know duplicate them um and then somehow I don't know be filled with winning or something um yeah all the buffers is propagation delay that's why I decided to use the fct series um so we're talking about a propagation delay of something like five nanoseconds um so if you multiply that out by I don't know 10 levels that's 50 nanoseconds um which still is within my 10 megahertz which is 100 nanosecond target so yeah uh the chi-cad schematic is not generated from the code this is totally manual um so I created these these symbols um and then I'm just you know wiring these up this isn't even a valid schematic you know this is just sort of like you know sketching with pen and paper um so yeah um this you know then once once I am able to formally verify everything uh then I will actually design some actual circuits uh let's go to the command line see what's going on it's still going uh it hasn't it hasn't listed out the uh the results yet um it just looks like it's uh hung because it's actually where is it where is it actually waiting I think this is the last thing it's doing waiting for solver um for fatals fatals always took the longest to verify because they actually test every instruction they don't test the results of the instructions uh which the other shards do uh this just tests if an instruction goes fatal that uh fatal exception handlers fatal exception handling actually happens so uh and unfortunately it takes the longest amount of time um and it's only been five minutes or so uh sorry 12 12 minutes so a little a little more so uh let's go back to what chi-cad yeah okay so this is what the sequencer card looks like here is uh right over here is the immediate decode hierarchical sheet so if I click into that that's what that is and then of course each one of these is one of these 32-bit buffers so already I'm beginning to break this up into chips uh so there's the fct series chips uh why are the three second tier multiplexers on the left duplicated uh yeah they have identical inputs but not identical outputs so if I go up to the top so this is the x the register number for the x bus the register number for the y bus and the register number for the z bus so the way it works is that um when there is a register number present on one of these three buses the register card puts the contents of that register onto the x bus or the y bus and if there's a z register present it loads the that that register number with the contents of the z bus so that's why there are three multiplexers because I need to specify three registers uh two source registers and a destination register and uh and you know these are these come from the instruction rd rs2 and rs1 sometimes they need to be swapped around sometimes you write to the destination register and then on the next machine cycle you read from the destination register or vice versa so you know that that's why they all get pretty much the same input with the exception of this other one which is sometimes uh sometimes I don't want to read from a register or I want to load the bus with zero which is what register zero is so that's that um okay so what was I gonna do oh yeah so here's the instruction decode so if we looked at the code let's take a look at the code sequencer card there's the instruction okay so here's the instruction decode really all it is is is rewiring the instruction right so what that looks like in in in the circuit itself this I mean there aren't any chips it's just you know a strict wiring job so all I'm doing is basically renaming the bits of the instruction signal into the bits of these other signals um so that's all that is um and again this isn't a valid schematic you know these net names are probably all wrong um and they you know they would probably completely fail um design rule checking and electrical rule checking again this is just to map things out so that you know when I'm ready to to do the real thing I know where all the signals are supposed to go yeah your your bag of gal 20 v8s you should definitely do something with them um probably probably the most canonical project to do with a gal is to make a a a four bit binary two seven segment decoder that way you can display the digits zero through nine and a through f with a gal um the thing the thing about gals is that um there are uh some some packages on github one is written in c one is written in rust um to actually generate the bit streams you still need a programmer um supposedly the tl 862 II can do it um I haven't actually tried it yet the last time I tried it it failed miserably uh but that was two years ago so uh and from what I understand they they updated their algorithms so yeah um cold coffee how are you all doing in the chat doing good I hope so let's take a look at the command line again oh we're still waiting for the solver we're still waiting for the solver it's been 17 minutes may take a while but at least everything else passed which probably means that fatal is going to pass because the only thing that I changed was the immediate values so yeah uh I think we'll we'll just let this finish let's see what else we can do vscode lightboard what do you think it's going to be you know I think we know what it's going to be um this is the cat circuit it takes x and knocks it off the table um there's not really much else that I wanted to do today I really want the formal verification to finish up and basically this is what I want to do accomplish today I did want to take a small part of the circuit a relatively simple part of the abstract logic and convert it into real chips um and then rerun formal verification to prove that my mapping from abstract logic into real chips does actually work and you know so far it it looks like it it did so that was that was a pretty neat accomplishment so so yeah and we did that basically uh basically I am able to go back and forth between the chip version and the abstract let me so yeah so basically I'm able to go from the chip version to the abstract logic version back and forth um if I if I need to simply by setting this boolean okay uh I think we're done we are done we have pass on everything so uh this column here is the number of seconds that it took so we can see that fatal took uh 1200 seconds which is um something like 17 minutes or so um in fact it says it 20 minutes okay 20 minutes so so uh le ui being such a simple instruction only took 13 seconds a ui pc took 34 seconds so most of these things you know took something like under five minutes to do um unfortunately fatal takes way too long would there be a way we can donate cpu power to help verify well no I mean uh there's a long poll here and the long poll is fatal um I could I suppose break up fatal into checking different sections of the instruction space um I could divide it into four maybe um I could do that and just call it fatal one two three and four and then it should pass in uh a quarter of the time so you know maybe I'll end up doing that um so you can see that that the real time in real time it took 20 minutes total so um so if I actually eliminate fatal whoops if I actually uh eliminate fatal as the long poll it's the next longest poll it looks like IRQs and op so you know and then and then it'll take something like seven minutes or so which is fine so um yeah I guess that's all I wanted to do all right um I'm not a professional streamer so I don't have a stream close video um I only have a stream open video that I made this morning so uh I guess that's about it um I hope you uh enjoyed listening to me waffle on about um electronics and coding and and and formal verification and and unboxing glass bottles for for hissing cockroaches um and I will probably be doing more streaming um because it is the end of the year and I am pretty much off for the last two weeks of the month um so this is a vacation um so I can stream and do other stuff and hopefully you can put me in the background uh because you know listening to every single word I say um you know it's it's you can either do that or you don't have to um I often like to listen to other people just do work in the background so it's sort of like you know I'm around other people um doing work so and I'm okay if you do that too so I guess that's about it thanks for watching hope to see you on the next stream and remember your hardcore see ya