 Good afternoon everybody. Good afternoon. How are you all doing today? Good? Everybody is doing fine. So The people who are actually listening to this recording and not turned up in the class today I will be probably using a board much Much much more. So the recording is not going to have it. So try to attend it tomorrow So I Will be using a lot of boards blackboard today. So most of it might not end up in the recording and We will today start seeing the micro architecture so in general We have seen till now how We design sequential circuits combinational logic circuits last week We saw the architecture what instructions what is an architecture? What comprises of an architecture? If I come and ask what is an architecture for you all what what what do you answer like what is an architecture? it is a set of basically instruction set and a bunch of Registers which hold the architectural state, right? Today what we are going to and we saw some instructions like our type I type J type hopefully the Assembly programming part was completed last week so We so this mean we saw how these Sequence of instructions are going to be executed and today what we are going to see is how each of these instructions Are actually implemented in hardware? So we'll basically see how what's an overview of the MIPS microprocessor? What's the micro architecture how? each instruction that we saw gets eventually Fetched Decoded executed and how you get back the results and how that Consequent instruction is executed and how all these things are done in hardware So basically what we are going to see is the micro architecture part We saw all the way up. We jumped last week to the architecture Which is as I said a bunch of instructions that is available to the programmers to use Along with a bunch of registers which kind of hold what we call as the architectural state of the system today we will go Will cover micro architecture that is how each of these instructions are going to be Implemented in hardware and see the flow of the conversion from the machine code that you saw like instructions getting converted to machine Code how this gets actually converted and decoded and implemented in hardware This is a very fundamental step for you to understand The remainder of the course so we will start today with single cycle Architecture and we will continue it tomorrow and then in the next weeks You will go forward and see what is how a multi-cycle what is a multi-cycle? Architecture how it is implemented what are pipelined architectures as Professor owner mutlul mentioned last time how how these systems are actually implemented in hardware because most of modern Processes that you're using most commercial processes are pipelined But in order to understand pipeline you'll have to start somewhere in single cycle So there are not many commercial high-performance single cycle architectures today But this is fundamental for you to or it kind of may make it easy for you to understand the reminder of the course So what is a single cycle architecture a single cycle architecture is as the name suggests each instructions You saw add subtract load word Stroll word all these instructs for each of these instruction will be executed in one single clock cycle So let me quickly show you in an illustrative fashion what each of these mean So you have so when there is an instruction what happens you have to fetch this instruction so you basically do an instruction fetch After you read this instruction. What do you have to do? So you have a bunch of machine code hexadecimal values Yeah, you have to decode it So you go decode the instruction and when you decode the instruction you will identify Some registers that you need are probably some memory which will which are the operands So you let's say you have some kind of a register fetch or some kind of a register fetch register read write and then after you have all the required values, what do you do? Execute it right so you need an execution unit or Depending on what the operation you have decoded you need to execute it or you need to perform the correct operation And then you have to either store The result back into the register or into memory or probably you will not do any store operation You just go to the next instruction So in a single cycle architecture all these processes Happen in one clock cycle So every instruction executes in one cycle the next cycle comes in you have the second instruction The bottleneck So what what this actually translates is that this actually this the slowest instruction to execute in the processor Will define your clock speed in a multi-cycle architecture what's going to happen is that I have shown One instruction is split up into multiple steps. So each of these steps is Potentially going to be executed in one cycle So you will have the same instruction So the instruction fetch in one cycle decode in one cycle execution in one cycle and so on so you will have this kind of Scheme, but it will take probably two or three cycles to execute a Single instruction and as we go down to pipelining what happens here is that you have this So this can be considered as one single pipeline And in a pipelining architecture you will have multiple of them So you will have multiple instruction fetch decode execute pipelines all operating parallely So you're got now what you're going to do is you're now going to execute or process multiple instructions at the same time and What this basically means is that the architecture is going to get much more complicated because If you have to solve dependencies for example, if you are going to do an ad followed by a multiplication operation and You need the result of the previous operation for the ad operation You need to somehow check for these dependencies and do some reordering and things like that So you will actually be seeing all these things after we In the multi-cycle and pipeline architecture, so there's much more logic that has to be implemented as we go multi-cycle and pipelining and Now this week we will start simple single cycle architecture, okay Any questions still no so again we go back to the I have to keep rising and So we go back to this Operations scheme so where do we all start so what does our microprocessor do? You fetch instructions first, right, but you have to fetch it from some location So all these bunch of instructions is stored in a particular memory That's the first Kind of precondition that we always assume and now what happens is we start from here We start reading one instruction We decode what the instruction has to do Exactly what I mentioned we find the operands whether it's register or memory I forgot to put in memory here, so you find where you have to feed in the data and Then you perform the operation If whatever is needed, so you have add subtract load or whatever is the operation you perform that Corresponding operation and you write the result again It's if necessary and then you go back to the next instruction So these are the kind of steps that our microprocessor has to do in order to execute one single Instruction and a set of instructions become a program so Let's go start So let's start with what all we want, right? So I'll use this part of the board for the hardware that we need So we start with instruction fetch, right? So we need to fetch these instructions from memory which basically means that we need what we let's call it instruction memory Just for simplicity Then what do we need? You are going to decode the instruction. So we'll come to that there. So what us and then you need to find The operands or fetch the operands. So you need a register file, right? We already saw this register file when we saw the architecture How many registers does the MIPS have? 32 perfect and how long is the what is the data word size? What is each? register content how many bits? 32 bits so you have basically a 32 32-bit Value and there are 32 registers here So you need a register file. We'll come back to this and then what else do we need? For execution we need an arithmetic logic unit We need something to perform our add subtract and other operations So let's say I will just draw a block here call it arithmetic logic unit What else do we need? Is that all? So we have a register file. We have instruction memory. We have something that probably does the operations for us Yeah, good. Let's say control unit but So it generates Appropriate signals, but we'll come to it. It's needed something is missing here We talked about register file. What happened to the other part if we want to have more if you want to fetch data from some part Some other part of yeah, no still be missing something. Yeah, good Yeah, we need data memory. So we need a data memory So typically depending on the architecture both instruction and data can either be in one single memory unit Sometimes it's separate, but for easier Understanding, let's simply say we have an instruction memory and a data memory that is separated. Okay so pretty much What are these are the main pieces of the off off MIPS, right? And you have so these let's start with memory to store program We have a register file an ALU to perform the operation data memory to store more data We will see support control unit as we go forward. We're still missing few minor pieces and those minor pieces Are basically you need a program counter. Do you remember the program counter? So it's basically a Register a 32-bit register Let me try to try it somewhere here. It's easier. So it's a It's basically a 32-bit register which holds the address from where the next instruction has to be fetched and This keeps increasing depending on the instruction But typically it increases by four and if it's a jump instruction then this will hold the address to which we have to jump to fetch the next instruction and of course we have some logic to decode the instructions and we also need a way in which we can Manipulate this program counter if we encounter a branch operation Typically, it's what you have one instruction the instruction memory as I mentioned last week you have the memory which you jump by four because of the type of memory and Sometimes you don't have to simply jump by four because you encounter a branch operation You have to calculate a completely different address. So you need some logic for manipulating the program counter for branches so these are the Actually, these are the pick the the main components or the basic components of MIPS and what you're going to see today is how All these things are going to come together and how you are going to You will actually see how these Instructions are going to be executed implemented and you will see all these things actually in the lab You'll do it yourself starting from the arithmetic and logic unit next week till Executing your own assembly level program on all the components that we are seeing So you will have a practical experience as well on how these things come together Good so Before we go further each of these so each of these Different components that we saw are all some kind are going to implement some kind of state machine remember state machines Last year we saw it No, we saw it a couple of weeks ago. It was so just for quick Revisit you have you remember the state machine has this current state, which is actually there's a register Which holds the current state then you use do some kind of a combinational logic to compute the next state And you have control signals and then you have a clock signal that at every clock signal You move from one state to the other depending on what the depending on the control signals and the data and Some of them are going to actually implement this kind of a design so it's good to kind of revisit this and Understand how this works. So let's get started actually. So let's start with Set sequence of instructions That are stored in memory So let's consider this to be the instruction memory and as I mentioned last week Your memory representation is typically bottom-up. So you so whenever You have a reset the program counter starts at a fixed Memory address. This is fixed for different architectures for this particular MIPS architecture. We fix it at this value and Typically, this is read only for for this for for our Educational purposes, let's keep this read only but in modern architectures You it's both read and write because you you load new programs as you execute because it's it's not so for this Architecture we keep it read only and again memory addresses. They are 32 bit wide So you can typically address for gigabytes of data and now we start. So as I said, we start with the most fundamental Block of executing any instruction of the micro architecture the program counter so Let's see what we need in a program counter so If I draw here is everybody able to see it. So You have a PC. It's clear The last rows you fine. Okay. So what do we need in a program counter? What does it do? What's what's the output? Let's start there What should be the output of the program counter? Yes It's the memory instruction memory address. So let me call this PC underscore next address and this is going to be a 32-bit value. What's the input? Or what's the current data actually? Should be the other way around Yeah, I'll come to it. So what should be the input? Yep. Yeah, we have reset. Sure. Let's let's go for that Let's say a reset. Okay What else do we need? Yeah a clock brilliant. So we need a clock What's the input? It's going to be another address, right? It's going to be this Let's say PC address copy. I think it's depending on how you see it It's going to be so this is also going to be a 32-bit value So these are all the things that are going to be this is the block or the module diagram of your program counter, right? How do we implement it? exactly, so What you have is pretty much in very long. This is how you implement it. Remember, you will see a very high similarity of fsm's So you need to be so so you start with declaring a 32-bit To 32-bit value So it's for the present state and the next state addresses as we saw here And then we need a combinational logic to compute the next address and it's simply addition by four So you have a simple combinational logic. This is actually wrong. Guess the problem here There is which something somehow I've completely overlooked. Yeah, that's okay. That's okay. That's that's still There is okay, there's still something okay, why that's that's kind of a minor issue I think but here we have a bigger issue if I remember correct. Yeah exactly it should be a Blocking no the other way around so it should be an equal. It should not be an it should not have the Aromark. Anyway, so then we go into the always block. So remember the all good old always logic we have at every Pass at every rise of the clock you need to increment your PC address So your next state has to be the so your present state has to be the next state So the at every positive edge of the clock you compute your your program counter value And when it whenever it is reset you assign your default value, which you saw the 400,000 has the value and Here you have an asynchronous reset What's an asynchronous? Yeah, sure good question It's historical more or less from there because of some of the technologies that we have used to use in the past but an Easier answer would be that it's much more easier to get glitches In the circuitry, which is like quick shot of voltages Which means your system is going to get reset unnecessarily? so It's much more Kind of it's more Stabler when you look for a high level to a negative level because this kind of a glitch is very rare One answer to it But there's also other answers of how these transistors were implemented in the technology and things like that You can say that since just to prevent glitches glitch glitching of the circuit because you have a quick Because it's an asynchronous reset If it was synchronous reset, it was fine because the glitch has to happen exactly along with the rising edge of the clock Then only the system will reset much more right, but when you're implementing an asynchronous reset at any point When you have a glitch in the circuit your system is going to get reset So it's it's it's just for most ability purposes among other reasons so What where was I? Yeah, we just talked about asynchronous reset and so pretty much we have implemented the program counter and So yeah, we got one part of the Architecture already done so we have the program counter now. So let's go Further where You are going to actually start implementing each of these instructions how they are going to be executed So a quick recap again. You have three different types of instructions our type I type and J type you have a question Sure, we will come to that This is exactly why so this is for now, you know for the time being Okay By the time by the time you I think it's I don't know how fast I will be able to go today But by the time by end of this week this for the time being will change completely So any other question just to make sure you're not simple we start simple. I don't want to straight away put in logics Okay, so we have three types of instruction and why it's always good to start very simple So we start with a subset of MIPS instructions like add Subtract and some logic operations like let's consider them are type instructions on the old and then we will slowly add memory Instructions and see how we have to modify our MIPS data part circuit to actually enable it to execute memory instructions And how then after that we will further modify the circuit or the hardware to enable Execution of branch instructions for example as your colleagues said here PC Why do we just simply increment how what happens when branch? We will see exactly that point how to modify the program counter to actually Include for branch instructions as well and among other modifications we have to do and then again We will in additionally we will have an immediate circuit like for example add immediate and jump So we will go step by step. I'll try to go as slow as possible if I'm too slow Let me know and if I'm fast also Anyway, so let's start with a quick Summary of the odd type just to get you back into the same frame of mind What's an odd type instruction add subtract? They have three operands. So Let me Switch on the other board Easier So you have let's say an add Instruction and what are the operands? You have let's say dollar as zero Dollar s1 dollar s2 What does it do? S0 will equal the value in s1 plus value of s2 and these two registers are called the source registers This is the destination registers So let's start with this example and if you want to convert the instruction Into machine code this instruction has the following. Are you able to see the board here? The values here, so you have an up core source source Sorry, this is going to be the source registers You have the destination registers and you are going you are having the shift amount which is useful for shift Instructions and then you have function So for an odd type instruction your opcode is always going to be Zero and all the function whether it's an add or subtract is defined by the font Field, let's move forward. So we have an add s0 s1 s2 when you convert it into machine code This is exactly what it is. So you have opcode zero source All the addresses are converted we saw this last week and you have this as your machine code So this is going to be the one that is stored in the instruction memory Okay So Now Let's start again With what we need and what we don't need. So what we need now So here, what do we need for implementing this instruction? We saw the program counter. Let's say it's almost done Now what do we need for the add instruction? You need to perform that operation, right? Yeah, I'll come to you and then we have the ALU and yeah Okay, so what else do we need? We only finished the program counter. Let's say we need an ALU now and we also need Exactly, so you need instruction memory and you need a register file. So we will see now how these are going to be Implemented so Let's start with the register file. Let me try to go more up. Yeah I don't want to remove anything Let me try so I'm going to replace this control unit with register file extremely. Sorry So this is going to be your register file for now. So Let's say Now what are the inputs? What are the outputs? Let's let's do the upper. Let's do that access us again What do we need? clock Yes, or no No clock Why? We need a clock What else do we need? What did I say? What is the register file? It's 32 registers Each 30 each register having 32 bits, right? So What do we need? Yeah, data in some data in some data out. Let's go specific into it now you have S1 plus S2 you're going to write the result into S0 So you're going to read The value out from S1 You're going to read the value out from S2 And you're going to write the value into S0 So what we need is a special type of memory Again for simplicity purposes what you're going to say is we need two data out Two data out wires, right? Yes or no, so you need two read access and one write so let me go back to this place and I will say data out comma sorry underscore RS which is Yeah underscore RS because you're going to read your source registers Another one will be this will also be 32 bit This will also be 32 bit and you need data out underscore RT the other source register Right, what's going to be an input or what what data are you going to be writing? Yeah, but we haven't performed the ALU yet You you're correct, but in terms of register. How do you how are these? Destination so it's already exactly. So you have data in underscore already right then We need we have data now What do we need more? Yeah Which is what do we call it register selector? It's fantastic. It's addresses took exactly specify registers, right? So you need addresses Let me call it a What else do you need? Is one address enough? Yeah You need three great So you need address of rd address of RS And you need address of RT Are we done? Yeah, you write into the register file So let me start there add a zero s1 s2 You're going to read the values in s1 and s2 Which means you are going to read the values of the source register that is RS and RT When you read out you're going to read from the register file So you're going to need RS and RT as the output and when you write back the result You're going to feed it into the register file. So you need it as the input Understood are we done more or less? So let me Yeah, pretty much we covered we didn't cover the 32 registers, but Of course, it's implicit. So we need 32 registers Every R type instruction has two for reading one for writing We need a special memory you need two read ports So you need we saw that we need two outputs and we have one right port and that's the other the input part And how is the very log going to look like? You know that these two blackboards cannot be operated simultaneously It's a problem No, okay Oh really? Oh super So now let's see you learn something every day So you have the register file and what do we need? We need a five bit address. So two power five 32 different addresses So you need five bit values for the addresses RS, RT and RD as an input And you have another 32 bit input for the right back You have an input for Right enable we will see why it's required in the couple of slides and you need output exactly for the source register values And then to implement the register itself You're going to define an array of registers and this is how you do an array of registers How you read this is you implement 32 bit register And 32 of them Okay, so you have 32 bit data wide register and you You instantiate or rather you declare 32 of them And then straightforward you have For reading the RS out the source registers out. You simply pass the address of the So addresses the input so you read directly in that array. It's very standard very similar to C programming And again here you do for RT the same and for right You always have a clock So this is Pretty much the convention you always have so you call this memory as Asynchronous read synchronous write There are of course synchronous read and write Typically, you don't have asynchronous read and write but so there's different types of memories, but what memory we are going to implement is synchronous write because the right enable will be checked At the rising edge of the clock and the data input will be then appropriately stored into the register file So when you do an ad operation your Result is going to be stored The data is going to be stored in the address of this destination register Simple so we have a register file now implemented So we can actually check one more. Are we missing something here? We're missing something here 32 registers fine. Are we missing a register? Everybody loved that register last week. You had so many questions on yet Perfect. So we need to add the trick for zero register We remember you had a dollar zero implemented or in the instruction and this is pretty much what you need to add So you say if your source or your Any of these registers is zero you simply assign the value zero out or you pick the correct value And it also it's a ternary operator. So address is zero read rs or just output zero By this you are simply implementing this dollar zero. That's it Got it It's up to the I mean it's up to the architecture how the synthesizer will say you can pretty much say you just zero But this is exactly what it will anyway result in So in hardware when it's all converting the verilog into hardware, this is pretty much the architecture that you will get So it's just how to kind of describe and you will actually have some of these files As part of your exercise and you might have to just implement the alu And then you will plug in all this register file that we saw the memories Which we will see in in the future all these things you will plug it in in verilog during your lab exercises And you will see the whole MIPS actually working. So these are all going to be part of your lab exercises as well So we have the register file now We need something that actually does the operations for you Which is the arithmetic and logic unit. It's it's the core of the MIPS MIPS processor You have you have so i alu basically does a whole bunch of operations adding Sometimes multiplications as well You have shift operations. You have logic operations and small so more more more of these So basically what it happened what it has what it looks like is Like like it takes two inputs typically And you have one output and you have an f Which is basically the function that you tell to the alu what to to operate So The alu has adders it has the logical operations of and are all implemented inside You can if you are interested in how these adders are implemented You can look up the arithmetic circuits chapter of the book But we are not going to go into details of how it's implemented in the in the core processor But if you're interested you can see how adders are implemented different ways of implementing adders How multipliers are implemented shifting operation how it's performed So you have an entire chapter in the book on alu only So you have two inputs one output and the function To specify what is the operation so for example if feel if the function is it's a basically a three bit Three bit wire three bit input and depending on the value you are going to the alu will perform the following operations that's it and Don't get don't get worried that So this is how your alu design will look like okay So you will take some parts of the f and feed it into the final multiplexer You will implement your r gate and gate and you have an adder And so so this is the alu which has all the operations that you need okay You will actually be implementing the alu as I said next week in the lab the lab manual will have Exactly the details why each of why you are kind of having this architecture You will implement the entire alu in very log Next week so don't worry if you're not understanding this you have much more detailed explanation in the lab much more practical next week Yeah question I guess you have a question okay, cool, so So alu does in some sense the real work in a processor and so far So good We have most of the components, but we still do not have data memory We can implement most of the r-type instructions because Some of the assuming that you have the adders and the shifter implemented in the alu You can implement all r-type instructions because you have an instruction you have the You have the register file implemented But we don't have the shifter, but if you want to add shifting you know what to do So you you have to implement it in your alu unit So we will we have most of the things that we need to implement all r-type instructions We do not have data memory And when we come back from the break We will go more ahead and then see how we have data memory how to do branch implementation And hopefully tomorrow we will finish off with an example Let's take the break and Meet in 15 minutes So let's continue Let's continue now So till now we saw some parts of the MIPS architecture and what we don't have till now is We don't have anywhere to read or write from memory We have a method in which we can read or write from the register file But we don't have anything to read or write from memory. We don't have immediate values Operations that can be done on immediate values yet And we also have no conditional branches branches no jump instructions implemented Let's start with the memory So again, let's go with the same same exercise where We Walk through what are required for data memory. So let's start with a data memory block First of all What's your input? What's output? What's the output? You have to focus in this part of the board Just to be clear What's the output? What's the input? It's pretty much like a register file, right? So you have A simple data out Which is 32 bit So whenever I mark The line on top of an arrow. It's it's basically a bus and I always write the value on top of it. It's a 32 bit bus if in case you're Wandering and then you have data input You need what address This is also going to be 32 bit How how wide is the? Address bit here address value here We saw this for the register file we had 32 registers And therefore the addresses was 5 bit In a MIPS we saw that the data Values can be 4 gigabyte good, which means 32 And you of course have a clock Is there something that I'm missing? How do we write into this data memory? What do we need? It's exactly like what we needed in the register file So you need a write enable signal as well. So pretty much this is what we need to implement a data memory And in very long you would implement something like This So you have 65,536 Values So how you read this exactly like the remember the register file so you have 32 bit Values And 65,536 of them. So it's an array And your data out is a simple Access to the corresponding address Which is here. It's just 16 bits, but we will come to come come why later so and then you have As we saw in the register file you have a synchronous write so whenever the write enable is High you simply write the data that is in the input into the Into the memory array straightforward and simple So Pretty much what I had to say. So you have up to two or two power 32 bytes that can be written Pretty much you have in in general you have smaller memories implemented, but it's nothing important So don't worry about the sentence But now when you want to read from the memory you need to implement load word instruction right and load word instruction is So till now we saw how to implement our type instructions And now we are going to see how we can incorporate an i type instruction because load word is an i type instruction And also We need to calculate so let me let me write down What's how how the So how does a load word instruction look like you have load word Into a register right? It's a destination register or Yeah No, it's it's the other way around so it's It's an rt because you have only one i'll come to that so you have a destination register and you have an Immediate value remember the offset value And then followed by a source register Do you remember the syntax? It's Pretty much what it does is you calculate the address from where You have to load the data address of the memory from where you have to load the data into a destination Into a destination register So what you're going to do here is the effect to address is and as I say Okay in the previous slide You have to calculate the address from an immediate value and a register For for a load word instruction And what we can what you're going to do is to do this calculation of the address We're going to use the alu Because you already have an adder implemented there so What what are the other types of i type instructions that you know? add immediate For example, so let's say add immediate How do you What what's what's the source? What's the destination? you have Dollar rt Dollar rs and then you have an immediate value So in in some sense An i type instruction has three operands So you have three operands Two of them are registers One is a 16 bit immediate value And when you when you transfer if you remember when you translate the i instruction into the machine code fields You have Six bit opcode unlike the r type instruction where opcode is always zero here the opcode actually specify what to be What what is the operation to be performed? And then you have the source You have the two register operands and then you have the immediate value. So we are now going to see what are the changes That one has to do To the register file to the alu to actually incorporate the i type Instruction because you have registers And you need an immediate value And remember that the address for a load word for example is going to be computed based is using the alu On on these on the immediate value and the register operand value So we need to make some modifications to the alu and the register file So to do that To make things faster Unfortunately this part is in the dark, but that's okay So what are the changes that we need to be to do in the register file? So this is the register file that we talked before the break you have The destination register input 32 bit and we have the three addresses clock write enable and the two Read outputs, but in order to implement an i type the first thing that one has to do is Your source your source your rs has to be added with the immediate values because you're going to generate this addition to generate the next Generate the effective address and always you will have RS is always a read so this is always going to be an output no matter what happens and for so let's say that If one register is written it is RT and not rd. What does that even mean? It's here So if you have only one register write in an i type instruction, you're not going to use the rd at all So this this rd is not going to be Useful and you need to that we need to incorporate some kind of a switch mechanism And for write data in or the input that is coming in into the register file You're going to choose it either from the memory or the alu Why? Why memory? Because you're going to read for example, you're going to read from the memory in a load word instruction So your data has to come either from the memory or the alu Because remember I told that you can give a direct address also here plus the immediate and The the offset and the base address so you can either have Directly the data the actual data coming in from this memory location Or the actual value coming in from the alu because you're going to store from the output of the alu For example in ad immediate this value It has to be stored Into this register so the data in is going to come in from the alu for an ad immediate instruction So you need to have this option And not all itab instructions write to register file Do you know any itab instruction that does not write to register file? Yeah Store word store word doesn't write to register file, but it writes to the memory So we need to incorporate all these kind of changes Into the register file implementation and also we need to do some changes to the alu Because if you no longer going to use the alu just to compute from the source operands You're going to use it to actually calculate memory address as well In this for example in the case of the lord word instruction And the result is also used as the address Right, so you're going to use the compute you're going to use the address that is effectively computed Here as another address for the data memory in some cases And also you're no longer So if you have only one input is no longer rt So if you have only one input then you also The input comes in from the immediate value So you have you need some changes here to one of the input which is always so you have In the typical sense you have rs and rt going in here, right? So you have two source operands typically for an r type instruction So you have add rd rs rt and this is going to be your rd typically This is going to be from the one of the source registers This is going to come from the other source register But for an i type instruction one of the operand could potentially be an immediate value So we need to incorporate this change also into the alu So these are the changes that you need to do to actually also implement the i type instructions And if you want to do store word operation It's also an i type instruction There's not much changes to be done in the way in which you calculate the address Because it's exactly the offset plus base address And the only difference is that in a load word instruction You have writing back to the register file The resulting data is written back into the register Back to the register but in a store word you write it to the memory location And in order to write it back to the memory location We need to enable the right enable of the data memory So these are some extra changes that we will have to do to incorporate For being able to use i type instructions Any questions? So in some sense what we have what we need is the following ALU inputs has to be modified So you need to have either the source register value or the immediate value Which is RT or immediate value The register file has to be modified Which is basically the right address is going to come either from RD or RT So it's no longer just coming in from one of them So it's going to come either from here or here And the data also is going to come So this data input is no longer just going to come from RD But it's either going to come in from the ALU Or the data memory out from the data memory So you need to have another option here And again you have to enable You have to have the right enable of register file and the right enable of memory So these are the kind of changes that we need to be implemented And all of them are moxes Right? They're all simple multiplexers You can have a control signal value That actually chooses between a register value and an immediate value Or here whether you want to feed in the data coming in from As a result of the ALU or directly from the memory So these are all simple control signal values that needs to be implemented And what we can now do is we can actually go Step by step and see So these are the control signals basically Whatever we talk in a very abstract way is now much more specific And these are the control signals that we require for now To actually enable i-type instructions So let's start with register write What's register write? It basically is a simple one bit signal Which says okay do the right enable for the register You have to write into the register file So this one becomes the reg write So this can be this is directly linked to the control signal reg write Now the reg dst is basically to determine whether which destination register to write So that is defined by the reg destination So whether you want to write your data into register rd or rt And also you need a control signal here to determine what is the source of your data So whether it's coming in from a register or the immediate value And this is ALU source And this is the control signal that shows which to use And memory write So if you are going to write any value back into the memory to store word You need write enabled And this is directly linked to the mem write Control signal And ALU operation is basically what operation you want the ALU to perform Add, shifting, any logical operation and so on So let's walk through with an example So let's take consider the r-type So if it's an r-type instruction Then you have the opcodes all zero You are Are you going to write into the register file in an r-type instruction? So what's the standard form of an r-type instruction? You have an add You have three register operands Consider the add instruction It's add You have a destination register and two source registers Which means you're going to write back the result into the register file So your reg write is one Similarly, since you're going to write into rd It's one because it's an r-type instruction And your ALU source is not an immediate value It's from the register So we can indicate it by a zero And there is no memory write because it's an r-type again And you don't have to do any kind of So the data is not coming in from the memory It's typically from the ALU So and then the function Part of the field directly relates to what operation to perform So this is how the control table will look like If you want to implement your control signals Which is part of your exercise next week No, a week later You will actually implement all these signals How they are generated Consider an i-type instruction like load word It's an i-type so you have a specific opcode for the load word And you're loading So are you going to write into the register file? Yes or no load word Are you guys following me? So how is load word actually written? So you have load word You have a destination register So you are going to write the value into this register Which means you are going to do a write operation on the register file So your reg write is going to be one Similarly you are going to write into So you are going to write into RT and not RD Because it's a load word instruction And you don't have the third operand So you're not going to have a destination like RD register Instead you're going to use RT so it's going to be zero And ALU source What's the ALU source? Which one will you choose? You're going to have to choose from the immediate value as well Right? So you have to have a one there And then there's no memory write because it's a load word You're not going to write into memory And you're going to load from the memory to the register file So you're not going to write back an output from the ALU Into the register file So you're going to write it You're going to write the value The data coming out from the memory into the register file So you have a one there And an ALU op is a simple add Because of the immediate plus base address Is it clear? Simple enough Store word Very similar You're not going to do a write operation into the register Your source is still coming in from the immediate value And you're going to do a memory write operation So you have all these three values enabled And these will actually enable you to almost implement the itap to the itap instructions And this is this kind of forms the control unit part of the whole architecture Of course, we still have we are almost there Like we have all the R type instructions We can do read and writing from the memory now with these control signals But conditional branches are still missing Right? So we need some way to increment the program counter Not just by four but by an arbitrary address Got it? So let's let's see how we could implement a branch on equal Which is a conditional branching statement And typically what you have that the so the opcode for branch on equal is 00010 And what it does the branch on equal if you remember is you compare two registered values And if they are same you form a branch target address Which the PC has to be updated with Right? And the branch target address is basically you need an immediate value Which is added to the next instruction I'll come you will understand it better when I go further So now in order to actually do the comparison we need the ALU Because if you want to say if RS equal to RT you need to subtract RS and RT And then we need a zero flag to actually say whether RS is equal to RT And so we need to make all these changes into the Introduce all these changes into the architecture in order to be able to implement branch on equal And we also need a second adder Why? Because what is what is branch on equal branch on equal has two registers and a destination address Which is typically an immediate value So you need to be able to add the immediate value to the program counter And there is one catch here The jump address the jump or the branch on equal address is typically 16 bits Which right no I think I'm confusing you guys but still So let's go back let's go back now for this PC plus four Now let's go back to the program counter which we haven't touched at all We have the only thing that PC can do is you increment the PC on every clock cycle by four But now your PC not only has to have PC plus four but also it can either be PC plus four or the new branch target address So you need some way to actually multiplex the PC addresses as well And for which he will actually add a new control signal Which is called branch Which will just indicate whether you're going to Have a jump instruction or not or some some kind of a branching or not And for example when you want to implement branch on equal you have an opcode And are we going to do any kind of writing into the register no so it's zero We don't care about the destination address the ALU source is Is an immediate value no it's from the register value here And then you have the branch is enabled you don't have any kind of memory write And you perform a subtraction instruction Because you have to be able to compare the RS with RT which means you have to do RS minus RT So you have to do a subtraction instruction which will rise the zero flag which you can read and then take the appropriate branch So we're kind of almost there. We now also have some kind of with the branch control signal We also have conditional branches that you can implement But still we do not have absolute jumps which is unconditional the Opcode looks something like this you have jump and then you have a 2626 bit address So you have it's an extremely simple machine machine code So what we have here is the jump what I indicate here is the jump target address You remember 400,000 is the reset PC value and then assume that for example You have a jump target address of this particular value But then what you need what what you can have here is a 26 bit value So you basically split this jump target address which is 32 bits and pick only the 26 bits Which is you don't care about the MS4 MSBs and you don't care about the last two significant bits And this 26 bits is what will go into your jump address when you compile and when you write into your instruction so you will have this address in the instructions machine code So your machine code will look something like this it will not have this address but it'll have This particular value why because it makes things much more simpler You will not need any ALU for computing the effective address for a jump instruction Because your jump instruction is typically only 20 you have only this 26 bits and you can Simply use concatenations and you don't need any kind of arithmetic output for example The first four bits is straight away from the program counter And then so the first four bits are coming from the program counter value And then the next value the next 26 values come from the jump address itself So you have address and then the last two bits are Zero so you can directly assemble the jump target address For without the need of doing any operation So you you save time also there and you also don't need any memory No register file nothing and the only thing that we have to do is make the following change for the program counter because now It not only has to be able to increment by four But also it should be able to decide whether it's just a four increment or it's a new Jump target address for which we add another control signal So you add another control signal which we call the jump and whenever you have the instruction The jump instruction you pretty much don't care about anything except the jump value And whenever this jump value is on you appropriately switch in the program counter Whether you want to calculate your address from based on the target address or you simply increment the program counter. That's it So you basically now have pretty much You can actually do pretty much all the operations that are required From on on MIPS. So we have we can implement our IJ type instruction We can calculate. I mean we can actually do branching We can so we so in general we should be able to run most programs Of course, we didn't talk about quite a lot of things like multiplication shift operation But then you know how this how you can implement shift operation into the ALU. So you simply add This functionality into the ALU And you have So yeah, so you actually Implemented only LW and a load word and store word We have we haven't yet looked into how to implement this jump and link and jump register Functions, but these things you can learn it So it's it's not it's it's where you can follow the same methodology and try to implement it But for now to kind of understand this entire flow. I think this is pretty much what you need to implement most programs and pretty much we are done here and When we come back next week So tomorrow what I will do is I will use a set of slides That will actually walk you through The two hours of lecture that we do in a much more graphical way So we'll use an example and then actually put all these different blocks that I was talking about individually in one particular slide and walk you through Every step that we talked about right from the instruction fetch Then followed by decode what when when a program counter is incremented by four when it is not incremented by four And calculates a jump target so you will actually graphically see Tomorrow when we come because I don't want to do it now and repeat it tomorrow We will see how how all these things actually come together and we will finish tomorrow the single cycle architecture. Thank you very much