 this is a second lecture on this topic. So, let us pick up the thread where I left in my last lecture. So, I concluded my last lecture with this design summary and where I explained whenever you go for single cycle design there is some performance bottleneck. You can go for designing variable time clock design, but that will make the design very complicated. That is the reason why we shall start with simple design with a single clock cycle for simplicity reasons and later on I shall discuss about how you can make it multi cycle design and also other complicated things like pipelining, you can incorporate instruction available parallelism and various other things. So, in your design you will require two distinct type of components. One is known as data path, another is known as control path, controller you can say sometimes it is called control path. So, data path specifies the different functional elements that is required for performing computer and controller will control those functional elements control the functional elements. So, your processor will require data path and controller. So, these are the two things required. So, let us see what is the data path that is required for our MIPS processor. So, this is the basic abstract view of the data path. So, you start from the left side and so here this shows the most common functions this is not complete this is not the complete data path. As we progress in this lecture we shall see the other components data path components that is required. So, you are starting with the content of the program counter. Program counter is holding the address of the next instruction. That means, the processor will be having a special purpose register known as program counter and program counter will always hold the address of the next instruction. So, whenever you are turning the power on that time it is the responsibility of the operating system to load a proper value in the program counter. Subsequently whenever you will be doing context switching when you will be doing subroutine call you will see that program counter has to be loaded with proper values. However, whenever you are executing instructions in a sequential manner the program counter has to be incremented by 4 because your instruction length is 4 bytes. So, next instruction it will find if you increment the program counter by 4. So, whenever you are executing instruction sequentially one after the other then everything that the controller needs to do is to increment the program counter by 4. So, that program counter will provide the address of the next instruction. So, here is the instruction memory from where the instruction will be fetched and then instruction will be available. Instruction as you have seen is a 32 bit instruction and we have discussed about different instruction formats. And if it is a if it performs arithmetic and logical operations the that source addresses will be applied directly from the instruction register it will go to I mean the operands will be taken from the instruction register. And that register this is the register bank that 32 bit 32 registers is here that 32 32 bit registers are here and you are applying the addresses of these registers. So, operands will be available on these two arms of the arithmetic and logic unit. So, arithmetic and logic unit will perform the required operation because that instruction it will also provide the operation to be performed that later on I shall when I shall discuss the control unit that controller will provide you that instruction will be decoded and that controller will provide you the operation to be performed by the ALU. And ALU will perform the operation and result has to be stored in the memory I mean or into the register if it is a R type instruction that result will go back to the register you can see data will be available here. And the destination address is again provided by the instruction and result will be stored in the memory. So, if it is if it involves that arithmetic and logical operation that means data manipulation type of instructions then you require the this register bank to supply the operands and also to store the result. So, you can see it is a three port register. Three port register means it can you can apply all the three inputs addresses and accordingly two will be used to generate the output values from the two registers and one is for storing the result in a particular register. So, this is the three port register and two will provide the address from where you will get the output and third one will be used to store the result. So, that is the destination register address. And in case of load and store instructions however you will require this data memory the data memory whenever you are performing load means you are loading the value from memory into a register. So, in that case that address will be available here ALU will compute the effective address as you have seen whenever it is a load type of instruction that address will be generated by the ALU effective address. And that address will will be used to get the data from the memory and that data will go here and instruction will provide in which register data will be stored. Similarly for store operation again that address will be provided by this ALU and the data will be also available from one of the registers and that data will be stored in the memory location for which the address is provided. So, in nut cell this is how different types of instructions are executed with the help of this data path. So, this is the data path you require for performing various arithmetic and logical operation to perform load to perform store and so on. However, you have to extend it and let us see how the different operations are performed as I have already told you will require another adder why do you require another adder because you know that next address has to be generated by adding four with the present value of the program counter. So, the program counter value is applied to this adder and four is applied to the other arm and the PC plus four is applied to the program counter. So, this is for instruction fetching. So, each time you fetch one instruction the after the instruction is fetched the new value of the program counter will be loaded I mean new value will be loaded from this output of the adder. So, that is your PC plus four will be loaded in this program counter which will provide the address of the next instruction. So, this is for simple instruction fetching from sequential memory. Obviously, this is this does not perform the that branch or jump for that you have to add some more data path that I shall discuss. Basic data path for R type instruction as I was telling the operands will be available from the instruction I mean operand addresses register addresses will be available then it will read data from the register and it will apply to the arithmetic and logic unit it will generate the result that result will be applied into this write data input and that write register address is again provided by the instruction. We have seen that in R type instruction you have got three register addresses two for address of the operands and third is the destination address of the result. And so all the three are applied here and data will be loaded into this whenever you are doing this you can see the signals will be generated by the controller. So, the controller will generate this register write signal and ALU operation signal to this ALU. So, here you require three bits assuming that ALU can perform eight different operations addition, subtraction, multiplication and so on. And so these signals will be generated by the controller. So, later on we have to add controller to make the design of the processor complete. So, this is on nearly the data path then for load and store instructions as I have already told it will involve this data memory. So, effective address will be generated for in both cases by reading the data from one of the registers and then that 16 bit data that will be coming as part of this instruction that will be sign extended I have already explained what do you mean by sign extension that sign extended data will be added with the content of a register to generate the effective address. That address will be applied to this data memory and this data memory will you know in case of load it will provide the data here. So, that will be available here and that will be written into the register. So, this is that load typically the value will be loaded into register in T 1 and this is the offset value is provided by T 2. That means, T 2 register is providing this offset and this is added with this and then you are loading this value in register T 1. Similarly, this is the store word store word means that content of T 2 which will be available here that T 2 will be available here which will be written into the memory. So, and that effective address is generated here again by adding the sign extended value with the content of T 2. So, do not get confused with T 1, T 2 these are essentially the registers taken from the same 32 bit that 32 32 bit registers. And in this particular case you can see the control signals to be generated by the controller for load and store instructions are given here. Number one is register write because that is required whenever you are performing load instruction and that ALU operation will be addition and memory write. That will be required for load similarly, for store sorry that will be required for store and memory read will be required for load and for store you will require memory write because both are shown together that is why you have got two signals one of the two will be generated at a time for load and store. So, similarly here register write will be taking place whenever you are performing load and for store you have to read it and that data will be loaded into the register. So, here the offset value is 16 bit sign immediate field must be sign extended to perform the addition. And I have already explained the need for sign extension and you can see here how the sign extension is used to generate the 32 bit effective address that is required to generate the address for the memory. Now, we have come to another type of instruction that is your branch if equal. So, whenever you are doing branch if equal two things are required. Number one is whether the two register values are same or not that decision should be known. If they are equal then branch will take place to a particular location and that address has to be generated. On the other hand if they are not equal then of course the address is already known that is b c plus 4 that is generated with the help of a register. So, you can see here you have you have used another adder here which will generate that branch address. So, you are performing that sign extension that you are adding with the content of p c plus 4 to generate the branch address. Because that branch address has to be p c plus 4 next address plus that with a plus some offset that offset will come as part of the instruction and that will be sign extended and added with this value p c plus 4. So, this will be branch address this will be generated I mean this will be loaded whenever branch if the branch is taken that means if it is equal and who will decide whether branch will be taken or not. So, this is that ALU will perform subtraction of the two content of two registers. So, content of the two registers will be applied to ALU it will do subtraction and if they are equal result will be 0 if they are not equal result will not be 0. So, depending on that branch control logic this will be applied to the controller. So, controller will receive this signal and accordingly it will generate control signals whether depending on whether branch is taken or not taken. So, you can see the control signals generated by the controller is register right control signal generated is ALU operation in this particular case subtraction. This output of the ALU will go to the branch control logic that is the controller. So, this is the adding data path for branch if equal instruction. So, branch if not equal that also I mean the same data path with sub the purpose implements branch if equal t 1 and t 2 t 1 and t 2 offset and offset is the is a sign 16 bit immediate field. So, this is the instruction BQ dollar t 1 comma dollar t 2 comma offset this is the instruction and offset is signed 16 bit immediate field and thus must be sign extended in addition we see left by 2 make it make low bits 0 0 to address the word boundary. We have seen that word boundary has to be at the should be multiple of 4. So, to that mean multiple of 4 means the least significant bits will be 0 0 and that is what is being done here. Then let us come to complete data path this is the complete data path which will perform all the different things. So, we see here we not only require two separate memories this is for reading instruction known as instruction memory you will require data memory you will require the register file in any case. Then in addition to the arithmetic and logic unit which will perform different computation you require two different two adders two separate adders one is performing that PC plus 4 whenever the branch is unsuccessful this will be loaded into the program counter. On the other hand when the branch is successful then that address is generated by this adder. So, this adder will be generating that branch address and that will be multiplexed and that the program counter will be loaded by that branch address. So, you see you require in addition to this adder two more adders for the calculation of the two different branch addresses I mean one is for unsuccessful branch another for successful branch. And whether it is successful or unsuccessful that is decided by the controller and controller will generate a signal to this multiplexer and accordingly either PC plus 4 then you can see here different control signals that is required in this particular case also you will require another multiplexer to the second arm because you will be either applying this value here that sign extended value to generate the address of this you have to generate the operand value I mean where the operand has to be stored. So, either it will come from this sign extended form or it will come from a register. So, you will require a multiplexer that means ALU source ALU source is a signal which will be generated by the controller and which will either it will select this value or the sign extended value that will be applied to the ALU in different situations. This is the signal memory read or memory write depending on load and store and memory to register you can see here again in the you require a multiplexer here either the result is generated by the ALU that has to be loaded into the register or that means in case of load in case of load or store from two different sources it will come either it will come from the memory if it is a load or if it is a store it will come from here and it will be loaded into the register that is why you will require multiplexer here. So, you can see in addition to two additional adders several multiplexers have been added in the data path. So, these are also data path component multiplexer. So, you can see here one multiplexer here one multiplexer here another multiplexer here. So, three multiplexers are added have been added as part of the data path to take care of the load and store and branch instructions and you know these are the various signals to be generated for R format and operation the ALU operation is 0 0 0 0 for AND 0 0 0 1 for OR 0 0 1 0 for add 0 1 1 0 for subtraction. So, here we have restricted to AND OR add subtract. So, these are the four ALU operations, but since it has got four bits there are apart from these four you can have other ALU operations which are not shown here for the sake of simplicity. Then for load word the various control signals to be generated is shown here register destination has to be 0 register write has to be 1 because you are loading ALU source has to be 1 accordingly the multiplexer will be selected path will be selected then memory read has to be 1 memory write has to be 0 for load and then memory to register has to be 1 that multiplexer control signal has to be 1. Similarly, for store they will be different this is a material redundant then register write has to be 0 ALU source has to be 1 memory read has to be 0 memory write has to be 1. So, it will be compliment I mean memory read memory write as you can see they will be compliment to each other and then memory to register it is irrespective of that redundant in this case then PC source has to be 0 then branchy pick well you can see all will be 0 and it is independent of this register direction destination and PC source it will be either 0 or 1. So, you can see and here in this case it will be performing subtraction because you know that branch equal operation is performed by subtracting one operand from the other and then checking whether they are equal or not I mean result is 0 or not. So, this is the R format ALU operation course to be generated next we add the control unit that generates write signal for each state element control signals for each multiplexer ALU control signal input to control unit instruction code and function code. So, far what I have done I have shown you the data path and also the various control signals that is required for different types of instructions now we shall add controller. So, on top of the data path you will require a controller and they two together will be implementing the processor and the control unit is divided into two parts the main control unit where the where the input is 6 bit of code that means the main control unit will be having 6 bit of code as the input and output will be all control signals for multiplexers register write memory read memory write and a 2 bit ALU of code signal. So, this will be the output and this will be the 6 bit of code that is generated that will be the main control unit function with input 6 bit of code it will generate this depending on the of code it will generate various signals and I have already shown you what will be generated then coming to the ALU control unit input is 2 bit ALU of code signal generated from the main control unit. So, I have seen the 2 bit ALU of code signal is coming from the main control unit because here it will perform those 4 operations and or addition subtraction. So, these are generated by the ALU by the main control unit and which will be applied to the ALU control unit as input. And 6 bit instruction code also will be applied to this as input to the ALU control unit. So, 6 bit of code along with 2 signals coming from the main control unit will be applied to the control unit of the ALU and accordingly the control unit output the 4 bit ALU output signal will be generated by the ALU control unit. So, let us see what are the input and output. So, these are the various inputs for the main control unit I have already told that there will be 6 these are the 6 bits coming from the of code field which will be applied to the as input to the main control unit. And it will generate these signals for these are the inputs for r format all are 0 for load word this is 1 this is the signal for store word this is the signal for branchy pick well this is the signal. And accordingly the output that will be generated by the main control unit is shown here. So, you can see the register destination ALU source memory to register control register write control memory to read control these are the control signals generated branch ALU of operation 1 ALU operation 2 these are the control signals to be generated by the main control unit. And for different types of instructions r format load word store word or bq the different values to be generated by this main control unit is shown here. And it can be realized by a simple combinational circuit like this. So, your input is 6 bits coming from the instruction code and it will generate various signals. So, I am not going to the design of this you can find out from this table you know this is the this you can use a truth table. And using this you can realize this circuit. So, here you have many options here I have shown how bit different signals can be generated with the help of gates. Another implementation technique is by using PLA programmable logic array. So, PLA can also be used as the controller. So, PLA will receive input and it will generate control signals. So, how what will be done and that I mean that PLA can be used or this gate gates can be used either way you will realize this main control unit. Now, let us focus on the ALU control unit. So, this ALU control unit must describe hardware to compute 4 bit ALU control input given. That means, the ALU it will receive input from two sources as I have already shown you it will receive 2 bit ALU output from this here these are the 2 ALU output 1 ALU output 0. These two will be applied to the ALU control unit along with the off code 6 bit off code. So, the 6 bit off code and the ALU control unit the those inputs will lead to the generation of this 4 bit ALU control input control signals. So, it will compute the 4 bit ALU control input given. So, 2 bit ALU main control unit and function code from the algorithm. So, it will describe it using a truth table. So, you can see this is the truth table for that. So, you can see for different instruction codes the ALU off code generated by the main control unit is shown here. So, it can be 0 0 0 1 1 0. So, 3 values depending on different instruction load store branch equal and R type. So, we have restricted to register type of instruction load and store and branch equal. Obviously, this is not the complete instruction set. So, we have restricted to a subset to generate the control unit. Then instruction operations are shown here load word store word branch equal additions add subtract and operation or operation set on less than R type. And this is the function field function field that is coming from the you know that you may recall that the instruction is having a function field. So, in addition to off code main off code there are 3 register fields and a function field that means you may recall that 6 bit off code then there are 3 register fields then function field and then here that shift field. So, this function field is used here in this particular case you can see for generating the ALU control signal. So, the ALU control signal which has been generated that 4 bit control signal generated as shown here for add it will be 0 0 1 0 for subtraction it will be 0 1 1 0 for for end operation it will be 0 0 0 0 for or it will be 0 0 0 1 and set on less than that it will be b q it will be 0 0 0 0 for addition subtraction and or and only 5 different values are required for the ALU control because either it will perform add operation or subtract operation or end operation or operation or set on less than. So, 5 different values are generated although it has got 4 different fields. So, there are 16 possibilities, but 5 are used for the ALU control. Now, this is the ALU off code field this is the function field same thing I believe the same thing is shown here. So, this is taken is the same thing represented. Now, we have I have put everything together that means that the function the data path and the control path. So, you have got 2 controllers main control unit and the ALU control unit all are shown here as you can see these are the data paths I have already explained the different multiplexer. Then you have got this is the main control unit which is which will generate that registered destination branch this will generate ultimately that PC source then it will generate that memory to memory read signal which will come to this data that data memory then it has got ALU off code 2 bit will go to the ALU control unit then memory write signal will go to the data memory then ALU source that will go to this multiplexer whether it will come from register or come from this particular field depending on that multiplexer will select one of the two and apply to the ALU. So, this is the main control unit and this is the ALU control unit is taking the 2 bit from the main control unit 2 bits are shown here and also the instruction that is 0 to 5 main instruction code that is also applied to the ALU control unit that 0 to 5 and that 0 to 5 this is the field and this is applied to the ALU that 4 bit control signal that is generated is applied to the ALU and ALU may generate you know in case of BEQ another additional signal that is required is 0 that is the flag bit that is generated by the ALU or it will perform that add subtract and or the resultant outputs are either it will be written into the it will be written into the register this will how it will be diverted it will go to the register. However, for load and store it will involve data memory. So, you can see this is the complete thing which shows at which implements the R type instruction where the memory 2 separate memories are shown the BEQ instruction scenarios are depicted and this is the complete data path and control signal that is required to implement the subset of functions that I have discussed. Now, addition of the unconditional jump earlier we have only added BEQ. Now, unconditional jump is also an instruction important instruction that has to be added and how the data path and control unit will be affected is shown here. We now add one more off code to the our single cycle design off code 2 J type and the format of the off code field is 28 to 31 is 2 and remaining 26 low bits is the immediate target adder. So, you can see that pseudo direct addressing mode is used here. So, that means, whenever it is a unconditional branch then that pseudo direct addressing mode that means, the 26 bit field and the 6 bit off code field. So, that is being used here and that 26 bit lower order bit field is providing the immediate target. So, this is used only in case of unconditional jump and the full 32 bit target address is computed by concatenating the upper 4 bit of the PC plus 4. So, here the effective address is generated in this way. So, here is your program counter program counter is 32 bit 0 to 31 and your off code in this case the instruction format is 6 bit off code and 26 bit is available that is the immediate target value. So, what is done this value and these 4 bits. So, here you have got 26 bit and you require 4 bits from here. So, 26 bit immediate field and 4 bits for of the PC plus 4 they are concatenated. So, these are these 2 are concatenated not added. So, these 2 are concatenated to generate the effective address in case of your unconditional jump. So, this is the address and that will be applied to the memory that will go to the memory instruction memory to generate the effective address. And you can see here this is your 4 bit this is 26 bit. So, that leaves 4 plus 26 is 32 30 what about the 2 bits the remaining 2 bits has to be 0 0. So, that means, the lower order bits will be 0 0 26 bits will come from here and 4 bit will come from here. So, 26 and 4 bit and that will be the effective address in this particular case. So, an additional control line from the main controller will have to be generated to select the new instruction and a 2 bit shifter is also added to get the 2 lower order 0. So, this can be very easily obtained by shifting this 26 bit and to show that it 2 0 0 are inserted. Now, let us look at the final design. So, here is your final design including jump instruction what are the new things have been added number 1 new thing is you can see another multiplexer has been added. So, we had 1 multiplexer now you have what another multiplexer and that multiplexer is providing you providing the jump address. So, by concatenation you are it will generate the jump address 0 to 31 that will be multiplexed and loaded into the program counter. And that is the main thing rest of the things we have already included. So, you can see here the those things do not change you will require the instruction you will require the data memory you will require the main ALU you will require 2 more adders for generating the addresses then you will require a number of multiplexers 1 here, 1 here, 1 here another 2 only new is this additional multiplexer which is required for this last jump instruction. So, this is the final design the showing the data path and the 2 controllers that is required for the for the that for realizing the processor. So, this lecture let me summarize what I have discussed in this lecture. I have first introduced to you in 2 lectures rather 2 in 2 lectures I have included I have discussed the MIPS instruction set architecture. And of course, I have only discussed the a subset of the instruction set architecture of MIPS. So, MIPS is a processor RISC processor which has got single instruction size 32 bit instruction with 3 different formats that is your R type I type and J type. And R type is used for conventional you know ALU operations addition subtraction and or like that. And I type instructions are used for branch interval and J type instruction is used for jump J type format is used for jump. And accordingly we have seen how the effective address is calculated in different situations in this particular case you know that BQ that 16 bit of say 16 bit that is provided as part of the instruction is sign extended and added with a register to generate the effective address. And in case of this jump type instruction we have seen how this is done by concatenating concatenating different fields 4 bit is coming from the PC 26 bit is coming from the opcode from the instruction. And by shifting it by 2 bit we are getting 2 0 bits. So, that is how the effective address is generated whenever the jump type instruction is added. And to incorporate this MIPS ISA to implement this MIPS ISA we have seen that you require your processor because processor requires what are the things require number it requires 2 different types of memories one is for instruction and second one is for data you will require 3 I mean rather 1 ALU 2 adders 2 adders and 1 ALU is required. And by doing that you can implement all the data paths and of course you will require a number of multiplexers you will require how many multiplexers you can see here 1 2 3 4 5 5 multiplexers. And of course in addition to 32 bit 32 32 bit registers you require another special purpose register that is program counter. And this is the data path that is required and we have implemented the controller by using 2 different controllers number 1 is main controller and second is required we have used 1 ALU controller. So, main controller and ALU controller together controls the entire data path. Now, before I conclude let me discuss about the multi cycle implementation. So, this was the single cycle that was used to perform 4 different things what were the 4 different things instruction fetch instruction decode ALU operation and write back. So, all the 4 instruction fetch instruction decode execution of the instruction and write back all the 4 were performed in a single cycle in one go using a single clock instead of that what you can do you can have 4 different clocks say you make it multi cycle. So, in one cycle you use you perform instruction fetch in another cycle you use instruction decode in another cycle you perform execution and in another cycle you use write back. So, you can see this is your single cycle and this is your multi cycle. So, whenever you go from single cycle to multi cycle obviously the clock frequency will be higher. So, in this particular case it will be 4 times that of the single cycle because we have of course assume that instruction fetch instruction decode instruction execution and write back all of them request the same time, but in reality they will not be same. So, the again the frequency will be decided by the slowest of the 4 different operation. In this case for simplicity we have assumed all of them request the same time. Now, whenever you go for multi cycle what is the advantage what is the advantage say we have seen a single cycle will require so much of hardware we have already seen that 2 different types of memories 1 ALU and 2 adders and so many multiplexers. Now, whenever you go for multi cycle do you require so much of hardware can it be reduced the answer is yes how it can be done. Number 1 is you do not require 2 separate memories because we have seen that in 2 different cycles whenever you will be performing instruction fetch will not be doing write back. So, you can have a single memory in fact earlier in the early years when the computer processes were designed they were having a single memory and they are known as Princeton architecture only subsequently to facilitate pipelining and other thing you require 2 separate memories. So, if you go for multi cycle of course not using pipelining then you can have a single memory. So, which is done by Von Neumann architecture or Princeton architecture. Second thing is you can reduce some of these other functional units because you see when you will be doing instruction fetch you will be using this adder when you will be performing execution you will be using this ALU. So, possibly the same ALU can do this addition if you go for multi cycle similarly when it will be calculating the address depending on the branch this adder also can be performed by this. That means some of the functional units can be reduced whenever you go for multi cycle implementation. So, however that means the reduced number of adders the question is what is the disadvantage in this world nothing is one sided it has definitely got some advantage what is the disadvantage is there any disadvantage for whenever you go from single cycle to multi cycle the disadvantage is that controller will be controller is very complex not only controller is complex you have to address some additional functional units like you know multiplexer and other thing. So, your design of the data path and the controller will be very complex whenever you go for multi cycle implementation. So, with this let me conclude this lecture I have discussed in detail the single cycle implementation of the MIPS processor and given some overview about multi cycle implementation in my next lecture next lecture I shall discuss how you can implement pipeline. Thank you.