 Welcome to this lecture on advanced digital system design in the course digital system design with PLDs and FPGAs. The last few lectures we were looking at the structure of the sequential circuit how to design it and in particular in the last lecture we had a look at basically the timing analysis we have looked at the maximum frequency of operation, hold time violation then we have looked at setup time when there are setup and hold time and there are skews in the data path and the clock path. Before moving to today's part we will have a look at the quickly look at the previous slides so that we get continuity. So moving to the slide we were talking about synchronous counter and we said that designing a synchronous counter means putting the flip flops all clock by the same clock and decode the next state from the present state and that is a logic and we form a truth table and then we get write the next state as a function of present state and that is how the design is done any counter can be designed and with when the clock comes with a delay of TCO the present state or the count changes and since in this case there are 3 outputs we will take the worst case as the maximum delay as the TCO and I mentioned that definitely there is when you show the picture in an abstract form there are details which you should not forget that here there are different paths it is not just 3 paths from Q2, Q1, Q0 there could be path to D2, D1, D0 in this case there is only D0 as only one path D1 as 2 paths and D3 as 3 paths from the output but in general case you know there could be there are n bits then there could be n square paths in this case and we also have looked at an up down control essentially we are modifying the logic we are giving this as an input and forming a truth table and coming out with the equation for DI's and in this case we find that next state is a function of present state and inputs and the next question we have looked at is that can we kind of get rid of this flip flop and make it asynchronous it is possible but then you can have there can be raises output raises which is difficult to control but asynchronous circuit is very fast because there is nothing like the no clock you know kind of blocking it and opening the path you know to control it. So it can be very fast it can move on at the maximum speed but then it is difficult to control because of the unbalanced path delays and this is the essential thing we have looked at the last class in this case for a counter there are two timing issue we have to look at one is a maximum frequency. So the minimum clock period is the delay of the flip flop tcq the combination delay plus the setup time so that we are analysing from one clock edge to the next clock edge. So here from one to the next so at this point you see the data comes after the delay tco at this point it is coming after tco plus tcomp and we know that data has to be setup time before the next clock edge. So the minimum clock period is tco plus tcomb plus tsetup and we give a margin called slack and the maximum frequency is the inverse of the minimum clock period. One surprising thing is that the whole time does not picture in the whole expression for the clock period but we have to make sure that the whole time is not violated that means data at this point should remain there sometime after the clock edge and if you care to think about it it is enough if you think like when the clock edge comes how long it is going to it takes to change this next state that we know that if a clock edge comes it takes tcq plus tcomp time for this to be changed and that is shown here. So it is enough that the minimum tco plus tcomp is greater than this whole time then there is no violation okay. So that is what is mentioned here tco min and tcom min is greater than the t whole max. In this case we take care of I mean we have to consider the maximum delay but in this case it is the minimum delay and if you find that you know suppose you fixed in a design the clock period and if the worst case delay path violate that clock period then you have to reduce the clock period okay that means sorry reduce the clock frequency and increase the clock period okay. But if there is a whole time violation you find that tco min plus tcomp min is kind of less than whole time the clock is not something to be considered because in the earlier analysis we are considering the timing from one clock edge to the next clock edge. But in the case of whole time violation we are talking about the same clock edge okay. So there is only one way of kind of solving the whole time violation is to increase the combination delay or add kind of dummy logic to increase the combination delay and this can happen you know this whole time violation can happen when there is combination delay is minimal or there is no combination delay like in a shift register normally the flip flops have their tco greater than the whole time. So normally there is no question of violation okay but like in this case we are assuming one thing our assumption is that the clock edges reach all the flip flops at the same time okay which is not a realistic assumption there could be skew between the clock arrival time at the various flip flops and that skew can create the whole time violation. We will analyse these two timing issues with respect to the I mean when there is skew that we will do little later when we probably take the FPGA lectures we will kind of consider that for the timing just we will do with a simple analysis okay. But as I said before at the introductory lectures the basic timing parameter for a combinational circuit is the propagation delay and for a flip flop is setup time whole time and the tco. Now you see that when it comes to sequential circuit with this tst whole tco and tcom we are building you know we are going one level up we are coming out with expression for the minimum clock period and condition for the whole time violation. So these two are the kind of important timing detail of the sequential circuit or any register to register data path so that has to be understood and next thing we have looked at was like this is applicable to any register to register path you know after all that picture shows the data moving from one register through a combinational circuit to another register. So that is with respect to a sequential circuit but in scenarios where there is we called data path where there is computation this will be the structure there will be registers or flip flops combinational circuit and registers the same analysis whole good here there is a minimum clock period which consists of tcq max, tcom max and the tsetup max and the whole time violation at the destination register is tco min plus tcom min should be greater than the whole time okay. So that is generalized for a data path or register to register path or a sequential circuit okay. And we have looked at the setup and whole time with skew when there is a skew in the data path when we refer to the external point the setup time increases because this delay I mean you have to setup the data much before so much time before at this point okay. So you see that with respect to d dash here the setup time was 2 nanosecond setup time has increased to 4 nanosecond but the whole time has become negative because the at this point the whole time spec was 1 nanosecond that means it remains there 1 nanosecond after the clock edge. But since this delay holds the value for 2 nanosecond in principle like you can remove the data with respect to clock like 1 nanosecond before the clock edge at this point and at this point will be correct. So as I said the setup time is measured as the time before the clock edge to this side and the whole time is measured as the time after the clock edge to that side okay. So when the whole time is towards the left of the clock edge then it becomes negative and negative whole time means that at the point of reference it can be removed the data can be removed before the clock edge. But this delay will hold it and at the point at the input of the flip flop that will be correct okay that is so you do not need to worry about a negative whole time. And if opposite is the case when there is a delay in the clock path or there is a skew in the clock path then we have a clock dash which is at this point which is coming earlier to this clock. So now you see that the setup time was 2 nanosecond with respect to original clock at this point and the whole time was 1 nanosecond. Since the clock dash is coming earlier to it and we can actually we can now setup the data much later at this point because clock is at this point clock comes only after 3 nanosecond. So you can see that the setup time has become negative that means we are setting up the data at this point with respect to this clock point after the clock edge. But when it comes here it will be correct but the whole time has increased okay it has become 1 plus 3 nanosecond this has become kind of 2 minus 3 which is minus 1 nanosecond. And so here the whole time increases and the setup time decreases it can even go to negative value as I said it means that you can set the data setup the data after the clock edge that is the meaning of it. And this was the last part we have discussed suppose we talked about the design now we are coming to designing the circuits designing the system and if you take a simple example like a 60 seconds timer we have seen the design you know you take a clock oscillator divide it to get a 1 hertz pass like you have a BCD counter counting from 0 to 10 then that goes to a mode 6 counter then we have BCD to 7 segment decoders and this goes on you know it goes from 0 to 59, 0 and so on okay. And we said it is a fairly simple circuit we have designed from one end to other end but we see there are issues like what should be the clock frequency higher the better because if it is higher the drift will be less and for the same drift the higher clock frequency since it gets divided the accuracy will be much better. But if you choose higher clock frequency the divider will occupy lot of area and we said that you can instead of having a BCD counter followed by a mode 6 counter we can have a mode 60 counter okay. So here there are 4 flip flops 3 flip flops in the case of a mode 60 counter you need only 6 flip flops I mean that is quite natural because in a BCD counter all the flip flops are not used to the full extent because 4 flip flops can you know count up to 16 but then we are only using 10 okay. But there is no guarantee that if you use a mode 60 counter mode 60 to 7 to 7 segment decoder will have lesser area than this put together and we also said that this has to drive LED so it has to give high current. So in a simple problem there is a question of area the speed and this divider mind you if it is kind of 10 megahertz the flip flops should work at 10 megahertz so that is a kind of little high frequency I mean comparatively and there is a timing issue there is area speed electrical you know the currents and all kinds of things. So accuracy even in a simple circuit there are if you put your mind there are lot of issues to handle so but this is a flat design given the spec you can design from one end to other end. So we are going to look at a fairly reasonably complex design so let us look at this example let us look at how to design an 8 bit microprocessor definitely I am not going into a complete design of the microprocessor I am trying to take this as an example of a reasonably complex design case and illustrate you how you go about designing a complex design problem. What are the steps involved and how do you handle it all that is what is our focus I am not guaranteeing that at the end of it we will be completely designing an 8 bit CPU though at the end of the course you can do it nothing very specifically great there are lot of details lot of functional details lot of timing details but it can be done you know there is no problem and we are also looking at a very simple architecture okay not very efficient architecture the main point is illustrating the process of design and for you to understand you know without at a very lucid way. So I am avoiding all kinds of you know high performance design or single clock design and things like that. So let us look at the spec of the microprocessor so it is an 8 bit microprocessor means it has an 8 bit ALU database is 8 bit D7 to D0 it has 4 registers which are of 8 bit it has 16 instruction. So the purpose of choosing 4 registers is that we are hoping that there are 16 instruction which will occupy 4 bit then you know that in the instruction we have to specify the registers. So 2 bit for source and 2 bit for destination so that makes it 8 bit so there could be most of the instructions could be 1 byte width most of them but we have instructions like jump or call which has to definitely specify the address and that can be kind of multiple bytes say maybe 1 8 bit for opcode and 16 bit for address and so on. It has 64 kb address space and A15 to A0 so since the program counter and the stack pointer are the one which is giving these address it has to be 16 bit there is no separate IO space that means the memory space is used to map the IO devices it is easy otherwise you have to have a separate instruction which is handling the IO space and all that. So it is mapped you know the main idea is that you should have enough address space then you can accommodate the memory and IO in a single space you do not need to have a separate IO space the controller is hard wired that means it is a finite state machine it is not something called microcoded as in very early designs and it use the demultiplied address and data bus so it is separate address bus there is one interrupt okay that is the specification okay. So first thing is that a mere look at it it is almost clear that we cannot design as we design a 60 seconds counter you know you cannot start putting say okay let us put a ALU but ALU has addition, subtraction, complementary operation, logical operation, comparator all kinds of things. So there is no way we can just put start putting the blocks and interconnecting as we did in the case of that seconds counter there are registers that need to be designed there are program counter if you are not kind of design at least once you do not even know how to design it okay. So this is a problem which cannot be kind of designed from one end to other end in a flat way so basically in such cases you have to break this huge design into pieces we have to we call it partition we have to partition them with functional block with interfaces that means we should say we might divide this as ALU registers the program counter this stack pointer and then we say how they are interconnected and so on okay that is the basic idea partition and we go top down that means that we go from the CPU at the top as a single block then we come down to break into to level one pieces like ALU registers the program counter stack pointer then we take the ALU and further design it which is composed of adder, subtractor, comparator, the logical operations and so on. So that is how from the top to bottom we come down as we come down to the bottom there are lot more blocks quite a lot of blocks each block at the level one will be kind of exploded into level two blocks and so on. So that is called top down design and at each level we have to look at the functional specification, timing specification, electrical specification and all that okay. Now when I say top down design this is applicable to any design whether it is kind of software design or an aircraft design or organizing a complex function all that you know say you have to organize a big program then you have to have committees looking at the whole arrangement the speakers and the food and memandos and so on you know there are finance all lot of things. That is all any complex things has to be broken down into pieces and handle that is all only thing is that you have to have experience in the game otherwise you cannot do it if you have not organized a big function it is very difficult to organize the first time. So you have to get in a team get the experience and do that okay similarly if you want to do an aircraft design it is composed of various pieces unless one go through it for a long time get enough experience then only you can design that. Similarly with digital system when you want to design a complex thing like a processor you should know about processor and you should know what are the blocks of the processor how to design them how to integrate them and all that. So you need a very good domain knowledge and experience and expertise to do it but my point is that the only way to do is that a top down approach you know going from top block all the way down to small blocks. So let us look at the CPU at the top level so in our case this is the block diagram of the CPU at the top level we have clock reset and interrupt as input read write and address as output and data is bidirectional you know port. And so many people kind of hesitate to draw such a block diagram at the top level it may be simple but nevertheless you have to draw it to bring clarity and what are the functions at this level okay there are definitely there are a lot of functional spec you have to decide at this point of time you have to decide whether the CPU is going to execute instruction in a single clock cycle or multiple clock cycle it has to be pipeline what is the instruction format what type of instructions you support all that comes in kind of a functional spec and when it comes to the timing spec you might think there is much timing at this point of time but you see there is definitely there is timing on the clock reset interrupt. But the very important thing is that you know that the memory and the peripherals are connected to this data bus be the CPU is along with the peripherals is in a single chip or it is external does not matter. And you know that there is something called the bus cycles which essentially access memory and the peripherals so it goes like this may be in the first clock cycle the address come the second clock cycle the read bar comes then the data comes on the data bus so that need to be specified how many clock cycle it take for the bus cycle to complete for an instruction related to the memory or peripheral to complete and that could be asynchronous that could be synchronous. So, there is timing spec and when it comes to electrical specs we know that this address and data lines and these lines are connected to multiple peripherals and memory. So, that should have good driving there it has to source and sink large amounts of current it is not in terms of microambient it has to be in the order of milliambient then only it can drive multiple loads. So all these are the spec at the top level we call it is level 0 and now we break this into kind of level 1 ok. So, let us look at the level 1 so I must say that this is a something which I have put it very quickly for the instruction purpose I do not claim a very great accuracy there could be some kind of timing issue but the basic idea is sound you can kind of follow this and make a CPU it is possible and this is a very simple CPU which works in multiple cycle that means it takes may be multiple clock cycle for instruction to complete and one instruction is fetched it is executed then only the next instruction is fetched so it is a multiple cycle CPU. So let us look at the partition so you see that there is an instruction register encoder a controller an ALU with temporary registers 4 registers a program counter and a stack pointer ok. Now the program counter or the stack pointer depending on the instruction or the situation drives address bus there is an internal single data bus which connects all the data path elements or all the registers and combinates circuit basically registers because the combinates circuit cannot be directly connected to the data bus only because there is a single bus only one of the kind of output can drive so if you have a combinates circuit you should put a kind of tri-state bus to the connect to the data bus ok. So now so there is an internal data bus but the modern CPU might have separate buses because so that many things can happen parallely because there may be a bus for the instruction to get into instruction register there could be a bus between registers and the ALU and so on ok. Of course there has to be at some point interconnected but for simplicity we assume a single data bus so let us look at these 4 registers and registers are you know 8 bit registers and it gets the data input from the data bus it also drives the data bus ok. Because sometime an ALU output has to come to the register or an instruction part of an instruction might get load into a register sometime the register as output has to go to the data bus to the ALU input or to the memory and so on now depending on the instruction set. So registers input and output both are tied to the data bus input is not a problem because all the inputs can be driven by the data bus but the output only one of the output should drive the data bus so there has to be a tri-state gate 8 tri-state gate between the output and the data bus and the controller make sure that only one of the registers or one of the load drives the data bus ok. And another important point is that you see there are 3 control signal to the registers all registers and one is clock which goes to all registers and there is a register a enable for register a it means that if the output of the register a has to come to the data bus then the controller will give that enable signal high and the data is enabled on to the data bus. Similarly if there is a data on the data bus which need to be latched on to the register then the controller gives this register a latch as a level signal and when the clock edge the positive clock edge come the data get latched on to the register a ok. So that is our operation that is true for this temporary registers you see there is a latch signal and there is a clock signal when the latch is one and the clock is you know active clock edge the data get latched this is a very convenient way of latching the data because you could think of other way you know you could think of the controller clocking the register may be say controller makes initially this clock is 0 for a register and controller make it 1 then make it 0 then it gets a kind of positive edge and data get latched ok. But there is an issue there because if you have to load something to a register every clock cycle then you have to you know toggle it 1 then 0 then 1 and so on. But in this case it is enough if you keep it high for integral number of clock periods and if there are high for 4 active clock cycle then 4 times the continuously data get loaded ok. So this scheme has such an advantage so this shows the instruction register because we said that the execution take multiple clock cycle. So this holds the register value there so that because it is not a single cycle scheme so that register holds the instruction that is decoded and it goes to the controller ok. Now when you look at the program counter you see the program counter is the output is the one which is driving the address bus and sometimes the stack pointer drive the address bus you know that when an instruction like a call comes the content of the program counter is pushed on to the stack and stack means the part of the memory at that time the address of the memory is given by the stack pointer ok. So the stack pointer drive the address bus and the address value on the program counter is driven to the data bus and which goes to that memory location which is addressed by the stack pointer ok. So that is this program counter and stack pointer but you know that the program counter is incremented after every clock cycle but one issue here is that the program counter is 16 bit but the data bus is only 8 bit. So if you have a jump address kind of instruction the opcode of gem comes and that gets last here then one byte of the address come then the second byte of the address come. So when one byte comes that has to be kind of temporarily stored and then the second byte is along with that stored byte is loaded on to the program counter so the loading has to happen byte by byte similarly when the program counter content has to be pushed on to the stack the program counter value has to be split into two part and has to be pushed byte by byte. So it involves lot of kind of multiplexers to select various part these are the condo signal ok. Now all these condo signals are given by this controller ok. So the controller is the one which gives say the latch signal to the various registers it is the one which gives enable signal for the outputs like register A enable, ALE enable this the controller is the one which gives the various combinational selection input like in this case of the ALU whether it is an addition operation or subtraction or logical or comparator all that is specified by this the controller. So in a digital system design we divide this into two part one is the data path where the data moves ok. So here is a data moving into registers from registers to the temporary registers and from there through ALU back to registers and sometime the data come to the program counter or stack pointer sometime data moves from the program counter to outside the memory and so on. So all these path where the data moves where the computation happens that is called data path ok and this is a controller which controls the data path which gives the latch signal which enables various path through marxist all that is called the controller ok. And the controller does not do any computation controller just sequence all operations correctly because the complex there are lot of blocks and everything has to be done in an order in a sequence that sequencing is done by the controller through control signal it does not computation all the computation is done by the data path ok that you have to remember. Now the question is that this kind of scheme allow you to kind of partition the design in a functional boundary. Third worrying about the sequence of operation like when you divide these the whole design into blocks you do not need to worry that for the what happens when the add instruction comes what happens when the comparator comes which way the data moves all that can be controlled by this control signal ok. So this kind of separation of the controller and data path allows us to partition the design properly ok. So that is a level 1 design so we have to worry about the function at this point of the various block the timing of the signal between them like what is the timing of the control signal the electrical spec and what is the source current what are the voltages and so on all that need to be addressed at this level. Now like at this point we are not in a position to say the design is complete because ALU is a complex block even registers are not broken down into kind of known pieces or higher level blocks as we call so this has to be for the design ok. So let us look at the register ok. So when you talk about the data path I want to mention that data path composed of registers and combinational circuit in our case there are register the ALU there is combinational circuit within the program counter and the controller is something called a finite state machine and that also is composed of registers and combinational circuit we will see that ok. And one may be one of the important issue is that how many controllers are required for a data path normally ok in a simple case we can say there is only one controller required but in a data path if there are two asynchronous activities going on that means there is some operation in a part of the system which is not happening in synchronous with other part then you need a separate controller for that two mutually asynchronous part ok. But in principle if you have everything is kind of synchronous with one clock you need only one controller but then the controller operation can be very complex so but when there is complexity involved we might divide the data path into multiple blocks and you could have multiple controller for each part and you can have a top level controller controlling this the level one controllers separately. So one can think of a hierarchy of state machine in the case of complex design ok. So let us look at these registers how to design them and when you take a register you should know that this is an 8 bit register it involves 8 flip flops and the easiest thing to handle is output enable so since only one load can drive the data bus we can imagine this to be putting 8 tri-state gate at the output of the registers and controlled by a common enable signal which comes from the state machine but what about the input you know input we have said should be large only when the large signal comes and the clock comes ok. So let us look at a symbol design for a particular this particular register as a level 2 design ok now we are looking at this one of the registers so say this is one possible scheme of designing a register. So you have 8 flip flops and Q7 to Q0 is going to 8 tri-state gate the enable of 8 tri-state gate is controlled by the common register enable and you see I am just do not be confused with this kind of connection this just shows that it is input and output is connected to the same data bus that is all my indent is do not think that it is some kind of feedback or something like that it shows that the data bus is same. Now what we want is that a large signal along with the clock you know enables the latching of the data at the input ok so when the clock edge comes and the large signal is high then the data gets large so we can say it is ended together or large is clock is qualified by the large ok. It is a simple scheme this are some timing issue we will discuss that later but if you look at it the design is over because we have broken down the register into 8 flip flops 8 tri-state gate and one AND gate all these are blocks are known blocks at a high level even very simple circuit. So at this point the design gets over there is nothing further to design so at the level 2 the register design gets over it has an 8 bit register 8 tri-state gate and 1 2 input AND gate ok. So now let us go back and look at this program counter ok how to design this program counter so as I said the issue with the program counter is that the program counter is a 16 bit program counter ok. So you can imagine already there is a 16 bit register which is holding the program counter value and that is driving the address bus and this 2 to 1 max is controlled by the controller when the program counter need to drive it will select it this path when the stack pointer need to drive the controller will give select this path ok. But internally if you look at the program counter operation when say an instruction like jump and address come say the jump gets latch registered here the address come byte wide you know the least significant byte comes first as I said that as we stored in a temporary register then it cannot be loaded into the part of the program counter because then the address is kind of wrong address because one part holds the old address and one part holds the new address that cannot be done. So it is stored in a temporary register and when the new the most significant byte comes I think that along with this stored value can be loaded parallely to the 16 bit register ok. So you can see that there are input select path which selects various input to the program counter because you know that at the power on reset the program counter has to load with some specific starting location ok. So that as for the input of the program counter register there will be different path one path could be from the data bus one path could be from this reset value similarly when interrupt comes the program counter has to point to some address location that is an input to the program counter register. Another operation happens in the program counter is that after the execution of a instruction the program counter has to be incremented ok. So that increment there is an increment inside so after every instruction the program counter is incremented and loaded back ok that increment value is loaded on to the program counter register. So and ultimately when it drives the program counter output drives a data bus it has to be through a tri-state bus so this enables it ok and when in an instruction like call the current the program counter value is stored in the stack in that case there is a 16 bit value which has to go to an 8 bit bus you know in 2 steps and that is kind of controlled by this output select. So my point of discussion is that if you know function of the program counter and if you describe it you can already know what is the internal structure of the program counter. So knowing the operation of the program counter and knowing the various then we can infer the blocks how it is interconnected and we do not need to worry too much about the timing because the timing is given by the controller that can be worried up I mean that can be handled much later. So when you design a block at the level 2 you need to know what all the block does and you should be able to design that block. So let us look at the program counter design ok now maybe I will skip this particular design for a register ok. So I have shown a possible implementation of a register there is another possible implementation here which is like you see the earlier case the latch signal was kind of ended with the clock but in this case the latch signal goes to a 2 to 1 marks ok. And when the latch signal is high the input comes to the flip-flops or registers and when the clock comes it gets latch but you see the clock is not controlled clock is continuously clocking the flip-flop. So at every clock the input gets latched so when the latch is 0 we have to give the input to the register as what is it output otherwise it will get corrupt. So we feedback or recirculate the output back to the input when the latch signal is inactive. So this is a very nice way of controlling the register it is a very useful way the earlier scheme as its own problem as I said we will discuss it. So let us get back to the program counter and we have described the operation let us look at the program counter. So when it comes to the level 2 design of the program counter it looks like this and I will explain that the program counter is a 16 bit register and that is broken into two part the most significant byte called PC1 the least significant byte called PC0 both get clock and latch signal so that the input can be latched ok. Now you can see this the most significant part is blue line 8 bits and the least significant byte is red line which is 8 bit. Now you can see that this 8 bit and this 8 bit goes to a 16 bit incrementer or a plus 1 circuit and it goes back to the respective program counter sections ok. So normally after an instruction is executed this particular path is selected so that the program incremented value of the program counter is loaded back to the program counter ok. Now this path this green path shows how the program counter is loaded with a new address like in the case of a jump or a call in that case the least significant byte of the address comes here and that comes to a temporary register and that has a latch signal and the clock signal coming from the controller. So at the first byte the latch signal is given and upon the clock the least significant byte get latched here and when the most significant byte comes the controller select this particular path this green path by this input select select line of the MUX this path. So this most significant byte comes here this temporary register byte comes here and the controller gives a latch signal and it gets latched ok. So that is what is this path is about and similarly upon the reset the program counter is loaded with a specific starting location so that red lines shows that path. So at the time of reset the controller gives select this red path and the reset value get loaded similarly when the interrupt comes at the appropriate the timing appropriate time this reset interrupt location is loaded into the program counter by giving the proper input select by giving the latch signal and so on ok. So that is about the input that is how this 4 to 1 MUX is used one to choose the increment of path one to choose the path from data bus one to choose the reset address one to choose the interrupt address ok. Now you can see that the program counter drives address bus through an 2 to 1 MUX which is 16 bit and the other port is coming from the stack pointer and the select line is coming from the controller ok that is address bus and we have said that when there is a call instruction or an interrupt instruction because it goes to that subroutine and it has to return back to the main program. So the return address has to be stored somewhere so in that case normally this goes to the memory ok. So then like we have to drive the values on the program counter 1 and 0 back to the data bus and that has to be done byte by byte so you can see that the orange line goes here the blue line goes here. So when such a scenario comes the controller gives an output select line to select the least significant byte and it enables a tri-state gate and then it goes to the data bus. In the next clock cycle this path is selected and it drives the data bus. So that is the design of the program counter so knowing the operation of the program counter we are able to kind of break into pieces and we are able to design the data path through this maze of the block. So essentially it is composed of 3 8 bit registers 2 4 to 1 multiplexer which is each path is 8 bit so we can say it is an 8 bit 4 to 1 multiplexer 2 numbers ok. It has a 16 bit incrementer it has a 2 to 1 16 bit multiplexer it has a 2 to 1 8 bit multiplexer it has 8 tri-state gate. So at this point of time all the blocks are known we know this you know these are registers simple registers these are MUXers incrementer tri-state gate we can say the design is completed we might you know the code the design in a hardware description language or we draw a schematic it does not matter we have broken down that complex block into simple blocks which is known high level combination blocks like the MUXers or encoder decoder adder substractor then it can be we know the circuit and that can be designed. So that is how we design from top to bottom maybe you can handle the stack pointer yourself you can give a try with you know try at that and so we have looked at the flag design where there is a symbol counter is designed a top down design in the case of the CPU and that is as I said any complex system is designed top down there is something called bottom up design which is like say you start with a very simple block like say you design the program counter then you come you design the registers and you design the ALU and try to integrate it which does not work in practically but what can be done is that suppose you have tried to design the CPU you have made the level 1 diagram then the level 2 diagram then at that point you are not very clear about the program counter maybe you can start bottom up by putting the blocks otherwise all complex design has to be top down otherwise it would not work and then at each level you have to worry about the functionality timing electrical characteristics power dissipation and so on ok. So that is in a nutshell how you design a complex circuit so in this lecture today's lecture we have looked at we have jumped from a very simple design to a reasonably complex design we have looked at the design of a CPU and we said it cannot be handled in one shot like a flat design it has to be partitioned into known functional blocks and then each block has to be further maybe partitioned and designed to a level where we know the blocks you know to the level of registers and non-combination blocks then we can stop the design ok. So that is top down design that is applicable to any complex system and we have looked at the level 1 diagram of the CPU and we have looked at what is a data path data path is where the data operation happens you know it is composed of registers and combinational circuit in the CPU you have most of the data moving between the memory and the instruction register or between the various registers ALU and back to the registers between registers and memory and so on and the memory and the program counter all that as where the data movement happens. So all that forms a data path and then there is a controller which controls the latching of the data enabling of the data to the data bus choosing various path in the data path all that is the job of the controller and it does no computation and controller does absolutely no computation it coordinate it sequence operation in the data path. So this kind of division allows us to concentrate on functional partition without worrying about sequencing like we can handle the old sequencing or timing of the control signal later when the controller is designed knowing the operation we can design the block into pieces and then we can handle the timing then after that we have looked at the registers how to design the register we have broken down and bring it into a kind of some gates and the flip-flops then we have looked at the various operation of the program counter and we have come out with a scheme where the program counter is designed in detail and we have found that it is composed of registers and multiplexers and tri-seat gates various different type multiplexer and that is what you are going to see in a design you will see lot of multiplexer which is choosing various paths ok. So that is how you kind of partition and go from top level to top to bottom and design the circuit and in the next lecture now we look at the important part of the controller what is the behaviour of the controller and what is the structure of the controller how to kind of what is the basic idea by which that controller structure works and how to design a controller and all that we will study in the next lecture. So please go back now we are starting the design seriously so you may not be used to designing complex designs so please revise and if you are not familiar with the microprocessor read some book on microprocessor and grasp the fundamentals of the microprocessor so that you can follow it very nicely so please work on this topic and I wish you all the best and thank you.