 Hello everyone, welcome to the session 4 of the labs. So, in this lab I would recommend each and every one of you to go through the lecture slides for all the lecture slides discussed for design compiler that means, complete ASP and all the three labs you should be very very comfortable at the end of lab 3. Before beginning lab 4, lab 4 I will discuss about some miscellaneous items about the different format that we could write out and how we could read in with those format of design. And then I would discuss about some advanced concepts in this. I would take up one more example of a design, I would clarify what were the math related. So, since this is that somewhat advanced of it I would recommend that you be very comfortable in lab 3, you should understand everything that is discussed in lab 3, if you are going on to lab 3. So, just clarifying one thing before starting is that I am using one alias called AD, it is not a need come on. So, I have a goodness in my source file alias and I use this. So, LT is an alias for LS9, this is what I use for listing down the files in order of time modified. So, you could choose to use a different sort of a different sort of alias come on. So, please do not be confused, LT is not using this in alias right. So, let us go to the, so this time I am not using the design I have been using for the previous I will give you some of the different design. So, I have copied the RTL again and I have been working in both, both directly. So, this is the design and this is the design. So, the top level is called chip level, it has a 16 cross 16 multiplier, 8 cross 8 multiplier. It has a MUX module, it has a fast speed module, it has a couple of 8 bit adders, a comparator, a conqueror, it has a 16 bit adder, again there is a comparator at the top level. So, this is not a real goal design, it is just a bunch of the data part components that are complete together to help us in understanding some of the aspects. So, you could look at the RTS files, they are very very simple comparatively. This is the, this is a bunch of RTS files. So, adder 16 is just, adder 16 is nothing but it is, it is a register the input and it just adds the 8 bit, 3 8 bit register. So, it is 16 bit register. So, A in, B in are A 2 16 bit inputs. So, we carry out and some is that B in less than carrying in, carrying is also one of the inputs. So, most of the modules are of this kind, where the adder 16, adder 8, micro cloud, 8 cross 8, micro cloud, 16 plus 16, just go to the bulk directory. So, what I did have, I have written out a bunch of BDCs and log times, I have tried 2, 3, 4 different types of synthesis. So, I have already synthesized them to same on time and to show you quickly what I want to be shown. So, let us first discuss the difference between writing out DDC and writing out the log time. So, let us, let me open DDC. Now, the right command supports multiple formats, 2 of the most formats, most famous formats are DDC as a block. DDC is the option of this internal format, internal dynamic format and it is recommended to be, to be used if you are using some of the students for their own design tools. However, if you are using any third party tools for back end, they will not be reading DDC. So, you will be in the log time, but there is one major difference between DDC and log time is that DDC contains the constraint component, not simply the design component also the constraint. So, what I will do, I will read in the DDC of one of the flavors of synthesis as the name suggests. So, I am reading in the DDC of you know I have delayed the DDC, this is the synthesis I have performed. So, yeah now see I read in the DDC, but now what has happened is that it is read in the DDC file, but it cannot find in the DDC because I have not set the link, link library in a target library. So, what I will do is I will start again, I would remove the design. So, this is to show you that you should first have all the variables set and then you should read in the design. So, I have taken the important variables, these are the important four commands where I have set all of them, now I will again read this DDC and now. So, when we read in DDC, we should we should always when you read in some design from the design design, all in an RTA, we should always make sure that your top level is set correctly. So, it tells me here that this current design is tip level and you could also verify that the current design is in fact, indeed the top level design is tip level. Now, the next step so, when we read in the RTL, complete RTL using analyze command, we should do elaborate on the top level design. What does elaborate do? Elaborate converts the RTL into detail, but here it is not RTL, it is already a net list, it is already in gate level form. So, you would not use elaborate, but we will use the command called link, link will just make sure that all the components in your design are mapped and all the labels that we have loaded and all the components are in fact, linked to one label or the other. So, here it tells me that the memory has 29 and this is the DDC that we have read and this is the library right. So, this linking is is ok, earlier it complained that they are not specify a library because DDC contains a link to that library and since the target library and link library will not set. So, it will be correct. So, now this is already synthesized then. Now, let us look at so, it also has constraints because it is a DDC, it is not very long right. When we write out DDC the constraints are also right now. So, it also it already has clock, it is a clock of 2 nanoseconds we have done the code 3 ok and it already it does not have any input output delays although it does have a I do a report code minus 12 code. So, it does is it does not have input output delays what it has it has loads on the output it is done it has. So, there is no input delay input delay delay the output delay is also empty it has transmission. So, it has a transmission of 45 or inputs. So, I have just defined the environment constraints, but no I will do it. Now, I have synthesized it without the I will do it with just the clock. So, these are the constraints I like the constraint I use I set the operating condition, I set the input permission, I set the load, I created the clock, I set the clock I set any transmission and I just did the component with the standard options. Now, since this is done I wrote out the DDC, I read in the DDC again. Now, if you would have read out read the that the very long then only the design would have been in the place you would have to reapply the constraint. So, what we could also do is we could write out the constraint by using something called write script. So, you see the help the write script let us. So, write script also supports different format. So, what we will do we say write script or most famous format is for the DDC and also design constraint or there is also a DC a TQM which is the DC Tical format. So, I will use the DC Tical format. So, and I write the clock ticker. Let us see what it writes minus output it is open list of the ticker. So, it wrote the units operating condition link library in load. Now, see the single command will be issued tech load on all outputs of the value of 10. It translates into so many tech load commands on each of these source and it contains the clock information input information and so on right. So, this is the constraint that DC wrote out for you now. So, the two choices either I reading that so, I want to read the second design either I should have a DDC or I should have a weblog and a constraint right or the original constraints in TQM like this. So, usually when I want to deliver the constraints to the back end or they said now I would give the weblog plus this kind of a constraint. Now, let us go back to the design compiler now the design is already ready. I want to show you the report the constraints right. So, I have already written out that there is a so with this particular DDC let us first look at how to write out the report right. Now, let us look at some report time example. Now, first let us look at what are the constraints that are violated. So, report constraint minus all violated will be that in the minus most option. So, this was the DDCs these are the only volitions in the design. All of them are whole as I have discussed before do not worry much about the whole volitions the small whole volitions. They are expected and it is expected that we should not call them simple. They are better solved after the later now is done after there is some property in this I will come to that later in the next slide. So, there is no no set of condition all the, but the fact is that there are no input and output delay right. The only delay the only constraint part are the register the register part. So, if I do a report timing the report timing by default it will report the worst case violator for each part group. Now, there are since there is only one clock there is just one clock which is clock this is the clock and it will tell me what is the worst part right. So, I give this now it is telling me that this is the worst part I will not discuss the report timing format again please if you have it understood it clearly it tells the complete part and then we see that there is no omission it is not slack is not. And it is meeting by view that means it is barely met right. Now, there is something called critical range what it means is that this the someone is first at critical range it has been many options. Now, what happens is that these even optimised and we have seen that it uses multiple iterations it will do design . Now by default when you do not do anything special I mean you do not issue any special command DT will try and fix from the worst part that are violating. Now, if your design is so and if you can do area requirement. So, it will focus it is energy on the worst violators for which the slack is less than theta. Now, for some reason let us say you are doing some prior things and you want. So, now in this case you see that slack is like to be 0 right. Now, some people for some special cases may want that you know I do not want to be your slack I want some margin I want the slack to be positive that is 100 second or 0.5 something like that you can use set critical range to control that is again I do not have an advanced teacher use for very very specific cases. So, what it does is that you could use you could specify a case specify the values to which critical range the attribute is to be set and you can tell what designs it should be set on. So, the compile command will use this attribute of the top level design and the default critical range for the possible that do not have this set by default the value is 0 that means it will fix the deal. We can assign this critical range using this critical range command there is also can be used in group path and we can assign a positive logic point value greater than 0.0 right greater than 0 and then. So, critical range of 0 means only the most critical paths are optimized. If we specify a non-zero value that then the near critical range path within that amount will also be optimized as possible. So, you could say that you could set the critical range to be 10 10 case for example. So, DC will try and optimize the 10 key all those parts that has slack between 0 and 10 key. So, this is the you could try it out if time permits then maybe in the next session we can try one example of this, but it is again a very special use case if you would not encounter its usage very quickly. So, now what happens to the input and output path? So, we reported paths. So, we saw that only the only whole volition there is a set of volition path. So, how to report the path starting from input, reverse path you could say something like this report timing from input. Now, it tells us that this gives us the reverse path again short point input port end point input register which is clock by clock, but the path output is none either path output is none because there is no input delay. Although the input delay is assumed to be 0 input delay is assumed to be 0, but DC does not know that what clock it cannot similarly default on. So, because since generally whenever we specify input or output delay we should also tell which clock is it referring to and there is no input delay defined DC assumes a value of 0 it cannot assume anything else, but it does not know what clock it is on. So, it will just say that it will just say that passing and passing because there is no target time to which this data should arrive and please note that so if there are lot of full adders in the design it is going through a multi path. So, the path two important things path group will be none and path will be unconstrained if you do not specify input delay right similarly to outputs how to report timing to outputs we say report timing minus 2 all outputs. Now, we see again path is starting from register it is going to be endpoint output code and path is again unconstrained. Now here DC does not even tell you that there is an output delay because again the output delay forms the part of the data required time. So, there are two sections if you remember correctly data arrival time and data required time. Output delay forms the part of the data required required time which is not here not present here because there is no required time will be constrained input delay forms the part of the data launch form. So, it assumes a 0 right. So, these are unconstraints unconstrained now DC here is free to reclaim area if it reclaims area in that case the path will be very slow right. So, let us see what is the area? So, let us ok there is some area total cell area is 5205. Now, is it good is it bad not to have an input and output delay let us see. Now, one more example of report timing now I want to report timing report the worst case path from starting from a register and it is an equal register. I want to report the worst case path among all register to register path. In this case it is easy because input and output delays are not specified I will give you the report timing and the worst case path that comes is in fact, a register to register path. Why? Because there is no input all the input paths are unconstrained all the output paths are unconstrained. Report timing by default will report you remember this it will report worst path in each clock growth worst single path if you want to. So, if you want to report multiple path you can say n minus max path let us say 5 if you now give us 5 worst path right all of them. So, this is the 5 worst path right. Now, I have a second version of this design where I have done this I have specified an input delay of 1 on all inputs and output delay of 1. One more type of path could be there starting from input ending in it output report again report timing minus from on inputs minus 2. So, these are these type of path are completely public path. For example, it is going from a cell 0 to max out we see that again input external delay is a 0 but the path would be unconstrained right. So, this is one more example. So, we saw that input register and unconstrained register register is fine register to output is unconstrained input to output is unconstrained right. Now, I will remove this design and I will read in the DDC of the design with input delay. So, I have to remove design minus design I will not do all then I will read in the DDC what is the DDC fill in DDC will I introduce the DDC again current design is correctly set I will just do a link it is fine then now I want to do report time. Now, let us see the report constraint first. So, it gives you a message that it is updating a design updating graph updating graph is this is updating time. So, it has the designing for it has the constraints it just needs to do a little process under it is put to calculate the timing and to give you all this now this is the problem. Now, there are set of violations this is the end points. So, we see that ok there are some outputs on which there are violations these might be input register path and input register is the path we do not know input register. So, now there are a lot of violations right and plus the whole violations compared to the one where we gave it to them also. Now, let us analyze where these violations lie first I want to check what are the violations from the inputs. So, again I am doing report timing from all and this is a violation this is the worst path from the input group. Now, I see that ok input today is 1 DDC gets an optimization. Now, if you remember the earlier path where there was no input illicit the input to register path was very very long the worst path was about it was about 3 11 milliseconds. Now, here DC has done some optimization because there was some constraints and since there is a violation we know that DC has worked here right it has optimized. Now, we see we do not see do not see so many levels of logic we could go and compare the the timing of input register path in this DDC and without the illicit DDC you would find that the paths are much more optimal here because there is a constraint. So, DC can only only optimize when there is a constraint. So, ok there is there is violation from input to register path I know I have the this the feedback is very very fast it is like higher that there is a high the input here is not registered input is going through so many so many levels of harmony in logic to register. So, with this I know that the design is not good right it is not a real world design. The input all the inputs should have been registered there should be a flop right after the input there should not be any combination or in that case it would meet our requirement it would meet the input requirement it can even meet 1.2 milliseconds. So, the design is very simple what I want you to do is either assignment try and correct the design so that the input to register path are good good for timing how do you do that to register all inputs. Similarly, I want to see the output time report I will say report timing minus from all inputs minus to all outputs sorry I will be more first I will use to all on. So, this is again a what what it gave us the worst case path ending at the output is in fact starting from the input input. Now, this is why it is worse the output delay is 1 the input delay is 1 so, you are left with 0 path path 0 available path right. So, the output frequency is 2 out of the out of the total period of 2 milliseconds you gave 1 millisecond on the input side 1 algorithm in the output there is a thing left for the operation you wanted to do inside right. So, in this case also I want to so, what we could do is for such cases I will come to this later a site later, but for such cases we have to make sure that an output is not in such a way that such kind of combination paths are constrained like this. It is a very bad constraint because it does not leave anything for the internal design it will give all the timing away at either input side or the output side. So, practical solution would be to reduce the input delay or the output delay there is one more solution I will come to that later. So, now, I want I want to report the worst case register to register form. Let us do a report timing and see what it does. Now, report timing tells us that the worst path is from start point to the end point register, start point to the forth clock to this clock. Now, for all the path in this design that clock to this clock whether it be input to output input to register register register register clock. Now, what is the way to report simply register to register form. You could do you could use the command for all register. All register typically gives you a list of all register, but it has the option called all registers minus data pins will return a collection with with all the data pins of the register minus clock pin will return with all the clock pins of all the register. So, now, I know that in case of a register to register path the start point is the clock pin of the register. So, I do all the register and the end point is any data pin. So, for a register to register path the start point is always a clock. So, I will build up a collection of all the clock pins of all the register where I have the collection of all the data pins of all the register right. If I do this it will tell me the worst nature register to register path. So, this is the path some U7 being data register U7 right. Now, let us see violation. Now, in the earlier design now what we have changed is compared to designs without input delays without IO delays and without IO delays. Now, what we have changed is just we have just applied input and output delays. We are just playing with input and output delay values, but do we expect a register to register bit more? No we did not expect that right it is wrong it is something some there is some mistake we have done. So, that the slack is violated in case of register to register. Ideally input to register path register to output path you can play with the delay values. You can even make sure that your design your source register may show the function or go and so on. First the register to register paths typically are the limiting factor for your performance not the input to register path or output to register path. The IO delay IO path should not be timing critical what is timing critical usually is the register to register path. Now, imagine a chain of the CPU now arithmetic and logic operations both by the end particularly the core of the AALU which that is not the output for the timing critical right not the incoming ports to register or register to output right. Now, what is the problem why did DC not optimize this path? We know that DC can optimize why because we have seen in the earlier example that without input output delays without applying input output delays DC was was able to achieve a new path, but in this case it is not so why. Now, the problem is that DC works on groups here there is only one clock group that is clock. So, DC will group the only timing path into this input which are determined by the clock. Now, in this case the core was on the same clock the input is on the same clock the output is on the same clock DC does not have a way of knowing that whether you want to prioritize this to this or you want to prioritize register to input to register or you want to prioritize register to output. So, it works on all the path together what it does now is that let us see report constraints. It is a very important thing please make sure that you understand this concept. Now, in this case we know that these top validers are input to register path. So, it will work on all the validers, but now it feels that it cannot optimize input to register path anymore input to register path is optimized to the maximum and it has so many violations again register to output path have so many violations right. So, what it does is that it spends all its time and energy into those top validers path. If there was only one path validing then finally it will come to the time, but if there are multiple paths validing at the top at the top of that it will focus its energy on that. Now, it knows that it cannot hit these. So, it will leave out some of the register path it will not optimize it. Why because every synthesis process is a time it has some time limit right over it although this is a very small design and all that, but still. So, be since all the paths in the same are in the same group same clock group and DC does not know any priority it assumes that all paths are of same priority it would optimize and since it feels so many violations at the top level starting from inputs or going to output it will spend all its time and energy in solving this and from the process it will leave it might leave out some of the register path. So, what do we do is on this we do not want it to be. Now, there is a much more sophisticated technique which you should I recommend that you use this technique always what I am about to do. Now, what I do is I renew the design. Now, what I what extra thing I do with that I use a command for group time. So, what I do is that. So, the first design we saw did not have any input delay or output delay. Second at the level of design had input output delay and we saw that the problems in this register path. Now, I will do a group path I issue command for group path I had group I create a new group called input group and I say that all the path starting from input put it into this group. Again I tell that all the path ending at the output group keep it into the output group. Now, I will read in the DDC in this I will read in the DDC in which I have to group path command as well and I do a link and again I do a report function minus all what it does. Now, we see that there are violations. So, now, it has three groups of input group and output group. The two of the groups we have created explicitly. So, now, it tells us for group. So, maximum setup is violated in input group. It is expected time because we have also reported then these are all input group violations. Now, it is very now reporting part also becomes much better. Now, we know that intuitively we know that ok input group means on these end points the violation is starting from some input code right. Then that is it. There are no violations no setup violations in the clock group clock or there are no violations in output group also. Now, let us do a report timing. Now, this report timing will report worst violators worst case path not to be violated. If it will not if all the path center was not violating all the timing is made it will again report the path with the worst plan. So, it could be negative path or positive path. So, now, the first part is reports is the path group clock. Now, my register is now the path group clock in fact contains only the register to make the path because we have created separate groups for input and output. So, now, this path is made. So, we are back to good synthesis data. We saw in trial one that without input output it will be given to me time. In second trial we saw that a time goes by. Now, in this one we are able to restore the time right. I tell you why how did we achieve that by group path. Now, slack is made. Now, before timing will also report the path that I mean the worst timing for input loop and output loop. So, the second path is report is the input loop. We know that timing is violating here because it is a non path there is a violation. Now, it reports the output loop right output loop is also meeting we saw that there will be a violation of the input loop or not there. Now, how did DC achieve this? DC was able to achieve this because now it sees the group earlier it is only one group. So, now, as many groups as DC see it will try to meet the timing for each of those groups individually. So, it tries to meet timing for clock. Now, the clock has no competition input loop right clock is a is a group in itself all register to register time certain time it is try and meet timing and we saw that it is able to meet time. Now, it was an input loop. Now, input loops group it finds that ok yes let us do reports on stream. So, input loop now input loop timing path critical now because it is a long there is not a complicated one it will improve this one. So, it tries to optimize. Now, it might happen that the path slower down here can be optimized, but they are not optimized because they are number of path that are violating more than these paths right. So, it can happen in a same group the path if the top volatiles can prevent optimization of the path that are violating by a smaller amount it can happen we saw that we tried to write again output loop group. So, we saw that in the earlier case in the last round the output loop was of the the timing to outputs was also violating that now it is not violating here right. So, DC will work on every group. So, it is a very good practice to create separate groups for input and output right either in this way all the things will change when multiple blocks blocks will come in their path right. So, the best practice is to register all input and output so, that is timing can change become very very easy. Second if you realize that your register to register timing is violating look at other groups and you expect the timing to pass look at the other groups what does the list of all make sure that you segregate groups. So, things become much more complex when you have multiple blocks then you have to make sure that the inputs get input delay with respect to the correct block not with a problem or you could use a group path to create separate groups of your own right. Read more about group path read more about the long page in the man page group path is one more functionality group path in fact, provides you with lot of things. So, in fact, you could also set a group path in that. So, what you could do is you have separate groups your separate groups created now you can tell DC that this group has higher priority. So, that DC will spend more time on that more time in the next right by by changing the weight value. If you create a group without specify anything to the weight and the critical weight then for DC all the groups are equal and it will meet up to the critical range of freedom. You could tell DC that I want to want to this group is more important. So, please spend more time in this. So, I can use this read feature you can read more about it and try it may be in the next lab session I will try out the weight feature, but not in this one. I just wanted to introduce the concept of path grouping and to make sure that you understand how DC optimizes in a particular group and how we talk while it is in one group can affect the optimization on the path that are already by a smaller number right. I hope the concept is clean now let us see few other things. So, we have discussed we discussed about the IO delays what how does IO delay applying or not applying IO delay will affect your design. We saw discuss about the right command and we read back of PDC and we have discussed about group path. Now let us see a command called change date. Now for this particular design let me write out a so I have written out a variable let us look at the value. Now let us look at the value let us look at the value of no IO delays not me. Now if you notice there are backslash in the name of wires and probably this this wire was 16 bit wide and it is split like this backslash the wire name and slash and n16 and so on. We saw that we see that the instant names have backslash and so on. This is the way design compiler operates when it works on a design and we wrote out very long so it will write out but this netlist will have problems when some third party tools will read it. Why because this this we do not follow the regular rule the naming convention does not follow the regular rules. So, what do we do to write this there is a command called change names. So, what I did was I applied a change names on this I said change names minus verbose which is given in list of two things and I say minus rules and I tell DC that follow the regular rule and change the names correct the names right if I do this or I say minus hierarchy. Now it is telling me that in this design this object is a cell and it is changing the name to this right. So, again see if you see that the instance name slash remove and let us go about it. So, it will give you a list of all the names to change right. Let us look at the net for example, see the net is has flash in it, but here it is replaced by underscore. So, you could the the replacement of slash by underscore these things are configurable there is a command called design name rule which can design a code rule, but for all practical purposes you can use the change name rules by law rules under complete hierarchy and now you write out the design. So, the design looks like this. So, let us look at the change name design. Now the design looks like this the the instance names are corrected there are no back flash or flash in the wire rule and so on. So, now it is safe to be to deliver this well off to a third party group there will not be any problem in reading right. So, ideally you should include change names before writing out the domain right. So, this was all about change names. Now let us look at something from the DW module. We discussed this in the lecture that DW is short form for designer components, designer components is a group of data path components by synopsis itself. Let us look at the design there is a data path component in the design. So, I think it is in the path that you know yeah. So, here you could use something like this DW 0 to a test code model for multi plan model. The hash 8 process tells that what is the so it multiplies to vectors a and b we are telling that ok the first is also 8 bits second is also 8 bits. You should look at the documentation before using any DW components like this. Again these are this is the this is how it will be found created. So, you could use it directly. Now second so what now DC will do is that you do not need a backlog for this because this DW component DC has the knowledge of this. So, there is a designer we go back to DC shell again. Let us go to DC shell again and start from start the synthesis from scratch. One more thing now let us look at the netlist. Let us look at no idolace.b and try to find if there is any DW component. There is no DW component in this netlist. Let us go to path statement because the DW component was here, but here we do not find anything we do not find any DW component. There is no design by the name of DWM. Why? Because by default during the compile design compiler is ungrouped. It is ungrouped design where a component. You cannot see the actual what is the actual architecture of designer because it is proprietary by default. You you only have the information of what it does and the code information how you do. And the rest in the compile process also it is ungrouped the hierarchy and you are left with the all the registers or the components. How can we avoid that? Let us see. So, I will remove all the design and now I will read in the RTL. I will start from RTL. I will set up all these things. I will apply the constraint. Let us not apply input output list for the thing I want to show you is ever dependent on it. Now let us search for a variable which controls the ungroup. So, there is a variable called compile as the ungrouped DW It is set to true. Let us set it to false. And now let us compile. It is compiling. Now when we compile log, if you will create the log it also shows that it is. So, this is the designer building log library. It shows what version it is using. It is input. So, it is using a basic and license code. So, it is mapping. Now, now there are two. So, it is adding this design there to the list of comparator building. So, now there are two kinds of users. There are two places where designer building. Once when you instantiate it explicitly like in the path segment we did. So, it is telling me that it is implemented to perfect means it is implementing a designer component for path segment. So, static library is nothing but a designer component. But when you say A plus B, there are also if the designer library is loaded DC will use one of the designer components. So, it will use a DW divider. It is telling us it is implementing synthesis for other 16. It is implementing synthesis for enterprise 16 or 16. It is implementing something for counter and for comparator also. So, there are lot of designs already in design there and DC will try and use it as a non-possible synthesis is done. There is a command called report resources. You do report resources minus I cannot see what it is this. It tells us what are the resources that this complete design is using. Now, it told us that in path segment dot B the cell U 100 was using DW 0 to 1 which is explicit with width 8 and and it contains some operation. So, you so this is explicit, but this adder this is not explicit. We saw only one DW component in RTA, but this one is the this is this one chosen by DC by looking at an RTA. It saw that there was let let we will see the path segment what is causing this. So, DW adder is instantiated by DC. This is the cell width is 16 it contains these operations right. For each of the design it will tell for example, there are no multi multi-technical report. Now, there is one more design it did not ungroup. In path segment there was a DW 0 to 1 less than 1. It did not ungroup now that is why it is showing a separate design. Now, apart from this explicit one it use a lot of other DW's and none of them is ungroup because the ungroup is set to false. So, this is there is one more design. So, again see multiplier 16 plus 16 it used an unsigned multiplier DW right unsigned right and this is the implementation report. It is also telling that for this multiplier it used this implementation. So, AP path is a type of an implementation of a multiplier I think WP means partial product. So, it is some a I do not remember what it means. So, it is a partial product based architecture somewhere somewhere on the knowledge of the board architecture. It is area it is optimized for area. So, it close this again we will see that for multiplier it did not choose. So, multiplier 8 plus 8 for now for multiplier 8 plus 8 also close the same similar kind of an architecture. Now, for comparator the comparator is we will see comparator in detail I have one more example of that. So, it chose a component called DWM. So, this report resources tells that the sources means the DW components or adders or machine clients. It is telling us that what all the sources does the I command to use again for for adder 8 it used something it used a DP op it is a DP op again it is item kind of a component. So, and so on right. So, this is the complete report we should read this report try to understand what it means read more about the report it is not too complex it tells what is the operation and what is the architecture it used and so on. It also tells it will it shared some of the resources. So, does it have any. So, yeah it tells there is no resource sharing information. So, let me see yeah. So, there are three things with reports resource sharing implementation and multiplexing. So, it gives for example now let us look at one example and try to map it. Let us look at adder 8. So, we will open this file adder 8 and see that what what it does right. This is an adder 8. So, it is very simple, but what what is causing DP to use a designer component is this assignment statement. So, this assignment statement a plus b plus team now this plus sign these two plus signs will cause DC. So, yeah. So, adder 8 it use something called DP op it is kind of an adder and. So, this is the data path. So, DP op is nothing but a data path operator operator. So, this data path operator this is the same it contains two operations add 20 and add 22 right. It tells us that there are three variables because we saw there was a plus b plus c. So, there are three primary inputs and one primary output all of these are unsigned because by default they are unsigned. The width is 8 here and the carrying is 1. The output is obviously be a 9 bit 1 expression is i 1 plus i 2 plus i a plus b plus c right. So, now let us look at these. So, this is the it tells what is the implementation. So, it shows a DP op now one one data path operator can have multiple implementation. You can look at the design with documentation to go more, but for example, a multiplier will have a lot lot of implementations some of them will be good for area others will be good for timing. So, this this DP op now it is choosing a STR kind of a STR an implementation called STR which is optimized for area. I do not remember what STR means let us see the netlist now you can think what STR means. Now, let us look at let us write out that have I written all the netlist yeah yeah. So, I have this netlist which is the design with components. Now, let us look at anything called DW. Now, we see that there is a multiply 8 process or let us look at path statement yeah. So, this is the module path statement we see that now that there are there is a add 0 which is which is used by DC. Now, it has not ungrouped it. So, you can see that it has been created in a netlist a data. So, now, this be really compared to the RTL. So, RTL has a plus b plus b I guess it can be RTL also. Now, there is b plus sign. So, this is a vector block like the player is U 100 and then there are the adders here yeah. So, there is a ALU which is a data plus b data and a data and b data are both 16 bits. So, this ALU will so, there so, it uses a DW might to calculate the product and based on the operation operating code with 0 the output is a plus b if it is 1 then the output is the product right. So, to implement it it needs 1 is 1 multiplier that is instantiated explicitly and 1 adder right. So, let us look at now. So, this is the adder. So, U 100 was in the it comes from the RTL. So, this one chosen by DC a data b data carrying this is a type of adder it is CW 0 1 and that is to add. So, this is the way it chooses right. So, it sees a plus sign it will find out a DW component and do based on a design constraint right. You can we can we can look inside these modules what they contain it would not be much of a use because again these implementations are not easy from this time. So, this is the multiplier module it is a big module it is a lot of combination logic adders and so on. It is difficult to understand it is not possible to understand from this template what kind of architecture it is made. So, but rest assured that these architectures are not documented. Although exact code is not given in the document, but the document does tell you that if it is looped for or what is how does area compare or performance compare across the inside. Let us look at module adder 8. So, adder 8 yeah. So, now we I also talk about you. So, now module adder 8 I come to why it is 0 or 1. So, module adder 8 contains a DVO P number which performs the actual addition. So, this module this is the module DVO P with that module DVO P. Now, this is the one that performs that is typical right. It is again this is this is just a combination of full adder which were expected right. If you add three things three bits it means the full adder this is a full adder. Now, the very useful adder in the connected will define whether it is the number of the full adder and the way they are connected will define whether it is area of the noise or time. In this case we saw from the report resources report that is area of the noise. So, one of the story report resources command use it to know what are the resources that will be used in the other time. You can use a generate multiplier statement by you can just use a multiplication sign or you could directly use a designer component up to the application or, but let us assure that for every class minus multiplier sign that will be 3 it will try and find a suitable DW implementation formula. If you have if only if you have a DW designer matching problem right. Now, let us come to the mystery of the module name of adder. Now, let us look at the design. Now, we see that there are multiple instances of adder aid in cascade mode there are two adder aids this is a case of unification. Now, what DC does is that it by default we did not specify I mean there will be a compile command it will use it to find it will make it different copies of the design. So, when we open this design we check what all what all are the adders let us look at the non-design there. Now, let us look at the modules here list of modules adder 16. So, see the adder 16 the module name is preserved because there is only one instance of adder 16. So, adder 16 is used only one adder 8 is the only design that is used multiple times comparator is also used multiple times comparator here and comparator here. So, module adder let me just read. So, these are the list of modules module adder 16 module now adder 8 has two modules 8 underscore 0 8 underscore 1. Now, in the net list that means has two different design adder 8 underscore 0 adder 8 underscore 1 although in RTL the module name of thing, but when it comes to net list it will make it independent copies. Now, when the design synthesize adder 8 0 has nothing to do with adder 8 1 although the functionality is trained, but the implementation might be evaluated because each of them will get optimized according to its own design goal ok. Similarly, comparator 1 and comparator 0 right this what have what name get used after underscore this can mainly controlled by a variable in DC shell just like the chain names can be controlled similarly we all these things can be controlled you want any specific prefix to be present or specific topics to be present you can you can control that. So, we saw the report resources command very useful to know what are the resources are there we discussed about the unification process. Now, let me show you one more command which is important for for instance it is called state fix multiple ports net. So, what is commander is that now in in the logical that RTL is logical in different part you are only worried about the connectivity data we are not worried about physical needs. So, many times we will say. So, when it get synthesized that way now an input you have assigned output is equal to some input. So, when it goes to two synthesis DC will not do anything special only we will just connect output to input right, but when it goes to physical design usually physical design many times these kind of these kinds of assigned statements will be a different. So, always you want that there should be a buffer separating output at least a buffer and then according to design rules there will be then both buffers or buffer will be upside or so. So, also you might have a case of one resisted driving multiple outputs. So, in that case the loading is probably increase on the resistor, but DC will solve it only to the extent of meeting the most happenings on that. But when it goes to back end physical design it might happen that all those outputs are not placed close together the ports are placed far away from each other in that case you want a good buffer to be on the resistor now. So, DC what you are telling DC by this command is that you fix all such cases. So, you could read more about it in the man page. So, it will it will apply some attribute called which multiple port made it will insert buffers to isolate input from output ports, it will insert buffers so that no send line in vice mode of an output port and so on. So, this is just to make the job of the back end in this video. Logical itself it does not change the functionality of a designer. So, either before writing out the netlist before compiling you set this you use this command again when you compile before writing out the netlist make sure you use the table command. So, all these things are to make the physical implementation will be a physical design to be smooth right ok. Now, this command was also done let us now let us look at we saw different options in compiler. So, we have been using no autonomous we have discussed about this you can read about most no sequential output inversion is the it inverts the sequential output in some cases where it finds that it can optimize the design better. Otherwise what happens if this happens sequential output it means that yes now let us say there is a clock right it drives a code of logic. If DC finds that if I can invert this output of this clock then my logic will be better optimized if you do not give this it will basically optimize that it will do that and it will invert again. So, in now the clock some many times you do not want that when you come to formal equivalence I will tell you why you do not want that. So, ideally to start with you do not want to invert the output of the sequential element. So, you should always most most of the cases you should do it with as you can read about exact now, but I do not give you that. Boundary optimization we have discussed that it can move inverters across boundaries usually to start with again you do not want to do this, but you can do this later when you are comfortable with the compile command. You can read we will talk about this later I will talk about this in the lectures right again I will talk about this in the lectures right incremental we have talked about it. Again you can split your compile into two incremental sorry into node design tool and only design tool, but you but in most of the cases you can you do not need to give anything then this is option for scan day clock also I will discuss first in the lecture side and then I will have a lab of it scan is simple when you do scan I have done a compile with scan let us see how it looks like let us first look at the non-scan one no are you there dot b let us look at one of the clock. So, this DFS have just these are DFS these are non-scan clock it has data clock and Q. Let us look at the I with scan now here the DFS are replaced by SDFS. So, it will just replace all these blocks which are non-scan by the scannable version. So, DFS is replaced by SBFS which means scan DFS it has extra pin called SI and SE scannable SI scannable when scannable is 0 B will go to Q when scannable is 1 SI will go to Q. Since it is just a replacement of B flip flop by a scan flip flop DC it does not stitch any scan chain it will just convert the block scan chain is not stitched. So, all the scannable pins SE are type of 0 or the scannable pins SI are type of 0. So, this is the default behavior when you give minus scan of course, you could use any DFT tool you can also use the DFT compiler this is the filter of design compiler this scan you might discuss it is the type of it. So, this is what scan done. So, this was we have covered most of the options of compiler run. Now, let me take let me talk about version clocks and how to constrain the completely combinational one right. Let me go to designs this is show and I will remove all the designs. Let us look at the RTL of a comparator. So, comparator is a very very simple we cannot have a much simpler module on RTL it just gives it has a 1 bit output it is A in is less than B in it will report 1 otherwise it will report 0 right. RTL is very very simple let me show not because the comparison is between 16 bits of A into 16 bits of B. I believe you already know what a hardware looks like. So, let us see what DC does right. Now, this is now let us say you are given a task of designing a component a module such as this it looks very simple, but now let us let us read it right. Let us read this design analyze or what I will do there will be single module I will say read the law there is a command for read the law and I will just it will do the reading analyzing and elaborated both there is no nothing special in comparator so, it is very very good. So, it is it is analyze and it elaborated also. Now, what I will do is now I have to compile it. Now, what is my first instance? I look at the ports I say all ports all inputs these are all inputs and look at all outputs this is the output there is no clock in place how do I constrain it? Let us not constrain it let us just do a compile compile data. Now, all the all the options there is no need there is no need there is no need of auto envelope there is no need of no sequential output inversion there is no need of giving no boundary optimization, because there is no hierarchy it is just one module. So, we can just get compile alter and see that happen. So, there is no there is no timing violation this column represents timing violation there is nothing area is 90.7. So, it will optimize for area right. So, it will optimize for area design will be slow whatever the area is 90.7 and so on. Let us write out a very long I am going to write out a very long form. Let us now look at now is it simple or is not simple it has lot of comparables right it has a lot of add or O, A, Gets and so this is the net list. Now, I want to constrain it now this comparator you know that your manager tells you or it is I mean that this comparator will sit between two clock, but those clocks are not part of it is I mean those clocks are one is the input side. So, the A and B are coming from clock, C out is going to another clock now the and that works at final domain and what should I do, how should I constrain it. So, there are two ways of doing it, one is very simple. Now, let us say you are given to synthesize this and make sure that it meets a parameter measure as clock and there is no clocking. What I would do is, I would assume that is a valid assumption that the clocks which are lying on the either side of this comparator, there is no other combination of the clock. That means, the clock is launching the data and clock is capturing the data. So, out of the complete period of 2 nanoseconds which represents the power of the clock, out of that complete 2 nanoseconds maybe I could take 1.5 nanoseconds, valid assumption because it is there is no other combination of the two picture. So, what I would do is the command I would use is called set maxillate. I say set maxillate minus from all inputs to all outputs, this is the command I can use and the value I can give let us say is 1.5 what I assume right. I tell DC that all the path, I mean if they are start from any input of the design and go to the output, each path should be not more than each time path should not be more than 1.5 nanoseconds. Now, let us do a compile like this. Is the area more? Yes, why? Because now it is trying to optimize the time right. Now, let us look at report time. Report timing now see starts input delay 0 that does not present matter it launches aim at 0 goes to all these elements and cp out and the maxillate 1.5. So, it goes from input goes to output total time is 0.99, maxillate constraint was 1.5. So, it tells us that it is made by 1.5. All the paths are constrained you could actually check by using report constraint right. Designers over it is constrained they are good. The design manager is happy you are able to do this. Now, there is a second way of doing it. Now, you know that you already know that the clock is 500 is that what I will do now is that first I will clean up the constraint. It is a very simple command to clean up the constraint. We know that remove design remove the design, but what is I want to keep the design, but remove the constraints I say reset underscore design. It will remove all the constraints right. The only constraint we have applied here is just the maxillate. Now, I created the clock fine. The clock is not part of my design, but I can still create it right by using a concept I was a clock. I say create clock minus 3 z 2 that I create a 500 mega clock I know lines outside of my design. I say create clock minus 3 z 2 minus name let us say clock. It gives me a warning that it is creating a virtual clock with the name clock with no sources which is fine I know that I expect that right. Now, I would say I can say input delay now again let us say I want a 0.5. So, again now here I was wanting I wanted a maximum of 1.5. What I will do now I will set input delay of 0 on with respect to clock input delay is always with respect to some clock input to clock clock and on all the inputs. Then I would set an output delay of 0.5 minus clock on outputs. Now, see input delay is something you are given today at some point output delay is also something you are given today at some point. Now, consider a fully combinational path. So, how do you decide input output delay? The sum of input output delay should be subtracted from the period and the time left is the one that you would be using. Let us go the other way round I know that I need 1.5 seconds for my design and let us see 0.5 analysis. So, the 0.5 NS is equal to input delay plus output delay. So, I chose input delay to be 0 output delay to be 0.5 we can choose the other way round we can also choose 0.25 input delay 0.25 output delay does not matter for combination delay for this kind of delay. So, the sum so, by specifying this command let us see what happens again I will do a compile. So, the area is very similar probably exactly the same and now I will do a report time. Again you see the results are exactly same does not change input delay 0 output delay 0 what we applied output delay 0.5 the clock right this is the way we use virtual clock to specify to constraint combinational paths. Virtual clock has so many other uses as well. So, first of all virtual why do you create virtual clock? Virtual clock is created only because you want to specify some input delay or output delay with respect to this clock and you want to limit the conditions that are there in the external volume which is superior max delay or virtual clock there is no verdict like that whereas, one is superior than other, but virtual clock is used very very explicitly in that is there are so many other uses too. So, for spec for constraining your combinational logic specifically the combinational logic might also exist as part of a different design. So, in the in the complete design we saw that there were parts directly from input output there were cases like that. So, these are the cases where starting from input there are there is a lot of combinational logic directly going to output. So, you can constraint this type of design this type of combinational by using either delay or by using a virtual clock whatever you want to use is up to you, but we will see the power of virtual clock in unit 5. Virtual clock are much more sophisticated than then the max delay and we see we will see that how what all uses what all powerful things we could do with virtual clock right. So, please please spend some time and reading about the max delay come on what it does you could try out. So, in this design in this the chip delivery design which I have the update for it is good in the sense that it is a simple design you could make modifications to it according to your needs according to your you should try and make this design better in terms of time right. Area wise do not be concerned much you cannot do much it is dependent on the standard size and dependent on the kind of design you have design is very very simple, but you should try and make modifications to it. So, that you can constraint it later and make sure that there are no time it can be achieved very very easily right you just have to make sure that inputs are registered outputs are registered you should have some kind of timing constraint good timing constraints for inputs and outputs try and simplify the designs for 500 make sure that all the parts it could be a very very good assignment right. So, next lab I want I will probably have one or two sessions on power and in one of the labs I will focus on the programming option of compile etra and on power engine. Thank you.