 Hello everyone, welcome to the first lecture session of unit 5. Unit 5 is all about static timing analysis. In this session, we will look at the constraints and the flow. Some of the concepts are since the constraints part is very similar to design compiler, there will be some repetition of the constraints, but now we will go much more deeper into all. So, in synthesis we are much more focused about the performance, but STA we are focused on the fact that the chip should work in all possible. So, we will look at the STA as a concept in the sense that what is it what are the principles behind the flow. We will look at some mathematical equations related to the constraints. So, it is better if you have your parent paper ready before continuing forward. So, the agenda is we will look at the part where STA fits in the AC design flow. We look at the importance of STA. We will learn we will look into look into the operating conditions in some detail. Then we will go on to the timing parts on constraints. We will have a relook at the different sort of constraints that STA check. We will look something at take a look at the clocks. However, there will be a separate session in which I will discuss the clocks in much more detail being there the clock being the most important part of the company. We look at something called timing acceptance and then we look at the flow part that how what all steps does STA contain. So, this is this session will give you a nice overview of STA and the following sessions we will look at each of the part in more detail like in one of the sessions we look at clocks in more detail. We look at exceptions we look at the interconnects and so on. So, this is the AC design flow. So, what we are focusing on in this course is on the synthesis part and plus we are focusing on the we have all seen that the the job of a synthesis rule like the Bennington policy is to convert RPM into gates given a proper set of constraints right. So, I am assuming that constraints. So, yellow thing yellow is the output output or the input the blue part represents the flow part or the tool part. So, RTL is input to synthesis the output is gates, gate net list plus constraints. So, constraints are again input they are input also and when we synthesize the write out the constraints on the Bennington boiler in one of the format in the format for ADP it is understood by most of the place in out too and these constraints plus the gate level net list goes as input to the back end tools for flow planning place and route and the output of place and route is a post layout net list and the parasitics. So, a difference between so now you have a post layout net list kind of free. So, now the process of the formal verification can be used to verify that the net list after synthesis is functionally equivalent to RTL. Please note the formal verification is not better based it is a formal again a static method. So, it the the coverage is very good. So, it can be used to check that in fact, it can be used to verify that any two designs whether they be in net list form or RTL form or equivalent form. So, RTL versus the typical flow is RTL versus pre layout net list to this net list after synthesis we will call it the pre layout net list. So, RTL versus pre layout net list taking step is pre layout net list versus post post layout net list. Now, this is the functionality part now we come to the timing part. STA is used to make sure that the net list is good from timing point of view and this STA is called post layout STA which takes into account the constraints. So, it takes into account three things the constraints the post layout net list and the principles. These three are the inputs that go into an STA tool like time time and using the process of STA we make sure that the design is good for manufacturing, good for factory business, what good profit of making sure that design is good for manufacturing is called sign off and the output of this process is the GDS. So, the place and now there are lot of back end verification checks that run on place and now net list most of the popular ones are GDS and BRT and once they have done sign off using prime time LVS and DRC using the back end tool we probably told that the design is good for fabrication. The output is GDS it is a graphic data stream file that we send it to the fabrication lab like PSNP and they use it to manufacture the GDS. So, STA is done at two levels the post layout STA was talking about this one more STA and STA the output level what we were doing in design compiler. So, we were doing in we were reporting timing in design compiler we were taking the constraints in design compiler all this is part of the pre layout STA. Now the pre layout STA only helps us in finalizing the constraint and nothing more otherwise any timing result at pre layout level is not valuable we will put enough for sign off has to be done with complete parasitic data. So, the idea of pre layout netlist is to clean up the constraint make sure you have all the exceptions in place make sure the clock frequency is correct make sure that your design does not have big violations or in fact, if it has video violations or synthesis. So, this will the pre layout STA will make sure that the synthesis netlist is of good quality, but the real violations the real timing problems will only come after the layout has been done. Hopefully we will have we will see some examples in the last then we will do one question about with the poster. So, what STA checks STA checks all possible path it is not STA driven the simulation on the other hand is vector driven in the sense that you give some values at inputs and you expect some values at the outputs right and you have to make sure that you go all possible combination it is not the same with STA STA is not we will not vector driven it checks all the possible timing path in the design checking is quite fast compared to dynamic timing to the STA compared to timing compared to the the verific system that we do with providing the stimulus it does not replace simulation the important fact is it only checks timing it is not a functionality check how do you check functionality by doing verification plus formal verification. So, STA does not replace functional verification STA is comprised of majorly two parts first part is delay calculation second part is time measure that means for delay the calculated then they are checked against the constraints. What are the types of checks that are performed STA fold recovery removal etc you can also have some sophisticated user defined data to data time constraints for your interfaces for the interfaces that we need to check that one data line arise no later or no faster than some time constraints with some other data thing you could have these kind of checks and clock gating we saw clock gating in design compiler then our clock gating setup and hold text which we can do with STA minimum period and pulse width checks can be done all design rules can be done minimum maximum transition time capacitance span out all these again important point all these checks that comprise time off will be done at the poster of the data. Now when you talk about STA we have to talk about operating conditions since STA is done at multiple operating conditions just try and compare this with synthesis synthesis is always done at the worst operating conditions why because we are only concerned about the performance of the design but here we are concerned about we are concerned that the chip should work at all possible operating corners these corners are called operating conditions P, V and T, T is for process V is for voltage, T is for temperature now what does process mean? Now on a particular wafer and on a die not so let us talk about wafer the wafer has multiple chips now all the chips are not going to have same performance the chips on the corner of the chip will be at which on the periphery will have different performance in compared to chips that are placed in the middle of the wafer and let us talk about the die that is one chip now on one die not all NMOS and TMOS will behave in a similar fashion some of them will be faster some of them will be slow right this is called process variation, STA in fact the whole purpose of STA is to make sure that your chip works in spite of these process variations there are two types of variations I will have I will try and fit in a lecture which talks about variation because STA is very tightly tied with all the variations take place in the chip. So, some of the variations come under the static bi-systematic value that means, these variations affect either all the wafer in a similar fashion or all the dies on the same this on the same wafer in a similar fashion. So, for example, let us say your target is a 65 nanometer process now let us say the process the the whole fabrication process by a particular fabricate in R is not optimized and there is some error in their formula or some error in the in the in the manufacturing process which will result into all devices that have fabrication value greater than system. This is called a systematic variation it can be you systematic variation can be corrected by making some corrections in the fabrication flow. So, if I if the fabrication guy finds out that now all the devices on the on a 16 nanometer technology load node are not actually 16 nanometers they are let us say 67 or 68 then the fabrication guys need to make a correction to make sure that systematic variations are minimum. Second is for the random variation on such a small scale of minus 1 3 let us say again we talk about 16 nanometer not all P m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m do not get same voltage there is I I drop in the power rail even the power supply does not provide let us say you are targeting a 1 dot 0 volt power supply even the power supply is going to have some value may be 1 percent or 2 percent in the power supply. So, all the cells do not get the same result. Let us talk about temperature chip will heat up during operation not all parts or all cells on the chip will heat up by the same degree. So, there will be. So, one part of the chip will be different than the other part of the chip whether it be in terms of process or voltage or temperature SPA should guarantee that chip works across different corners right this is called corner based STA. In today's world we do corner based STA, but there is also something called pathetical with its meaning importance just because of the fact that all such defect parameters all such variation can be captured in a statistical way and it will. So, now as as technology shrink the number of corners are increasing. So, we will not talk more about we will not talk in detail about this in this lecture there is I will probably present one more session on variation there we will see how and in what manner the variation is affecting the escape home. So, all the discussions we have in unit 5 will be focused on corner based STA that means we have very defined corners and on each of these corners we have to make sure that our chip meets timing constraints. So, this is what I am talking about. So, now the delay of a cell at a library at a standard cell library level depends on the input transition we have seen. So, delay is a function of inputs loop the load and now every technology you have on every corner you have one you have one the each cell the delay will be characterized for each corner that we have chosen. So, for each PVT so, ultimately delay depends on PVT or one PVT there is one library in each of those library you have a cell and characterization is based on the inputs loop and output loop. So, there are five parameters here PVT is moving and load. Now, let us look at one particular parameter now there are hundreds of parameters channel length thickness of the oxide and so on there are so, many directly from the operating form. So, there are so, many parameters that are affected by the variation by process origin. Let us look at one channel length. Now, if you look at the channel length on let us say we choose 60 nanometer second delay and we measure the channel length of each and every sound chip we will see that a graph of something of this kind will emerge right graph like this is called a Gaussian curve. The Gaussian curve is centered around a typical value in this case it will be 60 nanometer and there will be lot of some devices which show channel length which is moving 65 there will be devices which show channel length later than 65. So, the distribution looks like this. So, this is a a a a a a a a a a a a a a a a a a a channel length by accident number of samples. Now, let us plot two such parameters one is channel length now we on x axis there is channel length. Now, let us first keep clarify some doubts about the the Gaussian curve on the left. Now, this Gaussian curve has two parameters one is called the mean of the average which is a typical value other is called the standard deviation. Standard deviation is as what is the width of this curve. Now, how wide is this curve? Now, typically up to 3 sigma sigma is nothing, but the standard deviation typically the the 19 more than 99 percent of this area comes under plus minus 3 sigma that means 65 nanometer let us say the channel length is the average and let us say the sigma is 1 nanometer. So, 3 sigma means 3 nanometer. So, this the 90 more than 99 percent of this area will come under the values of 65 plus 3 and 65 minus 3 that is from 62 to 68 it will will if you take the value from 62 nanometer to 60 nanometer it will cover more than 100 percent of the area. So, typically the corners are selected to cover plus minus 3 sigma right things will become more clear when we talk about variation, but now what we have done is we have plotted the channel length and the threshold voltage of devices under two axis channel length being on x axis threshold voltage being on y axis. Now, note that a greater channel length and a greater threshold voltage means a slower device. So, somewhere around here at this point the devices are slow and somewhere around this point the devices are fast. In the middle the devices are typical that means, the device with the slow corners are slower will have more delay devices at faster corners will have faster delay less delay. So, we have chosen two corners at the two different extremes and we hope that or let us say we know that in older technology that these cover the two extremes of the process area right. We are planning to make sure that complete area this complete area in the middle is covered right. Obviously, the actual distribution of the devices will be clustered around the middle most of the time it will be closer to the critical there will be some devices which will be towards the slower side there will be some devices which are towards the faster time. But we do we have to do SDA at ask corner and slow corners what it means first important thing it means that we are trying to make sure that only very few few of the devices scale there will be few devices which are even beyond the slow corners there might be because of the manner processing process variation. So, we are trying to make sure here that most of the devices or more than 99 percent of the devices work. But again majority of the devices then they are lying in somewhere in the middle this analysis will be pessimistic for that means, we are assuming some let us say for slow corner we are assuming a worst day delivery for some day, but most of the gates will be around the corner right. So, they will not have that high delay. So, this analysis SDA is corner way SDA is also in most of the cases pessimistic that means, you are taking margins beyond the critical, but the idea is here is to make sure that almost all the devices work or sorry almost all the chips work and they all they need the required constraints. So, by doing corner way SDA although we are pessimistic, but we are at least ensure that the idea is to make sure that the customer does not send back all the good things right that is the idea. So, if you want to manufacture a processor which meets one regards we have to make sure that it meets one regards at the slow corners right. So, in in if we if we if we meet that and we make sure that all the timing constraints are met in the fast corners slow corners good. Most of the most of the chips manufactured under such conditions will probably have will probably meet frequency which are greater than at least maybe 10 to 20 percent greater than on the right also because SDA is pessimistic right. So, probably you will estimate you will appreciate the pessimism or the of the corner based SDA when we discuss about variation right. Now, what operating conditions we might choose right. So, in in all the all the process on all the technology note the 50 to 90 nanometer at least we should have two corners one is the worst case one as depicted by this this graph at least we should have at least two corners worst case and best case worst case corner process type is worst that means N MOS and P MOS both are slow slow voltage is low temperature is high the cell delay is high which is set up critical best case process type is best voltage is high temperature is low which means that all the devices are on the faster side it is critical for hold we will see why why I will say this we will see why worst case corners critical for setup and we will see why a best case corner is critical for hold or 65 nanometer typical operating condition can be 1.0 more 25 degree C worst case corner in this case would be 0.9 more which is 10 percent below the BPD temperature of 125 best case corner could be 1.1 more which is 10 percent more than the BPD on 0 degree C temperature depends on the type of environment conditions under which chips will be used if chips are used for space application this temperature would be a negative if chips are used for space application like televisions mobile phones and so on this could be 0. So, depends on the type of operating the type of environment conditions we are targeting for that thing. So, this temperature will change based on I have already discussed pre and post out that we have summarized this pre out that we have no information about the nets it is all estimated it is all valid mobile most important to be estimated in this way. Use for resolving bottlenecks and finalizing your previous property this is very important. So, you do not wait till the post layout data comes in and then iron out the problems in your constraint you start working from pre layout level post layout will take some time and meanwhile you can finalize the test for the this is very important. Choose the worst corners same as synthesis ideal clocks what it means is that there is no clock frame play we will see what ideal ideal means that the clock reaches all the clocks at the same time let us say you have a single clock in your design it is 1000 register the clock will reach all the 1000 register at the same time in most of the case it will be 0 because you do not have any information about the data. Post out STA usage back annotated net information which is contained in one of these five one of these five formats with a step or DSPF done at various corners. We saw you know we need to this case was this clock network fully implemented the clock frame is done and you have all the clock network is done. So, based on whether we are doing clear out all the post layout STA a lot of things change about the clock the clock constraints will change we will see them later. Other constraints would mostly remain same, but something about the clock will change we will see what now what are the steps of doing STA I mean what the tool actually does. So, what tool we will do once you feed in the design and the parasitic information based on whether it is clear out or post out whether it is clear out of the valid model. The tool will do three steps it will first break down the circuit into set of timing path we have seen a bit of we saw about timing paths are in the synthesis we will review them in STA. These for each path delay is calculated and these path delays are checked to make sure the timing constraints have been made. Now, let us see a couple of important things for combination sets the only thing important is propagation delay it is a function of output load and input flow. Sequential set apart from the propagation delay from processor 2 you also have timing set like set up and hold recovery and removal. Timing checks are function of constraint pin permission and related pin permission we have seen this in the STA I mean. If you do not remember this please go back look at the standard cell library in C that set up is dependent on what what what index is right what are the indices of set up look of delay. Delay calculation the gate delays are calculated are taken from library. So, cell delay this is the cell delay gate delay is nothing but a cell delay it comes from the library in the standard cell net delays from post layout data. Delay calculation gate delay is interpreted from the non-linear delay model look of delay before that in unit field. Net delays are either they are by load model in that case they are selected when we feed in the actual net delays by stress and DSPF by putting in stress and DSPF they are calculated. So, stress and DSPF they do not contain delay they contain the R and C information. So, there is since in capacitive information of each and every net using this R and C value that the STA tool has to calculate the delay right. So, they are calculated using algorithms and these algorithms let us say prime time or prime time it is try to do it will try and match the spice. So, spice is the difference it will try and match the delay calculation to the algorithm will try and. So, spice is very time consuming it will try to do spice on the complete. So, the the most accurate way the most accurate starting timing analysis should be done in this time, but if you do that for a first year it will take days and days what the STA tries to do it tries to make the process much more faster. Obviously, in the process of making things faster. So, one thing it does not work at the NMOS level NMOS people state transfer level it works at the date level that helps a lot in terms of abstraction. Second thing it will use some parcel versions to calculate the net delay, but also it is try and make sure that these the delays are within let us say 1 or 2 percent of this file data and they do not deviate much. So, here the difference one is spice, but you do not have to worry about spice when they doing STA this is the job of the tool right not the job of the user. The tool for example, prime time claims that the delay calculation is within some percentage of the spice. So, we will just take the word for it and do it right. Let us look about let us take a look at the timing parts. So, the tool will start at the input it will stop at the first sequential element in the counter input to register part again it will start the clock input at the register it will again stop at the data input of the any flop it sounds this is called register to register part. Again it will start at the clock input until unless it finds the output part. Obviously, there should not be any sequential logic in between. So, there is the last with this kind of part is called register to output part. If it does not find any sequential element and directly encountered output part it is called a combination of or a feed through the port basic kind of parts there are more than that. Part start point is the launch part you have seen that the timing report will show that the launch part it starts with either input pin into port or the clock one of the sequential element. So, in this case it is either a port this part or the clock pin of the sequential element. End point the capture part can be either output port this it should be port here output port or the data pin of the sequential element. So, part is captured here here and here checking is always there is a capture part launch part from the data part right. So, we have seen timing reports will be lot more in this you know timing parts are always grouped according to some some some principle the principle is that the capture they are grouped into part groups by the clock which are controlling the end point. So, for example, part from A to D here part starting at A captured being captured at D the part group is clock 1 clock 1. So, part 1 is from A to D of this FF 2 and part 1 comes under the group clock 1 because the capture clocking is clock 1, part 2 is from FF 2 to FF 3 it comes under the group clock 2 because the capture clock is clock 2 part 3 is from Q of this from from clock of this clock 2 to Z. Now, part 3 if you do not give any output delay on Z it will come under the default group similarly part 4 comes starts from A ends at Z and assuming we have not specified any output delay on on Z it will come under default. However, if you specify output delay on Z using let us say clock 2 it will come under the part group clock 2. So, SCA 2 is like trying time to generate time in reports and sort them by the part groups right. So, it will show the timing report with the input clock 2. The complete list we have seen register register part how are they constrained by just defining the clock. We have seen this in unit 3 we will review that we will also see some equation in the following slides input register part can be constrained by defining the clock and the input delay register to output parts are constrained by defining the clock and the output delay. So, what do you do first you define clock you define clock and input delays you define the clock. This will constrain most of your parts input to output parts fully combinational parts can be constrained using virtual clocks we have seen for example or using nice delimited. Then there are separate parts separate part groups for clock gating clock gating parts are special because the clock gating limit is this special then you think default recovery and remove. So, these are the gain separate part groups. So, for register to register input to register register to output input to output apart from these basic four you have clock gating default and a default part group it should in all their celling. Again we have we have seen this net delay is a total time which needs to charge or discharge all the positive characteristics cell delay is nothing, but the delay it takes the delivery to traverse from input to output. So, the part delay the sum of net delay and cell delay. Now, there are various terminologies most of you it you will be most of them you are already familiar with that, but I will expand each of this set up and hold times recovery removal, pulse rate, signal skew, clock latency, clock skew, stack and critical part and these are timing exceptions for us for let us look at this. So, I will define this again set up and hold times for an edge triggered element like a flip flop. The time interval before the active clock edge during which the data should be unchanged is called a set up time. So, in this diagram it should be stable at least set up time before the this is a constraint this is a property of the sequential element. Similarly, hold is the time interval after the active clock edge during which data should be unchanged this is called hold time. Set up and hold are the properties of the sequential time interval. They are calculated from the look of tables given in the law right you have seen that. I call this a classic timing problem between a register SF1 and a register SF2. The delay of the combination logic is T-POM here TPD is the propagation delay of flip flop 1, TSNPS, HR set up and hold times for flip flop 2. So, now, let us see the timing equations we will review the timing equations. We will see how the set up and hold time actually affect the the setting and what happens if they worry. Now, set up check TPD plus T-COM let us go we will go back and forth. So, the data will be launched by the flip flop home at the active edge of clock. It will take TPD delay to go from to traverse the flip flop plus it will take T-COM delay to reach the D. So, TPD plus T-COM D0 is the old data D1 is the new data flip flop 1 is launching D0. So, flip flop 2 gets D0 after TPD plus T-COM ok. Let us review the set up and hold time interval for the sequential element like a flip flop. So, for such an element the time interval before the active clock edge during this data should not be changed is called the set up time. Let us look at the figure. So, this is the active clock edge set up at least set up time before this active clock edge data should remain stable. The hold time is the timing constraint represents the timing constraint after the active clock edge during this the data should not change. So, data should remain stable at least set up time before the active clock edge and hold time after the active. During this window data should not change or to make sure that the sequential element captures the data properly. Now, let us look at the the traffic timing problem that means data is being launched by a flip flop FF1 is being captured by the flip flop FF2. TPD is the propagation delay for FF1, CS and TH are the set up and hold time constraints for FF2. Flipcom is the delay for the combinational element in this thing. Now, FF1 will launch data as the activate it will launch D0 and after the delay TPD plus T-COM sorry it will launch D1 or D0 D0 is the old data D1 is the new data. So, flip flop 1 so, let us sayafter TPD plus T-COM D1 will start reflecting at D. So, here at this point at this data pin of FF2, flip flop 1 will launch new data D1 after TPD and T-COM data will appear at D of FF2 here the data will start changing here. Since data changes here it should and to meet the constraint the set up constraint of FF2 data should not change in this TS window right. So, if we represent this using the timing equation it is something like this. TPD plus T-COM that is the combinational delay plus the delay of the launching flop should be less than TCK minus TF that means this is TCK this completes that the delay between this active edge and this active edge. We subtract this at a time because the data should not change T is before this so, TPD minus delay and TPD plus T-COM should be less than this right. Now, we will so, total part delay is TPD plus T-COM it is called the part delay combination delay plus the launch the sequential elements launch delay. Now, look at this equation I will underline this this equation the left hand side that is this side is the part delay and this side is the constraint. The equation is of the form path delay less than constraint. Since it is less than constraint so, that means, TCT minus TF is nothing, but a maximum value that the part delay can have. TCT is more in most of the cases it is fixed when designing a CPU for 1 GHz TCT will be the time period of the flop which is the 1 GHz flop right. TS in most of the cases it is a value it is a very tight range TS is nothing, but the set up time of the flop and set up time of the flop depends on the summation value of the of the data pin in the flop pin. So, it is it is constrained within a very tight value it is not in user control it is not designed dependent right. So, this TCT minus TS gives us the maximum maximum time this combination logic can have. In fact, TPD also is not under user control appropriate extent right because it is just the delay of the symbol flop right. So, what is what is design what design comes in while designing with the combination file. So, PCOM is actually in the user control. So, your propagation delay your part delay should be less than some maximum value that is why set up check is also called math delay check. What is slack? Slack is constrained what is constrained TCT minus TS what is delay TPD plus TCOM? If the slack is positive timing is met if the slack is negative timing is violated. Now, what happens if set up while it is on a chip? Let us say I designed a CPU I had some problem in my SDA I did not know how to do it properly the chip is fabricated and now does not work it on TCT. We try and make it now externally we can control TCT. So, there is a crystal crystal on chip which we can program for the clock. Now, what I do is I have programmed it for 900 and see the clock and if the set up while it is. So, now, by increasing the value of TCT at some point it might start to meet. So, even if set up while it is in SDA in the chip is fabricated the chip might work at a lower frequency because that is why when you go into market and try to process the processors the processors come under various frequency grades. So, you might get a code to go for 3.2 gigahertz again for 3 gigahertz 2.8 gigahertz and 2.8 will be cheaper than 3.2 gigahertz right. So, you all know that mostly. So, are they different chips? No they are same chips, but as SDA was done actually it does not mean that SDA is not a problem, but what may happen due to process variation some of the slower chips they do not meet the 3.2 gigahertz. So, but they will 2.8 gigahertz good we can select for less right. So, the chip still works in case of set up oration, but at a lower frequency right. Again now let us look at the let us look at the situation from a corner SDA problem. Now, what happens at the worst corner? At worst corner TPD is more, TCOM is more, ECK is fixed it is the performance same if the frequency that we specify ECK is fixed, but TPD and TCOM increase right. CS might also increase a bit because the temperature is increasing. So, if we make sure that all our set up constraints are met at the worst one, then we have made sure that they also work at the minimum, because TPD plus TCOM attains an maximum value at the worst one. This is why set up is called a max level check. This is why the corner base SDA mentions that let us go back to the corner base SDA, worst case corner is set up critical that means this is the corner which would be bad for set up. You have to make sure that better work in this corner. So, just by looking at this equation we are able to deduce that set up check is a maximum value check and it is most critical at the worst level ok. Let us go to whole check. Now, what hold so, just look at this this figure carefully. Now, see that what set up check is trying to do? It is trying to make sure that FF 2 captures D 1 properly that means D 1 should be set up properly before FF 2 can capture it before this the the ok in this case the program is not too good. The negative edge is shown as the calculate this edge is shown as the calculate, but does not matter this could be shift the causes to here it is on the. So, it is making sure that this set up check that FF 2 captures data properly what data the data D 1 then you did what FF 1 launches. Now, let us look at the whole slide. Now, in whole what this what is the check? The check is that D 0 FF 1 will launch in data D 1. So, D 0 will transition into D 1 at some point of time what is that time it is nothing, but T P D plus T comp same as the set up like D 0 transitions into D 1 after T P D plus T comp same thing goes for whole, but now we have to make sure that during this transition D 0 to D 1 this data this transition should only happen after the T H time after the whole time right and I will repeat this is very important to understand that FF 1 launches data D 1 whole data was D 0 D 0 transitions into D 1, but the constraint here is that data for FF 2 should not change within this window. So, T P D plus T comp should be greater than T H what this check is telling us that this is making sure that data whole data D 0 is captured properly right. So, the set up check will make sure that then the data gets is set up before the clock is properly the whole check will make sure that data is held for some time data is held for some time these are held for at least D H time before it could transition into D 1. In terms of equation T P D plus T comp should be greater than T P D again let us analyze this part delay T P D plus T comp same as before T H comes from the library whole time constraint what is the equation type equation is described then part delay is greater than a minimum value. So, whole check will constraint the part delay for a minimum value that is why this check is also called limb delay check. Now, if we combine both set up and whole equation we see that part delay T P D plus T comp remains same the maximum value is T C K minus T S the minimum value is T S right. So, for every part for every time in part the part delay should be every time you see that in the time in reports for every time in part there is a maximum value and a minimum value. Max check or max check is all set up check is also called a max delay check whole check is also called the middle delay check right. What happens if whole relationship nothing you can throw the chip you can do nothing, but because there is no nothing you can play with that it gets manufactured everything is fixed there is no clock frequency you have to work it. This is why when you go to the industry when you start working on real test you will see that the set up is check add probably one corner or two corners hold it check that six corners eight corners why because we are very sensitive about whole there can be a set up violation you can measure this with a small set up or not with a whole you have to make sure that all the whole timing constraints are there. Again we will try to match this equation to the corner based FDA now what happens for a for a first corner first corner T P D is more T comp is more it is good for whole right there is more delay. So, it will meet the minimum values you know only constraint. So, no problem when you go to best case corner T P D starts decreasing T comp starts decreasing the best case corner is bad for whole that is why we say that it is whole critical the best case corner is whole critical. So, you should be very clear that set up check is the max delay check hold check is the min delay check hold what is STA hold should be met in best best case corner that will be met in best case corner. In fact, let us say you choose N number of corners bottom line is all the set up and hold constraint should be met in all the corners it is the idea, but we know that worst case corners the corners that have more delay on tell will be more critical for set up the corner that have less delay on say which will be more critical for whole right. So, again the bottom line max delay and min delay these are the two things you have to remember max delay set up min delay is hold all formulate all timing paths you can understand if you understand how the equation will be done. I will repeat path delay less than maximum constraint is the max delay check which is this T P D plus T form less than T C K minus T S for hold path delay is greater than the constraint is the min delay check that is it that is the whole summary of STA. Now, what we will do we will apply this principle of max delay check and min delay check or other timing paths that we will see and we will see how this matches up right. Now, let us talk about input to register path. So, input to register path the input external word external to our design external to our chip. So, there is something called input arrival time which defines the time interval during which data signal can arrive at a pin in relation to the nearest the active clock is that triggers the data function. Now, let us say this is the clock on which the external device operates external device is launched the data and this data is captured inside a chip both for both rise and fall there will be a timing window during which the data can function this is the timing window. So, the data can come let us say we are talking about let us say some interface let us say I do 0. Let us say the data can come to NS after this clock edge and to NS can be minimum value and for maximum it could be 5 minutes. So, what we are saying is that the interface is such that. So, so the external device is also in a chip it will have process variations there will be wires on the on the on the circuit board that will connect the two chips together it will have some variation. So, we are saying that the data can arrive at your boundary between 2 and 5 nanoseconds of the trigger edge right. So, the min value is 2 the max value is 5. So, this is the way the input arrival time is characterized. Now, what is input delay? Let us look at the the equations now. So, now we know that ok there is something called input arrival time and in in most of the famous interfaces this input arrival time or input delay is already given it is already published right. So, you can use the same value you just have to be make sure that you use the proper max formula for the max delay proper min value for the min part of the delay. In most in some cases you have to estimate this input delay now this slide will tell you how to estimate them to them. Now, external world let us say the delay is feeling the input arrival time is being it and after feeling the data arrives at input pre combination this is internal dual chip and f f 1 since it is captured we are worried about the set of time here it will have its p s and p m p s and p s have the set of time. Now, for setup again the equation is the same type t m plus t com should be less than p c k minus t s right. So, t s is the set of time constraint this is the max delay equation again. Now, what what I am doing here is I am considering the equation in such a manner that on left hand side I will keep what I need to in to what I need to evaluate what I what I need to estimate I need to estimate the input delay part. So, I will keep the t in something which is exponent to r 1. So, I keep t in on the left side on the right side I will keep all that it is internal dual chip unknown. So, t com p c k and p s are all connected to my design I am not known to me. So, p h is less than t c k minus p s minus p com please note again it is a set of timing constraints you need to represent this equation in the form of estimate less than constraint less than time estimate less than constraint again this is the estimate part this is the maximum that it could have. So, estimate less than the max delay constraint is the max delay equation of the set of equation. So, how do I estimate an input delay? Now, let us say the clock frequency is 10 and let us say set of time of this clock is 1 and let us say I have some combination element which takes let us say 3 and a second. So, frequency of 10 set of timing constraints of 1 combination delay of 3 I know how much. So, by using this I say 10 minus 1 minus 3 that means, I have 6 nanoseconds that have been given to the experiment. This is the way you can estimate input delay if the input delay is not already known for famous input delay right. Now, we will do it for hold right for hold again we have to present the equation in terms of estimate greater than a minimum value right. For hold T in the STCOM should be greater than T H for this clock no clock frequency comes into play here. Again I will really arrange I will what I am I know what I know I will keep on the right hand side which will form the open path what I am trying to estimate will come on the left hand side again estimate is greater than some value what is the value T H minus 1. Now, let us look at in the practical sense. Now, TCOM here if you have some combination delay here hold time of clocks are not not a big values they are for let us say for 45 nanometer they will be in the range of 0.1 nanosecond 0.2 nanosecond maximum right. So, if you have a significant combination delay you do not need to worry about why because this is already met T in would be greater than T H. So, even if you have a input delay of 0 for the hold case this value here would be negative because T combination let us say is 1 nanosecond hold is let us say 0.1 nanometer. So, this would be negative and the value of 0 will this condition would be met. So, if you have combination delay from input to the first clock it captures it it helps for hold does not help for setup right. So, for setup since the constraint is maximum we want the delay to be less for hold we want the delay to be more and that will be a state problem. We need to make sure that all part delays meet on both sides on the wind side as well as on the natural right. Let us look at output register to output parts. So, something called output required time on the input is what input arrival time right on output side it is output required time. Output required time specified by data required time on output codes. We will we will launch the data from the clock and required time directly specify the timing through logic field. Let us see the in terms of equation again now on the output side since we are talking about the output side we have a clock that is launching data after TPD we have a combination delay and let us assume on the external world there is a clock that has a timing constraints of Tf and Th now let us look at the equation. T out is let us say there is some combination cloud in on the output side T out. Now, I will arrange what is the part delay TPD plus T com plus T out plus Ts plus Ts is less than Tck you could also have Ttk minus Tf is the same thing again I will rearrange on the left hand side I will keep what is the optimal form what is the external world what is the external world T out and Ts and I will rearrange. So, T out plus Ts should be less than Tck minus TPD plus T com what this tells me is that let us say there is a clock frequency again then I will use this 10 nanosecond let us say the propagation delay is 2 nanosecond combination delay is 2 nanosecond. So, 2 plus 2 4 then minus 4 is 6 it tells me that I can give 6 nanosecond to the output board output side because I am consuming 4 nanosecond in time right. So, this will help us in giving in estimating the output delay for the external world port is much more interesting and slightly complex. Now, for hold what is the part delay TPD plus T com plus T out same part delay same it should be greater than T h this is the equation right I will rearrange left hand side is the estimate part T h is the estimate T h I do not know. So, I will take T h on the left hand side. So, T out minus T h on the right hand side I will have minus TPD plus T com right the set output delay. So, how do you design output delay we take set output delay how do you separate between set up and hold you say minus min for this please remember if you have constraint like set output delay minus min it is for hold and the value there is positive you need to take a look you need to check again because this tells me this equation tells me that you have to give a negative value. Negative value does not mean it only means that this is the way a state rules understand data. How do we arrive on this we have written the equation like this we have rearranged the equation to conform to some standard what is that standard the standard state that part your estimate should be greater than something for hold how do we arrive. So, we rearrange and this the right hand side value is negative set output delay minus min should almost always be negative. If it is positive if you are giving a positive value you should rethink positive giving a positive value most of the times is a problem is an error and it will result into optimistic analysis please be very careful. Go through these equations you have to just remember two things how to write equation for a max delay how to write equation for a min delay these equations are almost of the same format as the register to register equation right. Other things important things that STHX are pulse width and signals to pulse width if the time between active and in active states of the same signal when are pulse width important pulse width is very important for memory. So, memories the hard memory that is the full custom memories like single code or register files have a min pulse width check on the clock that means what it means is that if you give a clock which has the pulse width which has been something either for the high edge or the local to the output the memory will not work. So, memory needs a minimum pulse width on the both high pulse and the both the max pulse and the min pulse. So, I am sorry both on the it needs a min pulse value or both the high and the local right. So, for example, a memory might have and that is this state is independent of the second window right. So, you need to make sure that. So, what so if you do report constraints on a solver itself this report will tell you all the set of operations on the whole version it will also tell you on the min pulse system right. Again there are the signals to amount of time it takes a signal to transmission from from 0 to 1 or 1 to 0 it takes into account the uncertainty in transmission. So, this can be from either 10 percent of ebd to 90 percent of ebd or only percent of ebd 80 percent of ebd this is defined by a technology library we have seen that in unit 3 right. So, this is these these are the two things apart from the set of a whole constraint that is a good thing right. A signal through there is a max transmission that is a good thing. And also please note that slower transitions we saw resulted into more short circuit power. So, they are not so it is good to have faster transmission is something called clock latency. So, clock latency is a difference between so this is the both of the design where we create the clock. Now, let us say you have a clock clock is just tells me that this is a this is one clock this is another clock this is third clock. The clock to the clock a clock b and clock c reach through three different network. Now, let us say there was there is a path from clock it to clock b right. Now, the latency what is latency? Let us consider clock a in isolation ok about clock b and clock c for now. So, the latency at clock a it will have two values a right and a fourth. Now, let us say clock rises here. So, it will rise here rise fall rise fall rise this they are four inverters in this right. Rise here is seven fall delays four. So, what is the total amount of delays the clock takes from plk to plk a it has two rise transitions and two four problems right. So, two rise transitions between 7 plus 7 14 two rise transitions means 14 4 plus 4 8. So, 14 plus 8 is 22. So, the clock latency at clk a point is 22. Similarly, for clk b clock rises here clock falls here clock is there is a buffer here clock falls here right. So, the fall latency at clk b is 4 it is fall plus fall again 8 because clock rise means fall here and this rise and fall means regarding the output relation to the output. So, rise latency at clk b how do we calculate rise latency if we clear the data first. So, how do we calculate rise latency? Rise at clk b we need to calculate rise at clk b rise at this point rise here is 7 rise at this point means fall at this point fall is 4 4 plus 7 is 11 right. So, this way you can calculate latency let me introduce that the time it takes from the clock creation point to the the point we are interested in clock a is mostly the captured at a particular flop. The second thing called which is called clock 2 now clock 2 is in relation to two points the difference in delay between two these two for two points being the clock pin of the two resistors. So, these we when we talk about skew there is a skew between clock a and clock b clock b and clock b clock a and clock b clock b and clock b. So, any two pins you pick up any two clock pins the difference in arrival times is the clock skew. So, how do I calculate clock skew? I calculate like this I calculate for example, a clock a and clock b you calculate the rise latency at clock a rise latency at clock b. So, the the clock rise skew would be rise at clock b minus rise at clock a the absolute value similarly you can calculate fall. So, you know the clock latency at clock a clock b clock 2 that is rise and fall latency separately and clock skew between any two between any two points would be simply the difference of the clock way between two only. We will see how clock clock skew is very very important we will see one right when we look at the post layout data we will see how clock skew is important. Clock skew changes the way. So, now let us go back to let us go back to the set up and hold time. Now set up let us set up go to this window. Now here we are assuming that clock is arriving at the same time between at the two clocks SF1 and SF2 at SF1 and SF2 we are assuming that the clock arrives at the same same point this is called ideal clock this is the pre layout clock right. And what about what about the post layout the post layout the clock edge the active clock edge will be either coming late or coming earlier, but if it comes earlier if it comes earlier your window is shortened your set up timing is more constrained right. Let us talk talk about hold what happens if the captured clock comes later you are more constrained for hold it is a problem for hold where the the captured clock comes later. So, this is where the clock skew is very important in post layout data prime time calculates the clock latency for every clock and it knows the clock skew between every pair it does not calculate clock skew separately, but since it knows latency at every point in the timing report you can see the clock skew right. So, this is why clock skew is very important we will see in how do we in P layout how do we estimate clock skew and in post layout how do we handle clock skew right. So, we saw about clock latency clock skew we have discussed slack and critical we have discussed what is slack is and is the difference between the part delay on the constraint. Negative slack means that constraint have not been satisfied positive slack indicates that the constraint is met SK tool calculates the slack of each timing path in order to find the critical path, critical path is any logic path that violates timing constraints, any path that means any path is negative slack is on the critical path right. There can be multiple critical path we saw that how DC handle critical path it will try to reduce the worst negative slack and the total negative slack on the critical path that is the idea of optimization. We have seen this this slide just give the summary of how do we calculate slack we have seen that in set up and hold timing equations how the slack is calculated slack is nothing but we can review this slide later it tells that how the slack is calculated for input register, register to output register to register and input to output loss right. So, you can the first thing the most important thing for you is to understand the equations I have discussed before if you understand the equations you can actually write equations for any kind of timing path and you can understand how set up and hold timing are checked right. Even on complex parts there are some parts which can get complex there are certain timing certain interfaces that have some complex timing check, but if you know how to write the equations or set up and hold you can understand any path you can understand any time report. What are recovery and removal times? Recovery time is the time available between the asynchronous it is it is a specific asynchronous signal set up and hold timing is characteristic of synchronous signals recovery and removal is characteristic of asynchronous signal it is a time available between asynchronous signal going active going in active to the active blockage. For example, we said N going in active it should go in active at least some time before the active blockage this constraint is called a recovery time it is very similar to removal time is similar to hold that means it is the timing between active blockage in the asynchronous signal going in active. The difference between recovery and removal in set up holds is this set up and hold is for both the edges active and there is nothing called active and inactive in base of synchronous signal and data signal both data going from 0 to 1 or from 1 to 0 for both there will be set up and hold time. But for asynchronous signals like we said only the inactive the the transition going from active to inactive is important why? Because when it goes to active the the functionality is very clear whenever reset goes into active data goes to be that is it no timing constraint. But when the signal goes inactive now let us say the signal is going in active here very certain this blockage is the active blockage and we have to make sure that whatever data transition happens here is detected on the flop. But if the reset violates here if it changes somewhere let us say here and it violates the recovery the Q of this flop will remain 0. Because it still considers reset to be active because the recovery condition is not met similarly for removal for removal also this is the active edge. But this edge should not have any effect why because reset is still active here. But let us say it is removed earlier the reset is removed earlier then this clock edge might affect the data. But since the reset is 0 here data should be 0 but since you have removed it earlier it might be a problem. So, that is why the edges go the reset edges going to inactive are important not the active going edges. This is why you will see that recovery and removal are only in one direction. Let us say you are active though reset recovery is only characterized for reset going to home going inactive and not the other way you can open up the standard cell library you can go to a flop and you can see how reset and how recovery and removal time are important. Let us talk about timing exceptions I will briefly introduce falls path and multi cycle path these 3 I will talk about clearly. So, what are falls path falls path are path in a design that exists according to SKA tool they exist, but they are not possible. They are not sensitized in any input condition let us see one example. Let us say you have 2 muscles here both are controlled by the same selection right both are controlled by the same selection. So, if selected 0 A will be passed on and here C will be passed on the output right. If selected 1 B will be passed on the C and again C will be C there is some branch. So, there are 2 branches of one is going through the first other is going directly it will be passed on to output. Now see this path path starting from A and or yeah path starting from A and going to this branch. So, this path consider this path starting at A and going to this branch is this path possible no why because if the selection is same. So, if selected 0 this will go through, but this will be blocked on on select then selected 0 let us continue to place it select 0 this path is blocked because this is the one path this path is in a way. So, this path is not activated. Now let us consider the case when selected 1 this path is activated, but this path is not. So, this path is not possible, but why why then this path is even shows. So, if you do not do anything prime time will show a path which are not possible design wise logically. So, prime time will show this path this path this path and this path it will show all possible path why because it is static in nature it does not consider logic values it considers all possibility it does not consider any logic value on S right it will assume S is not there is no value on S S can be anything therefore, all paths are possible it does not consider the case where S is controlling both the boxes. So, as a user it becomes our job to tell that tool that what to do. So, what we could do we can tell the tool that hello from this path is not true this path is not true you can do this in multiple ways you have multiple ways of doing things one of the very famous ways called false path you could say that false path and you can pick from and do there is one more way we will talk about it later, but this is meant by false path. False paths are paths in design that are not logically possible, but still showed by the static timing actually true ok. One more example let us say you know that there is some problem in your design which is which never changes its value which is always static they can do multiple reasons this can be software programmable and software programmable registers are mostly one time within only one time or they do not change at all they have this fixed value. Any path any set of check starting from this block or any hold check starting from this block is false why because the value never changes. So, no change in value means that it is not going to trigger any set of hold check. So, that is that is time time does not move on a statement does not move on you have to tell the state that false path wrong which is one example we will see lot more this in in last sessions other type of exception is the multi cycle path. Multi cycle paths are data path that require more than one clock period for example. Now, let us say you have a bit combination of it and you know that this is going to take more than one clock time and you do not want to add pipeline here you do not want to add any pipeline. So, you can tell an equal value right equal value. So, this constraint is both false path and multi cycle path are also important for synthesis. You do not want synthesis tool to optimize false path you do not want synthesis to unnecessarily work on a combination cloud logic which you know it is going to boil it. So, you have to tell the tool that set multi cycle path minus from this minus to this will take more than one cycle we will see commands later do not worry about this commands, but this is a way you can tell the tool that my logic will take more than one clock period. So, we will we will see if this is very interesting we will see how state will handle this right how does the report time it looks like this the idea of this session is just to introduce you to the concept of multi cycle path in false path. Again what we see is not what SCA tool checks are boundary condition they are same as what design compiler does, but please note SCA post log SCA as actual net aspects. So, also you could specify as the boundary conditions are something which you could specify proper SCA again they are same commands set load, bed driving cells set into form same as what we discussed in unit. So, let us summarize SCA flow very similar to synthesis nothing different redesign data, library design parasitics this is important here this is new here apart from synthesis, posterior SCA apply constraints. Now these constraints here will be slightly more than what we did in synthesis we will see why because we are talking about post layout some things will change the nature of clocks will change. So, we will see how do you modify or act to these constraints what important things we need to focus on, but no for sure that the constraints here will be slightly more than what we applied in synthesis right everything here to change the design is properly fully constrained and there are no DRC violations what happens when they are DRC violations you cannot rely on time data. I have talked a lot about this you will see this will go back if you are talking about how DRC violations for the other time you will see on both the sessions again. Last part is that we report we report all the time information we report what is the data the report time we took up or if we do a report time for hold we do report constraints on a solver request we need for that move on right. This is an exercise I will not solve it here, but the clock this is a very simple thing you can apply your equations here this figure will help you tell that this is a setup launch edge this is a setup capture edge this is a hold hold check this is a hold check these two edges will be hold check or these two edges does not matter. So, what you could do you could you could take one of the labs and you can do a report timing and understand that what this figure tells is actually matches with the report timing first you can use the equations I have talked about before in the previous slides and calculate what you should calculate you should calculate setup stack hold slack you can estimate into delay and estimate output delay. Please note now what is special here is that I have included the max and min values for this. So, the combination logic here will not have the same value for the best case and the best case right the combination logic will have a greater value in max case which is 5 ns here and a lesser value in them for the best case that way which is seen in setting here clock period does not change I am assuming that setup and hold constraints do not change right. Clock to Q delay again the clock delay is going to be more for worst case form and less for the main case form. Calculate setup stack calculate hold slack please make sure you use the correct values estimate input delay. So, there is no combination logic here assume that there is 0 combination logic here and 0 combination logic here assume this goes to an output 4 directly and assume this comes from input 4 directly you should be able to estimate input delay and output delay please do this for all both the max column and min column ok. So, I hope you will be able to solve this very easily just by you just you just have to go back understand the equations come back here and solve it is just a 2 minute job right. Summary so, what we just told in the session was about the concept of SBA how the tool will break all the parts it will consider all the parts it will break each of these parts into into register and input in in the path groups. The path group is characterized by the SBA SBA has 2 parts first is delay calculation cell delays cell delays are calculated from library data net delays net delays come from the parasitic file we will see more about that in in one of the session. So, delay is calculated and then delay is checked against the timing constraints for a timing constraints there are 2 types max delay constraint and min delay constraint max delay constraints are usually checked in the worst case corner I mean the worst case corner should be is critical from athlete constraint, naturally constraints are of the setup constraint type, min delay constraints are critical at the best case corner and min delay checks are also called the positive that is all for the session. In the next session we will discuss more about the aspect and we will see a lot more about the products. Thank you.