 In the previous lecture we have completed the testing and improvement steps of the device modeling procedure. In this lecture we shall take up the very last step of device modeling namely parameter extraction. The first point to be noted is that parameter extraction is the process of determining parameters in a model so that the model fits the measured data as best as possible. Parameters should preferably be determined using the specific model equation where they are employed and the device geometries, biases and measured quantities relevant to the application. Let me explain these two points. We should extract the parameters using the specific model equation where they are employed. What does this mean? Now you know that for example there is surface potential based model. Here is this B-SIM model then there are similarly other models. If you extract let us say the mobility using the B-SIM model you should not use this mobility in the surface potential based model or some other model. Now you might say well isn't the mobility of the MOSFET same irrespective of the model employed? Yes, in reality this is true. In practice however we are using the mobility parameter in an equation and as we have remarked throughout this particular course that any model is derived based on some approximations. Different models use different sets of approximations. So what happens is since we are trying to fit the model to measurements the parameters that are extracted get adjusted a little bit so that that particular model expression fits to measurements. So surface potential based model also will fit to measurements, B-SIM model will also fit to measurements, some other model will also fit to measurements but since the set of approximations are different in each of these the parameter value that you use in this particular in any particular model will be different from that you use in some other model. So the mobility in the different models would be slightly different because the value is getting adjusted to fit the model to the measured data. Now let us take up the second point what do you mean by saying that the parameter should be extracted using the device geometry biases and measured quantities relevant to the application. Now what is meant here is supposing my device has a short channel I should not extract parameters using a long channel MOSFET and use these parameters in the prediction of the short channel character short channel device characteristics okay. I should not use the mobility which is extracted from long channel characteristics in the short channel model. If I do that then the model prediction may not correspond to the measured data for the short channel device. Now what is actually about biases supposing I use again I take the example of mobility extracted from some low VGS values in characteristics corresponding to high VGS then the characteristics correspond to high VGS may not the model predictions in the high VGS range may not match with the measured data. Now what about measured quantities relevant to the application? Because I take the example of substrate doping. Now I could extract the substrate doping from capacitance voltage characteristics I could extract the substrate doping from DC current voltage measured characteristics. So what is being said here is if I want to match the DC current voltage model to measured current voltage characteristics I better derive the doping by fitting or by an extraction model by a parameter extraction procedure which fits the DC current voltage model to measured data rather than using a CV characteristics and extracting doping from the CV characteristics and then using them in the model for the current voltage characteristics right that is what I should avoid. Now let us illustrate with an example how parameter extraction happens we will take up the commonly used square law current voltage model for above threshold characteristics. Now what are the parameters that need to be extracted in order to predict the current voltage characteristics in the super threshold region using this model. So you can see here first important parameter that we need is the threshold voltage you know it is appearing here in the current voltage characteristics then we need the values of W and L then we need the mobility parameters right and we also need the oxide capacitance so these are the things that we need let us look at the threshold voltage. Now if I want to predict a threshold voltage as a function of substrate bias then I need the flat band voltage this parameter phi t which is twice 5 plus 6 V t and the body effect parameter gamma. Now as far as W and L values are concerned we should note that what we observe in a microscope or from the top view is the mask parameters W m and L m okay that is gate length and gate width realized in photolithography. However the actual values of the width of the MOSFET and the channel length of the MOSFET are different from the mask length W m and L m let us illustrate this point now for this purpose we should go to the device structure. Now here is the cross section okay which I have repeated from an earlier module and here is the top view so here you can see that the channel length is the distance between this junction at the source and this junction at the drain. So if I were to show it this is your channel length L however this is your mask length L m now what about mask width so you can see from here that the mask width is something like this this is what you see in a top view however the actual width of the MOSFET would be a little different it will be smaller for that purpose let us see the 3D view okay. So this is the device along the channel length and along the channel width you have this now let me expand this part of the device along the width and show you how it looks so that would look something like this and your gate is something like this now this is your gate oxide thickness whereas this is your field oxide thickness right I am using the low cost process this is the local oxidation region. Now what you observe on mask is this length this is W m whereas the channel width is actually this W which is less than W m now how much s is W m minus W that is this region and this region right that is W m minus W similarly here we have this region and this region so these are delta L regions right difference between L m that is this and L that is this so we need to know W m and L m which we can get from the microscope but we also need to know delta W and delta L this can only be known after the device is fabricated right because the device undergoes some complex processing. If you want to know the mobility or predict the mobility for various biases then this is your equation so here you need to know mu n naught you need to know E naught and you need to know n the E by effective 0 is a function of gate to source voltage I want to emphasize here that this formula is the formula for effective mobility near source that is why you have the suffix 0 here as we have remarked that in MOSFET modeling mobility varies from source to drain however the mobility is the effect of transfers electric field or mobility is maximum near the source because the transfers electric field is maximum near the source okay so we normally use the mobility near the source to predict the measured current because here it is a worst case situation so let us summarize our discussion so we need to extract Vt phi t Vfb and gamma so Vt phi t Vfb and gamma delta W and delta L which appear in these equations for channel width and channel length in terms of mask width and mask length and finally mobility parameters mu n naught E naught and n so mu n naught E naught and n these are the parameters now how do we go about this extraction we use the id in strong inversion that is for Vgs greater than Vt plus 0.5 okay and we set the Vds at a small value of 50 millivolts now why do we choose these conditions on Vgs and Vds small Vds implies that the drain to bulk current is negligible okay the reverse bias across drain to bulk junction is small further not too large Vgs implies idg is negligible so your drain to gate current the leakage current from the gate is negligible therefore we can assume that the terminal drain current id is the drain to source current ids under these conditions so the effects of gate leakage and bulk leakage are neglected here now we are going to assume that our device does not have source and drain resistances okay or these are small now you might ask but then what happens if source should drain source and drain resistances are present the next module we will discuss that point so under these conditions the square law model that we want to use can be written as shown here okay so it has been written in a slightly different form than we saw on the previous slide all that has been done is the square law term has been a slightly rearranged the Vds out of the square law term is removed here out of the bracket and the remaining Vds term is clubbed along with Vt okay now let us discuss extraction of Vt how do we extract Vt here is a typical measured id versus Vgs curve for Vds equal to 50 millivolts and Vbs some constant value when you are sweeping your Vgs the weak inversion, moderate inversion and strong inversion regions are as shown here extrapolate the tangent from the point of inflection to the Vgs axis this is the approach to get the Vt so this is your tangent the dashed line is your tangent to the measured data at the point of inflection now why do we do it at the point of inflection well the point is that if you say just fit a straight line to the approximately linear region of the id versus Vgs data now different people will choose different sets of points of this data and then fit a straight line so they would get different threshold voltage values so in order to introduce some precision we need to tell exactly what is the range of the data points to which you must fit the straight line so rather than telling a range you just tell the point now what is the point of inflection as we know the point of inflection is the point where the slope of the curve is maximum okay so that is what we are doing here we choose that point and that point will be precisely determined by different people right for the same data and then at that point you drop a tangent so wherever the tangent meets the Vgs axis that is your threshold voltage here now let me point out the fact that if you want to be very accurate then since this is your equation being used wherever the current becomes 0 from here clearly Vgs is equal to Vt plus alpha by 2 Vds so there is a small correction that you need to apply that is alpha by 2 into Vds so whatever you measure from that measured value you must subtract alpha Vds by 2 to get the threshold voltage okay now what is the value of alpha you should use well since this is a small correction we do not need to spend lot of effort in deriving the exact alpha typically people use alpha equal to 1.2 we know that alpha is greater than 1 so previous modules we have discussed this particular parameter alpha what does it do it tries to capture the variation of the depletion bit in the channel from source to drain in the square law model this means since our Vds is 50 millivolts Vds by 2 is 25 millivolts and 25 into 1.2 right so about 30 millivolts is the correction that you need to apply so measured value of this particular point from there you subtract 30 millivolts you will get the threshold voltage let us discuss extraction of phi t Vfb and gamma we need them to predict the body effect on threshold voltage that is the effect of Vsb on Vt the approach is as follows you can see that if I plot Vt versus square root of phi t plus Vsb according to this particular equation okay this should be a straight line whose intercept should give you Vfb plus phi t and the slope should give you gamma but now the point is but how do you know phi t right if you know phi t then it will work so the approach is as follows we know that phi t is twice phi f plus 6 Vt okay now twice phi f depends on doping however twice phi f is logarithmically dependent on doping you know the formula for phi f it is Vt ln nd by ni if it is a n type substrate and Vt ln na by ni if it is a p type substrate okay so since twice phi f depends logarithmically on doping even though doping may vary over a wide range the twice phi f term will not vary too much so you can assume some value let us say of about point 8 or point 9 for twice phi f plus 6 Vt and then you put that value here then you arrange your Vt as a function of that value of phi t let us say we choose point 9 plus the Vsb and you plot those measured points so let us say for Vsb equal to 1 volt you have got some Vt that is this point Vsb equal to 2 volt you have got Vt you will put this point so you will calculate 2 plus point 9 take a square root and that is the value you will put here right so for 1 volt Vsb point 9 plus 1 square root 1 point 9 you take and that is where you will plot this point you go on like this now evidently unless your phi t corresponds to the exact value in the device these points will not fall in a straight line okay so now what do you do well you go on changing your phi t then the shape of this curve will go on changing and for some phi t all these points will fall exactly on a straight line so that is the value of phi t that corresponds to the actual device so how do you achieve it in practice well you fit a straight line to these measured data and calculate the root mean square error if all the points are perfectly on a straight line then root mean square error is 0 okay so what you do is you vary your phi t over a range say from you know point 7 to 1.1 or something like that and find out that value of phi t for which the error is minimum and that value of phi t is the phi t corresponding to the actual device so fit a straight line to Vt versus square root phi t plus Vsb data right adjust phi t until root mean square error is minimum now once you have got this value of phi t now remaining things are straight forward from the slope I can get gamma and from the intercept I can get Vfb because the intercept is Vfb plus phi t I already got phi t so from the intercept I will subtract phi t and then I will get Vfb now let us discuss extraction of delta w and delta l Wm and Lm are known from layout information right by observation under a microscope delta l and delta w depend on fabrication process so the approach is as follows consider Vds equal to 50 millivolts which is much less than so this rectangle here is actually much less than Vgs minus Vt now if Vds is small I can simply neglect this alpha by 2 Vds term and my equation will simplify to this now what I do is I take the conductance which is ratio of id to Vds under this small Vds condition then what will I get according to this formula this conductance is equal to this average surface mobility to C ox into Vgs minus Vt by l into W okay where that W is written as Wm minus delta w so now what do I do I have a test structure in which I fabricate when I am fabricating the integrate circuit devices with different values of mask widths Wm I measure the conductances of these devices during parameter extraction and I plot these conductances as a function of Wm right the points will be something like this so as your Wm increases your conductance will increase because your current is increasing right now according to this particular formula since for different values of Wm the delta w is the same because it is determined by the process and the same process has been used to realize all this devices with different widths and similarly the other parameters nu n, s, average, C ox, l are also same for all these devices and you are applying the same Vgs okay and your Vt also is same so this G versus Wm should be a straight line so what you do is you fit a straight line then evidently wherever this straight line goes to 0 on extrapolation please note that I may not be able to measure this point because this corresponds to G equal to 0 right G equal to 0 is resistance infinity right so we may not be able to measure this but we can extrapolate the straight line to this Vgs to this Wm axis and whatever is the intercept that will give you delta w an analogous approach can be adopted for delta l all that we do is instead of G we take the resistance r so the same equation if I take reciprocal r is equal to Vds by Id and that will come out in this form now I write the channel length l in terms of the mask length and the process dependent parameter delta l and now you would have guessed that what I am going to do is I am going to measure the resistances for different values of mask channel length lm and for all these devices the channel width would be maintained the same and all these other parameters would also be same therefore this would be a constant for the different devices with different lm and so if I plot r as a function of lm once again all these points will fall on a straight line and therefore if I extrapolate this straight line the intercept on the lm axis is delta l finally let us discuss the extraction of mobility parameters mu n naught E naught and n now this is your formula okay so this is applicable for Vds much less than l into EC for that condition in our model equation the average surface mobility can be simply written as mu ineffective 0 into 1 plus Vds by LEC is approximately equal to mu ineffective 0 so only the transfers field dependence occurs here for small Vds right the horizontal field dependence does not come into picture because when Vds is small horizontal field itself is very small that mu ineffective 0 is given by this formula where the EY effective 0 is equal to Vgs plus Vt by 60 ox this was given to you as an assignment that if you have an n channel device with a n plus polysilicon gate and then you can show that the Y directed effective field at the source is simply given by Vgs plus a threshold voltage by 6 times the oxide thickness so we are going to use this result so let us write reciprocal of mu ns average what would this turn out to be so reciprocal of mu ns average is approximately equal to 1 by mu n 0 into 1 plus Vgs plus Vt by 60 ox a0 power n so we have substituted this particular EY effective formula in this place right and then taken the reciprocal of this why have you taken the reciprocal because as we know we want to arrange the data in the form of a straight line so that from the straight line fit we can extract the parameters of interest it is always easy to fit a straight line right it is not that easy to fit other forms of functions right which are non-linear therefore we always arrange the measured data in a way in which we end up having to fit a straight line to the data so that is why we are taking the reciprocal here okay now how do I get 1 by mu ns average so from your formula here for small vds once again this term does not appear we can write 1 by mu ns average is equal to W by L into C ox into Vgs minus Vt by G where G is ids by bds so this ids by vds so the same equation has been rearranged to write it in the form of 1 by average mobility right in terms of this now here you know we already know what is W what is L because we have extracted delta L and delta W right previously so this is an important point you must extract the parameters in a certain sequence so that you can use the extracted values right in later steps so mobility parameters should be extracted after extracting threshold voltage body effect parameter right and then delta L delta W and all these things now once you have measured these parameters I can use this formula and calculate 1 by mu ns effective for various Vgs minus Vt and then I am going to plot 1 by mu ns average versus Vgs plus Vt okay so each W L C ox Vgs Vt G are all measured then I put all those measured values in this formula so I get this reciprocal then I plot those points versus Vgs plus Vt right that is what is given here if you would like to know how do you get C ox well C ox is very easy to get because the oxide thickness can be obtained from CV measurements very easily and whatever oxide thickness you get from CV measurements can be used here in this current voltage measurement you will say that you are violating your own suggestion that all parameters should be extracted from the quantities relevant to the application but as far as the T ox is concerned right we will violate our guideline a little bit right okay so now it turns out that in many situations the power n is 1 or very close to 1 and therefore this 1 by mu ns average versus Vgs plus Vt would be a straight line so if this is a straight line then it is straight forward to get the 1 by mu n 0 right and E naught from the intercept and the slope I will leave it to you as an exercise now suppose the n is not unity then what do you do well even that I will leave to you as an assignment so here is the assignment question extract the parameters mu n naught E naught and n associated with the effective mobility expression from the following mobility data derived from measured output conductance of a MOSFET for small Vds here is the data these are the mobilities and these are the gate source voltages at which the mobilities have been measured as we had shown in the previous slide we have also measured that the device has a threshold voltage of 0.6 volts and oxide thickness of 5 nanometer right so you can use these quantities and extract E naught n and mu n naught towards the end of our module on this testing improvement parameter extraction let me quickly spend some time on evolution of the MOSFET models for circuit simulation right the first model that was proposed was called SPICE SPICE stands for simulation package with integrated circuit emphasis the first version of SPICE that was proposed in 1973 was called level 1 after that 1987 1990 and 1994 in this duration the University of California Berkeley researchers proposed models which were called Berkeley simulation model so three versions B sim 1, B sim 2 and B sim 3 in 1994 alongside this B sim 3 version Philips people also proposed a simulator that is called MM9 in 1996 yet another model for the MOSFET was proposed that was called EKV these abbreviations EKV stands for the names of the people okay first letter of the names of the people who proposed this model this was proposed from EPFL okay the University in Europe I leave it to you as an assignment to figure out what does the M and M here stand for in MM9 and what is this 9 in 1996 a council was formed it was called compact model council this council evaluated all the models that were proposed till then okay based on various tests and then they chose the B sim 3 as a standard model because people you know didn't know since so many models available which one they should use so this is some sort of a testing agency right which decided to test all models and then suggest which one is probably the best of the all after that 2001 a surface potential based model was proposed okay by Penn State University and simultaneously high sim model was proposed from Hiroshima University and yet another version of the MM model was proposed by Philips so as we have discussed earlier the models for circuit simulation purposes were always threshold based to begin with okay because it was easy to relate this model to practically what the circuit designers do however as the device size started reducing and the physics became more and more complex it was found that the threshold based model was becoming more and more complicated right first it was thought to be simple from the computational point of view but it was becoming more and more complicated in fact the definition of threshold voltage itself was becoming very difficult therefore people realize that one should go back to the surface potential based model and adapted to small geometry devices so this is the first such effort in 2001 subsequently 2003 the B sim people came up with another version of the threshold based model and 2004 which means after about 8 years so 1996 and after that 2004 the compact modeling council once again decided to test the various models available until then and then in 2006 the PSP was elected as a standard model so this was a surface potential based model from Penn State University okay that is the history of evolution of the MOSFET models for circuit simulation. Now let me discuss in little bit more detail about what are the differences in these the 1972 to 1985 period whatever models were proposed during this period were called first generation compact models they had some common features namely the emphasize device physics without due consideration to mathematical representation leading to convergence problems in circuit simulation right examples of these were spice level 1 2 and 3 models spice level 1 model was proposed in 1972 by Schismann and Hodges IV model by Schismann and Hodges and later on improved by Meyer right in 1975 came the level 2 model this was Meyer's IV model plus field dependent mobility plus sub threshold ID right so field depend mobility and sub threshold ID were added to the Meyer model to get this spice level 2 model after first release what that and CV model was also included right for transient and small signal predictions mathematically this is more complicated than level 1 and so more convergence problem than level 1 right so people try to incorporate more phenomena to come closer to the reality but then mathematically things became difficult spice level 3 semi empirical simplified version of level 2 popular for simulating digital circuits but not very scalable and discriminatory in the first derivatives of the drain current exist so based on the experience of level 1 and level 3 level 1 too simple but computationally efficient level 2 incorporates more phenomena to come close to the reality and for better prediction of the measured data but then computationally it becomes very difficult so both features simplicity as well as computational efficiency was tried to be obtained in this level 3 model the consequence was that it was semi empirical because you cannot use all phenomena physically and yet get a simple equation right so empiricism was the price that we paid for computational efficiency now what do we mean by not very scalable well supposing you extracted the parameters using device of a certain geometry and then you try to use it after the channel length reduced over period of time it was found that the model predictions using the parameters of long channel MOSFET right so the parameters which were extracted for longer channel devices if you use the same parameters for shorter channel devices the predictions were way off from the measured data so then came the second generation compact models 1986 to 1996 duration their features were improved mathematical representation and so better convergence at the cost of physical basis so here mathematical properties were given lot of importance right because circuit size was increasing and so circuit simulation time was increasing right and so it became important to mathematically improve the model so that in reasonable time you can simulate the circuit. The use of empirical parameters weakens the link between process and model parameters and complicates parameter extraction this was the price that was paid for improving the computational efficiency examples of these were BCIM or SPICE level 4 model published in IEEE Genov Solis State Circuits in 1987 this was mainly a digital model then it was a regional model and so it retained the discontinuities in the first derivatives of IB and CV characteristics it was inaccurate for submicron FETs the use of curve fitting polynomials right in this model caused negative conductance and convergence problems yet another model in this second generation category was BCIM 2 which came out in 1990 it improved accuracy and convergence to suit analog circuit simulation however the regional nature and so discontinuities of first derivatives remained HS SPICE it was proposed in 1996 its features were accurate description of the transition region that is from weak inversion to strong inversion model inversion region and good convergence to suit analog circuit simulation then came the third generation compact models 1997 to 2003 these achieved both physical basis and mathematical fitness using a smoothing function to represent different IV and CV regimes using a single equation so the model that we discussed right in detail using the smoothing functions VDS effective and VGS effective is actually the latest model right the latest approach to get both physical basis as well as mathematical fitness discontinuities in IV and CV characteristics are eliminated this was accurate for submicron technologies down to 0.01 micron examples where BCIM 3 published in IEEE electronic devices in 1997 MM9 published in 1994 and BCIM 4 published in 2003 now this was source referenced threshold voltage based model so threshold voltage was referenced to the source unified charge control model UCCM right published in 1990 transactions on electron devices and EKV model published in journal of analog integrated circuits and signal processing these were inversion charge based bulk referenced model that means here the threshold voltage was referenced to the bulk right in fact that is how we have derived our MOSFET model equations we have said that while in circuit applications the MOSFET is used with source as a common terminal but from the modeling point of view or from the device physics point of view it is easier to visualize this device with bulk as a reference right then came the fourth generation compact models MOS model MM11 PSP model these provide advantages of third generation models using fewer parameters and in addition work for sub 0.01 micron or sub 10 nanometer devices examples are MM11 PSP published in IEEE electronic electronic devices in 2006 these are surface potential based bulk referenced models those who want to read more about these things are encouraged to read this article by C. T. Sa of the Pao Sa model fame right we talked about the Pao Sa model so Pao and Sa were two researchers of which Sa was a senior so he has a lot of experience in developing models for MOSFETs therefore he wrote this particular review article a history of MOS transistor compact modeling this is available on internet those who are interested can read it with that we have come to the end of the module so let us summarize quickly the important points hopefully at the end of this module you should be able to explain how the threshold voltage based model is improved with the help of smoothing functions to get the BC model explain how threshold voltage phi T flat point voltage and body effect parameter gamma and the effective mobility parameters are extracted from measurements and describe the evolution of the MOSFET models for circuit simulation let us recapitulate some points which we have made regarding these so as far as testing is concerned we said that there are criteria for testing the physical basis of the solution two of which are dimensional correctness and prediction of limiting cases we applied these criteria to the threshold based model then yet another method of testing the physical base of the model is to check the consistency of the solution with approximations these are the approximations and we need to check whether the model is consistent with these one example of this checking was given to you as an assignment how to check the gradual channel approximation right similarly other approximations can be tested for consistency yet another criterion for deciding the physical basis is simply the number of empirical parameters used in the device then you also test the model for accuracy so what are the criteria so you compare the model results with accurate simulation and you compare the model with measured data comparison with measured data is the ultimate test okay now when fabrication of the device is costly and involves lot of time and effort then as a stop gap arrangement we can check model results with accurate simulation now what is the meaning of accurate simulation so we said that in any modeling there are two sets of approximation approximate set one which correspond to the approximation made while deriving the qualitative model now these approximations are used to write the equations of the device okay and you simultaneously solve these equations then you get a numerical solution now here when we talk of accurate simulation we are talking of this numerical simulation some approximations are indeed made here also right but they are fewer okay and you can always like this approximations make more complicated equations and then get a numerical solution to get the analytical model that is our model here which we are trying to compare we make another set of approximations which are the approximations of these equations right so what we are essentially comparing is results of this with results of this here now testing gives you the directions of improvement so essentially the model is modified to improve one or more of the criteria namely g caps generality continuity accuracy physical basis and simplicity we showed for example that you can take the threshold based MOSFET model and improve this model on all the fronts of g caps criteria right using the Berkeley simulation model approach where you use a smoothing function to eliminate the regional model and discontinuities across regions so here two smoothing functions are used one is the smoothing function for vgs called vgs t effective where here t is included because you are basically replacing the term vgs minus vt right with this function and there is another function vgs effective the function vgs t effective it gives you a continuous variation from sub threshold to super threshold and it is plotted here so vgs t effective versus vgs has this kind of a shape which corresponds to the shape of id versus vgs on a semi lock plot on the other hand the vgs effective function basically removes a partition between linear and saturation and it has the shape shown here as a function of vgs which is same as the shape of the id versus vgs curve right above threshold the vgs set term used in vgs effective equation is vgs t effective that is this value plus 2 times summa voltage by the parameter alpha which is called the body effect parameter. Now these functions are used in the square law model right and you can generate the entire range of current voltage curves from sub threshold to super threshold and linear to saturation without any loss of continuity okay and these equations are c infinity continuous then we discussed the final step of device modeling namely parameter extraction where we took up the square law model and showed how we could extract the threshold voltage the parameter phi t in prediction of the threshold voltage with source to bulk bias the parameter vfb and body effect parameter gamma we also discussed how to extract this parameters delta w and delta l which are process dependent and which are basically the difference between the mask width and the actual channel width and mask length and the actual channel length of a MOSFET then we discussed extraction of mobility parameters mu n0, e0 and n which are used in this dependence of mobility on transfers electric field formula finally we discussed evolution of the MOSFET models for circuit simulation from 1973 to 2006.