 Welcome to this lecture on field programmable gate arrays and the course digital system design with PLDs and FPGAs. Last lecture we have started with the field programmable gate arrays. At that point we kind of contrasted the FPG architecture over the CPLD. Basically we told the CPLD has a kind of a central interconnecting switch between the logic blocks. But in FPGA this switch is distributed. So the whole architecture is scalable and second feature of CPLD was very wide product terms which is not required in practice that waste lot of area. So in FPGA you have kind of lesser number of inputs to the combinational circuit. So essentially that is these to allow the FPGA to scale to larger you know quite a complex size and we have seen the evolution of FPGA from ASIC to the standard cell to the field programmable gate array. What is in a nutshell what is the architecture of the FPGA, a general architecture of FPGA and we have discussed the programming you know the technologies that means basically how this configuration technology what are the different type of configuration technology. We have seen some commercial chips commercial devices from the major vendors ok maybe those are the only vendors as far as FPGAs are concerned. So today we will choose a specific FPGA the silings FPGA and look detail into its architecture which will basically enable you to tackle understand even the much more complex FPGAs from silings itself as well as from other manufacturers. Because most of the people use somewhat similar architecture unlike CPLD there are variations but I would highlight that variation showing some examples wherever there is a kind of drastic architecture variation ok. So quickly we will have a look at the last lectures slides then we will get on with today's lecture. So let us move on to the slide. So we looked at the evolution kind of a from a you know very low level design to using higher level blocks and interconnecting and the interconnection made at in the foundry. But in FPGA the interconnection is with programmable is built into the gate array of logic resources so that it can be programmed or configured in the field. So the game is between the NRI course which is very high for ASIC medium for this and low for this. So this ASIC workout for huge volumes and this for middle volume and this for low volume and the design turnaround time is very high for it and low for it. This may run into years and this could run into weeks depending on the complexity ok. Apart from that then one should not kind of go overboard we definitely this FPGA has disadvantages and it is quite it can be quite a complex FPGA can be quite big and can be costly and one cannot assume the you know FPGA appearing in kind of you know mass produced item like cell phones and all that. But nevertheless there is a definite role of FPGA in every digital chip design because almost all the chips are kind of implemented fast in a field programmable gate array then moves to ASIC almost all of them ok. And the FPGA as I said has an array of ok though the name suggests array of gates but that is not true it is array of logic resources it is a higher level logic resource which is used to build combinational circuit with programmable interconnection and when we say logic resources we know that we have to implement data path we have to implement controllers. So it has combinational circuit and flip flops when it comes to combinational circuit most of the FPGAs use lookup table as a combinational circuit element and some uses multiplexers some uses gates and mainly I would say these are the actual anti-fuse FPGAs which uses the multiplexers of gates. And the programmable interconnection technologies are called SRAM an unfortunate name flash and anti-fuse there are special resources like phase lock loop, delay loop lock loop, RAMs V4s and memory controllers, network interfaces, processors and these are many a times depending on the requirement is you know is part of the silicon ok this is a hard coded devices which I am talking about ok. So that is field programmable gate there in nutshell we have seen commercial FPGA, Spartan and Vertex from Xilinx and Cyclone and Stratix and Aria from Altaira and Actal as this Pro ASIC plus and Accelerator and radiation tolerant version of the Accelerator smart fusion with ARM core and things like that. And of course as I said the Xilinx has a sync with dual core ARM and the Altaira has soft core Nios processor. So this is a general structure of an FPGA IO pins around array of logic blocks normally this will run into tens and hundreds ok like across and across the row and across the column ok. This may be typical numbers will be for small one 50 by 50 to may be 100 by 100 and it may go all the way to huge numbers ok. And that shows that you cannot interconnect those blocks with a single switch. So there are wires laid out in a regular pattern like at the junctions are the switches and the wires run. So and these left and the left and the bottom are the input to the logic block from these wires and the top and the right are the output and this is a very general I am not talking about a specific FPGA but may be that when you take a specific FPGA the input is from the bottom and the output is from the right hand side. And I have not verified the chip layout it may not be available in the datasheet whether exactly these are kind of schematic you know when you look very exactly it may not look exactly like that you know very precisely in this fashion. But nevertheless it resembles the implemented architecture. So you can imagine there are output coming there is a switch here where the output wires are connected to any of the wires here. And you have a big switch here which allows the interconnection of these vertical bottom part or horizontal part left right and things like that. So these are switches all junctions are switches which allows connection. So when you put when a designer when the tool put some hardware into FPGA it will be placed across multiple logic blocks and depending on the circuits in the size that will be interconnected using these general wires. There are wires which span a single logic block, there are wires which span may be two logic blocks, four logic blocks, six and the one end to the other ok. Now these are these come from the statistics it depends on what is inside a logic block and how many need to be interconnected. How close you know for a normal circuit you know like in on an average it might be that you take a 4 by 4 kind of matrix mostly the connections interconnection fall in here you know then some art you know like you take this boundary then it on an average it goes all the way up to the fourth or sixth then such kind of wires are useful. And as I said like when you interconnect there will be lot of switches programmed on the way. So the interconnect delay in FPGA could be quite high compared to a CPLD but the advantage of FPGA is a complexity itself a complex circuit can be implemented. And this shows a very detailed diagram where these are the switch block with horizontal and vertical wires. So this wire could be connected here, here or here and you can program all that and similarly output can be connected to any of the wires. There are 5 input going to the logic block and that can be you know connected to any of these wires. So there are lot of programmability and this is called configurable logic block. Because the logic block itself is quite big we will discuss why it is big. So it need lot of configuration ok. So that is it and this shows a switch diagram like a vertical wire is connected to horizontal you know this wire and this wire. So at least one wire is connected to left, right and bottom and that this shows you know various different types of connection depending on the statistics which each vendors use different you know type of connections. Maybe we need not be seriously worried because our aim is to understand the FPGA and use it in our design than design the FPGA. So I will skip that and this shows a diagram which is from the silings data sheet which is very old but this kind of you know kind of highlight a thing which we should not forget. There are logic blocks which consists of combinational circuit and the flip flops there are interconnect wires. But underlying is a configuration circuitry to configure all these and so you should not forget about that there is lot of configuration overhead in terms of addressing these switches and programming it on off and all that. So that makes you know compared to an normal ASIC this makes the FPGA quite big maybe it dissipates more power. So the raw the speed of an FPGA maybe may not be as high as say 1 gig or 2 gig maybe the serial transceivers go at high speed because that is easy to design. But a general look up table speed may not be that high. So any reasonable circuit you place and route it may not clock you know 1 gig maybe 200 megahertz 300 megahertz sometime if it is too complex if you are not floor planning it properly it might even go down sub 100. So but then it is not a processor. So you can it is up to the designer to exploit the available resources and you know do parallel computing I know pipelining and so on and kind of implement very high throughput design to exploit this available resources okay that is how with FPGA 1 achieve performances. So FPGA consists of IO blocks and that has tri-state output and input and it has a synchronizing flip-flop because the input can come from another chip or another clock domain and there are array of configurable logic blocks there are horizontal vertical wires with programmable switches in between these wires are single length, double length, quad hex and long lines it depends on as I said statistics of how spread is the circuit or the connections and there are resources available to the user in terms of logic blocks, the memory, the PLLs and all that and there are resources for configuring the programmable switches in the interconnect structures and logic blocks okay we will see what is there to configure in a logic block. And this is where we have stopped the programmable technologies are SRAM, flash and antifuse and we said that the SRAM basically use a transistor NMOS transistor for interconnecting the two wires but then if the gate is like if it is an NMOS transistor if gate is 1 it makes a connection gate is 0 it cut off the connection okay. Now at the gate is a flip-flop which holds that status of the connection if the flip-flop is set to 1 then it conducts if it is 0 it does not conduct so that means that these flip-flops in all these are flip-flops which need to be programmed at the power on because at the power on there are no specific state and now to program it now like it is too much of an overhead to kind of independently address and you know program it so you do not program it seriously you program it you know parallel. So this flip-flops are organized as static RAM of width 8 or 16 bit and the vendor knows the kind of the position to decide on the format of the configuration file and that is why it is called SRAM okay when you say the programming technology is SRAM no way SRAM can interconnect something but the pass transfer is the interconnect technology the flip-flops are used to store the status of the gate and flip-flops are you know organized as static RAM to use of programming. So in the previous families of FPGAs it was 8 bit wide and the current FPGAs use 16 bit wide programming and that I am telling the kind of the row internal structure but then there is a way to program serially as far as user is concerned we will see that okay. So what one issue with the SRAM is that this is a single flip-flop okay so you look at this whole block okay. Now I hope you remember that this is a logic cell this is a logic cell these 4 are logic cell and these are the horizontal wires these are the vertical wires. So we are talking about interconnection at this big switches and the output interconnecting to the vertical wire okay just an explanation on the diagram. Now we are looking at this we are blowing it up this one so we have a right transistor I mean we have a kind of switch which is a transistor and a flip-flop okay. So that is what is shown here you have a switch the gate of that is connected to a flip-flop which is nothing but in cross connected inverters okay. So and now you have to force this inverter to 1 or 0 so this is a right transistor suppose you want to force this to 0 then you what you do is that you put a 1 here enable the right transistor this is forced to 1 so 1 this becomes 0 and 0 comes here it is latched and that can be removed. That means you give a 1 you give a pulse here then it gets latched on to the flip-flop and the pass transistor is on or off depending on what you give here okay only thing is that you give the opposite of what you want as far as the structure is concerned. Important thing to remember is that the flip-flop are made of two inverters and one inverter is made of two transistors one NMOS and one PMOS PMOS to the VDD NMOS to the ground. So that this inverter is composed of two transistors this two and this one is you know the 2 plus 2, 5 and 6 and this can be sometime a right transistor with isolation can be big. So one switch means six transistors okay so you can imagine it occupies quite a lot of area. So if you assume say here it is shown 7 wires so if it is a 7 into 4, 28 wires even if 1 wire is connected to 3 you know like you can say 28 C2 kind of connections are available for it to connect to everything else. So it is a quite a 28 into 27 by 2 connections may be there you know so and that involves lot of transistors into 6. So these switches occupies quite a lot of area. So that is one kind of disadvantage of the SRAM kind of technology that the switches occupy a lot of area and of course there is a delay because once you there is a delay of the channel okay. So it introduces each switch introduces a certain delay. So when you cascade multiple switches from source to destination it adds to the delay and it occupies a lot of area and these as I said this whole thing is organized as SRAM cells you know of certain width hence it is called the Static RAM technology and when it comes to MOS transistor we have seen that with respect to the PLDs CPLDs and MOS transistor as a normal MOS transistor with a floating gate. It conducts when it is not programmed off and it can be electrically programmed off or on. We have seen the kind of basic working of it. So you have an N you know N channel transistor and MOS transistor with N source N rain P substrate you have silicon oxide a polysilicon gate then again silicon oxide the controlling gate. Now normal case you apply the positive voltage channel forms and conducts but the trick is to trap some electrons here to make the threshold voltage quite high. So if you apply normal threshold voltage because electrons are sitting here and kind of you need a higher field to form the channel okay. So that makes it off you know even if you apply the control gate with a logic high it does not conduct. So that is how it is turned off. So that is shown here the source is grounded drain is given the supply voltage a pulse is applied a higher voltage pulse is applied. So electron tunnels and get in here and then the normal application of the gate voltage would not make the channel and it would not conduct. And when you want to erase it you do the opposite you ground it the gate you ground you apply a pulse here it gets out. So one advantage of the flash is that it can be programmed on or off electrically and much more than that if the chip has a port for doing that within the circuit you can program it. So somebody makes an FPGA or a CPL with that technology on the board on the on the at the destination it can be kind of and it can be updated. And nowadays more and more kind of over the network update or sometime it is called over the air update through the wireless channel the firmware is updated and maybe even the hardware many times can be you know the updated like if you have a programmable hardware not only the flash memory is the programmable hardware also can be you know updated remotely through the network wireless and all that. So these technologies enable that and the third one is called anti fuse or anti fuse and you know that in a chip design you have layers of which implements the transistor and you have layers with interconnecting wires okay. So you cannot have the wires in the same layer as the transistor. So they are on a different layer and the layer to layer connection is by drilling a hole and normally on one layer the wires will run horizontal in the next layer the wires will run vertically okay. Because you know it is easy to interconnect also the kind of the like the coupling between the two tracks will be less the capacitance between the tracks will be less. So essentially if this is assumed that this is a horizontal wire this is a vertical wire normally there at the cross point there will be a hole drill and that will be conducting it is made conducting okay. So in an anti fuse FPGAs the connection is made through this hole initially the hole would not be conducting. There will be a fuse material as special fuse material deposited it shows a cross section of that and normally it is not conducting this shows the kind of the horizontal wire on the top layer and this shows a vertical wire which is kind of coming out of the screen towards you. So there is a kind of hole with a proper deposit and you apply a high voltage when I say high voltage like a kind of nearing 10 volt and then it fuses and make a connection okay which is permanent which cannot be kind of erased okay. So that has certain advantage so it does not kind of a noise would not reverse the status of connection once it is programmed. So these kind of technologies are used in the space application mostly so that anything send to outer space is exposed lot of radiation and this can kind of flips the flip flops and that can cause problems. So the anti fuse use and it is not very kind of little bit you say resistant to the reverse engineering if somebody try to kind of peel the chip and take the photograph of the mask then one cannot make out the hole is kind of programmed on or off. So that is an additional protection from kind of piracy the hardware piracy as far as this is concerned but in SRAM kind of FPGAs you know that these the status of these flip flops has to be stored somewhere and at the power on it has to be programmed and if somebody can read those memories then this design can be pirated but nowadays there are encryption technologies which is available to protect the configuration file which is stored in a memory okay. But this has an additional advantage of that the summary is shown here. So in a flash kind of technology it is non-volatile once you program it till you erase it retains good that way it is reprogrammable in circuit and the delay is quite large area is medium but when it comes to SRAM it is volatile that means at the power on it has to be reprogrammed but then you know that the flash has some kind of limitation on the number of times you can program it is not that it kind of it is a serious limitation as far as FPG is concerned is not when you design with FPG you program it 1000 times a day because it takes time to design and the designers do lot of simulation and verify everything before going to the chip to program the chip but still that SRAM has no such great limitation that number of times it can be programmed. So this is very good for prototyping and particularly I would say coming from academic institution it is very good to give it to the students you know they can program reprogram and it runs for quite a long you know you can use it for 3 years 4 years without trouble. So this is also in in circuit reprogrammable delay is large area is large we have seen the large area. When you say antifuse it is non-volatile it is one time it is not reprogrammable. But you see the delay is small because it is not through a you know active circuit like transistor it is a connection it is a fuse connection and the area is no extra area like there anyway the interconnection of the wires require the wires which is called wires or wires so the hole and so this deposit is in that wires okay. So there is no additional area but definitely to apply the voltage you need a transistor to isolate that voltage so definitely there has to be right transistors at that place but still it does not occupy much area. Now this has some effect this programming technology has some effect on the size of the logic block okay. So this is one important thing though nowadays most the complex FPGAs are SRAM. So this is kind of forgotten but then one should realize that when you take say an antifuse based FPGA since the area of the interconnection is very less they are able to make the logic block small okay. So this is something to do with the kind of fragmentation like you would have seen in a hard disk say there are sectors okay. Nowadays you have you know the terabytes or gigabytes of you know the space on the hard disk but there are basic units you know it is not that the hard disk can accommodate byte wide storage it is a sector wide storage and earlier the sectors were vital bytes but now it is kind of 4K 8K depending on the size of the total size of the hard disk. Now advantage of choosing small size is that less space is wasted. Suppose the sector size is a 4K okay. Suppose on an average the files have less than 4K size okay. Suppose 60% of the files stored in a hard disk is less than 4K then lot of area is wasted. So there we can say there is fragmentation or unused area in each sector. So this choosing the size of the sector depends on the average size of the file. So it will be ideal if you can choose it very small enough so that nothing is wasted you know. Like if it is make vital bytes the maximum waste will be vital bytes maybe if there is a file which is very small. So only vital is wasted but then in the case of 4K the 4K can be wasted you know for a very small file or there is a file which is just little above the 4K so that might affect. So that I am talking about the fragmentation. Same argument those coming back to the slide same argument applies here you know in a fine grain the fragmentation will be less but that is possible because the interconnection area is small but when you use an FPGA with SRAM which occupies lot of area. If you make the logic block very small to avoid the kind of wastage within the logic block then the switches will be much bigger than the logic block okay. So what this kind of FPGA vendor do is that they make the logic block quite big because interconnection is costly. So they make sure that quite a lot of things are packed into the FPGA but then there is a great danger that only part of it is used. So this kind of FPGA which is called coarse grain FPGA and the second type is called fine grain. The coarse grain FPGA the vendors make sure that the logic blocks are very flexible that means that they make sure that everything can be used. Suppose you are using a part of a logic block other part can be used say we have seen the CPLD. In a CPLD the and and over section output goes to a flip-flop okay or in a PLD like 22 V10. So you can bypass the flip-flop there is no problem okay. So but if you bypass the flip-flop that flip-flop cannot be used separately okay. So that is a waste okay but we will soon see that in an FPGA the FPGA vendors make sure that like if the combinational logic alone is used in a particular logic block the flip-flop can be separately used for something else. You know these kind of things are available that is how they exploit the kind of disadvantage of the large size of the static RAM and most I can say they are successful in doing that they are not bogged down by that. So though there is a division there is no great disadvantage with the coarse grain kind of architecture okay. That is what I want to highlight. I will just show a picture. This is a coarse grain FPGA basically a silence FPGA. You can see that it is a logic block configurable logic block. There are two identical slices and they call it slice and within a slice there are identical blocks two blocks and there is a lookup table four input and flip-flops. So a logic block has four lookup table and four flip-flops okay. So and this lookup table itself is four input which allows you to implement four input variable implementation. So it is quite this is quite huge okay. It is coarse grain and we will see this in detail later. And if you take this is a taken from a ACTEL FPGA, ACTEL 54X okay I should have shown that. So this is taken from the ACTEL data sheet the 54X FPGA. So you see the essential combination circuit is a 4 to 1 multiplexer okay. You know that the multiplexer is nothing but ando circuit and if you want to implement a Boolean function it is possible say you connect a variable a here b here then this line represent a bar b bar a b bar a bar b sorry a bar b a b bar and a b okay. So depending on like you want to have an a b bar or a bar b say you choose 0 1 1 0 then you get that implementation. So this is how you know it is very simple you know two variable but in this case there is an AND gate with two inputs OR gate with you know two inputs. So normally if you want to then you connect one to the logic one other uses variable. But in some cases like you give a b c d then you know some combination of product terms of 5 variables are possible and you also remember that if you connect an a b and instead of giving 1 0 here if you connect a third variable here like c c bar and so on you could make 3 variable implementation. So with this you know like a 1 variable here and 4 here maybe some combination of 5 variable can be implemented otherwise 3 variable implementation can be done. So that is a fine grain you know logic cell of an FPGA. So design methodology we have discussed this during the VHDL lecture but then if you have not gone through that lecture for some reason you are looking at the FPGA part of the lectures then we will just illustrate that. So the design methodology is currently you start with a hardware description language like VHDL or Verilog you describe your circuit in VHDL or Verilog after that you simulate that it is called functional simulation sometime people call it behavioral simulation. These are just named sometime people say the behavioral simulation is nothing but functional simulation plus the timing details but nevertheless you know I mean bifunctional simulation the basic code is simulated okay. Now that is not the circuit you are simulating when you say y is assigned a and b we are not simulating the AND gate we are just simulating that AND function okay. So there is no circuit here this is a basic source code it is very fast because if you make syntax errors you make logical errors that can be quickly corrected at this point before you know generating the circuit. So that is that functional simulation then you go to a step called synthesis okay wherein this the source code you have written is synthesized into a net list of logic gates and flip-flops maybe it is Boolean equation in the case of you know the combination circuit but it is the AS logic circuit is generated out of your description. Now if you are a novice if the designer is a novice he or she can go for a logic simulation that is different from this functional simulation because here we are simulating the logic which is generated by the synthesis tool. Here you are just you know simulating the code you are written the code in human readable English that syntax is simulated here but here the logic is simulated but mind you at this point or here there are no delays because there is not yet implemented in a device. So there is no delay at this point so you give to an AND gate a and b at 10 nanosecond the y which is output come out at the 10 nanosecond okay. So no logic delay or interconnect delay so there is no delay here but mind you here you are simulating the code here you are simulating the circuit without delay and maybe you can if you simulate that if you find errors if there is synthesis errors then you can go back and iterate so you iterate here you iterate here but for an expert designer who knows coding which I have taught you in the case of VHDL knowing a particular circuit how to write the proper VHDL code so that you get what you want you know then in that case you really do not require this logic simulation. And then you go to a step called place and route that is basically like going back to this diagram maybe like you have maybe yes here the synthesis tool will generate a logic circuit now the FPGA has an array of logic blocks. Now that circuit and these are identical arrays and that circuit has to be placed in the logic block and has to be interconnected and that is not a very easy task because if you take a logic one logic block the first logic some circuit you have synthesis part of the circuit like if there are kind of 100,000 logic block you could place it anywhere without any constraint if there is no constraint the possibilities are too high so the place that is you know the placing the circuit in a logic block and interconnecting it okay that is called routing so that is what it says. So this is what the place and route and at that point you know like the FPGA is a kind of general purpose circuit the IO pins you know your user input output signal can be assigned to any IO pin so you can specify the IO constraints saying that you know you please map you know map this particular signal to so and so pin because maybe one advantage of FPGA is that even before designing the chip you can start with the PCB design like you can assign the pin of the chip to a particular signal and you can go ahead with the PCB design while you are designing the chip okay ultimately some kind of a printed circuit board has to be used to mount this FPGA so that can go parallel. So these IO pin assignment is specified in the constraint and another thing which can be specified is that as I said the placement is a very critical thing to do complex thing to do. Suppose you want some performance you want to like you design a counter with the flip-flops and next side logic and you are planning to clock the counter at say 300 megahertz then if the counter flip-flops are kind of placed apart in the chip you know like you maybe the chip okay the chip we will come back to the chip diagram okay here. So suppose a part of the counter is here another part is here so for the counter it has to be interconnected so it may like it incurs a lot of delay it will be good if they are placed close together so that the interconnection wouldn't suffer a lot of delay. So that you will get a faster counter so that kind of constraints called timing constraints can be specified at this point which say that for a combinational circuit from an input to a particular output the delay should be less than certain you know time like 5 nanosecond for 200 megahertz or you can say from a register to through a combinational circuit to the other register which includes all the delay tcq t combinational t wire delay t switch delay and the setup time everything it should be less than 5 nanosecond then the place and route tool will make sure that the circuits concerned are placed together in close together and the number of interconnecting switches are less and so on okay. So it can iterate over that so it is when you say constraint you know you have the timing constraint the IO constraint all that is there then the place and route tool which is called par tool you know do the place and routing many times in the case of PLDs it is called fitting okay. So basically it is just a term used you could in principle use the term place and route there also but then you know that there is only there is no variation there is no great there is a single switch between the logic block. So it is basically you know how to fit the given circuit into the logic block is a main concern so that is why it is called fitting and once you do that then at this point for the tool all the delays are done all the combinational circuit delay all the wire delay all the register delay all the IO delay everything is known. So a timing model is generated now you can simulate the you are synthesized and place and route circuit with time delays and you give an input you can the output come with a delay you know so you do the timing simulation. So you should know that this are different this is the original VHDL code you have written you do the function simulation this is the synthesized code which itself may be written in VHDL which is simulated. So the language may be VHDL but that is not the code you have written and when it comes here still can be a VHDL code many times verilog is used but with the for certain reason for the timing model with the time delays this can be you know in the same language but totally different code. So that is timing simulated but then it takes a long time one thing to understand is that you give the input then the simulator has to now calculate all the time delays which is composed of lot of you know individual time delay in between it is not a gross time. So it is a time of a IO delay interconnect delay wire delays and the lookup table the combination circuit delay various combinational circuit components everything add to delay. So there is quite a lot of the computation to do so this can be very timing simulation can be very time consuming ok. So for a complex circuit it might take a day to simulate that and so at the beginning like when you start when you do the first iteration of the place and route if you go for timing simulation and find you know you take 6 hours to simulate and find a small error then come back and correct is a very hard thing to do. So there is something called timing analysis which is static timing analysis done. So this is dynamic in the sense that you give the input you literally you know make the circuits work from the input to output but here what is done is that this tool will look at the block delay interconnect delay and estimate the time from input to output for a combinational circuit or for register to register for a data path or sequential circuit ok. But the problem is that there are no inputs ok no inputs are specified. So the state machine or the controllers are not active what is available is set of registers combinational circuit set of registers and it adds up all possible paths and show up the delay ok. Now it may be mistake you know it might report a path from a register to another register which is never used in the real circuit in real life ok. So because you know that the controller is the one which is controlling the data path maybe there are 4 source registers some combinational circuit and say 5 destination register. The real operation of the circuit only involved like a kind of 1 to 1 mapping maybe 1 register result goes to the other register 4 of them and maybe 1 of the register give the result to the 2 registers you know sometime to 1 and sometime to other. But if you give on to a static timing analysis tool it does not know all that because it has no idea of the signal kind of signal status ok condo signal status. So what it assumes is that 4 source 5 destinations total possible paths are 20 but in real life is only 5 and it will report all the timing from source to destination of all the 20 paths. And you might as a designer you might find that you know 6 of them is violating the timing constraints you have put ok. But the none of it may be used in the real life circuit so you can be kind of misled in this game. So it is very important that when one do the static timing analysis such information is given to the tool saying that what are the paths valid ok. Like there could be another situation where there is a source register and destination register. The destination is not clocked you know kind of every clock like the source get an input data but the result get latched after 2 clock cycle. But this that is enabled through the enabled of the destination register through a state machine or a controller which the timing analysis tool of no way to know. So it will assume that all the registers are working all the time you know upon every clock head. So it will assume you know it will overestimate the critical path delay and there could be errors. So this has to be told to the timing analysis tool then it will be very far. So after the place and route one do the static timing analysis if that is not met you iterate back now. Maybe you go back change the constraint you do certain change in the synthesis options whatever is the correct option or even go back and modify the circuit. And once everything is done you do the timing simulation again go back maybe this way up to here or up to here iterate it. When everything is done when the designers satisfy a configuration file is generated that you can program the chip okay. So that is what is in a typical tool there can be more tools more advanced tool but at least bare minimum tools required for FPGA design is that you have a editor which can be internal or external a synthesis tool, the place and route tool or a fitting tool, a programming tool, a constrained editor, a simulator which can do the functional simulation, logic simulation and timing simulation normally any simulator will do support all these three and a static timing analysis tool. So and many times the vendors give everything put together as a single kind of integrated design environment. So IDE the Xilinx college ICE integrated system environment or things like that ICE, ISC and sometime some of these tools like there are vendors who give very good synthesis tool alone okay. And the vendors allow you to use that synthesis tool along with their you know the tool instead of you can replace the synthesis tool which is supplied by the vendor and replace it with a third party synthesis tool and so on. But normally the place and route is done by the vendor because the vendor knows the FPGA internal details which is very much required for properly doing that. And many times it is not wise to give out this detail because people can make out what is really inside though we have the data sheet but all the details are not exposed for you know keep the intellectual property. So that is the design methodology and if you look at the commercial tool just to give a flavor and this is recording in 2014 January I am not sure that after 2 years if you listen to this lecture these tools will remain or these some companies, some vendors are given in the bracket you know that maybe one is acquired by somebody else I have no guarantee. But as on now as on 2014 January you have the model sim simulator from Mender Graphics which is a very good simulator. Similarly you have Active HDL simulator from Aldec these are good simulator but vendors have their own simulators. And synthesis tool you have Simplify Pro from Synopsys very good synthesis tool precision synthesis from Mender Graphics. But when it comes to vendors you know the silings ICE or ISC it has everything you know you have synthesis simulation power programming static timing analysis constrained editor power analysis the floor planning you know name it everything is there. And these ICE tool work up to the chips of the family Spartan 6 and Vertex 6 but for Vertex 7 they have a different tool much more kind of better tool called silings Vivado very good tool a quantum leap in the whole tool technology this is quite old but this is quite new and good. But it is only now currently supporting the Vertex 7 series maybe as the years come by this may disappear and this alone will there will be there. And Altaira has caught us to a once again that has everything everything what you need whatever the silings tools are there everything is there actor libero is again same they have everything required for kind of for the FPGA design their family of FPGAs. Now if you are a normal novice designer you are starting with small kind of projects then what you need is only this. But if you are doing very complex thing and you are able to you want very high performance you want very high type fitting a you know kind of reduction in power I know the you want to refinement in the area of the implementation then you need to use these kind of things. And these tools are these vendor tools are improving say compared to 5 years back the current tools are really really good and you do not need to probably use the third party tool. But that is up to you to make a judgment you know what you do is that you evaluate like if you have a really if you feel that you are not getting the performance you want you have done everything under the earth to get the performance. But you need little more performance and you can probably move from the vendor to a third party synthesis tool to get the performance out of it. So of course you know when coming to the other tools we have the major VLSI designers you know you have the cadence synopsis Mander these are basically the synthesis tool and the simulator very good simulator. But then you have to definitely do the pace and route with the vendor tool you know these also support. So maybe I think I thought of going further but then it is good that you know a good introduction gives you a good grip on the subject. We will see the internal details of the Xilinx vertex FPGA I am taking that as an example. So what we have seen today is the programmable interconnect technology SRAM the flash and antifuse a kind of features what is the advantage disadvantage and we have seen how this forces a coarse grain and fine grain kind of architecture and how the coarse grain people the vendors make sure that nothing is wasted. And we have looked at the design methodology what are the steps the tool flow in designing FPGAs we have seen some commercial tool. Now I think I have touched upon all the introductory material to get a very good grip and we will go detail in we will take an architecture vertex architecture which is not used at all but it is a very good starting point once you understand that any other complex FPGA architecture can be understood very nicely. So we will go into details very much into details of the vertex FPGA architecture. So please now on your part you can go through my slide you can also read the data sheets of the vendor the application nodes from the vendor there are a lot of material which is available. Maybe the textbook are not the ideal place to learn FPGA from very do not waste your time on most of the textbook I have in frankly followed up but then you can use lot of the data sheets and so on. So we will see in the next lecture the details of the FPGA. I wish you all the best and thank you.