 Hello friends, welcome you all to the 20th session of ARM based development. In this lecture starting from now we will be looking at things which are outside the processor ok. So, today we will be starting with co processor ok, then I will give you know some introduction about floating point format then followed by a vector floating point processor. So, up to this point the number of units that we have covered so far talked about what is inside the ARM processor core ok. So, just to give you a flavor of this one second sorry ok. So, let me tell you what we have seen so far this is ARM. So, we talked about you know we are talking about a system with the ARM based processor ARM based processor ok. So, in the SOC this is someone achieved what we are seeing is what happens inside ARM core how does it execute in different instructions and how does it talk to different know kind of memory how it reads the instruction from there and how it writes back. So, it was maximum may be the instructions related to some memory in connections and then whatever is being executed inside the core. Now onwards we are going to look around the ARM core ok, having completed the instruction set including the thumb state and ARM state and then handling of interrupt where you know peripherals were connected and then we have touched upon how ARM processor handles the interrupts. Now we are looking at the processors are cores which are outside the ARM one is co processor ok. So, the naming itself means that it is a subset or not even a subset it is a add on to the ARM processor ok. Let us say it is called co processor it is not a main processor, but it is an add on to ARM. So, it is also inside the SOC ok where ARM core is inside and then memory may be some cache or whatever and then there are some buses which we will be talking about you know ARM specific buses which connect the different peripherals and then co processor. Now our interest is going to be if suppose we want to expand the functionality of ARM core beyond what it supports. So, you know that ARM supports those complement integers right. So, both signed and unsigned. Now what are what if it is for the ARM SOC that you are using it and the application is for some scientific application and where you will be using a floating point operation ok floating point you know I will be talking about this floating point arithmetic as well as the format. So, in case it is we want to expand the functionality of ARM we could do it by adding those features into the ARM core or it could be done by adding that specific application specific features into some other co processor which runs along with the ARM core ok. So, you may wonder why are we doing it you know because in a design in a system design we may not need a co processor always. We might be happy with only having an ARM core along with the peripherals and the memory. So, if suppose a co processor is also built into the ARM core all the functionalities of co processor like floating point support then it will be unnecessarily increasing the size and power consumption and complexity of the chip in terms of area in terms of power consumption. So, we do not want a functionality into the inside the processor where it is not required. So, that is the reason why a special functions like a DSP digital signal processing or a specific floating point processing or network processors it could be any of the specialized functions which are very specific to some applications where this ARM based system is going to be used for ok. In that case we can the user as a system as a user the system designers may decide to have co processors which are specific to those functionalities and then add it with the ARM core. So, it effectively it is a you know enhances the functionality of the ARM the same time preserves the simplicity of a single core ok. So, that is the intent of going in for co processor. So, we in this lecture we will talk about ARM co processor interface ok and then in the next lecture I will talk about the instructions which are related to co processor then I will explain you about floating point format followed by a lecture giving inside what is inside a welfare floating point processor. So, typically we start with how a co processor is interfaced with the ARM and then what are the instructions supported by ARM, then how it is implemented to support a floating point arithmetic operations ok. So, this will give you a complete picture of what is a co processor all about and what is the use of them and how is it implemented in the ARM processor ok along with ARM processor ok. Now, let us seelet me take the pen back. So, these are the topics which are going to be covered today why we need co processor and then how ARM and co processors are connected to each other what are their interface and what are their signals they exchange between them and then we will touch up on pipelining in the ARM and then how does it connect you know how is it connected to the co processor and whether does it have a pipeline and those details will be covering today ok. I told you about why we need co processor. So, it is to enhance a functionality. So, additional instructions specialized the additional instructions can be added using co processors to extend the functionality ok it could be DSP or a floating point operations. Now, when I say we are adding more instructions you know that ARM has a 32 bit wide instruction set right. So, we are not the how many instruction could be thereusing those 32 bit combinations there could be a 2 power 32 instructions, but does ARM have that many instructions it does not. So, it has some set of instructions I I I do not top of the mind I do not recall how many instructions are supported in 7 maybe 30 40 instructions because there are so many varieties to that because conditional strikes may be different and thenyou know you will have different you know whether it it could be with x option or without x option. So, there are multiple combinations are possible, but even if you support all of them it do not know be at the anywhere close to the number of options we have with it to 32 bit wide instructions. So, the instruction formats which are not used by ARM ok. So, you know that ARM opcode and then we have talked about my every instruction how they are dividing this particular ARM 32 bit instruction into multiple speeds and how it is used. So, we know those things inside. So, the combinations which are not supported which are not used by ARM ok they are going to be the gaps in the instructions of ARM are going to be used by the coprocessor instructions ok. So, the bit combinations of coprocessor instructions will be different from what is currently or what is which is being supported by ARM ok in the in in supporting the ARM instruction. So, the coprocessor instructions will be a different combinations of bit patterns ok. Now, this is what we are going to be using for supporting different functionalities. So, in this talk we are not going to talk about how is it insidecoprocessor instruction only thing what you should remember is that this coprocessor instruction combination is different from what ARM has already reserved it for for its own instructions ok that is enough. And then we will see how it is supported in terms of connecting a coprocessor with the ARM and what are the other signals which are getting you know exchange between ARM core and the coprocessor. Then we will talk about what are the instructions which are supported in coprocessor and then how different coprocessors can use them to support their functionalities ok different functionalities. So, one thing we have to remember the coprocessor instructions are also 32 bit by ok that is what I am saying. So, then you may wonder is there a thumb state for a coprocessor right because we know that ARM core has its own thumb state that means, it has got a 16 bit wide instruction set format right. So, which is different ok if it is a ARM is in the thumb state then it will be executing 16 bit instructions, but that is it that is the coprocessor also have thumb state in descent ok. So, if ARM processor is in thumb state ok suppose then we cannot have any coprocessor instruction be used ok. So, one thing good what you should remember is the coprocessor instructions are all 32 bit wide. So, only when ARM is in not in thumb state and it is in ARM state then a coprocessor instruction can be there. Now, I will give you anexample ok. Suppose you have that now is typical add S instruction ok r 1 comma r 2 comma r 3 is this a coprocessor instruction or is it a normal ARM instruction it is a normal ARM instruction and this is a 32 bit wide instruction because there are 3 operands in it ok and we mention also about S option. So, that means, this is a 32 bit instruction ok assume it is in some location in memory and then we give another format ok may be f add ok floating point add assume you know it could be different for different over processor. So, f add some registers ok it could be r f 1 ok floating point ok registers r of 2 you know I am just giving you some syntax because the this syntax is not just specific to over d particular processor it could be different. So, and then followed by may be a subtract r 1 comma r 2. So, you are writing a assembly program like this ok. Now, every instruction is going to take 4 bytes lengths. So, if this is starting at 100 address 100 this instruction will be at 104 and this will be at 108. Now, this whatever instruction ok that I have written here is something specific to a coprocessor may be a floating point ok the format may be you know you need to a particular processor. So, we are not bothered about that, but it is nothing to do with the ARM instruction ok friendsthere was a technical issue. So, hopefully you should be able to see thiswhat I have written. So, I am explaining you how does a coprocessor instruction ok gets embedded into a a typical ARM core ARM instructions. So, when we have a coprocessor support in the system ok we have an SOC ok ARM core is there assume that ARM core is there and then there is one coprocessor which is the may be a floating point coprocessor which also is there in the SOC ok. Now, you can write code by embedding the floating point instruction in between the ARM instructions. So, and they are all 32 bit wide both ARM instructions as well as the floating point instructions which happens to be a coprocessor instruction. So, they all come through the same instruction pipeline and they get into the pipeline of the ARM core as well as a coprocessor in our case is the floating point processor which is always looking at the data bus ok and then it is in sync with the ARM core and then it is also getting this instruction. So, when this instruction the when the ARM processor decodes this it will realize that ok I have got a floating point operation which is not what I could execute. If there is a coprocessor which supports this particular instruction then it might pick the instruction to execute. So, when this particular instruction comes into the execute stage it looks for a coprocessor whether is it ready to take this instruction or not. If there is a coprocessor suppose there is a floating point coprocessor which has I will ask it is also reading all the instructions are read by the ARM core. So, it will know that I have a instructions for mine. So, it will start executing it. So, it will execute it only when that particular instruction reaches the execute stage of the ARM processor. So, there are two pipelines one is inside the floating point processor and ARM anyway it has a pipeline. So, these instructions flow through all these instructions ok all these instructions flow through both the pipeline and then if the particular coprocessor instruction what ARM encounters is also is supported by a coprocessor which is in the system then that floating point processor or some coprocessor picks up that instruction and starts executing it and informs the ARM that ok I will take care of that. So, this is the kind of a handshake which we will be covering now ok just want to give you a overview where the whole thing is working. So, ARM instructions are all there and then in between the floating point instructions are embedded into them. So, they also occupy a 32 bit wide instruction, but they are different format than what typical ARM instructions follow. So, ARM processor knows that ok it is not my instruction. So, it will look for anybody who any takers in the system. So, there could be a floating point core processor and there could be a DSP another DSP processor ok. So, you may have an instruction from a floating point operation and then maybe a filter operation ok filter some registers and so this instruction also can be there. So, this will be picked up by the floating point processor and then a filter instruction may be picked up with a DSP processor. So, now you may wonder how so the instruction fetching and you know decoding is happening on instruction fetching is done by the ARM core whereas, what is being fetched is all looked by the other cores also other core processors which are looking into the database they also keep their pipeline they also have a pipeline similar to ARM core 3 stage pipeline and they keep filling the instructions what they see while the processor ARM is accessing the instructions from the memory. So, typically core processors also run at the same main clock as that of ARM core and they are in sync with the ARM pipeline have this background ok. So, this will help you to understand the rest of the topic ok ok let us go into the details sorry ok friends. Let us carry forward there are separate processing units that are tightly coupled to ARM processor ok. It could be a DSP processor or it could be a floating point operation processor or a network processor it could be anything. So, typical what is a core processor contain? It contains an instruction pipeline which is following the ARM pipeline ok. It follows the ARM pipeline in lock step that means, every clock the pipeline in the core processor also does the same thing that ARM core does. So, ARM might execute its own instruction ok it has to of course, of course, then what does the core processor do with the ARM instruction it ignores it. Similarly, when ARM encounters a core processor instruction it knows that ok it is not for it cannot be executed by ARM itself. So, when that particular core processor instruction reaches the execution execute stage then it controls it ask for any core processor who is willing to take that instruction then it will allow that instruction to be executed by a core processor. So, an ARM ignores that. So, typically the instruction pipeline is there are one in ARM and then there will be instruction pipeline in each of the core processors and they all go in lock step ok. So, a core processor has its own instruction decoding logic. So, it can understand its own instruction, but it will not understand the ARM processor instruction. So, it will know that ok it is the ARM instruction. So, it will ignore it ok that much of knowledge it will have and then it has a handshake logic to communicate with ARM. So, what I mean the handshake logic some set of signals ok I will tell you what are the signals that it uses to top to the ARM core. And then the core processor also has a set of register band it has got its own registers inside ok and a special processing logic with its own data part. So, you we have seen ARM's data part right how instruction come through you know goes to what all the stages it goes through and how the registers are accessed how many register ports all those things we saw in the data part of ARM. So, similar to that there is a separate data part inside the core processor. So, that also that only can use the core processor instructions to work with its own registers and that is the separate job which is scientific operation mostly ok. So, this set of things gives you a flavor of how a core processor behaves. So, core processor is an add-on it sits outside the ARM core in that in the sense in the SOC ARM will be there and then core processor will be there and then I will tell you how they are connected to each other ok. And then we will see that they the core processors have the its own instruction pipeline and its own decoding logic and they talk to the ARM core using some signal and ARM also looks at those signals and then the core processor has its own register band and its own data part ok. This you should have in the back of the mind then only you will be able to appreciate how does the core processor work ok. I will give you more and more know when we go into the details of the signals and how they talk to each other you will understand how these things all know come together to make this core processor function properly. So, a core processor is connected to the same data and control buses of ARM ok. Please remember it is connected to the data and control buses and not to the address bus ok that is very important address bus is not connected to the core processor why the address generation ok and instruction fetching is handled by whom it is done by ARM ok. So, instruction fetching is done by ARM, but while the ARM is ARM is fetching the instruction it is also read by core processors ok. So, ARM issues the read memory read to read the instruction ok from memory. So, this is the address bus and then there is a data bus ok which is also connected to the core processors. So, and then the control signal is also there there is a control signal between them ARM and core processors. So, the core processor is continuously monitoring the transaction happening on the bus and then it knows where the instructions are getting when the instructions are getting fetched and it takes that instructions also into it. So, it will go through its own pipeline why the same instruction is going through the ARM's pipeline ok. Now, at some particular point that is in a decode stage both the ARM as well as the core processor they realize that a particular instruction has come which is meant for this guy and it is not for ARM. Then when that particular instruction reaches the execute stage core processor takes on the control of executing that instruction and ARM ignores it. As soon as it knows that core processor informs that I am going to take this instruction you do not worry about it. Then core processor executes it and ARM ignores that. Then you may wonder what does it do it will go with the next instruction ok it will keep. So, it will ignore ok this instruction is a special instruction it does not ARM does not even know whether it is going to take 2 cycles or 10 cycles or 50 cycles it does not know. So, it is totally implemented by the core processor. So, it could be in some internal you know e power x operation which might take a longer time. So, ARM will carry on with its own job. So, the core processor will form know perform the e power x operation and then store it in its own internal register. So, that is what happened. So, I am telling this background again and again because you have to remember this to understand how the instructions get exchanged between them. So, it is core processor connected to both core processor tracks the pipeline of the ARM processor ok. This means that the core processor also decodes the ARM instructions in the instruction pane. So, core processor also should know whether it is the instruction is meant for its own execution or it is for ARM or for some other core processor see core processor is only entity there there may be another set of some few core processors in the system. So, by what particular core processor will look for its own instruction then you may wonder how does it know which instruction belongs to itself I will tell you that ok that is there will be a core processor ID which will indicate to the core processor that it belongs to it. So, it executes those that core processor supports ignores the instructions that are meant for ARM processor or other core processor ok. So, it will just ignore and then carry on with the pipeline you know similar to what ARM does. Each instruction progresses down both the ARM and core processor pipeline at the same time. Please remember they are at the same clock the M clock which is the main clock which is connected to both the core processor as well as ARM. So, they run together and the instructions go through the different stages of the pipeline together ok, but physically they are in a different places they are in fact, you know they are running in parallel they are having a different hardware and they all look at the same instruction and they execute it together ok. If a particular instruction belongs to core processor it will execute otherwise it will be ignored. So, the execution of instruction is shared between the core and the core processor. So, execution of instructions that means, what ARM instructions are executed by ARM core and if any core processor instruction come that will be executed by the core processor. So, they are actually running in parallel they are in parallel they they are in sync that means, each when a particular instruction is in fetch mode in ARM core the same will be in this fetch stage of the core processor the same instruction ok. So, this is the basic fundamental of how a core processor is connected to ARM core and how they function good I hope this is good for you. Now I will introduce you to the different signals and interfaces which will make sense now. So, this is the diagram. See there are n number of coprocessors ok, but maximum is 16 ok ARM 7 supports 16 coprocessors. So, but I have given you a sample there n number of processors are there ARM core is here and coprocessors are there they are all inside one SOC please remember that ok they are not in a outside the chip or something they are all in the same SOC where the ARM core is ok. And there are control signals which are connected to both ARM as well as to the coprocessors which are the control signals I will explain in the next slide, but control signals are connected to both of them and the data coprobers is also connected to both the processors ok. So, ARM is also rearing an instruction and the same time the same instruction where it will be going to the all the coprocessors ok please remember if there are 3 right now let us assume that there are 3 coprocessors we will call this as a 3 coprocessor 3 then when an instruction is fetched it is going into ARM core as well as to all the coprocessors. And if it happens to be a coprocessor instruction it may ok. So, coprocessor any instruction is 32 bit here and there is a coprocessor ID ok. Assume, but this is the field coprocessor ID CPID I call them if it happens to be a 3 one instruction with a CPID 3 comes. Now, this guy will pick the instruction and ARM will ignore it and other cores will ignore that and it will execute it. Now, next immediately after that there is another 32 bit instruction, but it happens to be a coprocessor instruction, but with the CPID of 2 then this processor will take it the other coprocessor and ARM core will ignore it. So, that is the way the instructions getting to the processors and they talk to each other. Now, how does ARM know a particular instruction is taken over by a particular coprocessor or not? These are the signals ok. Let me explain these signals now. Now, in detail I will ok see here this is the NCPI ok that is the ARM core has recognized that a particular instruction that it has seen is a coprocessor instruction it is not an ARM instruction. Then it will make this signal low ok because NCPI so, coprocessor instruction. So, it will send it to all the processors all the coprocessor are reading this CPI NCPI line. So, they will all come to know that say there is one instruction which the ARM core has recognized that it will be a coprocessor instruction. Now, when will this signal be generated? Only when yes this coprocessor instruction has reached the execute stage. See remember a coprocessor instruction will first come into this first stage and then during the decode stage when the coprocessor instruction moves into the decode stage the ARM core is recognized that oh it is a coprocessor instruction it is not my instruction. So, and then if it happens to enter the execute stage why do I say if it happens to execute there are some possibilities that the coprocessor may never come to the execute stage because the previous stage was a jump instruction branch instruction. Then this branch should have taken the control to some other instruction and the leaving this coprocessor instruction at the decode stage itself it is not executed. So, the decision of a particular decode in particular coprocessor instruction is getting executed is actually in the hands of the code which is be executed by the ARM. So, if it has executed a branch instruction just prior to the coprocessor instruction the control would have gone somewhere else and the coprocessor instruction would not have entered the execute stage. In that case the NCPI will not happen ok. Then these coprocessors which are suppose the coprocessor instruction had ID was 2 and this would have picked up, but now it sees that NCPI is not becoming a low that means ARM has decided to ignore know decided not to execute it not to be executed by any of the coprocessor also because ARM may not execute it and it will not ask others also to execute if a control takes it somewhere you may wonder why is it implemented this way see you might have written a code if ok I is equal to ok some integer equal to 1 then you do a floating point add ok f add ok. If it is not equal to 1 you will continue with some other instruction ARM instruction whatever it is you do not want this f add to be done. So, we are mixing the floating point instruction along with the normal control ARM instruction. So, in that case you should not allow the coprocessor also to execute this instruction. So, that is the reason the processor decides that this instruction is not to be executed and it will not raise this NCPI signal then what happen the coprocessor ignores it even you though it belongs to its own it will ignore it ok. So, remember this logic NCPI is generated when that particular signal it is you know it reaches the execute state and one more possibility is that ok you are adding f add you want to do a f add eq suppose that means, what if you have a 0 flag in the ARM core equal to 0 I am sorry eq means it is equal to 1 then we want this instruction to be executed otherwise we do not want. So, that ARM core decides when whether to execute a particular instruction or not including the floating point. So, that is control is with ARM. So, that is the main control processor is ARM they are all coprocessors they will do the job that ARM expects them to do, but they know how to do the job ok ARM does not know, but ARM controls what this coprocessor should know should do and when they should do. So, that is controlled by the NCPI signal ok. Now ok it will suppose you assume an instruction a coprocessor instruction has come into the ARM core and it has to be executed it has reached the execute state of the pipeline very good. Now ARM cannot do anything with a coprocessor instruction only it can do is it can inform the other coprocessors in the system. Please remember ARM does not even know the existence of these coprocessors ok. There is no way of ARM knowing it it just looks at that instruction and says that ok it is not for me it just informs the it rises the signal and then C waits for anybody taking this instruction or not. Now how does it know there are 2 signals for it ok. What is that coprocessor A means coprocessor absent signal and coprocessor DC signal. I will explain this here itself because the picture is there and you know you will you will be able to understand the whole flow of it then we will go through the slides to understand the exact sequence ok. Now after this NCPI is given it is waiting for these 2 inputs ok to come from any of the takers if they are present in the system. Now if the CPID of the coprocessor matches with one of the IIDs then they that processor suppose you assume thatinstruction which is being executed which has come to the execute stage as a CPID 1 ok. In that case what happens this guy will pick up the coprocessor one will pick it up assume that the coprocessor is not busy with the previous instruction ok. It is first time it is encountering the its own instruction. Now it will say that I am not busy that means what it will make this this will be 0 ok. This signal will be 0 because it is not busy now and then is it absent no it is actually present. So, it will make this as 0 both of them it will make it 0. Whereas, other coprocessor may decide not to even drive this signal ok or even if it drives what happen it is an AND gate. So, one of them driving it is 0 will make this output as 0 agree. So, once this processor has picked it up and others are saying that no I am not well no I am not the right person to use this. So, it will not say that no I am interested in executing this instruction whereas, only this coprocessor will say that I am interested in executing where I can do it. Where the CPID of this matches with mine that means what each coprocessor has its own ID embedded into the processor ok. It may be a some hardwired ID. So, if this will have 2 in it it will have 1 in it and may be we will assume that this is 3. So, only when it encounter the coprocessor instruction with its own CPID as matches with its own ID 1 or 0 1 or 2 or whatever. So, it will pick up and then based on that it will drive the signal and I am not busy now ok and then I am not absent I am present that means it will write as 0 there. Then ARM core knows that ok I said that there is a coprocessor instruction I encountered in the execute stage and any of you are willing to take it and I am getting a signal back from one of the coprocessors that they are you know it is ready to execute it. Please remember these coprocessors cannot have a common ID they need to be unique otherwise 2 people will say that I am taking it ok that will be a confusion. So, as a system designer ARM along with ARM when they are putting them coprocessor inside the system they have to make sure that the each of the ARM those the coprocessors have a unique ID and then and if the matching instruction come on the line they will pick it up and then they will inform the ARM core. Now the question is ARM what does it do? Once the instruction is fetched or accepted by a coprocessor it will not wait for that to be executed ok because it is if it suppose if it waits for that what is the advantage of having a coprocessor it does not make sense at all right. Suppose I am I am asking for a coprocessor to be there in the system to speed up the whole functions right. I want to ARM processor not to be blocked by any floating point operations or any DSP operation or network processor operation. But if the ARM core is also waiting for them to complete then they are not adding any value there does not make sense at all for ARM core to wait for this execution to be completed. So, as soon as one of the processor coprocessor say that I am interested in taking it up and then informs the processor ARM core back then ARM moves that instruction out of the execute stage ok that coprocessor instruction moves out of the execute stage and the next depot instruction cuts into the execute stage . So, once the instruction is handed over to somebody it goes on the ARM core does just goes on with its own job it does not care about what has what the coprocessor does with it. Then you may wonder how can I tie up these two because these are a common application we will talk about that later ok. At this moment remember once the coprocessor instruction is picked up by somebody ARM core carry on carries on with its own job that means it will pick take the next instruction and if it happens to be ARM instruction it can execute right. It is not tied to this instruction being executed because if suppose there is no link between these two for a moment then this instruction can carry on. So, what is the advantage you are getting here ARM is executing its own instruction at the in parallel coprocessor is also executing its own instruction and you can have multiple coprocessor instructions in the sequence and all of them are loaded with all those instruction they are executing you can see that whole lot of thing happening at the same time coprocessor are doing its own job and ARM is also doing its own job that is the advantage of having a coprocessor. So, unless this option is there you would not get the benefit of it, but you may wonder why is that control of you know fetching the instruction and then handing over to coprocessor given to ARM because there should be one master ok, one master and multiple slaves. A stable system should have one master and multiple slave they cannot be multiple masters. So, ARM is the master here. So, the control of fetching the instructions on the memory is with this core, whereas executing a particular specialized instructions are the knowledge of that and then a special functions are built into this coprocessor and they do the job ok. So, you can understand this now I hope see one ARM instruction followed by a coprocessor one instruction followed by coprocessor two instruction ok and then another ARM instruction and followed by a CP3 instruction then ARM instruction what happens now at this moment ok. This is going on this is going on this is also going on and ARM is ahead with the another instruction which is in the execute stage now. Now, you can see that 4 instructions are getting executed simultaneously understand that. So, that is what is happening here ok good I hope this is clear to you. So, this is how the signals are sent back to it. Now we will give a details of DC and why are we having this DC signal we will see that. Before that I will explain you other control signals which are connected to the coprocessor m clock which is the main clock and m n weight is what this is to delay that clock. Suppose there is a slow or peripheral or memory we want to delay it and reset please reset is not only going to ARM core ok it also goes to coprocessors because coprocessor is executing something in along with ARM. If reset is given to only ARM and not to CP then the coprocessor will not know that there is a reset given. So, this reset signal has to be given to all of them ok then ok. So, reset is given it means it is the coprocessor also looks at the reset signal. Now what are these signals let us see one by one. So, the signals used to interface ARM with the coprocessor are grouped into 4 categories. Now what are these 2 signal we have seen it earlier memory request and sequential. So, if you recall we have seen about 4 different cycles where ARM core can be in what are they non sequential cycle sequential cycle internal cycle and coprocessor cycle. So, whenever ARM encounters a coprocessor instruction it is in a coprocessor cycle it is executing a coprocessor instruction it is in coprocessor cycle. Otherwise it is a sequential or non sequential memory or it is something it is doing internally doing some register transfer or something. So, the coprocessor also should see all these cycles ok and then what is this? This we will talk about it later ok, but at least know that there is a this signal in the indicates that if it is a low it is a user mode otherwise it is a privileged mode. So, a coprocessor may be programmed to execute some instruction only when the ARM is in a privileged mode ok. It may ignore the instruction if it is given by the user mode ok there is a possibility you can build the system like that. In that case the CP2 has a signal entran if it is not in privileged mode it may ignore the instruction ok there are some control ok for control purposes ok. What is n OPC? It is a opcode. So, is a n means it is a low low signal is active low active ok. So, code fetch is low when the signal is low it is a see I will tell you one ARM core is there ARM this is ARM this is CP1 coprocessor 1 I told you that the data bus is connected to both of them. Now, inside the data bus what is going on ok it is the address bus this is memory ok memory is going to ok. Now, what is going on in the data bus could be instruction or a data. So, as a coprocessor you should know whether it is what is going on in the data bus is the instruction or a data. So, opcode is going being read by the ARM core or it is some memory cycle ARM may be doing some LDM or LDR it may be executing its own memory addresses. So, to know that it will look into the n OPC signal and then find out that ok if it is low that means, an instruction is getting you know it is fetched. So, I should now take that instruction into the pipeline this coprocessor will know that ok it has to fetch the instruction into the pipeline. So, that is why this signal is also important for the process coprocessors to look at. Now what is T bit T bit will be low when it is ARM is in thumb state. So, what is the difference between ARM state and thumb state the 16 bit instruction or 32 bit instruction. So, if a ARM is in thumb state it is fetching a 16 bit instruction from the memory. So, whereas, coprocessor instructions are always 32 bit. So, when 16 bit instructions are going on the database coprocessors just ignore it ok because they cannot make any sense out of 16 bit instruction. As long as the ARM is in thumb state ARM is not going to give any coprocessor instructions ok the NCPI signal will not be activated at all because coprocessor instruction are always 32 bit. So, it cannot appear along with the thumb state instruction. So, effectively if you are in a if you are writing a thumb state code you cannot embed the floating point instructions along with it because you will be mixing 16 bit instruction with the 32 bit instruction which is not valid ok. So, that is why coprocessors instruction and coprocessor instruction can appear or can be embedded along with ARM instruction, but not with the thumb instruction. Please remember this very important point ok very good these signals I already explained to you and this is the database it could be one of them will be selected and the it will be there ok. Now, let us understand how the different signal combinations mean what. See when both of them the coprocessor absent ok let me use some other color here to bring out monocony ok. C T A C P B A means it is not available or something it is absent. Coprocessor is absent is 0 means it is it is there present and C P B it is not busy ok. C P C P B is 0 means the signal is 0 that means it is not busy that means coprocessor is present and not busy ok. That means it can take up this instruction and it can you get it ok you can read this thing I will not go through this. So, ARM ignores the instruction coprocessor picks up the instruction. Now, what is this combination? Coprocessor is there, but it is busy what does it mean? It is busy with the previous instruction I will give you an example. Suppose ARM instruction is there after that some F add is there floating point add and immediately you are giving a F multiply ok and then some ARM instruction. Now, I told you any floating point operation may take more time more cycles. Now, F add is in the execute stage and your ARM has handed over to the floating point process because you know it has set the 0 0. So, it has it is now executing as now floating point add it is executing the floating point process. And immediately followed by that is F mull is coming. Now ARM processor what it will do? It will hand over this F mull also to somebody right. So, it will say that hey any takers are there for this. Now there cannot be two floating point processor there can be only one because C P will be the same right. So, one coprocessor instruction coprocessor will be there which is the floating point processor. So, it is already busy with the add instruction. Now, it cannot take mildly instruction until it completes this. So, what happens because it is busy with F add F mull cannot be taken which happens to be in the next instruction itself. Now, what happens is because of that the ARM pipeline stalls it stalls ok I will tell you why. This is the ARM pipeline ok this is the fetch this is decode and this is execute. Now there is a coprocessor ok floating point processor it has got a fetch decode and execute. Now add instruction is here mull also has been read by the coprocessor ok floating point coprocessor has already read the mull instruction, but it cannot see typically these two instructions are to be handed over now handled by this processor only ok, but it is already executing this in its own pipeline ok. So, it cannot execute the this instruction now. So, we have this instruction has to be. Now what happens here? Here first add came this F add is handed over to this right. Now F mull was here right previously F mull was there. Now that is picked up bynow because add has come here F mull will come here now ok after that here some ARM instruction is sitting there as you assume. Now F mull is to be taken over by the again same coprocessor, but it cannot take it up because this is going on it may take may be a 4 5 cycles or 6 cycles assume. Now until this instruction is taken over by this guy the pipeline stall starts ARM pipeline ok starts. You may wonder why should it stall now? Earlier for F add it it does not stall it carries forward whereas, for F mull it is waiting the reason being there is a coprocessor to take over this F mull because CP A is 0 that means it is not absent no floating by coprocessor is present, but CP B is 1 that means it is busy. So, that means the there is a guy who can take care of this instruction, but there it is busy with his previous instruction. So, now ARM has to wait if it does not wait what will happen this F mull will go out of the pipeline and then it cannot keep track of the flow ok because I told you that ARM is the master and others are slaves. So, the master has to take care make sure that the control flow is properly maintained ok, but you may wonder why in F add it went ahead? It went ahead with F add because somebody has taken over the instruction and it does not want to wait for it. So, it is a typical programmer if you are a smart programmer ok you should not put two instructions belonging to the same coprocessor one after the other then you are unnecessarily stalling the pipeline of the ARM code. Suppose if there is something else does to be done by ARM before the F mull has to be done then those instruction can be put here instead of F mull and then you can put F mull later. Now what happens during this gap while ARM is busy with his own instruction F add also has been done by the CP and it is now no longer busy when the F mull comes into the flow then F mull will be taken over by coprocessor. So, that is the kind of mixing we have to do a floating point instructions and the ARM instruction needs to be mixed properly. So, that you make use of the gaps in between for the ARM code to go with the it is own execution without waiting for the coprocessor to complete. So, it is left with it is it is left to the to the developer programmer to make use of this particular combination ok. Then only you will be using the exit you know with taking the advantage of the coprocessor. Now, why this is invalid it says ok coprocessor is of accent that means what no takers for this particular processor instruction coprocessor instruction. But somebody in the coprocessor list says that I am not busy see it is a contradiction ok. That means it says that I am it says I cannot execute this instruction, but I am not busy. So, it is a bad combination to happen. Suppose see there are 2 coprocessors ok CP 1 and CP 2 ok both of them saw the instruction which had a coprocessor ID 3 ok assume that being the system only these 2 are there ok both of them say ok this is not for me. So, that means they would not have driven the CP 8 to 0 they it will maintain it as 1 only because it is not going to take up this instruction at all ok. So, as far as this instruction is concerned they are the coprocessor is not present in the system. And they should also drive the CPB 1 to say that ok I am busy ok. So, it will not confuse the processor. So, this is not the kind of a valid response it is not say I am not busy, but I cannot take this instruction. So, that is a bad combination. So, that is why I am saying that this combination will not happen ok no coprocessor will see this signal is a and of multiple coprocessor giving. So, nobody should have made the particular signal 0 in this combination if this happens to be 1 this also should be 1. Now what happens this is the valid combination because no takers for this coprocessor instruction and all of them are you know whether they are busy or not they say that you know I cannot take this instruction at all. Now what does the processor do let us see ok let me use blue color for a change or may be brown ok. Now ARM core ok this is fetch D core Exigore it found out it found that there is a coprocessor instruction with a CP may be CP ID coprocessor ID as 5, but in the system there are ok 3 coprocessors which a ID 1 2 3. Now as soon as it sees this ARM core would have sent a NCPI signal ok coprocessor instruction any takers for this it would have made this low ok. Now this guy also will say I cannot take it I cannot take it because it is a coprocessor 5 instruction. Now what does the processor do that is a valid valid question to ask. See it cannot solve the pipeline because there is no takers for this particular instruction if there is a taker that person would have said that I am busy now, but I am present that means, the CP A will be 0 and CP B ok would have become 1 then it can it can wait because there is somebody who can recognize this instruction, but it is currently busy so he may take it up later. But when they say that you know it is busy as well as I am not the guy to recognize this instruction that means what no takers for this currently system then ARM has to do its own job what it will do it will generate an undefined instruction exception ok. You have heard about this under instruction right exception it will go to that instruction exception and that handler needs to do the job of what to do with this instruction ok. Now you may wonder why are we supporting both see as a system designer I may decide to have a coprocessor in hardware or I may decide to do a simulation using it in software suppose the floating point operations can be done in software also. So, I may emulate the software in the emulate software. So, in that case it is it should be in the handler ok. So, that instruction needs to be handled by some the left untouched ok. So, hello ok hope you have understood this combination of various things. So, if a particular instruction is not organized by any coprocessor then ARM generates an undefined undefined exception and it is taken care of by the software. So, if we decide that I do not want to have a hardware coprocessor and then I can handle it in software that option is available ok. So, if you know takers are there for a particular coprocessor instruction that we can do undefined instruction track and then execute whatever is supposed to be done by a coprocessor in software. So, only thing is what do we lose here the parallelism because ARM is performing the job of handling the particular coprocessor instruction. So, it is going to affect the performance and there is no hardware which is running parallely along with ARM executing the coprocessor instruction. So, it will be pretty slow, but the thing is we do not need a extra hardware we do not need that extra power and you know area, but it is ok if we can handle it in software can be done. So, a particular instruction if it is not picked up by the coprocessor then ARM generates an exception to handle it ok very good. Now, let us see . So, the ARM process evaluates the type as well as the condition code and then decides whether to be executed or not by a coprocessor ok. I told you about do and NE and all those control signal if a particular coprocessor instruction is does not have to be executed because the condition code fail then it will not generate the signal that means, the coprocessor also will ignore that the instruction. Now, I told you that ARM is connected to address bus not the coprocessor. So, coprocessors are not connected to the address bus. So, ARM only will generate the addresses required for the instruction fetch as well as if there are any instruction a floating point or in a coprocessor instruction needs a memory access that is also generated the address is generated by ARM. These coprocessors cannot generate any address because they are not even connected to the address bus. So, they cannot access the memory directly they have to go through the corollary. So, ARM takes the undefined instruction track if no coprocessor accept the instruction that means, how does it know both CPA and CPB happens to be 1 that means, there is no coprocessor which is present and there is nobody who is who can take this instruction. So, that means, it has to be handled by ARM only or it will generate a track or that on undefined instruction. So, at the same time what does the coprocessor do? It decodes all the instructions to determine whether it can accept the instruction or not. It ignores if the instruction does not belong to it ok that means, what if an instruction is a ARM instruction it will ignore. If the CPID does not match that means, what this coprocessor instruction does not belong to it it may be for some other coprocessor. So, only one coprocessor whose ID matches with this ID in the instruction will say that I am taking up this instruction. So, otherwise it is ignored. So, that is the logic used. Now, indicates where it can accept the instruction by using ok the coprocessor indicates whether it can accept or not using CPA and CPB. When the instruction is in execute stage and the NCPA is made low by ARM please remember it should be in execute stage and this should have been low. If it is not low what does it mean? The ARM decided that condition code is failed that means, what the F add EQ was the instruction and 0 flag in the ARM core is not equal to 1 that means, this instruction should not be executed because EQ condition is not satisfied. So, please remember this condition code is maintained in ARM core not in the floating point process. It may also maintain the coprocessor also maintains the status and all that things, but condition codes that I mentioned in the instructions are maintained in the or with respect to the CZNV flag in the ARM core ok. You may wonder why is the CZNV of ARM is used because the the decision on whether to execute an instruction or not ok is with the ARM core is a master ARM is a master. So, it can do only based on its own flag it cannot take a decision based on the flag in other coprocessor. So, the conditions are all to do with the something in the CZNV flags of the ARM processor ok. So, that is how the design is because that is the only possible design is in in the in the scenario right. So, because one guy has to decide the control flow of the whole in whole program that is ARM core. So, coprocessor fetches any values required from its own register bank as mentioned by the instruction. So, once the coprocessor instruction is picked up by the processor then it will look at the outputs of. So, these outputs will be most probably for some of the operations to be done by the CP it will be its own registers because I told you that ARM the coprocessor also has its own register set. So, the output operands may be from those register flags. So, you know here F flag suppose it mentions that ok you do a perform a floating point arithmetic using those registers inside the processor maybe you called as F 1 and F 2 these registers and then put the result in F 3 then it can add the values in F 1 and F 2 and then put the value in F 3 which is inside the CP ok inside the coprocessor. So, the instructions may fetch the operands within the coprocessor registers and perform the operation required by the instruction good ok. What happens unaccepted instruction if a coprocessor I cannot execute an instruction there is that is no coprocessor in the instruction system responds with an acceptance of the instruction ARM generates an undefined instruction I told you this programmer can choose whether to emulate the coprocessor function in software by writing undefined instruction exception I told you this also. Since I dedicated a hardware coprocessor is not integrated with the ARM core to handle the instruction ok. If it is not there hardware is not there then it has to be done by the software you cannot have you cannot say I will put the coprocessor instruction in the no instruction stream, but I would not handle it I would you know the system does not have a hardware then what happens to that instruction ARM cannot do anything with that and then there is no coprocessor also who can is executed then what will happen your program will fail. So, we cannot have that condition at all. So, if suppose you know that there is no coprocessor which can take that instruction we have to support that coprocessor instruction by writing the exception handler properly undefined instruction exception properly to handle that instruction. So, ok what is coprocessor ID up to 16 coprocessor can be referenced by a system with a unique coprocessor ID number to identify it please remember in a system in a SOC in a system ARM is there and there are CP1, CP2, CP3 are there they need to be unique ok cannot be CP1 and CP1 2 hardware coprocessors which are having the same ID it is a cost of failure ok that is a bad design and it cannot be done. Now, I will there are some coprocessor ID which are reserved ok one is a debug communication channel processor I will talk about that we touch upon the debug module of ARM. So, that is debug controller and then reserved one is 13 to 8 that means, ARM as you know reserved these numbers. So, only these two are these are available for users that means, what if you are a hardware designer in your building a coprocessor you can use that only 7 to 4 if suppose you are using with the ARM 7 ok I am not talking about the later processor family processor we are only concerned about ARM 7 in this course. So, we are talking about what are the registered coprocessor IDs supported by ARM 7 there are 16 coprocessor IDs and then out of which 13 and 8 are reserved and these are all reserved special purpose and this is also reserved. So, you are only having only this 4567 only which you can develop your own processor now you may wonder where is the floating point coprocessor if it is actually inside this 10 and 11 are used for that we will talk about that later ok. So, this is debug and this is system control this is another coprocessor which is called system control we will I will touch upon that you know later some point in time ok. So, these are different IDs which are reserved for different purposes that is all. So, this will be mentioned along with the instruction ok I will talk about the instruction format in the next session. So, you will understand that. So, pipeline following I talked about this. So, for a complete next set I have put it in the slide every coprocessor in the system ok must contain a pipeline follower to track the instructions in the ARM processor pipeline. So, ok coprocessor connect to the configured ARM data bus as well as this signal this you know already it is essential that two pipelines remain in step that means, what the coprocessor pipeline and ARM pipeline they should be in step in step means they should be in synchronization because when a particular instruction moves from one more one stage to the other stage in the ARM pipeline same transition happens in the coprocessor pipeline also not only in one coprocessor if n number of coprocessor the all of them will will transition the instruction will transition to next stage. So, you may wonder how is it done because m block is common anyway right all of them are connected to m block. So, and then they are also having n OPC and other signals which are seen by both ARM as well as the coprocessor. So, they can be in sync with the each other and a flushing and refilling of ARM pipeline when will it happen suppose ARM is executing a branch with the link or branch instruction what happens to the pipeline it gets flushed, but what happens to the coprocessor pipeline they also get flushed ok. So, it is like they mimic what the ARM does all the coprocessor are dancing to the tunes of ARM. So, whatever this guy ARM does the coprocessors also will do because then only they can be in sync in terms of pipeline. There are no coprocessor instruction in the thumb instruction set right told you this. So, coprocessor must monitor the state of T bit because T bit will be low when it is in thumb state. That means, during thumb state coprocessor will just ignore what is happening on the bus it will only monitor the T bit and then it can just sleep it does not have to do anything because it does not have to carry the thumb instruction inside the pipeline at all because anyway it is not going it is not going to encounter any coprocessor instruction along with thumb instruction right because thumb instruction means 16 bit instruction that cannot be one odd man out 32 bit instruction between them. So, it is very safe for the coprocessor to just sleep when there is a thumb thumb mode in thumb mode ok. So, the coprocessor decodes the instruction currently in decode stage if pipeline to of its pipeline to check whether it belongs to it if the coprocessor number matches with that own ID it will generate the required signal. If the instruction currently in decode stage is a relevant coprocessor instruction suppose ok coprocessor is just looking at the instruction and then it sees that a coprocessor this is a coprocessor pipeline and it is in the decode stage this pipeline i the coprocessor ID matches this is own ID ok this is CP 1 and the instruction also ID is 1. Now, what does it do the coprocessor attempts to execute the instruction it will start executing the instruction ok. So, you may wonder why is it starting the execution typically coprocessor instructions are going to take more time and most of the instructions are internal to its own registers right. So, it knows that the ID matches ok and it belongs to it and it is not busy ok. So, it will move that instruction may be along you know it will move only when the arm moves to the execute stage, but it will start executing it little ahead ok there just to keep save some few cycles ok. So, it may execute it the instruction ahead, but wait a minute let me erase it the coprocessor handshakes with the arm core using the CP and CPB it will inform the arm core saying that hey I am taking this instruction ok CP A CP B both will be 0 ok. That means, what it is taking the exact instruction, but please remember the NCPA instruction that signal from arm this is arm and this is coprocessor here ok that has not come NCPA has not come because this arm may be busy with its own instruction ok, but let me explain this little bit hope you will understand it. See this is coprocessor instruction is there in the arm core and saying coprocessor instruction is there in the coprocessor also ok it is in decode state, but it starts executing coprocessor has identified that this is this instruction belongs to me and I am not busy now. So, I can start executing it, but it will not write the result into the register file ok anything suppose it is computing some addition or something until it gets the NCPA signal from the NCPA signal ok from the arm core why because there is a possibility that there is a branch which is happened and this was never executed by to be executed at all. In that case even arm the coprocessor also should not execute it or there is a possibility that condition code has failed that EQ condition or some other condition code has failed then also that instruction should not be executed. So, but I said that the coprocessor starts executing it, but if it does not write the result into the register file then it is as good as not executing right. It might have done some job, but it may ignore it whatever it has done in between it will drop everything and then it will start first. So, the value which is the status what is maintained by the registers in the take an example it is a floating point processor it is not affected by this instruction because arm decided that this instruction should not be executed. So, I hope this is clear to you. So, coprocessor handshakes with the it informs the arm core that I am taking it ok as soon as it decodes said, but it does not have to wait for NCPA to be low, but it must not commit the result until this becomes low it should not commit the result ok good. So, same thing I think I have explained it already. So, this quiz the does not have anything, but I want you to take a complete break to look at this questions and options and then see which are the options are correct. There may be multiple option, but I am just giving younot try to look at it and then come back we will come back. If all the options look correct to you there is something wrong ok only the C option is right. That means, condition codes might have failed. So, the coprocessor instruction should not be executed. I am saying branch also another option, but are these in our options are valid let me go one by one. The coprocessor instruction be executed may not be valid. See a coprocessor has already started the instruction main the ID was matching that is why it has started executing then this is no longer true right. It cannot be that means the instruction cannot be invalid instruction because some coprocessor picked it up that is why it it is going ahead with the execution, but it is waiting for MCPA to become low to commit the result. That means, this is not correct then what is it correct ARM might have decided to take undefined instruction exception for the instruction execution and coprocessor. This is also not true why the processor once a coprocessor has taken up the instruction it will say that CPA and CPB it will make it see it is executing it ahead, but it would have made this both 0. That means, what it says that I am taking this instruction I have a you know this is valid instruction ID. So, I am taking somebody has told then after hearing from the coprocessor that somebody is ready to execute it the ARM core cannot decide to you know take an exception that is not correct ok. Only when no coprocessor is ready to execute the instruction it should take the exception otherwise this is not a valid option ok. Can this be a valid option some other coprocessor might have also responded to the same instruction by driving CPA CPB below is it true? I told you that CPIDs are unique if another coprocessor is also driving the signal even ARM may not even know that because it is an AND gate ok. I am sorry about the AND ok it is a AND gate if both the coprocessor are driving it as 0, CPA and CPB it will not even know. So, the AND AND is that possible it should not be in a system you cannot have a coprocessor with the same ID. So, that is why only this is correct answer ok. So, execution of the coprocessor instruction coprocessor instruction progress down the ARM core a coprocessor instruction is executed if the following are true. That means, coprocessor instruction has reached the stage and then ARM processor cannot execute the instruction because it is the coprocessor instruction and it is a part of the undefined part of the instruction set and the instruction has passed the conditional execution test and CPI is low and CPA CPBs are somebody have accepted this and then the coprocessor can commit the instruction to execution ok that is all I hope the whole thing is clear to you guys. So, we have not told so far what are those coprocessor instructions ok I only said some coprocessor instruction may be a part and if you know some floating point I took an example we will see what are the instructions supported and then what all can be done using those instruction in the next class, but now I hope you understood that how ARM is reading the instruction and then handing it over to coprocessor and how that is coprocessor execute them ok if you have understood this and how they communicate with each other then I think we have done the job of today in the today's class ok. So, happy talking to you. Thank you very much for your attention see you in the next class have a nice day .