 Hello friends, welcome you all to the 21st session of ARM based development . So, in thislet us we will cover the coprocessor instructions ok . So, in the last session we saw howany coprocessor is interface with ARM ok and what is the signal interaction between them ok. Now, in this talk I will cover specifically about various type of instruction that are supported ok. Now, based on the coprocessor these instructions can be redesigned, the mnemonics can be different, but they all fall under these three categories what I am going to be showing ok. So, these are the three categories of instructions which we will be talking about today. So, any floating point coprocessor or DSP coprocessor or any error encryption coprocessor anything they need to fall under the instructions they they need to fall under any of these three categories and they have freedom to use these three categories to put in their particular instruction which will be given along with the ARM instructions ok. So, the tool which is the assembler which is compilingwhich is assembling this instruction need to be aware of this instruction and then they will generate equivalent encoding based on what ARM has defined ok. So,with this introduction let us go into the instruction format and then variousspecific details about instructions and then we will touch upon the fundamental three categories of instructions ok. So, coprocessor data processing instruction the name itself clearly says this is something though with the specific functionality of a coprocessor ok. So, we are back into again the 32 bit format of instruction ok. So, it is the condition code is similar to what we saw in ARM instructions the same coding ok which is which was followed in ARM is followed here also and then this is any pattern which distinguishes that this is a coprocessor instruction ok. Once the ARM processor looks at this pattern the shared portion which are in blue this is all not meant for ARM processor to encode decode ok. So, this is all specific to the coprocessors. So, as soon as ARM looks at this it will not even look into these bits because it is it cannot make out anything from this because this is very specific to the coprocessors which are taking this instruction and processing it ok. And then this is another uniquebig pattern which which differentiates between the CDP instruction data processing instruction and then which one I will be showing at the long end register transfer instruction. So, that even ARM coprocessor and ARM. So, based on this big pattern it decides that this particular instruction is coprocessor instruction ok. So, as I told you in the last session while the ARM is decoding it and trying to understand this instruction the same instruction is being decoded by the coprocessors also. And then when this instruction enters ok any instruction enters the execution stage and then this condition code issatisfying then ARM will generate the NCPA signal if you remember ok. It will make this low to indicate to all the coprocessors in the system that there is one in you know coprocessor instruction which I have found and which is in the execution stage you know execution stage. And then the coprocessors inform back with the CPA that is coprocessor absent and coprocessor busy signal to indicate whether it is any of the coprocessor willing to take that instruction or not. Once they take it up then ARM no goes forward if it is a data processing instruction it goes forward, but if it has to be there for servicing some instruction it will say that in the execution stage. We will talk about that ok in this instruction. Now when the CDP instruction is seen it is supposed to be handled by you know this is supposed to be handled by the coprocessor. So, once some coprocessor takes this instruction that means, it corresponds back saying that you know I am going to take it up and if it has to it does not have to wait the ARM ARM processor will carry on with the next instruction otherwise if it will do a busy way ok I will show you the example. So, the instruction is executed if the condition is true the class of instruction is used to inform a coprocessor to perform some internal operation. So, what kind of internal operation that is specified by this bit and this bits ok these are all two types coprocessor type as well as the coprocessor operation. And then this is coprocessor registersone of the operand registers and this is one of the coprocessor operand register and this is the destination register inside the coprocessor. So, anything appended with the C indicates that they are coprocessor registers and what is this this is the coprocessor number it can vary from 0 to 15 that is why 4 bits are given. So, based on this CPID the coprocessor ID a specific coprocessor in the system will acknowledge saying that I am taking this instruction ok. And then they will know that these are the bit patterns they need to interpret and then accordingly perform the operation ok. So, whatever I am showing it in the color coding is whatever I am showing it in blue is a parameters which are meant for coprocessors and ignored by ARM. Similarly coprocessors ignores these bit patterns and then they look at these other patterns which are specific to them ok. So, here no result is communicated back to ARM ok ARM will not wait for the operation to complete. See it will wait for the instruction to be taken up by coprocessor if there is a coprocessor present in the system, but it is busy because of something else then ARM will wait and once your instruction is taken up by the coprocessor ARM will carry on with the next instruction. The coprocessor could contain a queue of such instructions awaiting execution. So, suppose ARM is here a coprocessor is here and it has confirmed back saying that ok I am taking this instruction now it may not have to execute it immediately because it may have a FIFO of ok it may maintain a FIFO of instructions ok. So, whatever instruction which is reading from the data bus ok it may keep it in the FIFO and then it will execute it internally that is also quite possible ok. So, this why is it required? Suppose I told you in the last class also some example of you know you assume that there is a floating point add and then followed by floating point may be subtract and then floating point null. Suppose these are all all the CDP instruction you agree because these are all something processing to be done by the coprocessor and it will be done among you know using the registers of the coprocessors. It is nothing to do with the ARM registers ok ARM is here the coprocessor suppose it is a floating point coprocessor ok. So, these instructions will be executed by the coprocessor using the registers mentioned in the operands which are actually inside the processor. Now, actually speaking the ARM does not have to wait for this instruction to be completed why? Because they are all internal to the ARM the coprocessor and they can take their own speed time in completing it as long as the any of the results do not affect the flow of the ARM code ok ARM is executing some code and then these are instructions. Suppose the results of this are not immediately needed to control the flow of the ARM code then ARM can continue with the execution of the remaining instruction, but that is not possible if these instructions are not accepted by the coprocessor. Now, assume there is a you know two stage FIFO ok only two instructions can be in the FIFO and one can be in the execute stage. Then if suppose when these ARM instructions were executed and then why when this FFAD was given to the coprocessor coprocessor was not busy. So, it accepts it that means FFAD goes into the execute stage and then two stage FIFO may accept these instructions because FIFO will keep the instructions pending for to be executed by the coprocessor. So, this will go into the FIFO ok and then this will also go into the FIFO. See once these instructions are accepted by the coprocessor then the execute stage of the ARM becomes free then ARM can take the some other ARM instruction. Suppose there are followed by ARM instruction they are all specific to the internals of ARM and now it can carry on with that. So, this is what I am explaining that is quite possible that the coprocessor may have a FIFO and then they may keep accepting it, but what happens is the FIFO becomes full and still the coprocessor is not able to accept you know is not able to accept any more instruction then what happened the ARM has to be because you know once the instruction is not accepted by the coprocessor if it ARM continues to be executing the next instruction this is lost. So, it should be either accepted by the coprocessor or the exception handler should be hand you know called. So, if anyone of this condition is true only the ARM will continue with the execution. So, there is a FIFO possible ok, their execution can overlap other activity allowing the coprocessor and ARM to perform independent tasks. So, ARM is doing its own processing and the coprocessor is doing its own job. So, they are all running in parallel that we should keep in mind that is the advantage of having coprocessors. If ARM and coprocessor are not running in parallel then we are not having the benefit of you know designing a hardware specifically for processing the coprocessor instruction. CDP instructions is not available in thumb state see any coprocessor instruction is not available in thumb state for that matter, but I am making a statement clearly saying that this are not available in thumb state ok. That means, if in the thumb state you cannot introduce the in between the CDP instruction CDP is a coprocessor data processing instruction ok. Now, I am just explaining the big patterns the 21 to 31, 24 to 31 this whole thing is meant for ARM and then 4 also is interpreted by ARM ok. The remaining bits are used by coprocessor the above field names are used by convention ok these are all used by convention, but the coprocessor designer can feel to use it in any way you know they can even know decide to merge these two or they can know swap any other locations, but the location of this ok is fixed. Can you see the reason behind this? See let us again go back to our ARM coprocessors ok CDP 1, CDP 2, CDP 3. Now ok they are all sharing the data bus ok and the address this is the data bus ok this is memory sorry about these lines ok this is data bus. So, now all these data buses are connected to CDP 1, CDP 2, CDP 3 this is the ARM. Now why do I say the CPIDs the location of the CPID has to be fixed that is the bit 8 to 11 needs to have a CPID therefore, this need to have a CPID why? Let me ask a question as a system designer ok we have we are building a SOC system on a chip. We bought the ARM IP core from ARM ok and this is a you know I am designing this for company A ok. This company A is not an expert in DSP ok they do not have the expertise in DSP assume and they do not have a expertise in floating point. The expertise they have it or not is a different issue and they are not in the business of designing this coprocessor, but the company A is designing the SOC ok. They buy IPs from different companies and build an SOC ok they make a design and give it to companies like PSMC because this is a this company A is a fabulous company assume ok. It does not have a fab of its own it is not Intel or Samsung or you know this company which have their own fabs, but assume ARM is working with the PSMC this is the Taiwan Silicon Manufacturing Company which is a third party company which builds SOC for us which is chip for us for any other third party company and then gives the chip, but to them we need to give a design. So, this company A is designing an SOC for their own application by buying IPs from different sources ok. They need not have to build these IPs in house and then they build an SOC integrating the you know AMBA bus ok that I will talk about this bus this is also AMBA specification which is defined by ARM. So, they build the SOC using the ARM IP where may be memory from different memory controller from different company ok and memory will be outside my controller may be inside there may be a cache ok all these things will be there and then so many co processors will be there and then they will dedicate some IDs for each of them. Then different companies say the DSP may come from a a specific company the DSP core and the floating point core may come from different other company who are experts in you know giving the IPs top IPs which can be integrated with the SOC and then another company may have a may be a networking based or a knows may be encryption IP built as a co processor they will be providing it. Now, when if the co processors are provided by different companies they need to have some standard ok on how they are going to interpret the instructions why the instructions that are going to be seen the instructions which are going on this data bus are going to be seen by all of them is remember though the DSP instructions are handled by them the DSP instructions will be seen by these co processor also and it will be seen by this co processor also. Only thing is when they see that the IP ID the co the co processor IDs are not matching with their own they may ignore it, but they will certainly see those instructions if they are on the system you know on the bus data bus ok. So, they need to have a common understanding that whatever bit pattern here indicates that this is a co processor ID ok not only this they need to also know what the ARM provides ARM you know gives what is the pattern it provides for the CDP instruction ok. This also should be known by it should be known to the co processor vendors. So, they then they are building the decode logic ok they will putting this information in while decoding this instruction which are coming into a their pipeline. Because I might be building a DSP core, but I might get the instructions which are meant for this right other course also, but I will ignore it if my IP the DSP IP that I am building is not the instructions my DSP instructions will have the ID which are specific to my CP ID if it is not matching I will ignore it. So, similarly they will also do other co processor vendors also will do the same thing, but they need to have a idea of what is the pattern followed by the instructions and then where they have to look for the CP ID in the instruction. So, it is 32 bit they will be getting from the bus. So, they need to know where to how to interpret that that is why this first location of this CP ID are fixed. Whereas once I know that it is meant for me then the co processor may decide to interpret the remaining bits in the any way they want ok. I may have internally may be only 8 registers in that case I know though the ARM provides me a 4 bit pattern for each of the operand I can restrict it to 3 bit pattern ok and then I may have inside the processor I may have multiple operations to be performed. So, I cannot encode them in this limited 4 bit pattern. So, I may need a more pattern for that. So, it is completely implementation dependent how they interpret the remaining part is completely implementation dependent. So, we cannot now say whether they will do they will combine this and this or whatever order they do it does not matter, but this actually is a recommended by ARM co processor ARM company that ok you can have these parameters like this, and again it is left to the developer of this co processor ok. But they need to be consistent with the where they will keep the CP ID and then how they interpret this conditional flag ok. So, this is the convention co processor may redefine the use of all fields except this that is what I have explained you now. CP field is used to contain an identifying number of each co processor a co processor will ignore any instruction which does not match with their ID. So, the conventional interpretation of the instruction is co processor should perform an operation specified in this and this field ok that among the what operation need to be performed the conventionally represented in this flag bit position and these are the operands and this is you know this sorry these are the operands this and this R and M and this is thedestination ok. So, this is the convention you are free to follow if you like it otherwise as a co processor designer you can decide to do it the different different way ok. I hope this is clear to you I took little more time to explain you ok. Sorry the on the contents are there is a difference here ok that is the convention ok. Now, if thislooks very complex ok, but very easy to understand this M clock all of you are aware ok this is a common the main clock. So, these are the different instructions in the pipeline this is the fetch state of the which pipeline this is source which pipeline this shows arms pipeline ok arm. So, these are this is the order in which the instructions are given ok this is the starting address and then instructions are going like this. Now, you see that add comes see time wise time is like this. So, this is the first clock setting maybe after that this second third fourth like this. Now, add comes into the fetch first correct and then say add moves into the decode state and sub gets into fetch and then add moves into execute state and this sub moves into decode state and CDP instruction get fetched ok. Please remember while this is getting fetched the same way the co processor also fetching these instructions ok. The instructions also flowing through their pipeline the same way. Now, what happens when add reaches add reaches this co processor pipeline co processor will see that ok is this my instruction of course, not it is a add and then the big pattern shows the arm instruction it will ignore it and then next will who will enter the this thing sub will enter. See actually we does not have to come to the execute stage itself because in the decode itself it will know that add and sub are not its own instruction. So, it may throw that there itself ok execute stage may not even do anything. Now assume CDP is coming ok first time CDP comes here it gets fetched here in this clock cycle fetching is done by this clock cycle ok and then it gets decoded ok here. Now, CDP when it gets decoded at this time now co processor knows that ok there may be multiple co processors all of them will look at the instruction and then one of them will choose that it is my own based on the ID. Assume that this CDP is one and this instruction is also meant for co processor one ok. Now, this see CDP one will pick it ok it will say that ok I want to use it I want toexecute it, but here this is CDP one ok. In this case here it was in decode both decoded and then it entered the execute stage of the ARM ok. Similarly, it should enter the execute stage of the co processor also, but look at the signal here these three signals are important. Now as soon as the CDP instruction enters that means, the conditional code is satisfied and ARM decides that ok this instruction should be executed it will bring this signal low to indicate that ok I have a co processor instruction to be executed. Any of the co processors in the system argue will into take it. Now, see here this signal the CPA signal has become low ok. What does it mean? One of the co processors in the system has accepted that saying that the absent co processor absent is low means it is present. So, one of the co processors says that oh this instruction belongs to me I want to take it, but unfortunately I am busy now by it may be executing the previous instruction we do not know because this before this what happened I am not showing it here. So, the co processor is not currently ready to take it take the instruction. So, what happens? Can the ARM processor carry on with the its own job it cannot because until this instruction is accepted by the co processor ARM has to wait ok. So, because it is nobody has taken of it no one guy one ARM co one co processor has said that ok I can take it, but it is now busy. So, what happens is this is called busy wait. So, ARM waits for the both the signal to be low ok. Now, one of them is low does not guarantee that this will be this is done with that it has to be handed over. So, it is waiting ok and then at what happens during that time it fetches one more instructions from the pipeline ok see here it fetches this as well as this see this two instructions have been fetched after that because CDP is in the execute stage then there may be two stages right ARM core ARM is having decode stage in the fetch stage. So, fetch stage can have a sub instruction and decode stage can have TST instruction and now CDP is in the execute stage. So, two more instructions are there after this in the it has been absorbed you know it has gone into the pipeline of the ARM and similarly it has gone into the pipeline of the co processor also ok. Now, what happens is the ARM the co processor now makes it low here that means, what it is accepting this instruction now it is going to execute the CDP instruction whatever it has been conveyed by the instruction. Now, at this moment ARM knows that somebody is owning it. So, ARM make this high ok CDP instruction is accepted. So, it takes an TST instruction to execute stage and make this high at that moment the the core which took the instruction also makes these two high ok that is the way the co processor is designed. That means, when the next instruction comes it can make it low or it can continue to low ok. If suppose next co processor instruction is there then it may based on the ID one of the co processor may do driver signal, but at this moment this co processor which has accepted the instruction has made it high because it should not keep it low continue to keep it low that you know ok by once it has accepted it it should immediately bring it up because it should the ARM should not interpret the next instruction ok to be accepted if it continues to be low even if the next instruction the which does not belong to this co processor is in the execute stage it may interpret the signal and then decide that ok this insert co processor also this instruction is also accepted by co processor. So, these two signals should be brought to high immediately. So, avoid the confusion ok because this signal remember the CP and CPV are driven by multiple co processors. If you remember recall it is connected to AND gate inputs all the CP co processors are driving this input and then the finally, AND output of this signals are going to ARM code. So, that is how the interpretation is happening this is how the handshake happens. So, every instruction there may be an instruction which may be accepted immediately by the co processor or ARM has to wait that is what is called busy wait sequence ok. I hope this is clear to you once I once you understand this the remaining time you know timing sequence you will be able to follow easily. Now what is the cycle time taken by this instruction let us go back I said 1 s and b into i b is number of busy wait cycles internal cycles ok how it is see the time taken by the co processor is how much time the co processor is occupying the ARM execute stage that is a ARM execute stage. If an instruction is occupying the execute stage of the ARM if it is occupying for how many cycle that is what is the cycle time right. So, how many busy waits are there it waits for this waits for this wait for this 3 clock cycle ok if the ARM is waiting for busy wait for co processor to be free and then it it waste 1 internal cycle 1 sequential cycle sorry 1 sequential cycle to come out of it ok see though co processor has brought it low at this moment it cannot execute the next instruction immediately right it can execute only here. So, 1 cycle here goes off the sequential cycle goes off and then 3 busy wait cycles are happening here. So, 3 busy wait cycles so, number of busy waits the that many cycles and 1 internal cycle is what I am showing you have 1 sequential cycle and number of busy wait internal cycles are wasted ok I do not want to call it as wasting, but of course, if it is a busy wait it is a waste only. So, there is a possibility that busy wait is not there the co processor is free then this whole thing will be 0 only 1 sequential cycle will be taken by the ARM immediately ARM can go to the next instruction otherwise it has to wait this many cycles ok. So, 3 is the number of cycles spent in the co processor busy wait loop. So, busy wait needs to be done in case co processor is present and is busy with some other operation I think this is clear to you ARM busy waits until the co processor instruction is accepted by the co processor that is what that many internal cycles are spent ok. I hope this is clear to you why it has to wait so long while the ARM is waiting for the co processor to accept ok the instruction ARM pipeline is stalled of course, you know once the execute stage is stalled everything else behind it also stalled ok. Once it is accepted ARM continues with the execution. So, because it is a data processing instruction ARM does not have to do anything where it is the totally something to do with the internal operation of the co processor here the instruction has all the operands required for the data processing and it is has to be done by the co processor. So, ARM does not have to do anything except for wait till it is accepted then carry on with this draw ok that is why ARM continues with the instruction same as soon as it is accepted. Co processor may take its one time to complete the instruction because that is not in the control of ARM. How many cycles the co processor may take depends on the complexity of the instruction and how it the particular co processor is implemented and how long will it take is the internal to the co processors and it is something to do with the what it is doing with with its own register set. So, so this is nothing to do with the ARM ok very good. Now what is the format? So, it looks complex, but see this condition code you can mention either e q n e or you know m i minus you know anything you can give the CDP is is a indicator that is the co processor data processing instruction, but the mnemonics the you know what you want to write here you know if you suppose you want to add the f pad it is you can do that also ok it is something this in this instructions are interpreted by the assembler and it needs to have the awareness of the co processor. So, if it has the awareness you can put any instruction you want it will generate the equivalent encoding for this particular processor instruction. And you know in a P hash has to be given this is the number this is the co processor number co processor id which will go into that location which I mentioned you know 8 to 11 I suppose you know in the format this CT id will go there and then other parameters to the co processors can be left to the independent you know implementation. So, they are all some co processor registers and then you know a specific type field and what operation to be performed ok these are all very specific to the particular instruction implementation the co processor implementation ok. So, these are all typical format the P 1 indicated this is a co processor id 1 and this is a co processor id 2 it needs to do a job of this data processing based on what is mentioned by this processing this is then may be indicating that is a floating point addition or time may be floating point multiplication we do not know. So, it is specific to the co processor implementation and then this is some type co processor type and these are all the co processor registers. So, this is the you know typical example you know what if suppose convention is followed the C 2 and C 3 are taken as operands and this operation is performed and the result is put into C 1 ok. Similarly, C 2 and C 3 are the operands and this operation is performed and the result is put in C 1. So, provided 0 is set ok this is a conventional way of interpreting the instruction that is left to the implementation. So, we have seen the data processing instruction now data transfer ok. Let me explain you why these instructions are required ok. See this is the ARM code it has got a memory address bus, data bus this is the data bus to it to the data bus all the co processors are connected ok, data bus C P 1, C P 3 need not be all connected ok. Now, this see this numbers are very unique those I said that 8 to 11 I think are the user you know CPID. So, maybe I need to give only this ok maybe I say CP 10 and CP 11 or whatever ok. So, CP 10 and 11 are co floating point co processors. So, I can say that CP 9 ok and 9 I have given. So, CP 8, CP 9 and CP 10 are the co processors. Now, let us this thing memory is here ok and ARM is accessing the instruction from the memory memory has both data and instruction ok. Now, I told you that ARM is only generating the address for accessing the instruction as well as data ok. Now, what will the co processor do? It is you know it has got its own registers fine no problem and then it is getting the instructions from the ARM core that is also no problem it is getting the instruction, but do you think that CP O you know co processors can do its job without communicating with the memory it is not possible because all said and then the co processors are any processors for that matter you know including ARM has got a limited registers right. If you just do meaningful job a scientific application if the data has to be maintained in the memory and then brought into the processor and written back into the processor correct written back into the process memory sorry. So, so even co processors need some data from the memory and they should have a you know facility to write into the memory and read from the memory, but I told that all the co processors are not connected to the address bus and they cannot generate address ok and the master is ARM and only ARM can generate the address and only ARM can access the memory. In that case what happens the co processors are not will not be able to have any of their data in the memory correct. So, to help them do that there are some instructions which are helping them to transact with the memory, but they have a limitation they cannot generate address. So, what they do ARM will generate the address for them and the co processors will give the data or take the data from memory. Can you see this ARM will generate the address ok they ARM will generate the address, but data will be provided by co processor. Either if suppose if it is the write operation store operation ok, if it is storing you know suppose I say you know data transfer I said. So, suppose it is a floating you know it is a data store operation ok load operation you know load ok or store. So, if load means some value from memory is copied into the register. So, a load operation may be done and then some value from memory can be loaded into the co processors registers and then co processor does some operation with that and then it does a store operation to store it into the memory ok. So, it is possible now with these instructions, but only caveat is that the addresses have to come from ARM and data will be provided by the co processors. So, they have to be coordinated please remember it cannot be arbitrary because memory is a common element memory is getting the address from ARM and getting the data from the co processors. So, but for as far as memory is concerned once address is provided it will provide the data out if it is a read operation it will provide the data out. If it is write the operation once address is given it will expect somebody to put the data into the data bus. So, there should be a close net between the 3 parties who are the parties ARM co processors one of the co processors and memory. So, when any data needs to be transferred from between co processor and memory ok ARM provides the address and co processor provides the data or accepts the data you should understand this concept very well ok. Otherwise the rest of the session will become irrelevant. So, it is a data transfer between co processor and memory, but address is provided by the ARM. I will tell you why it is why the control is always with ARM ok, but at that moment please remember the address is generated by ARM and the data is either based on the read or write operation data either is provided by a co processor or accepted by the data co processor let us go. So, this is the format. Now, now you will be able to understand if you recall all the low load LDM and STM instructions that are provided earlier ok. The the instruction format looks similar ok these are all the things which you have seen earlier in these instructions ok, but this will be might be different we you know ok I do not recall what was the encoding done for LDM and STM, but this whole thing gives an indication that this is a data transfer instruction and betting in a co processor and memory ok. Now, let us see what are the operands which are read by the co processor and which are read by ARM. See I did a co color coding to say that blue color indicate that it is the this bit pattern or this operands are read by co processor. So, I said this is a co processor destination register ok. So, star destination register this is a co processor number please remember 8 and 11 between these two the co processor ID should be there. Now, this is what this is offset offset is to do with some data operation and what is this? This is a base register and whether is this ARM register or a co processor register this is a ARM register why when I say this is the base register then that is what is used for address generator and who is generating the address ARM is generating the address. So, the instructions should indicate which of the registers in ARM can be used as a base register for generating the address. So, it has to be generated by ARM. So, you should know which register to be used. So, that is given by the instruction. So, this particular bit field will be used by ARM ok not only this including all this pattern all these patterns except this blue color are all used by ARM ok remember the you are able to understand. So, only thing is because of the size restriction now only 8 bit immediate offset is given, but it is given in words ok. That means, what whatever is offset is given it will be shifted by 2 bits. So, that it becomes a byte address ok and then source and destination register is a this is a co processor register why do we need to mention a co processor register I told ok ARM is there memory is there and co processor is there ok. Now, there is a transaction either transfer either one let us simply simple thing let us load instruction let us say. Suppose, suppose the load means what some value in memory has to be copied into one register into cp1 this is load ok and then now suppose write if it is a store ok some register in the cp has to be written into the memory. So, in some particular address now this address has to come from here. So, ARM has to generate the address ok it has to compute the address for this memory transaction whereas, the data needs to be provided by co processor. So, if it is a store it will provide a data to be written into the memory if it is a load it will take the data given out by the memory into some register. So, so you need a register here which should be mentioned that is called CRD. Similarly, for generating the address you need a base register here that is called RN here got it. So, RN is a register used for the address generation and C maybe I will write it here CRD is a destination or source register used for the data values in the co processor and offset is something to do with the arithmetic. Now, how it has to be generated inside the how it has to be generated by the ARM. So, those offset is used and based on whether it is a load or store this bit will be set and whether the address the newly computed address can be written back into the base register or not is decided by this bit pattern and transfer length to just say that whether it is a single or multiple word ok. Multiple means it does not know how many ok multiple transactions are happening ok. So, that will be indicated by one bit pattern either single instruction single transfer or multiple transfer ok onetransfer length and then whether the addresses have to be incremented or decremented whether it has to be incremented before or after because similar to the original address mode that you learnt in the ARM code I hope this is clear to you ok. This is the very crux of the problem ok you should know this is how the instruction is interpreted. Now, this class of instruction is used to load or LDC or store a value or subset of copras register directly to memory. So, you can have either one value or multiple values I will tell you how it is done. Now ARM is responsible for supplying the memory address and the copras are supplies or accepts the data and copras are also controls the number of words transfers this is very very important ok let me take some measuring color. So, number of words is very very important you should remember that ok that is also controlled by coprocessor. Apart from the accepting the data or providing the data ok it also controls the number of words I will tell you how it is done in the next slide. Now, see people identify the coprocessor which has to do the job and coprocessor will only respond if it is number matrix of course, this you know. Now, let us see see Rn is the base register I explain to you CRD is the field and the n which contains the information for coprocessor which may be interpreted in different ways by different coprocessor. So, that is number of words ok, but by convention CRD is the number to be transferred or the first register where more than one is to be transferred. See I will tell you the transaction can be a single word ok in that case one register may be transferred suppose ARM is here CP is there only one register is transferred to the memory that is one way. Another one is multiple transfer that means, multiple transfer means it may say that I will mention the first register and then from till the end of that suppose you know it is again implementation depends CRD ok CR 0 to CR 15 ok coprocessor has a 15 register and then here you mention that the transfer has to happen from suppose CR 4 then the coprocessor and then you say that it is a multiple transfer ok n which is set then multiple transfer that means, it may indicate to the coprocessor that ok starting from CR 4 if till CR 15 you transfer either transfer it to the memory or load it from memory. So, it is all implementation dependent ok it is nothing though with the particular you cannot say that this is how it will be implemented, but your coprocessor can internally decide how many transfers are to be done ok. So, n bit is used to say whether it is a one of two transfer length options ok. So, if n is equal to 0 a single transfer and then may be a all register starting from some particular coprocessor register. Now, this will give you some understanding of how the number of transactions are decided ok the whole thing up to this point whatever I explained earlier is same ok. So, we do not have to go through that pipeline it all as a now at this moment the instruction the LDC instruction which is load coprocessor has come to the execute stage ok of ARM similarly in the coprocessor also. Now coprocessor has said that I am available ok and I am not coprocessor accent is low means it is present and busy is low means it is not busy that means coprocessor accepted the instruction. That means what the coprocessor is going to do the perform the with the help of ARM it is going to perform the data transfer. Now, see this instruction is execute you know especially from after this LDC reaches the execute stage this is also fetched from the memory this ARM is fetching this instruction. Now, from now onwards ok up to this point the data is provided or accepted by coprocessor ok CP data what is going on on the data basis CP data. Now, who is generating the address address is generated by the ARM and the coprocessor accepting it. Now, how does the ARM know when to stop generating the address? See when I say that it is a multiple transfer the address has to be incremented right. Now please remember all the transfers are fetched to bit ok similar to our LDM and STM fetch it to bits that means 4 bytes. So, it has to increment by 4 the address. So, starting address is computed based on the LDC instruction and then it keeps incrementing it by 4. So, suppose this is address 100 is generated by ARM 104 will be here sorry from this onwards 100 and the 104 108. So, it is generating the addresses. Now, you remember for this transfer to happen here in this cycle the address is generated one cycle ahead. So, that is why you see that this instruction is this instruction fetched during this cycle and at that time the new address for the data transfer is happening here and then those new the data values are accepted you knowtransracted over the bus. So, who is reading it or writing this data the coprocessor is accepting it or it may be generating the data, but address is generated by ARM and then when this particular busy goes up that indicates that coprocessor is done with this multiple transfers ok. Coprocessor decide how many number of transfers happens how it is done I will erase this to give you. Number of transfers is indicated by coprocessor by keeping this busy for that many cycles. In this case 4 words have been transacted and it has been in controlled by the coprocessor CTB is generated by coprocessor. So, by keeping it low it informs the ARM that do not go away from the LDC execution continue to generate the address ok. Please remember it is not similar to CDP where the ARM can just ok once somebody has accepted it ARM cannot go to the next instruction because ARM has got a job of generating the address. So, it cannot throw this LDC out of execute stage and then take the next instruction test it cannot do that because it has got a job to generate the address and it will generate it will be able to generate the address only when the LDC instruction is in the execute stage of the ARM pipeline. So, ARM will continue to be in the or the LDC instruction will continue to be in the execute stage of the ARM till this busy goes low sorry high that means, when this happens coprocessor indicates that ok I have done with the transfer now you you can stop generating the address. So, at that moment only it generates the address for fetching the next instruction and it goes as a with the execution of the test instruction the next instruction which was there in the paper. So, can you understand this the transfer is happening with the help of ARM and data is either provided or accepted by the coprocessor and ARM has to keep that instruction the data instruction in the execute stage based on how the CPB is given by the coprocessor and then the transfer completes this is very very important I hope this is clear to you ok. Now, cycle time can you understand this I have explained this for the ARM instruction. So, one non sequential cycle because of what this particular data transfer is non sequential because this is the new order generated and then rest of the data transfers are sequential because it is all adjacent addresses as this is happening. So, 1 n and then how many words are transferred minus 1 is the sequential and then this instruction was in the execute stage for 1 more s cycle ok see here 1 more s cycle it was here and DC rate was the 0 here ok it immediately it it was accepted otherwise DC rate also will be added. So, s and n minus 1 s and 1 n see 1 n and then number of words minus 1 s ok number of words transferred plus 1 s anyway for the execution address generation ok. So, effectively this will be n s ok and 1 capital N that is non sequential address first address will be non sequential. So, that is what you will see 1 n 1 n minus 1 s and DC rate I ok. Now, this non sequential address sorry I will tell you here how many are there see 1 how many cycles 5 are there right 5 are there how many words are transferred 4 words. So, 4 anyway is due to the 4 words transferred and then plus 1 is here that is non sequential. So, 3 plus 1 4 cycles are used by this instruction ok. So, that is why you see this, but this is a data transfer ok 1 second ok. So, so this is cycle if it is there then it will have 1 not that many number of internal cycles and then 1 non sequential cycle followed by n minus 1. So, that many words are transferred ok. So, ARM is responsible for providing the address used by the memory system I have explained you already. The addressing modes available are all LDM similar to this this note. However, that the offsets are limited originally it was 12 bits in ARM mode, but in copacic instruction it will be only 8 bit offset and it is shifted by left 2 bits because the offset is given for the words because you are very clear we are sure you are sure that it is a word transfer only can happen and then whether it is subtracted or offset is subtracted or added is based on this bit pattern because plus hash you do or minus hash offset you do that decides the additional subtraction. So, whether the characters should be performed before or after that is pre-indexing or post-indexing is desired by this based on whether you are maintaining the offset inside or outside. So, you should go back to the LDM addressing modes to recall this bit patterns ok. So, this is how it is done and then W is the overwritten whether the base register can be written back or not is provided by this. The value of the base register modified by the offset is pre-indexing session is used to address the transfer of the lost word ok. So, the address computer is used by the for the first word transfer after bus it will be incremented. So, second word will be incremented by 4 bits because it is always a word transfer and the address will be incremented by one word for each subsequent transfer always I told you that in the sequential address the addresses will be incremented because that is how memory is programmed. So, this is the general pattern ok is nothing specific it is you can if again you know application specific. So, you can mention LTC or STC and then mention the conditional code and then which coprocessor ID and how to generate the address and then say which coprocessor to copy thatvalue to or some ok. So, these are all some you know example usages of instruction. So, take this as a R5 as a base address add 24 to that and use that value to access the memory and put it it is a store. So, C 3 is coprocessor is copied into this address and then R5 is updated the exclamation mark is there. So, write back happen and what is this P 2 this is the coprocessor P 2 the coprocessor 2 is used for this transfer. Although the address offset is expressed in bytes ok here you can express in bytes, but internally it will be stored as a burst because you I told you that offset is left shifted by left shift by 2 bits because it is a word transfer the offset can beright shifted and stored and then internally ARM processor left shift by 2 bits and uses that address. So, the assembler will adjust the offset accordingly. Now there is one speciality here Rn is R 15 the value used will be the address of the instruction plus 8 points. Now let me explain to you this ok if suppose by chance we are writing Rn as this Rn as R 15 suppose, then we should know that which R 15 is used it is a whatever is the PC here suppose this instruction is at 100 the R 15 will be 108 that is all ok 108 will be used as R 15 value and write back into R 15 should not be used. So, if you are using instead of R 5 you are using R 15 you should not write R write back that is the condition ok assembler will give an error it will not accept that kind of the instruction. Co-processor data transfer is not available in thumb state it is similar to any other co-processor instruction it is not available in thumb state ok. Now one more thing data abort is possible in the co-processor data transfer also the if the address is legal, but the memory manager generates a abort ok. Now see a abort has to be there in respect of whether the transfer is between ARM and memory ok or ARM either co-processor and memory the transfer may be between these two ok, but abort should be allowed because there may be that memory is apparently that floating point no thing is not available in the memory then abort has to be generated. Now who will execute the abort handler ARM will execute not this guy ARM will execute the abort handler ok and then perform the job required for avoiding that abort and then the restart the instruction. So, it is similar to a data abort happening in the ARM processor. So, the write back of the modified base will be take a will take place, but all other processes will be preserved. So, if suppose write back is mentioned in the base register ok. So, the write back value will be written into the value into the register suppose R and hash you are given then the base register will be written with the write back value if abort happens after the after the writing the value the abort. So, it is very specific to the abort handler how you write the abort handler for co-processor. So, I am not going into the details of this because it is very very specific, but I am just giving you that you have a back in the back of your mind that a data transfer with the co-processor and memory also can create an abort and in that case how does the ARM behave? How does the ARM behave? If a co-processor is partly responsible the why I am saying co-processor is partly responsible means because the data is provided by the co-processor and the address is generated by 3 party transfer right 3 people are involved ARM memory and co-processor. So, if abort has to restart the instruction again you know if you have restart the instruction then co-processor also should be partly involved in this. So, co-processor may know that this abort have happened ok there may be some signal value to some to know that there is a abort because the pipeline flush will happen and similarly pipeline flush will happen. So, the 3 party will work together to resolve this issue ok memory anyway is a dumb dumb entity here, but ARM and co-processor will work on this abort handling and must ensure that any subsequent actions co-processor takes can be repeated when the instruction is read pride. See on a data abort you remember the instruction will be read pride that means, what it will be executed again after the abort handler is completed. So, we have to make sure that the co-processor and ARM keeps their register in status in such a way that it does not impact the correctness of the program ok. So, do not bother too much about it, but I want you to be aware that there is a possibility of a data abort in the co-processor transfer also ok. Now, the lastregister instruction type which is the register transfer again let me come back ARM is there co-processor is there ok, data bus is there ARM CP memory ok. Now, we saw that co-processor can do is one job using CDP instruction. We saw that co-processor and memory between the memory data transfer can happen using the co-processor data transfer instruction ok. This is CDP instead that something happening is done by CDP transfer between memory is CDP, but there is one more transfer possible ok that is this is the ARM master is there ARM is a master and co-processor is a slave ok and I also told you that conditional execution is done ok based on the flags inside the co-processor inside the ARM processor . So, if suppose you write a program ok let me give an example ARM code you have written ok ARM code ok and then so co-processor f add you are done. Now, suppose if f add is results in a 0 ok suppose this you are writing that register c r 1 assume the f add result is written into co-processor register 1 ok. Suppose if it is a 0 value is written into after end of this instruction ok. Now, down the line somewhere I want to if ok suppose branch if floating point results in a 0 I want to go to some other location. Suppose you know I want to go to the label l 1 ok otherwise I want to execute this instruction see if the floating point operation is done along with the ARM instruction. So, there should be some over and otherwise what is the use of having some ARM code running and DSP code running they need to be in sync ok some results of this need to reflect on the control slope of the ARM instruction otherwise what is the use of this whole thing running in a single application right. So, there should be some control slope possible based on the outcome of the operation done by the co-processor. Now, how can it be done? Because though floating point also maybe I am taking that example because it is very easy to comprehend. So, assume the floating point is a scientific notation which I will be covering in the next class, but that co-processor is having some flags ok they also have some flags, but these flags are not accessible from the ARM code they are internal to it. Now, how can ARM this change the control slope ok without having an access to this is not possible. So, to to provide this option there is a way that we can transfer some information across the ARM and the co-processor. See this is a next set of instruction that is called register transfer register transfer is what registers in ARM ok ARM has got some set of register co-processor has got some set of register these values can be transferred between them, but how ARM interprets the values and how co-processor interprets the value they are different because here they are all integers ok maybe co-processor may be all floating point arithmetic ok float float, but suppose if there are some data transfer you know instruction converse the format and then put it in the ARM core, then we can the values can be exchanged between the ARM core and the co-processor as well as some conditional flags ok flags can also be exchanged between them ok. Say two things ok let me come back see flag is one thing ok sorry. So, flags ok can also be exchanged between a co-processor and a ARM processor. In that case what happen suppose I have a 0 flag here ok in the co-processor I if I am able to transfer this value to 0 flag in the ARM ok, then what I can do I can do this operation if my floating point arithmetic resulted in a 0 know I want to control to go to L 1 L 1 is some code which ARM is accessing. So, I can control the flow of ARM based on some outcome of this which is happening. So, there should be some provision for the floating point registers to be transferred are as well as the flags of the co-processor to be transferred to the ARM co-processor. So, such a conditional flow can be effectively done. So, this is the need for having this set of instructions ok. So, let us see how these instructions help us and how is it implemented. See there are two MRC and MCR if we recall there were MRS and MRS of MRS of instructions earlier in ARM ok. What is this? This it was a CPSR ok transferring to some register ok you can transfer the CPSR to some register always the convention is like this right. So, R you can do a register to CPSR ok that is S is a flag register. So, this was the instruction which we saw sometime back in the ARM code. So, this is a convention co-process to ARM register is this instruction and the register the ARM register to co-processor is its MCR instruction. So, those conventions are same always it is like something R equal to C ok. This C can interpret as R equal to C if you want to just remember right you write this then you say R is for ARM registers easy to remember C is because C it is a co-processor register. So, co-processor register is written into ARM this is the convention this is ARM register is written into co-processor register that is all very simple. Similarly, S is a flag so, register is the ARM register that was the convention we followed there ok. So, very easy now this instruction is not as complex as the memory transfer instruction only thing is the parameters are more ok. You may transfer see here CR and it is a co-processor in operand this is another operand co-processor operand and then RD is a register you know the RD is a ARM register. So, either you may do this operation and write into ARM or ARM will be written into some registerin the co-processor ok. So, ARM stores the destination register. So, RD is given, but does not meanmean that it is only a destination register it is also a source register ok. Similarly, co-processor also can be this register is a source register source or destination similarly this is source or destination ok. So, the transfer can be between anything that you mean co-processor to ARM or ARM to co-processor and this is another operand register ARM CR and this is a operand register ok. So, it could be a load or store and then you can even then you can perform some operation and then transfer it ok. I will give you an example. Supposethere will be an instruction will be coming later okyou have a floating point co-processor ok floating point co-processor and this is the ARM is the ARM is there. Suppose you have a value 1.0 ok in theCR0 of 1.0 is stored in co-processor register 0 you want to transfer it to R 0 of ARM ok CPCR0 CR0 you want to write into R 0 here this is 1.0 which will become 1 here ok. Because it is the truth complement notation and this is the floating point representation you cannot have the same bit pattern here and then you try to you know interpret this as a 1. So, there should be some trans transition happening the this pattern has to be transferred to some format here to make this as a 1. So, that operation can be performed here mentioned here. So, that the co-processor before transferring the value it does this transition and then push it here because this values how they are represented in ARM register and the co-processor register are different. So, you cannot have the same bit pattern in both the places and expect that to work. So, there should be some trans transform transformation has to be done. So, the transformation is specified by this particular operation mode ok ok good. So, let us see with this type of instruction you need to communicate information directly between ARM and the co-processor. So, an example will be a fixed floating point value held in a co-processor. So, this is the fix I mentioned that 1.0 is done as 1 in ARM ok this is CP this is called a fixed operation fixed is floating point value into a integer value ok that is called is an instruction which is used by the floating point processor ok, where floating point number is converted into 32 bit integer and the result is then transferred to ARM register. Similarly, you know a short operation is suppose 3 is here in ARM if it has to be written into a floating point register it has to be written into as a 3.0. So, there should be a trans transformation that is called float ok, there are different instructions which are used. So, we do not have to remember with, but you understand the concept when any data is transferred between ARM and the co-processor where the co-processor interprets the data in a different way and ARM interprets it in a different way. So, they need to be transfer transformed into the other format that is what is achieved while transferring this values that is all is important here. So, an important use of MRC instruction is to communicate control information directly. We told you that CPSR flags of ARM you want to change it ok, but let me try to see that you remember this or not ok ARM. In ARM I mentioned to you that MSR and MRS instructions are used fine ok, but you cannot have a move instruction operating on CPSR correct. You cannot say move CPSR into R 1 you cannot do that. You before that what we suppose if this is what you want to do what you need to do let me say change the color ok. What you need to do you will say that M S ok, MRS R 2 suppose and then CPSR ok. Now, what happens CPSR is copied into R 2 and then you will say move ok directly if you want to interested in R 1 you can play that ok ok why do I go around this. So, R 1 comma CPSR what does it mean I cannot access using any of these instructions I need to use a special instruction to move from CPSR to some register value ok. So, let me come back here. Now, what is the I I I told you that if you want to have a control so, you want to change it you need to have the flags in the CPS know flags in the control processor to be flags here to be transferred to ARM ok ARM has the flags in the CPSR to change the control so, of ARM instruction. How can that be done there is a special way? If the R D is ok you can use this instruction only M R C that means, what coprocessor is moving to register. Now, suppose here you say R 15 as the register ok ARM register and then say that the flags in the coprocessor ok in the C coprocessor. Now, what happens is when you want to say flags in this is moved here whatever is the value in the 4 bit lower 4 bits ok of what you transfer is actually not written into R 15 it is written into CPSR of ok CPSR of the register the CPSR 4 bits is there know those 4 bits are written into this is when you mention the transfer coprocessor to register of ARM and then say that R D is R 15 then what is done is it is treated as when a coprocessor register transferred to ARM has R 15 as a destination register the bits 31 to 28 of the transferred word are copied into these flags and other bits remain in a unaffected and PC also not affected. So, it is a indirect way of saying I want to transfer the flags inside the coprocessor to the flags in the ARM processor. So, what you should do it is a coprocessor will have some set of instructions inside. So, you have to execute them and then bring the flags of the coprocessor may be into one of the registers of coprocessor ok its may be some instruction. So, you have to execute that instruction to bring that into some one of the coprocessor registers and then transfer that to R 15 of the ARM once you do that what happens the flag value has come into the CPSR of the ARM and then we can have a code in the ARM say that ok branch EQ you can say or some conditional check you can change the flow of control. So, you have to execute ARM in sync with the coprocessor instructions. So, there is a provision provided by the ARM to perform this job ok. So, what are the instruction fields I explain to you already destination register based on whether MCR or MRT there are certain features are interpreted by coprocessor CP indicates the coprocessor ID. So, register transfer now one more thing is suppose register transfer if you are doing with R 15 as a source register ok. Suppose ARM is here ok you want to transfer MCR. So, suppose ARM register to coprocessor register and suppose the register what you are mentioning as a ARM side this is the RD this is the source register here suppose you have mentioned this as a R 15. So, what value ok R 15 value from here is going to the coprocessor some CP register ok which is mentioned by this coprocessor say. So, for CP 1 ok register 1 sorry CP ID. So, register 1 now R 15 is going into CP you know some register 1 of this then you may need that you want the parent memory ok. So, R 15 where it is pointing at that address you want to transfer what you get actually is PC plus 12 it is a very you know a good question to answer you know think about why it is so let me give you know example add is there one ARM instruction and then you are executing MCR ok and then there is subtract instruction and then there is a test instruction and then there is one Mully instruction suppose ok. Now, when this instruction is MCR is what register to and then you have mentioned R 15 needs to be transferred ok to some coprocessor register R 1. Now, when this is in execute stage this is accessed this is accessed ok means PC is here suppose this is the address address under 108. Now, this is also fetched ok. Now, for doing this register transfer has to happen through data bus only right ARM is here and coprocessor is here the transfer has to happen through data bus right. That means, what the instruction fetch has to be stopped for a while. So, after accessing this instruction it will be implemented by another pole that is R C and this value will be written into R 15 ok and then this will be executed. So, because I told you that when this is in execute stage other two instructions have been already accessed and the PC is also incremented by 4. Since we need we cannot access this instruction we need the data bus for you know transferring the value which is mentioned here to be to the coprocessor we need the data bus. So, the fetching of you know fetching has to be stopped. So, but instruction address would have been incremented to the next value which happens to be this value plus 12 ok. Now, if you are adding R 15 as the source register to be moved into coprocessor register what you will be getting is a plus 12 value that is what is extend by this bullet ok. I hope this is clear to you. So, this is the timeline MCR is come ok MCR is in the execute stage ok. Now, it is bringing the thing low ok it is a register transfer between the coprocessor and ARM processor. So, ARM is delaying it by one cycle because it is already the memory know it needs a transfer you know data bus. So, it has to complete this fetch. So, it is completing this fetch and then it is making this low by the time this also accepted immediately then what is going on this is the value in the register either in coprocessor or in the ARM processor is exchange and then it comes back. See please remember this can be not be completed in one cycle it can take multiple cycle because if you recall is just now I mentioned that while transferring the fetch or flow the coprocessor can perform some operation especially when it is supposed to transfer a coprocessor value into a ARM register it will take some time to do that. So, at that time the busy weight will be more, but if it is between ARM and but you know from ARM to the coprocessor the register value is copied intocoprocessor and then it is performing the job. So, ARM can continue with this execution during this time. So, the delay will be there only when CP to ARM transfer because CP has to perform some transformation and then send it to ARM. Whereas, that is not true in this case when ARM value it is ready actually the integer value whatever is there is ready to be transferred to CP and then the transformation job that CP has to do is internal to it. So, ARM may not have to wait for it. So, the busy cycle will be brought down brought up. So, that ARM can continue with the execution and the transfer will complete ok. So, this is a very circle point that you should remember. So, cycle time is one cycle to know one sequential cycle for the execution and then if there is a weight 1 b weight and then one coprocessor cycle because this is called coprocessor because we are transferring between ARM and coprocessor that is what is called a coprocessor cycle. And why is this plus 1 is there apart from busy weight that is because I will tell you see it is only for RC RC is what coprocessor to register right. So, ARM register. So, when from the data bus when the value comes into the ARM ARM 4 it lies in the data in register ok. If you call data in register and then it will be taken into the register set. So, it will involve one internal cycle to do this transfer ok. That is why this plus 1 is there is there only for coprocessor to register whereas, that is not there for coprocessor register to coprocessor ok. Because the register value is put into the data out register and then immediately it is taken out. But when we are reading in something from the it is same is similar to reading from memory ok. Though it is a reading from a coprocessor it is similar to reading from a memory because it is coming through the data bus. So, it will take one more cycle internal cycle to transfer it to the register we know a destination register that is why plus 1 is there ok. And then this is a format. So, do not get worried too much it is very very specific to the implementation. So, this is the CPID and then you can mention the which ARM register you are talking about and then which are the coprocessor can be mentioned here. So, these are different type of instruction ok this is the coprocessor ID 2 and we want the this is what C to R. That means, coprocessor register ok performs some operation file on C 5 and C 6 and transfers the particular bit word result into R P ok. Do some operation based on this and then transfer it to R P. Now what is NCR? Move R 4 into J ok and then use the both operand to you know do some operation based on what the value was coming to it and then write into C 6 ok. So, whatever our value read some R 4 you do this and then this is if this condition is 2 transfer the coprocessor register ok. Operation is 9 and then type is 2 and it is written into R 3 and use both C 5 and C 6 ok and then transfer the value into R 3 that is what is the each instruction mean. So, with this we have come to the end of coprocessor instructions. See these are all very you know a high level view of how the coprocessor instructions are provided and how they are used ok. You should if this knowledge is enough once you have this diagram information then you can go in the details of any particular coprocessor implemented for ARM then you will be able to understand some of the instruction the coprocessor may be implemented using one of the three type of the instructions ok. It will be encoded as one of the three type of instructions ok. So, and you will be able to appreciate how the transfer is happening how many cycles they take how the memory if it is something to the memory how the address is generated by ARM and how the coprocessor is interfaceted with that. So, I hope this talk was useful for you to understand this and in the next class we will talk about coprocessor for the floating point arithmetic asset the in representation and then we will take a look at the vector floating point processor in the following lecture ok. Thank you very much for your attention. I enjoyed sharing this knowledge with you hope it was useful and enjoy your reading and on all. Thank you very much. See you in the next class.