 Hello and welcome to this presentation of the STM32H7 Cortic Co-Processor Block. It will cover the main features of this block, which is used to accelerate trigonometric functions. The Cortic Co-Processor provides hardware acceleration of certain mathematical functions, notably trigonometric, commonly used in motor control, metering, signal processing, and many other applications. It speeds up the calculation of these functions compared to a software implementation, allowing a lower operating frequency or freeing up processor cycles in order to perform other tasks. The Cortic Block is an AHB slave that inserts weight state when the Cortex-M requests the result until the operation is completed. No input-output driver is therefore needed. Another approach consists of enabling the Cortex-M to handle other processing while the Cortic calculation is in progress. In this case, an interrupt request indicates that the result is available. DMA channels can be implemented to provide the arguments from memory and to write the result to memory. The Cortic Block supports a pipeline operation. Next arguments can be provided while the calculation with the current arguments is in progress. Note that the Cortic Block is a fixed-point arithmetic accelerator. Cortic, which means coordinate rotation digital computer, is a hardware-efficient iterative method which uses rotations to calculate a wide range of elementary functions. In trigonometric or circular mode, the sine and cosine of an angle theta are determined by rotating the unit vector 1, 0 through decreasing angles until the cumulative sum of the rotation angles equals the input angle. The X and Y Cartesian components of the rotated vector then correspond respectively to the cosine and sine of theta. Inversely, the angle of a vector, X, Y, corresponding to arc tangent, Y, X, is determined by rotating X, Y through successively decreasing angles to obtain the unit vector 1, 0. The cumulative sum of the rotation angles gives the angle of the original vector. The Cortic algorithm can also be used for calculating hyperbolic functions, cinch, cauchy, a tonch, by replacing the successive circular rotations by steps along a hyperbola. This slide indicates the list of the 10 supported mathematical functions. The first step when using the co-processor is to select the required function by programming the funk field of the Cortic CSR register accordingly. Consequently, only one function is active at a time. Several functions take two input arguments, arg1 and arg2, and some generate two results simultaneously, res1 and res2. This is a side effect of the Cortic algorithm and means that only one operation is needed to obtain two values. This is the case, for example, when performing polar to rectangular conversion. Sin theta also generates cost theta, while cost theta also generates sin theta. Similarly, for rectangular to polar conversion, phase XY, modulus XY, and for hyperbolic functions, cauchy theta, cinch theta. In q1.31 format, numbers are represented by one sine bit and 31 fractional bits or binary places. The numeric range is therefore minus 1, 0 times 8, 0, 0, 0, 0, 0, 0, 0 to 1 to 2 to the minus 31, 0 times 7, f, f, f, f, f. The precision is 2 to the minus 31, around 5 by 10 to the minus 10. In q1.15 format, the numeric range is 1, 0.800 to 1 to 2 to the minus 15, 0 times 7, f, f, f. This format has the advantage that two input arguments can be packed into a single 32-bit write, and two results can be fetched in one 32-bit read. However, the precision is reduced to 2 to the minus 15, around 3 times 10 to the minus 5. Angles are expressed in radian, divided by pi. Consequently, only the interval minus 1 plus 1 is used. Several of the functions specify a scaling factor, scale. This allows the function input range to be extended or cover the full range of values supported by the CORDIC without saturating the input, output, or internal registers. If the scaling factor is required, it should be calculated in software and programmed into the scale field of the CORDIC CSR register. The input argument should be scaled accordingly before programming the scaled values in the CORDIC WDATA register. The scaling should also be undone on the results read from the CORDIC RDATA register. Note that the scaling factor entails a loss of precision due to truncation of the scaled value. The precision of the result is dependent on the number of CORDIC iterations. The algorithm converges at a constant rate of one binary digit per iteration for trigonometric functions. For hyperbolic functions, hyperbolic sine, hyperbolic cosine, natural logarithm, the convergence rate is less constant due to the peculiarities of the CORDIC algorithm. The square root function converges at roughly twice the speed of the hyperbolic functions. The format of arguments and results is independently programmed in the field's arg size and rest size of the CORDIC CSR register, either Q1.15 or Q1.31. Internally, the CORDIC accelerator implements the Q1.23 format. This means that rounding errors start to become significant at a precision of 2 to the 19th. Continuing CORDIC iteration after the maximum precision has been reached will degrade the precision gradually. For maximum precision, Q1.31 format should be used for input and output. However, given the format implemented internally, the output is limited to 20-bit precision at best. If Q1.15 format is used for input, the precision will be limited to Q1.15, whatever the output format. The precision required depends on the number of iterations, which has to be programmed in the field precision of the CORDIC CSR register. The number of iterations is equal to the value programmed in this field multiplied by 4. For maximum speed, the minimum number of iterations for the required precision should be programmed. Note that for most functions, the recommended range for this field is 3 to 6. This slide describes the features of the cosine function. The primary argument is the angle theta in radians. It must be divided by pi before programming arg1. The secondary argument m is the modulus. If m is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg2. The primary result, res1, is the cosine of the angle multiplied by the modulus. The secondary result, res2, is the sine of the angle multiplied by the modulus. This slide describes the features of the sine function. The primary argument is the angle theta in radians. It must be divided by pi before programming arg1. The secondary argument m is the modulus. If m is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg2. The primary result, res1, is the sine of the angle multiplied by the modulus. The secondary result, res2, is the cosine of the angle multiplied by the modulus. This slide describes the features of the phase function. The primary argument is the x-coordinate, that is the magnitude of the vector in the direction of the x-axis. If x is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg1. The secondary argument is the y-coordinate, that is the magnitude of the vector in the direction of the y-axis. If y is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg2. The primary result, res1, is the phase angle theta of the vector v. Res1 must be multiplied by pi to obtain the angle in radians. Note that values close to pi may sometimes wrap to minus pi due to the circular nature of the phase angle. The secondary result, res2, is the modulus, given by v equals square root x to the second plus y to the second. If v is greater than 1, the result in res2 will be saturated to 1. This slide describes the features of the modulus function. The primary argument is the x-coordinate, that is the magnitude of the vector in the direction of the x-axis. If x is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg1. The secondary argument is the y-coordinate, that is the magnitude of the vector in the direction of the y-axis. If y is greater than 1, a scaling must be applied in software to adapt it to the Q1.31 range of arg2. The primary result, res1, is the modulus, given by v equals square root x to the second plus y to the second. If v is greater than 1, the result in res1 will be saturated to 1. The secondary result, res2, is the phase angle theta of the vector v. Res2 must be multiplied by pi to obtain the angle in radians. Note that values close to pi may sometimes wrap to minus pi due to the circular nature of the phase angle. This slide describes the features of the arc tangent function. The primary argument, arg1, is the input value, x equals 10 theta. If x is greater than 1, a scaling factor of 2 to the nth must be applied in software such that minus 1 less than x times 2 to the nth less than 1. The scaled value x times 2 to the nth is programmed in arg1 and the scale factor n must be programmed in the scale parameter. Note that the maximum input value allowed is 10 theta equals 128 which corresponds to an angle theta equals 89.55 degrees. For x is greater than 128, a software method must be used to find 10 to the minus first x. The secondary argument, arg2, is unused. The primary result, res1, is the angle theta equals 10 to the minus first x. Res1 must be multiplied by 2 to the nth point pi to obtain the angle in radians. The secondary result, res2, is unused. This slide describes the features of the hyperbolic cosine function. The primary argument is the hyperbolic angle x. Only values of x in the range minus 1.118 to plus 1.118 are supported. Since the minimum value of cosh x is 1, which is beyond the range of the q1.31 format a scaling factor of 2 to the minus nth must be applied in software. The factor n equals 1 must be programmed in the scale parameter. The secondary argument, arg2, is unused. The primary result, res1, is the hyperbolic cosine cosh x. Res1 must be multiplied by 2 to obtain the correct result. The secondary result, res2, is the hyperbolic sine sinc x. Res2 must be multiplied by 2 to obtain the correct result. This slide describes the features of the hyperbolic cosine function. The primary argument is the hyperbolic angle x. Only values of x in the range minus 1.118 to plus 1.118 are supported. Since the minimum value of cosh x is 1, which is beyond the range of the q1.31 format a scaling factor of 2 to the minus nth must be applied in software. The factor n equals 1 must be programmed in the scale parameter. The secondary argument, arg2, is unused. The primary result, res1, is the hyperbolic cosine cosh x. Res1 must be multiplied by 2 to obtain the correct result. The secondary result, res2, is the hyperbolic sine sinc x. Res2 must be multiplied by 2 to obtain the correct result. This slide describes the features of the hyperbolic arc tangent function. The primary argument is the input value x. Only values of x in the range minus 0.806 to plus 0.806 are supported. The value x must be scaled by a factor 2 to the minus nth, where n equals 1. The scaled value x times 0.5 is programmed in arg1, and the factor n equals 1 must be programmed in the scale parameter. The secondary argument, arg2, is unused. The primary result, res1, is the hyperbolic arc tangent at tanx. Res1 must be multiplied by 2 to obtain the correct value. The secondary result is not used. This slide describes the features of the natural logarithm function. The primary argument is the input value x. Only values of x in the range 0.107 to plus 9.35 are supported. The value x must be scaled by a factor 2 to the minus nth, such that x.2 to the minus nth is less than 1 minus 2 to the minus nth. The scaled value x times 2 to the minus nth is programmed in arg1, and the factor n equals 1 must be programmed in the scale parameter. The secondary argument is unused. The primary result, res1, is the natural logarithm. Res1 must be multiplied by 2 to the n plus first to obtain the correct value. The secondary result is not used. This slide describes the features of the square root function. The primary argument is the input value x. Only values of x in the range 0.027 to 2.34 are supported. The value x must be scaled by a factor 2 to the minus nth, such that x.2 to the minus nth is less than 1 minus 2 to the minus nth minus second. The scaled value x times 2 to the minus nth is programmed in arg1, and the factor n equals 1 must be programmed in the scale parameter. The secondary argument is unused. The primary result, res1, is the square root of x. Res1 must be multiplied by 2 to the nth to obtain the correct value. The secondary result is not used. The software that subcontracts a calculation to the Cortic block does not need to pull a flag to determine when this calculation is completed. It simply initiates a read request of the R data register through the AHB bus. As with any AHB transaction, the slave is permitted to insert weight states by maintaining H-ready signal low. Once the results are available, the Cortic block asserts H-ready, which completes the transaction. In the meantime, the Cortex-M processor is frozen. This approach is called zero overhead mode. As soon as the results have been read from R data, one or two reads, depending on the value of nres, the pending operation is started. A new set of arguments and settings can be written, as long as there is no operation pending. This means that time spent waiting for the Cortic operation to complete can be used to prepare the next operation, and the Cortic is never idle. The Cortic CSR register can be reprogrammed while a calculation is in progress without affecting the result of the ongoing calculation. The sequence described in this slide summarizes the use of the Cortic IP in zero overhead mode, assuming a single shot operation. No further calculation is scheduled, so the processor simply waits for the completion of the current operation. The sequence described in this slide summarizes the use of the Cortic IP in zero overhead mode, assuming pipelined operations. By iterating the steps three to six, software can re-execute the same operation for an array of arguments. The seventh step is required to obtain the result of the last operation. The sequence described in this slide summarizes the use of the Cortic IP in polling mode. When a new result is available in the Cortic R data register, the R-ready flag is set in the Cortic CSR register. The flag can be pulled by reading this register. It is reset by reading the Cortic R data register once or twice depending on the NRES field of the Cortic CSR register. Polling the R-ready flag takes slightly longer than reading the Cortic R data register directly, since the result is not read as soon as it is available. However, the processor and bus interface are not stalled while reading the Cortic CSR register, so this mode may be of interest if stalling the processor is not acceptable. For instance, when low latency interrupts must be serviced. The sequence described in this slide summarizes the use of the Cortic IP in interrupt mode. By setting the interrupt enable or IEN bit in the Cortic CSR register, an interrupt will be generated whenever the R-ready flag is set. The interrupt is cleared when the flag is reset. This mode allows the result of the calculation to be read under interrupt service routine, and hence given a priority relative to other tasks. However, it is slower than directly reading the result by polling the flag due to the interrupt handling delays. DMA mode is very efficient when performing multiple calculations using the same settings. It is not possible to modify the Cortic CSR register by DMA. Consequently, if the settings need to be changed, the DMA should be stopped first and restarted once the new settings have been programmed. DMA write can be combined with DMA polling or interrupt read methods. Pipelining is always used in DMA mode. DMA write requests are enabled by setting the DMA when bit in the Cortic CSR register. DMA read requests are enabled by setting the DMA when bit in the Cortic CSR register. The Cortic unit is active in run, low power run, sleep, and low power sleep modes. DMA is not available in the other low power modes. These peripherals may need to be specifically configured for correct use with the Cortic block. Please refer to the corresponding peripheral training modules for more information.