Andy David says:
Here's my 32 bit routine as written for the 17c43 taken from a mail I sent Scott just after I wrote it, hence the comments about the implementations I used.Looks a lot like Scott's original 16-bit sqrt. As the root is going to be a 16 bit number the last subtract is awkward, so the 24-bit sqrt method wasn't appropriate. I did actually write this myself rather than automatically converting Scott's code to 32 bit. I DID, however, consciously and unashamedly steal two parts - how to carry out the final 17-bit subtraction and how to count iterations - the extra 'counting' bit in the mask was quite a devious idea. This one took a little longer to write than the 24-bit, probably because it needs to iterate more times...
Standard disclaimer applies
;========================================================================= ; brSQRT32 ; ; Calculates the square root of a thirtytwo bit number using the ; binary restoring method. ; ; Result in ACCaHI:ACCaLO ; Mask in ACCbHI:ACCbLO ; Input in ACCcHI:ACCcLO:ACCdHI:ACCdLO ; ; Takes between 392 and 439 cycles (incl. call and return). ; Uses 58 words ROM, 8 bytes RAM including 4 holding the input. ; ;------------------------------------------------------------------------- brSQRT32: movlw 0x40 ; Initial value for Result is... movwf ACCaHI ; ... 01000000 00000000 clrf ACCaLO,f ; movlw 0xC0 ; Initial value for mask is... movwf ACCbHI ; ... 11000000 00000000 clrf ACCbLO,f ; (second '1' is loop counter). Sub_Cmp:movfp ACCaLO,WREG ; Compare root-so-far with current subwf ACCcLO,f ; ... remainder. movfp ACCaHI,WREG ; subwfb ACCcHI,f ; btfss ALUSTA,C ; goto brstr ; (result is -ve, need to restore). In1: movfp ACCbLO,WREG ; set the current bit in the result. iorwf ACCaLO,f ; movfp ACCbHI,WREG ; iorwf ACCaHI,f ; ShftUp: rlcf ACCdLO,f ; rlcf ACCdHI,f ; rlcf ACCcLO,f ; rlcf ACCcHI,f ; rrcf ACCbHI,f ; Shift mask right for next bit, whilst rrcf ACCbLO,f ; ... shifting IN MSB from remainder. btfsc ACCbHI,7 ; If MSB is set, unconditionally set the goto USet1 ; ... next bit. movfp ACCbLO,WREG ; Append '01' to root-so-far xorwf ACCaLO,f ; movfp ACCbHI,WREG ; xorwf ACCaHI,f ; btfss ALUSTA,C ; If second '1' in mask is shifted out, goto Sub_Cmp ; ... then that was the last normal iteration. movfp ACCaLO,WREG ; Last bit Generation. subwf ACCcLO,f ; ... The final subtract is 17-bit (15-bit root movfp ACCaHI,WREG ; ... plus '01'). Subtract 16-bits: if result subwfb ACCcHI,f ; ... generates a carry, last bit is 0. btfss ALUSTA,C ; return ;Peter Harrison says these next two lines are required to set Z correctly movfp ACCcLO,WREG ; Clear zero flag if remainder non zero xorwf ACCcHI,WREG ; ;End section added by Peter Harrison. movlw 1 ; If result is 0 AND msb of is '0', result bit btfsc ALUSTA,Z ; ... is 0, otherwise '1'. btfsc ACCdHI,7 ; xorwf ACCaLO,f ; return USet1: btfsc ALUSTA,C ; If mask has shifted out, leave. final bit return ; ... has been set by iorwf at in1. bcf ACCbHI,7 ; clear bit shifted in from input. movfp ACCbLO,WREG ; Append '01' to root-so-far xorwf ACCaLO,f ; movfp ACCbHI,WREG ; xorwf ACCaHI,f ; movfp ACCaLO,WREG ; This subtraction is guaranteed not to subwf ACCcLO,f ; ... cause a borrow, so subtract and movfp ACCaHI,WREG ; ... jump back to insert a '1' in the subwfb ACCcHI,f ; ... root. goto In1 ; brstr: movfp ACCaLO,WREG ; A subtract above at Sub_Cmp was -ve, so addwf ACCcLO,f ; ... restore the remainder by adding. movfp ACCaHI,WREG ; The current bit of the root is zero. addwfc ACCcHI,f ; goto ShftUp ;
Comments:
32 bit Square Root binary restoring routine.
This routine has an error in the last bit generation routine - basically the Z bit is incorrectly set after the final subtraction.
I have a routine for the PIC18F with this error corrected - two extra lines of code inserted before the decision to set the final bit. It now works correctly on all the numbers tested so far (several thousand!!)
{ed; here is Peters complete PIC18F routine . His corrections have also been added to the routine above. Thank you Peter!};======================================================================================== ; SQRT32 ; ; Calculates the square root of a thirtytwo bit number using the ; binary restoring method. ; ; Input: acc2_4:acc2_1 ; Mask: acc3_2:acc3_1 ; Result: acc1_2:acc1_1 ; ; Takes between 395 and 457 cycles (incl. call and return). ; Uses 64 words ROM, 8 bytes RAM including 4 holding the input. ;======================================================================================== ; sqrt32 movlw 0x40 ; Initial value for Result is... movwf acc1_2 ; ... 01000000 00000000 clrf acc1_1 ; ; movlw 0xC0 ; Initial value for mask is... movwf acc3_2 ; ... 11000000 00000000 clrf acc3_1 ; (second '1' is loop counter). ; compare movf acc1_1,w ; Compare root-so-far with current subwf acc2_3,f ; ... remainder. movf acc1_2,w ; subwfb acc2_4,f ; bc $+4 ; goto restore ; result is -ve, so need to restore ; setcurr movf acc3_1,w ; set the current bit in the result. iorwf acc1_1,f ; movf acc3_2,w ; iorwf acc1_2,f ; ; shftUp rlcf acc2_1,f ; rlcf acc2_2,f ; rlcf acc2_3,f ; rlcf acc2_4,f ; ; rrcf acc3_2,f ; Shift mask right for next bit, whilst rrcf acc3_1,f ; ... shifting IN MSB from remainder. btfsc acc3_2,7 ; If MSB is set, unconditionally set the goto setnext ; ... next bit. ; movf acc3_1,w ; Append '01' to root-so-far. xorwf acc1_1,f ; movf acc3_2,w ; xorwf acc1_2,f ; ; bc $+4 ; If second '1' in mask is shifted out, goto compare ; ... then that was the last normal iteration. ; movf acc1_1,w ; Last bit Generation. subwf acc2_3,f ; ... The final subtract is 17-bit (15-bit root movf acc1_2,w ; ... plus '01'). Subtract 16-bits: if result subwfb acc2_4,f ; ... generates a carry, last bit is 0. bc $+4 ; return ; movf acc2_3,w ; Clear zero flag if remainder non zero xorwf acc2_4,w ; ; movlw 1 ; bnz $+4 ; If result is 0 AND msb of N is '0', result bit btfsc acc2_2,7 ; ... is 0, otherwise '1'. xorwf acc1_1,f ; return ; ; setnext bnc $+4 ; If mask has shifted out, leave. final bit return ; ... has been set by iorwf at in1. bcf acc3_2,7 ; clear bit shifted in from input. ; movf acc3_1,w ; Append '01' to root-so-far xorwf acc1_1,f ; movf acc3_2,w ; xorwf acc1_2,f ; ; movf acc1_1,w ; This subtraction is guaranteed not to subwf acc2_3,f ; ... cause a borrow, so subtract and movf acc1_2,w ; ... jump back to insert a '1' in the subwfb acc2_4,f ; ... root. goto setcurr ; ; restore movf acc1_1,w ; A subtract above at Sub_Cmp was -ve, so addwf acc2_3,f ; ... restore the remainder by adding. movf acc1_2,w ; The current bit of the root is zero. addwfc acc2_4,f ; goto shftUp ; ; ;========================================================================================