Andy David says:
Here's my 32 bit routine as written for the 17c43 taken from a mail I sent Scott just after I wrote it, hence the comments about the implementations I used.Looks a lot like Scott's original 16-bit sqrt. As the root is going to be a 16 bit number the last subtract is awkward, so the 24-bit sqrt method wasn't appropriate. I did actually write this myself rather than automatically converting Scott's code to 32 bit. I DID, however, consciously and unashamedly steal two parts - how to carry out the final 17-bit subtraction and how to count iterations - the extra 'counting' bit in the mask was quite a devious idea. This one took a little longer to write than the 24-bit, probably because it needs to iterate more times...
Standard disclaimer applies
;========================================================================= ; brSQRT32 ; ; Calculates the square root of a thirtytwo bit number using the ; binary restoring method. ; ; Result in ACCaHI:ACCaLO ; Mask in ACCbHI:ACCbLO ; Input in ACCcHI:ACCcLO:ACCdHI:ACCdLO ; ; Takes between 392 and 439 cycles (incl. call and return). ; Uses 58 words ROM, 8 bytes RAM including 4 holding the input. ; ;------------------------------------------------------------------------- brSQRT32: mov W, #$40 ; Initial value for Result is... mov ACCaHI, W ; ... 01000000 00000000 clr ACCaLO ; mov W, #$C0 ; Initial value for mask is... mov ACCbHI, W ; ... 11000000 00000000 clr ACCbLO ; (second '1' is loop counter). Sub_Cmp:movfp ACCaLO,WREG ; Compare root-so-far with current sub ACCcLO, W ; ... remainder. movfp ACCaHI,WREG ; subwfb ACCcHI,f ; sb ALUSTA.C ; jmp brstr ; (result is -ve, need to restore). In1: movfp ACCbLO,WREG ; set the current bit in the result. or ACCaLO, W ; movfp ACCbHI,WREG ; or ACCaHI, W ; ShftUp: rlcf ACCdLO,f ; rlcf ACCdHI,f ; rlcf ACCcLO,f ; rlcf ACCcHI,f ; rrcf ACCbHI,f ; Shift mask right for next bit, whilst rrcf ACCbLO,f ; ... shifting IN MSB from remainder. snb ACCbHI.7 ; If MSB is set, unconditionally set the jmp USet1 ; ... next bit. movfp ACCbLO,WREG ; Append '01' to root-so-far xor ACCaLO, W ; movfp ACCbHI,WREG ; xor ACCaHI, W ; sb ALUSTA.C ; If second '1' in mask is shifted out, jmp Sub_Cmp ; ... then that was the last normal iteration. movfp ACCaLO,WREG ; Last bit Generation. sub ACCcLO, W ; ... The final subtract is 17-bit (15-bit root movfp ACCaHI,WREG ; ... plus '01'). Subtract 16-bits: if result subwfb ACCcHI,f ; ... generates a carry, last bit is 0. sb ALUSTA.C ; ret mov W, #1 ; If result is 0 AND msb of is '0', result bit snb ALUSTA.Z ; ... is 0, otherwise '1'. snb ACCdHI.7 ; xor ACCaLO, W ; ret USet1: snb ALUSTA.C ; If mask has shifted out, leave. final bit ret ; ... has been set by iorwf at in1. clrb ACCbHI.7 ; clear bit shifted in from input. movfp ACCbLO,WREG ; Append '01' to root-so-far xor ACCaLO, W ; movfp ACCbHI,WREG ; xor ACCaHI, W ; movfp ACCaLO,WREG ; This subtraction is guaranteed not to sub ACCcLO, W ; ... cause a borrow, so subtract and movfp ACCaHI,WREG ; ... jump back to insert a '1' in the subwfb ACCcHI,f ; ... root. jmp In1 ; brstr: movfp ACCaLO,WREG ; A subtract above at Sub_Cmp was -ve, so add ACCcLO, W ; ... restore the remainder by adding. movfp ACCaHI,WREG ; The current bit of the root is zero. addwfc ACCcHI,f ; jmp ShftUp ;