Andy Shaw wrote:
>
> Hi Folks,
> Does anyone out there have any code for a reasonably
> fast divide by 10 routine. I'm just looking for a rounded
> up 8 bit signed value as the result I don't need the
> remainder.

A little over a year ago we had this same discussion. I'll
post some of the pertinent (or at least interesting) results:

In general, when you find that you need to divide by a constant,
it is often easier to multiply by the reciprocal of that constant.
For example, to divide by some X, we can find the closest Y that
satisfies the equation:

1/X = Y/ (2^n)

Then division by X is transformed to multiplication by Y followed by a
shift of n bits. Note that this is actually converting the fraction
1/X into a binary fraction.

When X is an integer, then either Y will be an integer or will be a
repeating pattern (somewhat analogous to how 1/3 = .333333). If it is
a repeating pattern, then pray to the god of numbers that it is a
simple one. Fortunately, 1/10 is a very simple pattern:

1/10 = 0.00011001100110011001100110011001100 ..... (base 2)

This can be rewritten to emphasize the optimization:
1/10 = (3/16 + 3/256 + 3/4096 + ... + 3/(2^(4*m)) ) / 2
where m is an integer corresponding to how many terms you wish to keep.
Or more compactly:

           i->infinity
             ___
        3    \
1/10 = --- * /__ 2^(-4*i)
        2    i=1

This can be checked by using the formula for the sum of a geometric
series:

i=k-1
___            1 - r^k
\            ----------
/__ r^i  =     1 - r
i=0

In our case, r = 1/16. Taking into account the fact that our summation
begins at 1 and not zero and also that k is infinity, we get

1/10 = 3* [1/(1 - 1/16) - 1]/2 = 3 ( 16/15 - 1)/2 = 1/10


We can also express the multiplications and divisions by shifting

X / 10 = 3 * (X>>4 + X>>8 + X>>12 + X>>16 + ...) << 2
Since you are only interested in 16 bit integers divided by 10 then
all you need are the first four terms. To reduce roundoff, you will
want to re-arrange this slightly,
X / 10 ~ 3 * ((X<<12 + X<<8 + X<<4 + X) >> 17)


Then after that,

Thomas Coonan wrote:
>
> >Here's a 16-bit divide-by-10 algorithm (y = x/10):
> >
> >    y = x/4
> >
> >    for i = 1 to 7
> >        y = x - y
> >        y = y/4
> >    next i
> >
> >    y = y/2
> I understand why powers of two are good... And I sort of see how you
> divide down close to the target before you do too much shifting...
>
> So, what's your general problem-solving technique for this class of
> problem?  Are you applying some basic number theory tricks about rational
> numbers, or is this trial-n-error?

and

Andrew Warren wrote:
>
> In this case, my divide-by-10 code is just a divide-by-5 with an
> extra divide-by-2 tacked on at the end.  I came up with the original
> divide-by-5 routine by noticing the following:
>
>     x      4x        x     5x
>    --- =  ----, and --- = ----.
>     5      20        4     20
>
> Therefore,
>
>     x      x     x
>    --- =  --- - ----
>     5      4     20
>
>            x      x/4
>        =  --- - -------
>            4       5
>
>            x       (x/4)     (x/4)/4
>        =  --- - ( ------- - --------- )
>            4         4          5
>
>        etc...
>

Another way to skin Andy's cat is to return to the binary fraction of
x/5 and do some manipulation:

  x
 --- = 0.001100110011001100110011...
  5
             3     3      3             3
     = x *( -- +  --- + ---- + ... + ------ )
            16    256   4096         2^(4*m)

     = x * 3 * ( 1>>4 + 1>>8 + 1>>12 + ... + 1>>(4*m) )        (1)

Now, x * 3 can be rewritten as x<<2 - x (i.e. x*3 = x*4 - x)

  x
 --- = (x<<2 - x) * ( 1>>4 + 1>>8 + 1>>12 + ... + 1>>(4*m) )
  5

     = x * ( 1>>2 + 1>>6 + 1>>10 + ... + 1>>(4*m-2) ) -
       x * ( 1>>4 + 1>>8 + 1>>12 + ... + 1>>(4*m) )

     = x * ( 1>>2 - 1>>4 + 1>>6 - 1>>8 + 1>>10 -+  ... -+ 1>>(2*m) )

Now, assuming x is a 16 bit integer we only need to keep terms up to m=7

  x
 --- ~= x * ( 1>>2 - 1>>4 + 1>>6 - 1>>8 + 1>>10 - 1>>12 + 1>>14 )
  5

      = x * (1 - (1 - (1 - (1 - (1 - (1 - 1>>2)>>2)>>2)>>2)>>2)>>2)>>2
      =     (x - (x - (x - (x - (x - (x - x>>2)>>2)>>2)>>2)>>2)>>2)>>2

This is Andy's algorithm unrolled.


Just for grins, Pete Brink's cat can be skinned similarly:

First note that x * 3 can be rewritten as x<<1 + x. Substitute this
into equation (1) from above and you get:

  x
 --- = (x<<1 + x) * ( 1>>4 + 1>>8 + 1>>12 + ... + 1>>(4*m) )
  5
     = x * (1>>3 + 1>>4 + 1>>7 + 1>>8 + ... )
     = (x>>3 + x>>4 + x>>7 + x>>8 + ...)
     = (x + x>>1 + x>>4 + x>>5 + ...) >> 3
or,
  x
 --- = (x + x>>1 + x>>4 + x>>5 + ...) >> 4
  10

------------------------------------------------------------
And finally, for grins I wrote a simply minded loop-and-divide-by-10
routine which is short in code space but takes a "long time" (which
is just as vague as "reasonable fast") to run:

;----------------------------------
;divby10
;  Divides the unsigned integer N_hi:N_lo by the constant 10.
;
;Input
;      N_lo - Low byte of  the 16 bit dividend
;      N_hi - High     "                  "
;Output
;      R_lo - Low byte of the result
;      R_hi - High    "         "
;
; 17  words
; 152 cycles

divby10_ver3
        CLRC
        RRF     N_hi,F
        RRF     N_lo,F
        CLRF    R_lo            ;Only need to clear R_lo. R_hi is
cleared by shifting(below)
        MOVLW   13
        MOVWF   count           ;
v3_1    MOVLW   0x50            ;If the high byte is greater than or
equal to 0xa0,
        SUBWF   N_hi,W          ;then this subtraction causes no borrow
(i.e. C=1)
        SKPNC
          MOVWF N_hi            ;Replace N_hi with N_hi - 0xa0 if

        RLF     R_lo,F          ;Shift result left one bit and
        RLF     R_hi,F          ;   pick up the carry bit in the
process.
        RLF     N_lo,F          ;Adjust N for the next iteration.
        RLF     N_hi,F
        DECFSZ  count,F
          goto  v3_1

        RETURN

In another message, I alluded to the possibility of a 16bit-divide-by-
10 routine that would execute in 50 cycles. Maybe Dmitry will want
to take a whack at it?

I hope this is enough to keep you busy for a while.

Scott