> John Payson wrote:
[binary to decimal routine]

> > Anyone want to give a go at figuring out how it works?
>
> Sure!
>
> <code snipped>

> This gives us five equations:
> b_0 = 6*(a_3 + a_2 + a_1) + a_0
> b_1 = a_3*9 + a_2*5 + a_1
> b_2 = 2*a_2
> b_3 = 4*a_3
> b_4 = 0
>
> Which as John says, must be "normalized". Normalization in
> this context means we need to reduce the b_j's such that
>   0 <= b_j <= 9
>
> In other words we need to find:
>
> c_0 = b_0 mod 10
> b_1 = (b_1 + (b_0 - c_0)/10)
> c_1 =  b_1 mod 10
> b_2 = (b_2 + (b_1 - c_1)/10)
> c_2 = b_2 mod 10
> b_3 = (b_3 + (b_2 - c_2)/10)
> c_3 = b_3 mod 10
> c_4 = (b_4 + (b_3 - c_3)/10) mod 10
>     = (b_2 - c_2)/10
>
> Division by 10 can be done quite efficiently (as was shown
> in another thread several months ago). However, it does require
> a significant amount of code space compared to say repeated
> subtractions. Unfortunately, there can be very many subtractions
> that are required. For example, b_1 could be as large as 15*16,
> or 240. 10 would have to be subtracted 24 times if you wish to
> compute 240 mod 10. I presume John realized this inefficiency
> and thus sought to express N so that repeated subtractions
> could be used and that the total number of subtractions are
> minimized. This leads to the next way that N can be expressed:

Actually, a bigger concern than that was that not multiplications by 5 and
9 are particularly efficient; if I were to make the routine perform the
shifts in proper sequence I might be able to do those decently, but I'd
lose the other advantage of my routine: one half of the number is processed,
then the other; this allows it to be run effectively with FSR/IND.

In any event, the most important aspect of the pre-normalization process is
that all of the b_ items fit in 8 bits and that they all have the same sign.
I could probably have improved the efficiency a little if I'd used a "zipper"
algorithm (e.g. some terms are known to be positive while others are known to
be negative) but such things can be a little bit tricky.

> Carrying this concept out for the rest of the b_i's we have.
>
> b_0 = a_0 - 4*(a_3 + a_2 + a_1) - 20
> b_1 = 6*a_2 + 2*a_1 + 2 - 140
>     = 6*a_2 + 2*a_1 - 138
> b_2 = a_3 + 2*a_2 + 14 - 60
>     = a_3 + 2*a_2 - 46
> b_3 = 4*a_3 + 6 - 70
>     = 4*a_3 - 64
> b_4 = 0 + 7
>     = 7
>
> And if you look at John's code closely, you will see this is
> how he has expressed the b_j's.

Yup.  I think if I were putting this in a production environment, the three
biggest changes I'd probably make would be:

  [1] Replace "ANDLW $0F; ADDLW $F0" with "IORLW $F0" as someone was kind
      enough to suggest.

  [2] Probably add code so if b_0<=-128, b_0+=100 and b_2-=1 and likewise for
      b_1 -> b_3.

  [3] Possibly add code to cast out 30's [in this case, I would make it so
      all the terms were known-positive and subtract 30's until they went
      negative then add 10's to make them positive again.