Hello Scott.

> If you add two horizontal registers together,
> only two instructions are needed:
>
>   movf  r1,w
>   addwf r2,f
>
> If you're implementing phase accumulators, you
> need the msb of the sum and:
>   movf  r1,w
>   addwf r2,f
>   rlf   r2,w
>   rlf   phase,f
>
> If you're implementing multibyte adders (e.g. 16 bits) you need 6
> instructions for the addition. If you're using this as phase accumulator
> then two more instructions are needed to extract the msb. The number of
> instructions required for say 8 counters is:
>   (6 + 2) * 8 = 80


(6 + 2) * 8 = 64 ? Where others cycles were lost ?


> With Dmitry's 6-cycle per stage vertical adder (actually the first stage
> only needs to be two cycles) you need:
>   6 * (stages -1 ) + 2 instructions
> And for 16 bits that comes out to 92 instructions. For 14 bits, the
> horizontal and vertical counters are equivalent and for fewer than 14 (but
> more than 8) the vertical is faster!


Actually I've understood another interesting thing that makes a vertical
counters faster and memory requireless in competition with horizontal
ones.
Phase accumulators usually used to generate sine & cosine square waves.
Let us recall the following trick :

X= 00, 01, 10, 11

X.1 changed as sine func ( 0_0_1_1_.. )
and X.1 xored with X.0 changed as cosine ( 0_1_1_0_..)

Generating sine and cosine will require only one additional stage of
vertical conter. In case of 7 bits counters we will see the following:

movlw   const_1         ;phase adding
addwf   count_1s,f
addwf   count_1c,f
rlf     count_1s,w
rlf     phase_l,f
rlf     count_1c,w
rlf     phase_l,f

7 clocks per sin&cos generating operation 7 * 8 = 56 and
will require 2*8 + (1 or 2) (temp_phase) cells of memory

In case of vertical implementing: ( 8 + 1 stages at all )
6 * 8 + 2 + 1(additional xorwf to obtain cos) = 51 and will
require only 9 cells of memory

Probably there are way to achieve much better performance
after understanding what John'd proposed in 3 clocks routine.

WBR Dmitry.

PS. Playing RISC uC obtain pleasure ;)