Adam Bryant wrote: > A small, slower approach is to rotate the source byte right one bit > (putting the rightmost bit in the carry flag), then rotate the > destination byte left one bit (putting the carry flag in the rightmost > bit of your destination register). Repeat this cycle 8 times. Your > destination byte will have to be a temp register, then when done move the > temp register back to the original source register. I would estimate > this could be done in 10 to 15 bytes of code and 50 to 70 instruction > cycles. > > Obviously I am thinking like the C programmer I am and someone like > Dimitry could probably give you a method for accomplishing the same thing > in 5 bytes of code and 10 cycles. :-) Adam is correct about the options, but even in assembler: 1) A direct flow routine: 8 instructions shifting bits out 8 instructions shifting bits in 1 instruction swapping bytes Total Program Instructions: 17 Total Instructions Executed: 17 2) A loop routine: 1 instruction setting loop count = 8 1 instruction shifting bit out 1 instruction shifting bit in 1 instruction decrementing count and looping back 1 instruction to swap bytes Total Program Instructions: 5 Total Instructions Executed: 26 3) Lookahead table (16 bytes of nibbles): 1 instruction to set the lookahead base address 1 instruction to save the original byte 1 instruction to AND the original byte lower nibble 1 instruction to get the table nibble 1 instruction to save this table byte 1 instruction to retrieve the original byte 1 instruction to swap original byte nibbles 1 instruction to AND the original byte lower nibble 1 instruction to get the table nibble 1 instruction to OR with previous table nibble 1 instruction to swap final byte nibbles Total Program Instructions: 11 + 16 (table) Total Instructions Executed: 11 4) Lookahead table (256 bytes): 1 instruction to set the lookahead base address 1 instruction to get the table byte Total Program Instructions: 2 + 256 Total Instructions Executed: 2 As you can see, execution speed cost a lot, always.