On Sat, Oct 20, 2012 at 10:52 AM, Bob Ammerman wro= te: > The FF1L instruction will let you find the one bit. > You should then be able to use a multibit shift with the shift count in t= he > register to do the variable shift at just a couple of instructions per wo= rd. > > -- Bob Ammerman I just woke up, so I may be misunderstanding part of the original question, but the OP and Bob have noted that you can find the Most Significant 1 Bit (MS1B) with the FF1L instruction. Can't you then place the value in an accumulator and use the SFTAC instruction as follows: SFTAC Acc,Wn This uses the barrel shifter to do a multibit shift in one cycle. -Scott > RAm Systems > > > -----Original Message----- > From: piclist-bounces@mit.edu [mailto:piclist-bounces@mit.edu] On Behalf = Of > Joe Mickley > Sent: Saturday, October 20, 2012 10:47 AM > To: Microcontroller discussion list - Public. > Subject: Re: [PIC] Fast 32 bit by 32 bit divide in a dsPIC30F > > Follow on to my last post. Started working on the implementation suggest= ed > in Hackers Delight. > I'm working from Figure 9-3 and converting it to ASM as I go. Book in on= e > hand, keyboard in one > hand, coffee in one hand (wait a minute, how is that working ...) > > The initial results are NOT good. My original goal was to improve on the > divide from the Microchip > lib implementation which takes around 400 instruction cycles and which I > consider too long. I have > gotten part way thru the Hackers Delight implementation and I can clearly > see that this is not going > to produce a substantially (lets just define "substantial" as at least 2:= 1 > better in time) better solution. > > The problem is that the routine starts off by normalizing the denominator > left in the 32 bit register > pair until the first "1" bit in the denominator is aligned with the MS bi= t > of the INT32. That means that > if the denominator is a smallish value there will be a lot of left shifti= ng. > > Further, the routine takes > the numerator along with it as it shifts left. Since the numerator was > already an INT32, internally > it now becomes an INT64 as part of the shift process. > > I reduced the total number of left shifts by first looking to see if ther= e > were any "1"s at all in the MS > word of the INT32 denominator. If there are none there, then my routine > simply uses MOV instructions > to promote the INT16 words up by 1 word in the registers. After that it > then looks to see where the > first "1" is in the register and iteratively shifts left until that bit > aligns with the MS bit of the INT32 register. > > The problem is that all of that left shifting (denominator as an INT32 an= d > numerator as an INT64) takes > about 11 cycles/bit. The result is that for a small denominator (worst c= ase > > denominator =3D1) the total > time is about 200 cycles. Thats before I actually get to the section of > code that is going to do the actual > dividing. > > The routine has 2 DIVIDE operations. I know each one will take 17 cycles= .. > That is 34 more. There > is (not yet coded) an iterative "multiply and compare" function. That to= o > is multi-word. MUL goes fast > but many compare (CP) instructions will still count up (somewhere in ther= e > are some decision functions, > else why compare) and I haven't looked at the iteration criteria. > > This is not good. I'm not even close to being done and I have already > burned about 260 cycles worst case. > The "substantial improvement" idea is dead and it's going to get worse. I > am going to keep plugging away > for a bit longer, but I don't think this is going to be the solution. > > -- > http://www.piclist.com PIC/SX FAQ & list archive > View/change your membership options at > http://mailman.mit.edu/mailman/listinfo/piclist > > -- > http://www.piclist.com PIC/SX FAQ & list archive > View/change your membership options at > http://mailman.mit.edu/mailman/listinfo/piclist --=20 http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist .