I've thought a little about a PIC code optimizer, but quickly got
discouraged at how much work it was compared to just writing another program
and going on...

I was thinking along the lines of what I understand that sophisticated
optimizing compilers do.

Most high-level languages enforce rules that make a program conform to a
tree structure.  In C, main() calls other functions, which in turn call
other functions...  Some high-level-language IDE's even display this call
tree in a little window.  Individual statements in most languages are also
tree structures, grammatically.  The parsing process generally creates a
parse tree in memory.  The optimizer usually starts on this parse tree,
applying heuristics that recognize various patterns in the tree, and
rearrange the branches so that the code will be more efficient.

For example, if you write x = b + a*c, in many processors it is more
efficient to do the multiply first, which will leave a*c in some register,
then add b to it.  Math (addition is commutative) says it's OK to do so.

A Parse Tree:

          =
         / \
        x   +
           / \
          b   *
             / \
            a   c

The tree can be optimized by recognizing that the + has two things "hanging"
from it, one a simple variable and the other the result of some other
computation.  The hueristic says "swap it so that the complicated one comes
first, and the simple one after it."

This is a pretty dumb and simple example, but hopefully it gives you the
shape of the thing...

The first thing I would like to get is the call tree.  That alone could give
a definitive answer to the question "can I have a stack overflow?"  At this
point, it might even be possible for the thing to automatically identify any
code that is called from only one place, and replace the call-return with
goto-goto, or even drop it inline.  This just might fix the stack overflow.

While we are at this level, the optimizer might just be able to recognize
when temporary variables are being used in such a way that they could share
the same physical RAM location.

The next thing I would like is for my optimizer to recognize "modules" of
code and analyze how closely they are related to one another on the basis of
their calls and goto's.  Two modules that had a lot of calls and goto's to
one another are more closely related than two modules that neither call nor
goto one another.

Clearly it would be beneficial to put closely related modules in the same
page with one another.

So the optimizer would compute that pairwise "relatedness" scores of the
modules, and then start dropping them into pages on that basis.

The next thing it needs to know is how big the modules are so that it can
tell when a page is filled up.  Initially, this is just an estimate, but it
probably won't be very far off.

Perhaps you get the idea.  Eventually, a code spitter just goes through the
whole thing and spits out the actual code.  It could be a little trying to
debug, since it will no longer be in the same order as the source code,
but...

The problem with doing all this kind of stuff with assembly code is that
assembly code does not have to be structured so that this kind of stuff will
work (it's legal to goto or call from anywhere to anywhere), and at the same
time it provides very few clues as to what the actual structure of the code
is.  There's nothing simple like the {} around a function in C.  Assembly
language just doesn't lend itself to automatic optimization very well.

Which is why there are languages like C and optimizing C compilers.

So, in the end, if I did all this and did a really good job of it, I might
wind up with something almost as good as a really good optimizing C
compiler.  But other people have already done some pretty good C
compilers...

So, I can hand write and hand optimize assembly code, or I can use a C
compiler.  Sometimes the hand work seems justified, and results in squeezing
the application into a one-size-smaller chip than I could get with C.  Other
times the extra labor of assembler doesn't pay and hurts the time-to-market.

> -----Original Message-----
> From: Roman Black [mailto:fastvid@EZY.NET.AU]
> Sent: Wednesday, May 02, 2001 12:19 PM
> To: PICLIST@MITVMA.MIT.EDU
> Subject: Re: [PIC]: Super Assembler (parser)
>
>
> Douglas Wood wrote:
> >
> > It would have to be at least a three pass assembler.
>
> >
> > > How would you account for the movement of code near the
> end of the program
> > > as a result of the insertion or deletion of page
> directives earlier in the
> > > code? E.g. You process a forward goto that you see does
> not require a page
> > > setting because it is in the same page but later you have
> to add page
> > > setting code for another goto between the first one and
> the first ones
> > > target. Now the first gotos target is pushed out of range
> and requires
> > > paging code. Going back to insert it might even cause the
> second goto to
> > > pushed in range of its target and removal of its page
> setting code may
> > bring the first goto target back in range.
>
>
> Not neccesarily. I imagined something more "modular".
> The parser knows the PIC type and how many pages
> are there.
>
> Then maybe each function treated as a separate module
> in the way that each code page would be filled with
> "modules", and no module would span a page boundary.
>
> A trick would be to allow AVERAGE number of words
> (code space) for every branch when placing modules.
> So as it starts from the front, each module has its
> branch code modified, then hopefully because you
> allowed for the average size some will compress and
> some will expand, so most modules will not need to
> be moved. A little extra headroom might help too.
>
> I like the module idea, it's rare than any one
> function would be more than 512words (or especially
> 2kwords) so a "smart" parser that can move function
> modules around is not too hard to implement.
> -Roman
>
> --
> http://www.piclist.com hint: PICList Posts must start with ONE topic:
> [PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads
>
>

--
http://www.piclist.com hint: PICList Posts must start with ONE topic:
[PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads