I've thought a little about a PIC code optimizer, but quickly got discouraged at how much work it was compared to just writing another program and going on... I was thinking along the lines of what I understand that sophisticated optimizing compilers do. Most high-level languages enforce rules that make a program conform to a tree structure. In C, main() calls other functions, which in turn call other functions... Some high-level-language IDE's even display this call tree in a little window. Individual statements in most languages are also tree structures, grammatically. The parsing process generally creates a parse tree in memory. The optimizer usually starts on this parse tree, applying heuristics that recognize various patterns in the tree, and rearrange the branches so that the code will be more efficient. For example, if you write x = b + a*c, in many processors it is more efficient to do the multiply first, which will leave a*c in some register, then add b to it. Math (addition is commutative) says it's OK to do so. A Parse Tree: = / \ x + / \ b * / \ a c The tree can be optimized by recognizing that the + has two things "hanging" from it, one a simple variable and the other the result of some other computation. The hueristic says "swap it so that the complicated one comes first, and the simple one after it." This is a pretty dumb and simple example, but hopefully it gives you the shape of the thing... The first thing I would like to get is the call tree. That alone could give a definitive answer to the question "can I have a stack overflow?" At this point, it might even be possible for the thing to automatically identify any code that is called from only one place, and replace the call-return with goto-goto, or even drop it inline. This just might fix the stack overflow. While we are at this level, the optimizer might just be able to recognize when temporary variables are being used in such a way that they could share the same physical RAM location. The next thing I would like is for my optimizer to recognize "modules" of code and analyze how closely they are related to one another on the basis of their calls and goto's. Two modules that had a lot of calls and goto's to one another are more closely related than two modules that neither call nor goto one another. Clearly it would be beneficial to put closely related modules in the same page with one another. So the optimizer would compute that pairwise "relatedness" scores of the modules, and then start dropping them into pages on that basis. The next thing it needs to know is how big the modules are so that it can tell when a page is filled up. Initially, this is just an estimate, but it probably won't be very far off. Perhaps you get the idea. Eventually, a code spitter just goes through the whole thing and spits out the actual code. It could be a little trying to debug, since it will no longer be in the same order as the source code, but... The problem with doing all this kind of stuff with assembly code is that assembly code does not have to be structured so that this kind of stuff will work (it's legal to goto or call from anywhere to anywhere), and at the same time it provides very few clues as to what the actual structure of the code is. There's nothing simple like the {} around a function in C. Assembly language just doesn't lend itself to automatic optimization very well. Which is why there are languages like C and optimizing C compilers. So, in the end, if I did all this and did a really good job of it, I might wind up with something almost as good as a really good optimizing C compiler. But other people have already done some pretty good C compilers... So, I can hand write and hand optimize assembly code, or I can use a C compiler. Sometimes the hand work seems justified, and results in squeezing the application into a one-size-smaller chip than I could get with C. Other times the extra labor of assembler doesn't pay and hurts the time-to-market. > -----Original Message----- > From: Roman Black [mailto:fastvid@EZY.NET.AU] > Sent: Wednesday, May 02, 2001 12:19 PM > To: PICLIST@MITVMA.MIT.EDU > Subject: Re: [PIC]: Super Assembler (parser) > > > Douglas Wood wrote: > > > > It would have to be at least a three pass assembler. > > > > > > How would you account for the movement of code near the > end of the program > > > as a result of the insertion or deletion of page > directives earlier in the > > > code? E.g. You process a forward goto that you see does > not require a page > > > setting because it is in the same page but later you have > to add page > > > setting code for another goto between the first one and > the first ones > > > target. Now the first gotos target is pushed out of range > and requires > > > paging code. Going back to insert it might even cause the > second goto to > > > pushed in range of its target and removal of its page > setting code may > > bring the first goto target back in range. > > > Not neccesarily. I imagined something more "modular". > The parser knows the PIC type and how many pages > are there. > > Then maybe each function treated as a separate module > in the way that each code page would be filled with > "modules", and no module would span a page boundary. > > A trick would be to allow AVERAGE number of words > (code space) for every branch when placing modules. > So as it starts from the front, each module has its > branch code modified, then hopefully because you > allowed for the average size some will compress and > some will expand, so most modules will not need to > be moved. A little extra headroom might help too. > > I like the module idea, it's rare than any one > function would be more than 512words (or especially > 2kwords) so a "smart" parser that can move function > modules around is not too hard to implement. > -Roman > > -- > http://www.piclist.com hint: PICList Posts must start with ONE topic: > [PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads > > -- http://www.piclist.com hint: PICList Posts must start with ONE topic: [PIC]:,[SX]:,[AVR]: ->uP ONLY! [EE]:,[OT]: ->Other [BUY]:,[AD]: ->Ads