On Fri, 14 Mar 2008, Tamas Rudnai wrote: > Sergio, > > On the 'XCSB samples and circuits' section, the led-01.bas is _not_ compiled > with code optimization I suppose? (It's not a new flame, I am just curious > about how compact code can be generated by an HLL, how could it be compared > to what I would write in asm). > > Many thanks, > Tamas Hi Tamas, Yes it is compiled with code optimisation as standard. But the example you see is compiled with a much older version of XCSB than is available today. In this example you find the program startup (present in startup.asm), the interrupt service prolog and epilog code and the heartbeat timer code (all of which is also present in startup.asm), 32 bit interger addition and comparison code (used by the big delays). The few lines of code given in this example would probably compile to more or less what you might write in assembler. You may also find that a really good C compiler will produce about the same (for this example). However a cheap and chearful C compiler will generate much more code and it will be much less efficient (take longer to execute). The same is true of many BASIC compilers including the expensive ones. The real strength of a HLL lies in compiling big programs (certainly much bigger than the simple example shown). With big programs (ASM and HLL) you tend to break them down into subroutines (or functions). Each subroutine would typically need input parameters, local variables and return results. To make ASM subroutines better that stright forward inline ASM you tend to try to make them reuseable within the same code. So you end up defining a protocol to pass info between the subroutine and the caller. You might decide to pass a value to the subroutine by loading it into W and return the result again in W. This is OK for an 8 bit value, but what about 16 bit values or multiple parameters. You would probably end up reserving some RAM locations to pass the parameters and you might need to do the same thing when you return the result because it wont fit in W. Another complication is that you might need to use W to calculate the address of the subroutine you are calling (maybe setup PCLATH). To clarify: in one place you might want to pass the value of a variable as the parameter to your subroutine, in another place you might want to pass a constant, in yet another place you might want to pass the value of an element of an array. So the obvious solution is to define a general purpose way of passing the parameter through a RAM location. e.g. (1) movlw (MAX_DLY >> 8) & 0xff movwf arg0+1 movlw MAX_DLY & 0xff movwf arg0+0 call wait e.g. (2) movf max_dly+1,w movwf arg0+1 movf max_dly+0,w movwf arg0+0 call wait This has two serious implications: (1) it takes time and instructions to setup the parameter before you call the subroutine (2) the place where you store the parameter must be safe (you cannot put anything else there until the called subroutine has finished with it) A HLL compiler can do all this for you automatically. It will keep track of the protocol and generate all the necessary code AND it will keep track of memory locations used for passing those parameters and reuse them for other subroutines only when it is safe. NOW COMES THE MAGIC!!! As you develop your code, you will notice that sometimes a subroutine that you have written, only ever takes a certain variable or constant as a parameter. So you go back and massage the subroutine so that it no longer needs that parameter but instead you EMBED the constant or variable directly into the code of the subroutine. Then you might notice that your subroutine is doing a lot less work because it no longer needs to play around with extra RAM access, or the pointer you were passing does not need to be dereferenced any longer but you can access the memory directly. So your code gets tighter and tighter until you notice that it's reduced to only a few instructions. At this point you might decide to turn it into a macro. Now imagine a compiler that does all that checking for you each and every time you make a change. IT GETS BETTER!!! The HLL might also be able to optimise the way the result is generated so that instead of passing the result back through some reserved RAM it actually computes the result in situ: in the variable where the result needs to be stored. e.g. proc int sum(int a, int b, int c) return a + b + c endproc x = sum(a, b, c) generates the same code as: x = a + b + c without any call, return or copy overhead and this goes even further: proc int diff(j, k) return j - k endproc x = sum(a, diff(b, c), d) generates the same code as: x = a + (b - c) + d TIP OF THE ICEBERG!!! Now imagine your assembler source code in front of you. You should be able to break this down into groups of instructions. Within each group you will have a protocol - the way information is passing from instruction to instruction. You will concentrate on optimising each group so that e.g. (1) movlw 0x80 iorf fred becomes bsf fred,7 e.g. (2) movlw (0x1000000 >> 24) & 0xff movwf fred+3 movlw (0x1000000 >> 16) & 0xff movwf fred+2 movlw (0x1000000 >> 8) & 0xff movwf fred+1 movlw 0x1000000 & 0xff movwf fred+0 becomes movlw (0x1000000 >> 24) & 0xff movwf fred+3 clrf fred+2 clrf fred+1 clrf fred+0 and the compiler will do the same as you. But the compiler will also check obscure things that would probably escape your notice like e.g. MDF .equ 16777216 movlw (MDF >> 24) & 0xff movwf fred+3 movlw (MDF >> 16) & 0xff movwf fred+2 movlw (MDF >> 8) & 0xff movwf fred+1 movlw MDF & 0xff movwf fred+0 here the compiler would notice that MDF is actually 0x1000000 and use clrf instead Ok so this intra-group-protocol thing may seem like a big "so what" but it has real benefits when you start looking at the size of variables. Consider adding an 8 bit unsigned variable (jack) to a 16 bit variable (fred). You can either convert the 8 bit value (jack) to a 16 bit value (jack2) and then add it (to fred). This is a simple one-size-fits-all solution e.g. movf jack,w movwf jack2+0 clrf jack2+1 movf jack2+0,w addwf fred+0 btfsc STATUS,C incf fred+1 movf jack2+1,w addwf fred+1 or you can go the extra mile and generate the following optimised code movf jack,w addwf fred+0 btfsc STATUS,C incf fred+1 Now we have three protocols, one that deals with 8 bit to 8 bit, one that deals with 16 bit to 16 bit and one that deals with 8 bit to 16 bit. 8 bit to 8 bit and 16 bit to 16 bit is easy to do in assembler using macros but the more combinations you have to deal with with assembler the easier it is to get it wrong (cause a bug) because the assembler cannot keep track of the size of a variable and use the appropriote macro - the assembler programmer has to do that. Ok, only 3 protocols, not a big deal you might argue. What about adding some more useful ones then: 8 bit to 32 bit 16 bit to 32 bit 32 bit to 32 bit then there are also the constant variations: 8 bit constant to 8 bit 16 bit constant to 16 bit 32 bit constant to 32 bit 8 bit constant to 16 bit 16 bit constant to 32 bit and then there are the variations where W holds the least sig 8 bits of the value, and variations dealing with the value being pointed to by FSR. The point is that there are lots of different intra-group-protocols to choose from in order to generate optimised code. Just defining a few macros wont cut it. You will also have a protocol between groups of instructions so that (e.g.) one group leaves something ready for another group to process, or (e.g.) a loop counter is setup and at the start of a loop and used in the loop and decremented at the end of the loop. The compiler is also performing optimisations based on the way these groups interact e.g. for j=0 while j<10 step j=j+1 do done would generate movlw 10 movwf j lab1 decfsz j goto lab1 AND FINALLY!!! I keep talking about protocols as though they are a big thing - well they are. If you have a ridged protocol then you can end up with some severe limitations. e.g. in C if you use a dynamic stack and you insist on passing everything on the stack then there is little that you can do to optimise a call. Similarly if all your arithmetic is based on 16 bit ints then there is a big penalty and optimisation suffers. A good compiler will have lots of protocols at its disposal, it will analyse the HLL source code and it will be constantly selecting the protocol which allows it to generate the best executable code. A really good assembler programmer CAN do all that but a good compiler WILL do all this every time there is the slightest change to the source. Regards Sergio -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist