mary 1.0a51 - an optimizing native code Forth compiler for PIC microcontrollers Released on March 2000 Copyright Francisco Rodrigo Escobedo Robles, 2000 Visit http://www.pepix.net/proyectos/glazz/mary/ for info and latest versions Contact frer@pepix.net for suggestions mary is free as in freedom, as it's an Open Source work This version is covered by GNU General Public License version 2 mary is currently in alpha state, so it's released to be used with caution. I honestly believe it works, but it can cause unknown errors, for which I cannot be held responsible. Particularly, if the entire Universe is collapsed into itself, I am innocent. This is the mary-manual.txt file "Size does matter. Who wants a dinosaur in his/her desktop?" - FRER CONTENTS -------- 0 - Acknoledgements 1 - General 1.1 - What is mary? 1.2 - Where does the name mary came from? 1.3 - WHY IS THE SOURCE IN ALL CAPS? 1.4 - What is Forth? 1.5 - What is this PIC thing? 1.6 - What is the current status of mary? 1.7 - Why use mary instead of assembler or C? 1.8 - How good are the optimizations? 1.9 - What are the system requirements? 1.10 - Licensing issues 2 - Forth model 2.1 - What kind of Forth is mary? 2.2 - How is it implemented on the PIC? 2.3 - How is mary made? 2.4 - Is it possible to mix assembler code with Forth? 2.5 - What is this strange CONST, word? 2.6 - I can't find my favourite words... 2.7 - Highlights of new/improved words 2.8 - Examples of code that take full advantage of optimizations 3 - Files 3.1 - mary 3.2 - savestate.fs 3.3 - restorestate.fs 3.4 - picasm.fs 3.5 - picforthsupp.fs 3.6 - picforth.fs 3.7 - 16cXX-data.fs 3.8 - 16cXX-consts.fs 3.9 - 16cXX-code.fs 4 - Vocabularies 4.1 - PICASM, the PIC assembler 4.1.1 - Assembler pseudodirectives 4.1.2 - Support words 4.1.3 - PIC assembly language support 4.1.4 - PIC assembler mnemonics 4.1.5 - Utilities 4.2 - PICFORTH, the free mary PIC Forth optimizing native code compiler 4.2.1 - Forth words 4.2.2 - Forth support words 4.2.3 - PIC specific support words 5 - Programming with mary 5.1 - A little example Appendixes A - A Taoist-like story about Forth ----- 0 - Acknoledgements =================== - to all the people that have created the marvelous free tools I am using including, but not limited to, gforth, all the tools from the GNU project, the Linux kernel itself, and so - to Charles Moore for creating Forth - to Wojciech Zabolotny for testing and reporting bugs. The bugfix in the HEX file generator is his :) - to too many more to be listed here. Thanks a lot 1 - General =========== 1.1 - What is mary? ------------------- "This is a code generator-less compiler for a syntax-less language in which variables are constants" - FRER mary is an optimizing native code Forth compiler for PIC microcontrollers. It generates machine code instructions and optimizes sequences of them, constant values and logical operators. mary is part of the Glazz project, whose original aim was world domination, later changed to the creation of a native code Forth system with multiple targets. The idea behind Glazz is to be able to create an interesting and portable application developing framework that run fast and without consuming resources as if they were unlimited. Right now, as spare time is scarce, mary, the free PIC Forth native code optimizing compiler, is the first work in the Glazz project to see the light. It was planned to run on the Glazz Forth system, but it doesn't exist right now (at least in executable form). Back to mary. The code generated is optimized for speed, and the optimizations try to make it small. The result is (hopefully) an efficient substitute for the assembly language programming. mary contains an essential but useful subset of Forth, as well as some extensions unique to PIC programming. Current target is the mid-range of PIC processors, the ones with 14-bit wide instruction set. Low-range PICs, while useful, are not contemplated for a future release, but who knows. Anyway, mary does optimize the use of the hardware return stack, it's possible that mary would compile well for a 2-level stack as low-range PICs. High-range PICs are more likely to be supported in the future but, again, there are not currently any plans to do so. Supplied with version 1.0a50 of mary are the necessary files for compiling Forth code for the 16C63 and 16C84 PIC models. It's easy to add new PIC models, so support for the full mid-range is projected for future releases. 1.2 - Where does the name mary came from? ----------------------------------------- You know how this things are. This is a Forth compiler for PIC microcontrollers. A PIC Forth. Mary PIC Forth. Do you know who Mary Pickford was? Remember that good ol' movies? I do. 1.3 - WHY IS THE SOURCE IN ALL CAPS? ------------------------------------ ANS Forth stated that a program that uses lowercase for standard definition names or that depends on case sensitivity has an environmental dependency. Although mary is not (yet) aimed to ANS Forth compliancy, I decided to make such a little effort in order to compile inside an ANS Forth system. Comments suffered the same luck, for no good (nor bad) reason. Depending on the host Forth system it's possible that you can use lowercase when writing programs with mary, but don't bet on that. 1.4 - What is Forth? -------------------- There are probably more definitions than stars are in the sky. Let's say that Forth is an unusual but useful programming language with a strong philosophic background. If you learn Forth well, it will help you in you everyday life too and even in your understanding of the Universe as a whole. You see, I told you there were too many definitions. In the late 1960s, Charles "Chuck" Moore worked at a radiotelescope at Kitt Peak, Arizona (U.S.A.). The main language for developing his control and statistics programs was FORTRAN. The edit-compile-run-debug cycle was too boring and inefficient, he was almost forced to make a new way to do things. He started to write assembly language subroutines, later connected in address lists. After some time, the first Forth system was born. He thouugh of his language to be a fourth-generation language (in the third-generation computers of then), but the IBM 1130 he worked on rejected names with more than 5 characters, so FOURTH became FORTH (it was the all-caps FORTRAN era, too). Since then, there has been some Forth standards through time, as FORTH-78, FORTH-79, FORTH-83 and FIG-FORTH. In 1994, the American National Standards Institute, Technical Commitee X3J14, issued that was called ANS Forth (X3.215-1994). Its goal was to collect all common programming practices and create a global standard that contemplated them all, or at least, invalidate the lesser possible number of programs. But there are areas in which a full ANS Forth system is not required. As Chuck Moore, the inventor of Forth, put it: "Underground Forths are still needed". That is, non-ANSI systems. ANS Forth systems are regarded as "dinosaurs" by some people. They even think that programmers educated in the ANSI standard but without a previous Forth knowledge tend to use Forth the same way as (for example) C, writing poor code as they don't really understand the Forth concept. Maybe that's true, but I am not an ANS Forth-minded person myself. I started programming in Forth in 1984, left it in 1986, came back in 1989 (ANSI TC X3J14 was already working on ANS Forth), left again in 1991 and came back again in 1999. I have always found Forth new and refreshing, and regret not using it more time. It could have helped me with my life (really). So, Forth means different things to different people. 1.5 - What is this PIC thing? ----------------------------- PIC stands for Peripheral Intelligent Controller, and was the name for a dedicated type of processors. Nowadays, Arizona Microchip Technologies, or Microchip for short, manufactures PIC microcontrollers for many uses. They are small computers, with a processor, ROM, RAM, some peripherals built-in and I/O ports. Microchip is currently the #2 microcontroller manufacturer in the world (second to Motorola), and the range of applications for their PIC chips is amazing. From remote controllers to microbots, from doorbells to serial encoders. PICs are normally programmed in assembly language or C. Now there is a free (as in freedom) alternative: Forth. 1.6 - What is the current status of mary? ----------------------------------------- Version 1.0a50 is an alpha version. It needs some testing before entering beta stage. A useful library of words and examples on how to manage peripherals is currently lacking, but not difficult to implement. The documentation is fairly complete for a start, but Forth words are listed in definition order. More examples and a full tutorial are needed. 1.7 - Why use mary instead of assembler or C? --------------------------------------------- Although assembler is the most powerful language, it tends to be cumbersome and boring for large programs, not to mention error-prone. It shows its excellence when used for short, carefully handcrafted routines. It's my believing as a computer scientist that no programmer should be working with at least a grasp of what assembly programming is. C can be a great language. I used to program a lot on it. But (as most of other languages), it has a syntax, with you must deal in order to express your problem's solution. Forth, in the other hand, gives you a much greater freedom for developing your systems. It's perfect for embedded applications. You control all the enviroment and make your own operating system, totally tailored for your application. mary is a free optimizing native code Forth compiler whose aim is to substitute assembly language programming while being fun and powerful. 1.8 - How good are the optimizations? ------------------------------------- There is no code generator, so there is no optimizer at all. Every word generates its own code and makes its own optimizations. There is little context sharing, as the only thing a word can do is look to the previously generated code and try to optimize it, merging, suppressing or modifying. All the optimizations have been done thinking in what I would have written if the sequence was originally in assembly language. Several tradeoffs had to be taken into account, resulting in the following kinds of optimizations: - constant optimization: when all the operands to a word are known at compile time, the code for the constant result is generated; in some cases there are important optimizations that can be done if only one of two operators are known. E.g.: "1 CONST, 2 CONST, +" is compiled as "3 CONST,". See below for explanation of CONST, - operation combinations: adding up constant optimization and operation combinations, we can get rid of some redundant operations, like 1+ - logical expression reduction: when a logical operator precedes an IF, it they can be merged to a more compact code - dead code elimination: if a constant expression is used as dicriminator in a conditional control flow word (like IF or UNTIL), a branch can be never reached, so that code is deleted. There is currently an exception to this, see the code for IF for explanations - exit code compaction: when exitting a word prematurely, a jump is done instead of a call 1.9 - What are the system requirements? --------------------------------------- Currently mary uses only standard ANS Forth words, but it doesn't provide a Forth system of its own. So an ANS Forth system is required to use mary. As of version 1.0a50, gforth 0.4.0 has been used for developing and testing. I use gforth in a Linux/Intel PC, so reports for use of different Forth systems and architectures are welcome. I think that I have got ridden of all the octet order dependencies, but one never knows. One important environment dependency is that mary must be run on a Forth system whose cell width is at least 16 bits. This is so because mary uses system cells to contain PIC instruction words. Mid-range PICs use 14 bits, but high-range ones use 16 bits. A system with 14-bit words could be used for mid-range. mary generates a Intel HEX file that can be feeded to the Microchip suite of PIC programming utilities. Some non-commercial programming systems accept HEX files as input also. 1.10 - Licensing issues ----------------------- mary is an Open Source project. As such, I hold the Copyright, but give explicit permission to copy and redistribute it as long as it is unchanged. If you change something, then it can't be called mary and it's not my resposibility for the resulting code. You must give the original sources and all the changes. The preferred method, however, is to send me patches (with 'diff -uNr') or suggestions in order to coordinate all changes and preserving the code quality. The applications can contain code present in the source libraries of mary. I haven't any intention to claim rights on this, so you don't have to give the sources for your application to your customers. However, if you have a nice app written in mary and you would like the world know it, you may ask me for making a link in the mary project web page. Similarly, if you include my page in your links, I would like to know it. 2 - Forth model =============== 2.1 - What kind of Forth is mary? --------------------------------- An "underground" Forth, that is, a non-ANS Forth. It follows tradition where possible, and incorporates some additions. It can't recompile itself because mid-range PICs are not powerful enough to use their own model for regenerating code. There are some improvements in mary in order to substitute assembly programming, as : and ; being possible to be used several times in a word. As for word description, I use the common stack notation written as: ( n1 n2 ... nn -- n1' n2' ... nm' ) read as: "before word execution, data items n1, n2, ..., nn were on the stack, nn being the top of stack (TOS); after word execution, items n1', n2', ..., nm' are in the stack, nm' being the top of stack". Parentheses indicate this is a comment. 2.2 - How is it implemented on the PIC? --------------------------------------- PIC architecture is a Harvard one, so data, code and even stack spaces are different. There are different address and word widths in every of them, making difficult to mix code and data. Besides, the return stack is not directly addressable. All of this adds up to make very difficult to make an efficient traditional indirect threading Forth system. So, the way to go is a native code system. The words are compiled as subroutines, and then the hardware return stack is used as return stack in mary. There are no return stack manipulation words in mary. The data stack is implemented using the indirect addressing register, holding the TOS in the W register. The data stack depth is a processor-specific feature, can be a maximum recommended of 16 words (as in 16C63), or reduced to 8 words (as in 16C84). The limit of 16 words is proposed for 2 reasons: - Chuck Moore estimated a stack depth of 18 as effectively "infinite", so 16 is not far from that - there are PIC models in that the highest 16 data words in every bank are shared (or, said another way, the highest 16 data words in banks 1, 2 and 3 are mapped to the real ones in bank 0) There are 3 locations reserved for temporary data, named TMP1, TMP2 and TMP3, intended to be used only in the compiler, not in applications. mary currently adresses only 2 banks of RAM, instead of the possible 4 that some PIC models have. The reason is that indirect addressing can't go further, and the stack width (8 bits) don't make it possible also. In the future is possible that this situation will change, but currently think in using the other banks for specific words that deal with them as buffers (e.g.: serial communication drivers, D/A samplers, etc). Code is stored in ROM, as are constant tables. Variables are stored in RAM, as is the data stack. Cell size is 8 bits, so the stack and the variable width is also 8 bits. Currently, no double number operators are available. 2.3 - How is mary made? ----------------------- First, a PIC assembler was made in Forth. It contains every mnemonic supported by Microchip assembler for mid-range PICs, with the reverse "syntax" typical of Forth assemblers. The only changes are the following: - TRIS instruction is _not_ supported, as it is marked as deprecated by Microchip - OPTION instruction is _not_ supported for the same reason as TRIS - default destination is W instead of F. Use always ,F or ,W to specify destination. There was no easy solution for this problem, but I am open to suggestions The assembler has some pseudodirectives in order to make it easy to use and support mary. Keep in mind that, while a backward reference can be used from many points in the code, a forward reference can only be resolved once. This is not such a great problem, as its purpose is only to be used inside the PIC assembler. 2.4 - Is it possible to mix assembler code with Forth? ------------------------------------------------------ Yes, but you must follow the conventions in order not to interfere with mary. Besides, it may be needed to use NOOPT to indicate mary that optimization is not possible. When writing assembler code, it's not necessary to use CONST, for the constants. 2.5 - What is this strange CONST, word? --------------------------------------- I knew you would ask that. In order to compile a constant in the code, CONST, must be used. Example: "2 3 +" in Forth becomes "2 CONST, 3 CONST, +" in mary. This is just the same as in standard Forth, the only difference is that you never see that. The reason is that the text interpreter must be changed for doing this automatically, and this is an alpha release. Internally is also used for compiling constants, so CONST, will not go away in the future. When a _portable_ text interpreter is written, I will do my best to signal errors like use of words not in mary. Currently you must learn the vocabularies and adhere to them. Note that for defining a named constant, you would write: 31 CONSTANT LIMIT so, no CONST, is to be used in this case. 2.6 - I can't find my favourite words... ---------------------------------------- Most of the Forth words not implemented fall in one of this categories: - not applicable to PIC architecture - not applicable to a target-only, non-recompiling, non-interactive Forth - easily substitutable by reordering arguments - optimization eliminates the need for them The rest of words not implemented fall in one of this categories: - intended to be written by me in the future - not intended to be written by me in the future The words in this last category fall in one of this categories: - intended to be written by anyone else in the future - not intended to be written by anyone else in the future Said that, let's see some of the words that didn't get implemented and the reasons: - ( : mary currently uses ( from the Forth host system - 1+ : "1 CONST, +" is a perfect optimized equivalent. The same goes for 1- - 0< : "0 CONST, <" is a perfect optimized equivalent - DO ... LOOP : too complex, use WHILE instead - >R R> : there is no user manageable return stack in mary - ROT : usually reordering or repeating the arguments does the trick - PICK : this is Forth, you know. Factor, factor, factor - strings: some basic support planned for the future - floating point: fixed point is faster - fixed point: */ planned for the future. Scaling is fast, too :) - <# # #S #> : probably in the future - EXIT : use ; instead - ?DUP : normally used before IF, use AIF instead (see) - local variables: this is Forth, again, do you remember? - CASE ... OF ... ENDOF ... ENDCASE : probably in the future. If you are an ANS Forth-minded person, it will take you some more effort to love mary. Sorry, one has to make decisions. Suggestions (and code) are welcome. 2.7 - Highlights of new/improved words -------------------------------------- mary words fall in one of this categories (again :) - assembly language instructions - assembly directives - Forth compiler support - Forth words - specific processor word versions - specific processor constants For more details, see the documentation for each of these. Forth words that are improved include: - : : can be used as traditionally or to make several entry points to a subroutine (word). Example: : ENTRY1 ... ( no ; here ) : ENTRY2 - ; : can be used as traditionally or to make several exit points from a subroutine (word). Example: ... IF DO-IF-TRUE ; ( exit point 1 ) ELSE DO-IF-FALSE ; ( exit point 2 ) THEN ; ( traditional use ) - AIF : alternative/advanced IF. it doesn't drop its argument, generating much better code than a ?DUP (inexistent in mary) and normal IF. Don't forget to DROP argument after ELSE, unless you want to chain several AIFs. 2.8 - Examples of code that take full advantage of optimizations ---------------------------------------------------------------- @ and ! are optimized for constants. They generate optimum code when preceded by the variable: MYVAR1 @ MYVAR2 ! is efficient but MYVAR1 @ MYVAR2 ! MYVAR3 @ MYVAR4 ! is even more efficient, as the restacking between "MYVAR2 !" and "MYVAR3 @" is suppressed. If you want to know, the first example generates 6 instructions while the second generates 8 instead of 12, as it eliminates 2 instructions for restoring TOS and avoids to generate 2 more for saving TOS. Similarly, sequences like 1 CONST, MYVAR1 ! are full optimized, as they use a special sequence to save TOS, generating only 4 instructions. Again, things like 1 CONST, MYVAR1 ! 2 CONST, MYVAR2 ! generates 6 instructions instead of 8, as the TOS resaving (restore/save) is avoided. mary can't optimize after THEN, so ... 0 < IF MYVAR1 @ ELSE MYVAR2 @ THEN ... produces a _much_ better code than ... 0 < IF MYVAR1 ELSE MYVAR2 THEN @ ... not to mention that "0 <" is optimized also, not requiring a specific 0< word. IF optimizes the < > and 0= operators and makes dead code elimination. Constants are optimized as much as possible, so 1 CONST, + generates one instruction incrementing TOS (register W), while 2 CONST, + generates one instruction adding 2 to the TOS. +! is very optimized, so 1 CONST, MYVAR +! generates one instruction directly incrementing MYVAR, 2 CONST, MYVAR +! generates two instructions incrementing MYVAR, and 37 CONST, MYVAR +! generates 5 instructions saving TOS, adding 37 to MYVAR and restoring TOS. Similarly, 37 CONST, MYVAR1 +! 42 CONST, MYVAR2 +! generates 8 instructions instead of 10, suppressing one restore/save sequence. 3 - Files ========= 3.1 - mary ---------- This is the compiler caller. Customize here the Forth system (currently gforth) and the prefix directory (currently /usr/local/lib/mary). Optionally, you can select a different default processor (currently 16c63). All the needed files are joined in the apropriate order. One of them, the one with the processor-specific constants, is included twice in the list: one for use with the assembler, the other for use with Forth. A typical invocation would be: mary -p 16c84 myfile.fs where the processor selected is different from the default one (16c84 in this case) and tries to compile myfile.fs 3.2 - savestate.fs ------------------ A minimal file for saving the host Forth system status. 3.3 - restorestate.fs --------------------- A minimal file for restoring the host Forth system status. 3.4 - picasm.fs --------------- The mid-range PIC assembler. For supporting the Forth compiler, although it can be used for writing low-level subroutines. However, accesing peripeherals don't need this, as they are mapped in memory, which is very efficiently accessed in mary. 3.5 - picforthsupp.fs --------------------- Some words needed for supporting the Forth compiler itself. They deal mainly with the code identifying for the corresponding optimizations. 3.6 - picforth.fs ----------------- The Forth compiler, except for the memory access words and some other processor-specific features. All the optimizations are done here. 3.7 - 16CXX-data.fs ------------------- Variables and constants defining the specific processor, mainly memory space. 3.8 - 16CXX-consts.fs --------------------- Constants taken from Microchip documentation for the specific processor. Included twice, one for the assembler, the other for the compiler. 3.9 - 16CXX-code.fs ------------------- Processor-specific code for dealing with memory access and some other peculiarities. 4 - Vocabularies ================ 4.1 - PICASM, the PIC assembler ------------------------------- PICASM is a traditional Forth-style assembler. Its target is (currently) the mid-range of Microchip PIC microcontrollers. Thanks to the simple addressing modes of the PIC family, there is only one word for every possible PIC instruction. There are additional words for defining every type of instruction, but this is of use only in defining the assembler. Several tools and pseudodirectives are available, some of them for use in the assembler, others for supporting the Forth compiler. Normally you only worry about them when creating some kind of words for the compiler, not for normal programming. 4.1.1 - Assembler pseudodirectives ---------------------------------- These are the words that allow to control the assembly. $ Variable containing the current assembly location pointer. Use it through the other manipulators. ORG ( N -- ) Sets the current assembly location pointer to PIC address N. Example: 200 ORG LABEL ( -- ; -- ADDR ) Creates a named label with the current value of $. When executing, it behaves as a constant that represents a PIC program memory address. As such, it can be used to be back referenced from any place in the program. Example: LABEL RIGHTHERE ... RIGHTHERE GOTO FLABEL ( -- ; -- ) Prepares a _SINGLE_ forward label reference for use with GOTO and CALL. The reason it only supports one forward reference is that its only use is in the assembler for supporting Forth, and that's the only requirement that arose. Feel free to submit new code for dealing with more than one reference, if you will. It probably implies making a sort of stack or array with associative addressing. Example: FLABEL FARAWAY GOTO ... FARAWAY HIGH ( N -- N ) Leaves the high bits of a PIC address. Use in assembly programming when defining special memory access words (as TABLE). Example: 173 HIGH LOW ( N -- N ) Leaves the low bits of a PIC address. Use in assembly programming when defining special memory access words (as TABLE). Example: 173 LOW 4.1.2 - Support words --------------------- These are the words that form the base for the assembler and support the Forth compiler in some way. $+ ( N -- $+N ) Aid for forward references. Calculates the address that is at N words from $. Used for jumping. Example: 3 $+ GOTO ( 3 instructions to be skipped here ) >PICROM ( PICADDR -- REALADDR ) Translates PIC address to real memory addres. Use only in the assembler. Example: 100 >PICROM $>PICROM ( -- ADDR ) Translates $ to real memory address. Use only in the assembler. Example: $>PICROM PICROM@ ( ADDR -- OPCODE ) Gets content at specified PIC ROM address. Use only in the Forth compiler. Example: 100 PICROM@ $@ ( -- OPCODE ) Gets content at current compiling address. Use only for optimizations in the Forth compiler. Example: $@ $! ( OPCODE -- ) Sets content at current compiling address. Use only in the assembler. Example: 0800 $! $+1 ( -- ) Advances current compiling address pointer. Use only for optimizations in the Forth compiler. Example: $+1 $-1 ( -- ) Backs current compiling addres pointer. Use only for optimizations in the Forth compiler. Example: $-1 $, ( OPCODE -- ) Compiles value at $ and advances 1 word. Use only in the assembler. Example: 0080 $, PATCHJUMP ( ADDR -- ) Patches a back reference of a forward jump. Example: (see definition or FLABEL or control flow words in PICFORTH for practical examples) PHERE ( -- ADDR ) Current compiling address; as HERE in the host Forth. Name changed for several reasons. Example: PHERE NOOPT ( -- ) Marks a spot as non-optimizable. Use in case you know the optimizer would want to alter code you prefer not to. Freely usable. Example: : VALIDDEF 2* ; : NULLDEF NOOPT ; ( NULLDEF would be empty if not ) NODEADCODE ( -- ) Marks a spot as dead code cancellation. Example: NODEADCODE 4.1.3 - PIC assembly language support ------------------------------------- These are the words that make PICASM possible. They deal with the various instruction types in the PIC instruction set. Use them in the assembler definition only. HIGH5BITS ( N -- N ) Gets the high 5 bits of an addres. For HIGH. Example: 173 HIGH5BITS LOW8BITS ( N -- N ) For constants, and registers with ,F or ,W. Example: 173 LOW8BITS LOW7BITS ( N -- N ) For registers in bit operations. Example: 173 LOW7BITS LOW11BITS ( N -- N ) For addresses. Example: 34582 LOW11BITS CONST ( OPCODE N -- OPCODE ) Embeds an 8-bit constant in an opcode. Example: 3F00 17 CONST ,F ( N -- N ) Selects file register as destination. Use when assembling. Example: 37 ,F DECF ,W ( N -- N ) Selects the W register as destination. Use when assembling. THIS IS THE DEFAULT IN PICASM, AS OPPOSITE TO MICROCHIP ASSEMBLER. Example: 37 ,W DECF REG ( OPCODE N -- OPCODE ) Defaults destination in an OPCODE as W. Use ,F or ,W for selecting the destination. Example: 0700 17 REG REGBIT ( OPCODE N -- OPCODE ) Selects bit for bit operations. Example: (see definition of BIT-OPCODE) BITSEL ( OPCODE BIT -- OPCODE ) For bit selection. Example: 1800 3 BITSEL ADDR ( OPCODE N -- OPCODE ) Embeds an address in an opcode. Example: 2800 173 ADDR IMMED-OPCODE ( N -- ; -- ) Defines an assembler mnemonic with immediate addressing mode (no arguments). Example: 0100 IMMED-OPCODE CLRW CONST-OPCODE ( N -- ; CONST -- ) Defines an assembler mnemonic with an embedded constant. Example: 3800 CONST-OPCODE IORLW REG-OPCODE ( N -- ; REG -- ) Defines an assembler mnemonic with register addressing. Example: 0A00 REG-OPCODE INCF BIT-OPCODE ( N -- ; REG BIT -- ) Defines an assembler mnemonic with register bit addressing. Example: 1000 BIT-OPCODE BCF ADDR-OPCODE ( N -- ; ADDR -- ) Defines an assembler mnemonic with register addressing. Example: 2000 ADDR-OPCODE CALL CFGMASKER ( MASK -- ; -- ) Creates a configuration bit masker. When used, the configuration word is affected, being ANDed with the MASK. After INIT-PICASM the configuration word is set to FFFF, so you can use the configuration bit maskers as normally. Example: 3FFB CFGMASKER _WDT_OFF ( DEFINING THE CONFIGURATION BIT MASKER ) ... _WDT_OFF ( USING THE CONFIGURATION BIT MASKER ) 4.1.4 - PIC assembler mnemonics ------------------------------- They are all identical to the ones used by the Microchip assembler. Don't forget to use the operands in reverse order: BTFSC STATUS,C becomes STATUS C BTFSC CALL SUB1 becomes SUB1 CALL DECF COUNT,F becomes COUNT ,F DECF As you can see, the STATUS register bits have a name equal to the Microchip assembler definitions, but they don't have a comma prepended. That's because they are named constants. The complete mnemonic list: ADDLW ADDWF ANDLW ANDWF BCF BSF BTFSC BTFSS CALL CLRF CLRW CLRWDT COMF DECF DECFSZ GOTO INCF INCFSZ IORLW IORWF MOVLW MOVF MOVWF NOP RETFIE RETLW RETURN RLF RRF SLEEP SUBLW SUBWF SWAPF XORLW XORWF 4.1.5 - Utilities ----------------- These are the words that can be used externally, although most of them are used inside the assembler. Documenting them is mandatory, you know... INIT-PICASM ( -- ) Initalizes the PIC assembler, namely clears the ROM space, etc. Example: INIT-PICASM PICPROGSIZE ( -- N ) Leaves in the stack the current program size. Example: PICPROGSIZE .PICPROGSIZE Shows an self-explaining text about the current program size. Example: .PICPROGSIZE HEX8. ( N -- ) Shows an 8-bit hex number. Example: 37 HEX8. HEX16. ( N -- ) Shows a 16-bit hex number. Example: 34582 HEX16. MEMDUMPLINE ( ADDR -- ) Shows 8 PIC program memory words starting at ADDR. Example: 100 MEMDUMPLINE MEMDUMP ( ADDR COUNT -- ) Shows COUNT MEMDUMPLINEs starting at ADDR. Useful for debugging the compiler. Example: 100 4 MEMDUMP HEXDUMPLINE ( ADDR -- ) Shows 8 PIC program memory words in Intel HEX format. Example: 100 HEXDUMPLINE HEXDUMP ( ADDR COUNT -- ) Shows COUNT HEXDUMPLINEs starting at ADDR. Example: 100 4 HEXDUMP HEXDUMPCFG ( ADDR -- ) Shows configuration word contents in HEX format. Used by PROGDUMP. Example: CONFIG HEXDUMPCFG PROGDUMP ( -- ) Dumps current program in Intel HEX format from 0 TO $. Used for generating a text describing your program in order to be burned into a PIC. The mary driver calls it accordingly, so you don't need to care for this. Example: PROGDUMP 4.2 - PICFORTH, the free mary PIC Forth optimizing native code compiler ----------------------------------------------------------------------- PICFORTH (mary) is an "underground" (in Chuck Moore's sense) Forth. As such, it tries to be useful by containing a subset of Forth well suited to PIC programming. Some words have changed their usual semantics, several other have been added for unique PIC functions. 4.2.1 - Forth words ------------------- These are words making the Forth vocabulary in which you will write your programs most of the time. Most of them are semantically equivalent to their homonyms in any normal Forth system. There are some variations and additions. All arithmetic words are optimized for constants. Additionally, several logical operators (<, > and 0=) are optimized in IF. Other ones (= and <>) are not, because they are not further optimizable. All words whose name is (WORD) are assembler subroutines intended for use by the compiler. Don't use them directly, as they can't optimized anything and will be compiled as a subroutine call, when there may be another faster solution. CONSTANT ( N -- ) Creates a named constant. Example: 173 CONSTANT WHATEVER VARIABLE ( -- ) Creates a named global variable. Example: VARIABLE COUNT TABLE ( -- ; N1 -- N2 ) Creates a header for indexed table access. Unique to PIC programming. At compilation time, no parameters are needed; just add elements with ",". At run time, it takes the TOS as an index into the table and leaves the value of that element. Optimized for constant. Example: TABLE CONVERSION ... 4 CONVERSION , ( N -- ) Compiles value in current compiling address. Use only for adding elements to a TABLE. Example: TABLE CONVERSION 4 , 3 , 2 , 1 , 0 , DROP ( N -- ) Drops the top of stack (TOS). Discards the TOS by making the second on stack the new TOS. Example: DROP DUP ( N -- N N ) Copies TOS in a new TOS. Old TOS becomes second on stack. Example: DUP NIP ( N1 N2 -- N2 ) Equivalent to: "SWAP DROP". A handy word, and it compiles to only 1 PIC instruction. Example: NIP OVER ( N1 N2 -- N1 N2 N1 ) Copies second on stack to new TOS Example: OVER SWAP ( N1 N2 -- N2 N1 ) Exchanges the 2 topmost elements on stack. Example: SWAP S>D ( U -- UD ) Converts to double precision. In fact, pushes a 0 into the stack, not doing sign extension. Example: S>D D>S ( UD -- U ) Converts to single precision. Drops the TOS. Example: D>S - ( N1 N2 -- N1-N2 ) Substracts N2 from N1. Example: - 1 - ( optimization gets rid of 1- ) + ( N1 N2 -- N1+N2 ) Adds N1 to N2. Example: + 1 + ( optimization gets rid of 1+ ) 2* ( U -- U*2 ) Multiplies TOS by 2. Example: 2* 2/ ( U -- U/2 ) Divides TOS by 2. Example: 2/ ABS ( N -- U ) Calculates absolute value of TOS. Example: ABS NEGATE ( N -- -N ) Calculates negative of TOS. Example: NEGATE AND ( N1 N2 -- N3 ) Bitwise and. Example: AND INVERT ( N1 -- !N1 ) Bitwise not. Example: NOT OR ( N1 N2 -- N3 ) Bitwise or. Example: OR XOR ( N1 N2 -- N3 ) Bitwise XOR (exclusive-or). Example: XOR < ( U1 U2 -- F ) Leaves a TRUE flag if U1 < U2. Example: < 0 < ( optimization gets rid of 0< ) < IF ( optimization possible ) <> ( N1 N2 -- F ) Leaves a TRUE flag if N1 is not equal to N2. Example: <> = ( N1 N2 -- F ) Leaves a TRUE flag if N1 = N2. Example: = = IF > ( U1 U2 -- F ) Leaves a TRUE flag if U1 > U2. Example: > 0= ( N -- F ) Logical invert. Example: 0= IF ( optimization possible ) IF ( -- ADDR CF ; F -- ) Condition for IF ... ELSE ... THEN. At run time, checks and discards the TOS: if 0, skips code up to next ELSE or THEN; if not 0, executes following code. It optimizes <, > and 0=, and deals with dead code suppression. Example: MYVAR @ IF SOMETHING ELSE OTHER THEN ( executes SOMETHING if MYVAR is not 0, else executes OTHER ) AIF ( -- ADDR CF ; F -- F ) Alternative-IF for AIF ... ELSE ... THEN. Similar to IF, but it don't discard TOS. The construction "?DUP IF ... THEN" becomes "AIF ... ELSE DROP THEN" in mary. It optimizes <, > and 0=, and deals with dead code suppression. Example: MYVAR @ AIF ... ELSE DROP THEN ( optimized for TRUE branch ) ELSE ( ADDR1 CF -- ADDR2 CF ; -- ) FALSE condition branch for IF/AIF ... ELSE ... THEN. Deals with dead code suppression. Example: MYVAR @ IF SOMETHING ELSE OTHER THEN ( executes SOMETHING if MYVAR is not 0, else executes OTHER ) THEN ( ADDR CF -- ; -- ) Ends IF/AIF ... ELSE ... THEN. Deals with dead code suppression. Example: MYVAR @ IF SOMETHING ELSE OTHER THEN ( executes SOMETHING if MYVAR is not 0, else executes OTHER ) BEGIN ( -- ADDR CF ; -- ) Starts BEGIN loop. Use it for BEGIN ... AGAIN, BEGIN ... UNTIL and BEGIN ... WHILE ... REPEAT loops. Example: BEGIN BORING AGAIN ( a sure way to get bored ) AGAIN ( ADDR CF -- ; -- ) Ends BEGIN ... AGAIN LOOP. This is an indefinite loop, you must exit from inside, else it will loop forever (if the computer is eternal, at least). Example: BEGIN BORING AGAIN ( a sure way to get bored ) UNTIL ( ADDR CF -- ; F -- ) Ends BEGIN ... UNTIL loop. When execution reaches UNTIL, it will loop back from BEGIN if value at TOS is non FALSE. TOS is discarded. Example: 5 CONST, BEGIN OTHER 1 - DUP UNTIL ( does OTHER 5 times ) WHILE ( ADDR1 CF -- ADDR1 ADDR2 CF ; F -- ) Condition for BEGIN ... WHILE ... REPEAT loop. When execution reaches WHILE, it will exit loop if value at TOS is non FALSE. TOS is discarded. Loops back from REPEAT to BEGIN. Example: 5 CONST, BEGIN 1 - DUP WHILE OTHER REPEAT ( does OTHER 4 times ) REPEAT ( ADDR CF -- ; -- ) Ends BEGIN ... WHILE ... REPEAT loop. When execution reaches REPEAT, it will loop back to BEGIN. Example: 5 CONST, BEGIN 1 - DUP WHILE OTHER REPEAT ( does OTHER 4 times ) : ( -- ) Creates a named subroutine entry point. It can be used for creating a word in the traditional way or to create several entry points (useful for substituting assembler code, as this is a known technique). Example: : ENTRY1 ... : ENTRY2 ... ; ; ( -- ) Creates a return point. It makes the word/routine to return immediately to caller. It optimizes return stack use, so that a word call before ; translates to a jump when possible, saving one stack level. Example: : MYWORD IF OTHER ; ELSE ANOTHER ; THEN ; ( 3 exit points ) (!) ( N ADDR -- ) Stores N in ADDR. _DON'T USE THIS WORD DIRECTLY_, use ! instead. Example: 5 CONST, MYVAR (!) (+!) ( N ADDR -- ) Adds N to the contents of ADDR. _DON'T USE THIS WORD DIRECTLY_, use +! instead. Example: 5 CONST, MYVAR (+!) (@) ( ADDR -- N ) Fetches contents of ADDR. _DON'T USE THIS WORD DIRECTLY_, use @ instead. Example: MYVAR (@) (*) ( U1 U2 -- U1*U2 ) Multiplies U1*U2, giving an 8-bit result. _DON'T USE THIS WORD DIRECTLY_, use * instead. Example: 5 CONST, 6 CONST, (*) (/MOD) ( U1 U2 -- U1\U2 U1/U2 ) Divides U1/U2, U1\U2 is the remainder. _DON'T USE THIS WORD DIRECTLY_, use /MOD instead. Example: 5 CONST, 2 CONST, (/MOD) (LSHIFT) ( U1 U2 -- U3 ) Left shifts U2 times U1, giving U3. _DON'T USE THIS WORD DIRECTLY_, use LSHIFT instead. Example: 5 CONST, 3 CONST, (LSHIFT) (RSHIFT) ( U1 U2 -- U3 ) ( RIGHT SHIFTS U2 TIMES U1 GIVING U3 ) Right shifts U2 times U1, giving U3. _DON'T USE THIS WORD DIRECTLY_, use RSHIFT instead. Example: 35 CONST, 3 CONST, (RSHIFT) / ( U1 U2 -- U1/U2 ) Divides U1/U2, giving an 8-bit result. Example: 5 CONST, 2 CONST, / /MOD ( U1 U2 -- U1\U2 U1/U2 ) Divides U1/U2, U1\U2 is the remainder. All are 8-bit entities. Example: 5 CONST, 2 CONST, /MOD * ( N1 N2 -- N1*N2 ) Multiplies N1*N2, giving an 8-bit result. Example: 5 CONST, 6 CONST, * MOD ( U1 U2 -- U1\U2 ) Calculates the remainder of U1/U2. Example: 5 CONST, 2 CONST, MOD @ ( ADDR -- N ) Fetches contents at ADDR. Optimum code is generated when preceded by a variable. Example: MYVAR @ INCDECLOOP ( ADDR N -- ) Generates code for +! INCF/DECF loops. Internal use only. Example: INCDECLOOP +! ( N ADDR -- ) Adds N to the contents of ADDR. Example: 5 CONST, MYVAR (+!) ! ( N ADDR -- ) Stores N in ADDR. Optimum code is generated when preceded by its arguments. Example: 5 CONST, MYVAR (!) LSHIFT ( U1 U2 -- U3 ) Left shifts U2 times U1, giving U3. Example: 35 CONST, 3 CONST, (RSHIFT) RSHIFT ( U1 U2 -- U3 ) Right shifts U2 times U1, giving U3. _DON'T USE THIS WORD DIRECTLY_, use RSHIFT instead. Example: 35 CONST, 3 CONST, (RSHIFT) 4.2.2 - Forth support words --------------------------- These are words necessary for supporting the Forth compiler. A few of them are to be used when programming, but here instead of in the Forth compiler because they are needed for some of the words in this group. INIT-PIC ( -- ) Minimal startup code for PIC Forth. Currently, gives an initial value to the indirect addressing register, used as data stack pointer. Example: INIT-PIC INIT-PICFORTH ( -- ) Initializes PIC Forth system. The mary driver uses it for starting the system. Example: INIT-PICFORTH TOPATCH ( -- ADDR 0 ) An address to be patched later by PATCHJUMP. Uses the curren value of $. Example: TOPATCH DONTPATCH ( -- ADDR ) Marks a spot in order not to be patched. Used in dead code suppression. Example: DONTPATCH DEADCODE ( -- ) Marks a spot as dead code begin, to be suppressed later by KILLDEADCODE. Example: DEADCODE DEADCODE? ( ADDR -- ADDR F ) Leaves a TRUE flag if we are in dead code. Example: PHERE DEADCODE? KILLDEADCODE ( ADDR -- ) Suppress dead code by backskipping it. Example: PHERE ... KILLDEADCODE ( would suppress back to PHERE ) OPT ( -- ) Marks a spot as optimizable. Use in case you know it's possible to optimize, but code shape don't permit. Freely usable. Example: : ENTRY1 ... : ENTRY2 OPT ( normally can't optimize here ) OPT? ( -- F ) Leaves a TRUE flag if optimization is possible. Example: OPT? RETURN? ( -- F ) Leaves a TRUE flag if last assembled opcode is a return. Example: RETURN? CALL? ( -- OPCODE F ) OPCODE and TRUE flag if call optimization is possible. Example: CALL? CALL>JUMP ( OPCODE -- ) Convertes a CALL into a jump (GOTO in PIC parlance). A kind of optimization. Use _ONLY_ in conjunction with CALL? Example: CALL? IF CALL>JUMP POP ( -- ) Generates code to pop W from stack. Used by DROP, etc. Use in the compiler. Example: POP PUSH ( -- ) Generates code to push W into stack. Used by CONST, and so. Use in the compiler. Example: PUSH RESTACK? ( -- F ) TRUE flag if restacking POP/PUSH optimization is possible. Example: RESTACK? CONST, ( N -- ) Embeds a constant in the code. Use manually for this task, the compiler uses it internally to embed any constant. Example: 37 CONST, MYVAR ! ( store 37 in MYVAR ) CONST? ( -- N F ) Constant and TRUE flag if constant optimization is possible. Example: CONST? IF KILLCONSTANT DROP ( excerpt from DROP ) KILLCONST ( -- ) Backs $ in order to suppress a constant. Example: CONST? IF KILLCONSTANT DROP ( excerpt from DROP ) SAVETOS ( -- ) Generates code to save TOS. Used by the compiler when optimizing 1 level of operands. Example: SAVETOS RESTORETOS ( -- ) Generates code to restore TOS. Used by the compiler when optimizing 1 level of operands. Example: SAVETOS ABS8 ( N -- |N| ) Calculates the absolute value of N for 8-bit data values. Used internally. Example: -3 ABS8 42 ABS8 ?PAIRS ( F1 F2 -- ) Check parts of a flow control structure. Aborts with an error message if flags F1 and F2 are different. Example: (see definition of ELSE for practical examples) 0=? ( -- F ) TRUE flag if 0= optimization is possible. _USE ONLY IN IF_ Example: (see definition of IF for practical examples) ? ( -- F ) TRUE flag if > optimization is possible. _USE ONLY IN IF_ Example: (see definition of IF for practical examples) 4.2.3 - PIC specific support words ---------------------------------- These are words for dealing with the rather strange PIC data space. They must be redefined for every PIC model, although there are some models that share memory layout, so the code can be copied. PIC data space is laid out in banks, possibly with not all RAM locations implemented and with different free RAM (peripherals are mapped in this space too). There can be up to 4 different banks in mid-range PICs. Don't use directly these words, only in the compiler. ALLOC ( N -- ADDR ) Allocates N contiguous cell and leaves starting ADDR. Don't use it for dynamic allocation, as there is no provision for freeing memory later. Example: 15 ALLOC ALLOT ( N -- ) Reserves N cells in variable space. _USE ALLOC INSTEAD_ Example: 15 ALLOT BANK0 ( ADDR -- ) Generates code for switching to RAM bank 0 if needed. mary assumes we are always in bank 0. Example: MYVAR BANK1 FETCH BANK0 BANK0FREE ( -- N ) Gives the free space in bank 0 for variable allocation. ALLOC uses it. Example: BANK0FREE BANK1 ( ADDR -- ADDR ) Generates code for switching to RAM bank 1 if needed. mary assumes we are always in bank 0. ADDR remains in the stack for easy chaining. Example: MYVAR BANK1 FETCH BANK0 BANK1FREE ( -- N ) Gives the free space in bank 1 for variable allocation. ALLOC uses it. Example: BANK1FREE BVARIABLE ( N -- ) Creates a named global "big" variable, N cells in size. It uses ALLOC, so it allocates in the proper bank. If your program has memory problems, remember these easy rules: - allocate variables in descending size order (biggest first) - you can define different constants to index a shared memory area - think first of the real memory requirements and switch to a better processor if needed (most of the time the previous rules make the deal) Example: 5 BVARIABLE MYVAR FETCH ( ADDR -- ) Generates code for fetching from a constant address. Takes into account bank changes. Used by @ Example: MYVAR FETCH INIT-PICMEM ( -- ) Initializes PIC memory system. Called from INIT-PICFORTH. Example: INIT-PICMEM STORE ( ADDR -- ) Generates code for storing to a constant address. Takes into account bank changes. Used by ! Example: MYVAR STORE 5 - Programming with mary ========================= 5.1 - A little example ---------------------- This is a very little example. More to come when the docs leave alpha stage. Let's say our file is example.fs (fs meaning "Forth stream", as opposite to a bunch of Forth screens (1k-octet blocks), the classical way). Our project is a silly one: connect 4 switches to RA0 ... RA3 in the PIC, and 4 LEDs with serial resistors to RB0 ... RB3. Don't forget to make an oscillator and a power supply. The program couldn't be simpler: -----8<----- PICFORTH INIT-PICFORTH ( CLEARS MEMORY ) INIT-PIC : SILLYTEST BEGIN PORTA @ PORTB ! AGAIN ; -----8<----- Compile it with: mary -p 16c84 example.fs > example.hex and burn into the C84 with our favorite programmer. And that's all. The program doesn't use any external resources, so the code is very small. In the general case, we don't use INIT-PICFORTH, as there are some subroutines needed for words like @. See the examples directory for more code. A - A Taoist-like story about Forth ----------------------------------- I have said that Forth could have helped me with my life. I have said even that Forth can help you to understand the Universe. This appendix don't try to teach you Forth, Taoism, the Universe, the life or anything. It simply is here, and you are there. Or maybe "here" and "there" mean the same, or even nothing. For learning Forth, there are many texts around, I will do my best to recommend one if I can find an adecuate one. About Taoism, the definitive text is the "Tao Te King", translated to too many languages to list here. It's interesting and appropriate for this appendix, not meaning anything about religion nor following a given philosophy. As for your life, the one who best know it and can help you is yourself. A mirror can help. Forth can help. A friend can help. But the only one able to change what is to be changed (in case that is the case) is you. Said that, lets begin. In Forth, the variables are constants. The constants are executed. There is no compiler, no code generator, no optimizer, no syntax nor syntax checker, no lexical analyzer. There are no programs. But everything works. Frequently, the Forth novices, seeing too many differences with their previous learning, abandon desperately. Paradoxically, the simplicity of Forth seems to them a great complexity when compared to other languages. Time to unlearn and learn again. This is an important lesson for life. The computer memory when clear is like the Void: it has the potential to contain anything. Only by making the decision to change its content to any value we like, we give it a sensible meaning. The meaning is not in the value, but in us. The memory remains being memory, and any value change is not permanent (obvioulsy true for RAMs; but ROMs do break, you know). The memory is permanent, and its content can be data or code. Variables in Forth are really constants, representing the address where we can change or read the values. But constants are Forth words, and as such, executable code. So, the variables are executable, too. If data represent inanimated objects (matter, if you prefer) in our everyday lives, by realizing that variables are code, we now see that matter is a process. And everything in Forth is executable, so in our analogy, everything is a process, matter and energy. In the computer (in a Von Neumann architecture, at least), code and data seems all alike. Only the fact that the processor reads ones instead the others makes a difference. A difference in the observer's mind, be it a machine or a human being. There is no such thing as a Forth compiler (not even mary). Seen in perspective from a certain distance, it's possible to say that there is a compiler, but we must keep in mind that the program to be compiled is calling actively every needed word to make the compilation possible. So, there is no compiler, but compiling words that generate their own code and make their optimizations. They don't exist outside a Forth system, nor outside the memory when running. There are no big programs that do everything, but everything is done. Forth applications are not programs, also. They are words defined to do a job when used in a whole. But they may seem that individually don't do anything useful. They are little parts of a whole, everyone taking care of an aspect of the problem, until there is no problem at all. The "Not Doing" as a practical philosophy. Once you have defined words suited to your problem, you have the right tool, and the "program" to solve it is done by itself (really).