Reasons For Inventing The Wheel
The Cybiko computer is built around the Hitachi H8S processor which, from the
programmer's point of view, has many nice features, most notably excellent
instruction timing, 32-bit architecture (including a "flat" address
space), and plenty of registers. When you need to squeeze the last cycle out
of some time-critical piece of code, control hardware, or manage power
savings, the H8S is great. But when it is necessary to come up with short
programs, the H8S fails. There are several reasons for this:
1. The
H8S is a successor to a 16-bit processor, designed to maintain maximum
compatibility with its predecessor. These Hitachi processors might appear
to be like the Intel i80x86 processors at first glance. But while (in
32-bit "flat" mode) Intel "sacrificed" 16-bit
instructions (whenever you employ an instruction that operates upon 16-bit
operand[s], it gets prefixed with an extra byte), Hitachi
"sacrificed" some 32-bit instructions. Most notably, 16-bit push/pop take 2 bytes
each, while their 32-bit counterparts take 4 (!) bytes each. And remember,
these are the instructions that the GNU C compiler uses for passing all
function arguments to vararg functions and all arguments (except the
first 3) to other functions.
2. The
H8S has an extensive set of instructions for operating upon bits, bit
masks, etc., and these instructions have rather short opcodes. It is
amazing what one can do with a carry bit. This is great for operating
system (especially device driver) writers, and for die-hard assembler
addicts. But from the compiler's perspective, they are almost useless, and
simply eat up opcode space, making "useful" (from the compiler's
point of view) opcodes much longer.
3. The
H8S has a RISC-alike architecture in many respects. First and foremost, the
lengths of all its opcodes are multiples of 2 (the length of a machine
word). From the perspective of optimizing an opcode set in terms of average
code length (by Hitachi engineers), this was a bad thing. RISC architecture
also implies that, in any opcode, all involved registers must be denoted
explicitly — all you can do with the memory operand is load or store it.
4. It seems that in the famous "time vs. space" tradeoff, Hitachi engineers have always favoured time. The most notable example is the execution flow control instruction subset. There are two "flavours" of most such opcodes: "branches" (opcode names are some derivations of 'branch', opcodes are 2-byte and employ 8-bit relative displacements) and "jumps" (derivations of 'jump', 4-byte, 24-bit displacements, respectively). The funny thing is that, while instructions must be aligned upon 2-byte (word) boundaries, "branch" opcodes address bytes, not words. That is, displacements stored within instructions never have its lower bit set. Even if you (via some cheating) store 1 there, it will be ignored by the internal processor logic. So why not store disp/2 and "expand" it (i.e. left-shift) just before utilization, enlarging byte-addressable range (and thus reducing the need for the dword-sized jumps) by a factor of two? It seems like this would require one extra processor cycle.
The features mentioned above are great if you're dealing with a controller, firmware for the "frozen" built-in system, or just something time-critical. But Cybiko features an extensive object-oriented OS (with pre-emptive multitasking), GUI, an extensive and ever-growing set of applications, and much more. And these features, and much more, must fit in only 250k RAM. So, we started looking for something less expensive in terms of consumed memory, and came up with our bytecode interpreter (which, BTW, currently uses only 950 bytes by itself!).
The above list could be expanded, of course, but there is also another big reason for implementing a bytecode interpreter, which can be explained in even greater detail. Fortunately, Sun Microsystems did the boring work for us by advertising their Java virtual machine with a bytecode interpreter of its own. So to learn more about the huge and undisputed advantages :-) of interpreted vs. native code, you're probably better off with Sun's The Java Language Environment: A White Paper (by James Gosling and Henry McGilton).
A good interpreter has to be small and fast. These days, "small" is no longer considered a mandatory property, but things look very different when all you have is 250k of RAM shared with other (potentially even more demanding) applications running concurrently. So we have taken some steps to pursue both goals; specifically:
1. Our
interpreter, as a virtual machine, works with registers, not with a
[virtual] stack. There are two virtual registers, R0 and R1 (which
correspond to real registers er0 and er1 of the H8S processor) which get
used as operands. This does not lead to opcode bloat, however, each
register has a hard-coded role — in other words, no single opcode contains
a "register bit". For example, push
implies register R0, pop implies register R1, and
so on. This may sound weird, at first, but there are very clever ways to
make use of such opcodes, and our existing C compiler proves that. As for
opcode implementations, please consider this one:
add:
add.l er1, er0 ; 2 bytes, 1 cycle
bra scheduler ; 2 bytes, 2
cycles
It is that simple.
2. There
is no translation layer between CyOS and the hardware on one side, and the
bytecode interpreter on the other. For example, push, pop, calln,
and retn. Opcodes make use of the regular
stack and not the emulated one. Upon calls to CyOS and so called extension functions (see below) which expect the
first 3 parameters to be in registers, those parameters' values get placed
into the proper registers as a result of standard expression evaluation
sequences (e.g. extension functions, written
in "regular" C, expect their 3-rd argument to be 'this'
and expect it to be passed in register er2, but bytecode interpreter
always keeps 'this' in er2). That means that the
interpreter does not have to move registers' values around before each
call. Thus the bytecode interpreter and the rest of the system are pretty
well integrated.
3. The
bytecode interpreter has neither 'stack frame pointer' nor 'code buffer
pointer'. For those of you familiar with the i80x86 processors: 'stack frame
pointer' used to be BP in 16-bit programs, and EBP in 32-bit. Compiler
writers managed to always compute local variables' addresses relative to
ESP, and thus freed EBP for use as a general register. In GNU C/C++, this
is known as 'omit frame pointer' optimization. Our C compiler always
does this optimization, so we dropped the very notion of 'stack frame
pointer' out of our virtual machine specification.
Almost the same applies to 'code buffer pointer' - there is no such thing.
Even such an instruction as leag.u contains
displacement relative to the very next instruction, not any
"absolute" offset. Other instructions, such as calln.s and jump.c
are similar in this respect. In other words, the code is essentially
position-independent (a.k.a. PIC).
4. The
maximum size of a bytecode module is 64k; therefore, "local"
address space is essentially 16-bit and, consequently, "static"
offsets (including those used in calln.s
opcode) are 2-byte. Furthermore, stack frames are addressed with unsigned
1-byte displacements (thus, a function cannot have more
than((255-4-1)&~3)==248 bytes of arguments and 'auto' variables in
total); objects are also addressed with unsigned 1-byte displacements
(therefore, objects cannot be larger then 256 bytes; in other words, no
more than 256 bytes are addressable via 'this', but objects
themselves could be of any size up to 64k). But — as soon as any
address gets loaded into a register (say, as a result of leal.b bytecode execution, which effectively sums
1-byte offset it contains and current value of stack
pointer and places result in R0),
it becomes a valid 32-bit address, fully compatible with those used by the
rest of the system.
5. There are bytecodes for "object" commands, that is, opcodes that operate upon data addressed relative to the special 'this' pointer. In other words, there are provisions for object-oriented languages, such as C++. The current implementation of C has some OO extensions implemented via the use of these opcodes (which save considerable space). See leat.b for more information.
6. Not
all arithmetic opcodes are 32-bit. Multiplication and division are
essentially 16-bit in that if their operands do not fit within the
-32768..32767 range, results are unpredictable (this does not apply to the
dividend). This is a design decision.
7. Unsigned data types are not supported, just like in Java. The only supported data type types are signed char, short (synonym for int), and long. However, unsigned shift right, and specialized Unicode character types found in Java are not supported in the Cybiko bytecode interpreter. Note that the H8S has a 24-bit address space and no virtual memory, so any address is guaranteed to have 8 higher bits clear; in other words, addresses could safely be treated as signed entities and compare correctly. We used this fact for major optimization: we excluded all opcodes for unsigned comparisons.
8. Opcodes
are bytes, so there may be up to 256 of them. The
thing is that we take the famous "profile anything, assume
nothing" rule quite seriously. From this perspective, we want to
build as perfect an instruction set as possible while maintaining maximum
(including full backward) compatibility with current implementation.
Therefore, we decided to implement a minimalist set of opcodes, wait until
large pieces of software which make use of that set appear, and then
"profile" them to see what bytecode sequences are most used, and
then make them into new bytecodes. Similar approaches already proved to be
highly efficient with our proprietary compressor.
9. The
bytecode interpreter is fully re-entrant. Any number of applications
executing bytecode modules (even with different sets of extension functions) share a single in-memory
image of the respective dynamic library (bytecode.dl).
Structure And Use of Bytecode Modules
The structure of a module is simple. At the very beginning, there is a table of even number 16-bit words. In other words, there is a word, then some number of pairs of words are followed by one more word. The very first word is an offset to the module entry point: function main(), or whatever you call it; please see the VCC1 compiler documentation for more info on how that compiler handles module entry point (however, your treatment may vary; see below). Pairs are offset (again, relative to the start of the module) from data and code for "exported" objects, respectively (remember — all words, including double words, are stored in BE order!). The very last word is a "terminating NULL". Currently, the following tasks are up to the programmer:
1. How to load the module into memory. You should probably load it like any other resource - from the .app or .dl application archive. You may then keep it in memory (and keep p-code running) until your program terminates, or you may free it upon losing focus (thus effectively suspending p-code execution) and then re-load upon getting focus again.
2. How to interpret data and code pointed to by those offsets at the beginning of the module. For example, Cylandia treats the first offset as the module initialization function's offset (which contains relocation records and objects' ctors), and treats remaining offsets (except for the very last one, of course) as offsets to data and code of game actors. Our C compiler thinks of them as exported structures and respective exported methods.
The only way to execute bytecodes is to call the vm_exec
family function found in the bytecode.dl library, like this (below, we assume
that entire module got loaded into 'word_t* buff'):
/* 1) call module initialization routine */
vm_exec( NULL,
/* no active objects/actors yet */
(byte_t*) bytecode +
bytecode[ bytecode[ I_FIRST_EXPORT ] + I_STARTUP ], /* startup code */
extension functions
);
/* functions imported by p-code*/
/* 2) enter main loop */
vm_exec_3( NULL,
/* no active objects/actors yet */
(byte_t*) bytecode +
bytecode[ bytecode[ I_FIRST_EXPORT ] + I_MAIN ],
/* main(): module entry point */
extension functions,
/* functions imported by p-code*/
argc, argv,
TRUE);
/* arguments */
/* 3) call module cleanup routine */
vm_exec( NULL,
/* no active objects/actors yet */
(byte_t*) bytecode +
bytecode[ bytecode[ I_FIRST_EXPORT ] + I_CLEANUP ], /* cleanup code */
extension functions
);
/* functions imported by p-code*/
Important note:
all functions that are called via vm_exec() must return with retf opcode (in other words,
they're considered far, as opposed to near functions callable
via calln.s).
The very last argument to vm_exec() is the address of the table of
functions prototyped as follows (also, see callx.b
opcode description):
typedef dword_t (*import_t)(
dword_t arg0, dword_t arg1, void*
this_ptr );
import_t extension_functions[] = { /* ... */ };
The primary use of extension functions is for implementation of time-critical
pieces of code and filling in the gaps which currently exist in the
interpreter's import abilities. For example, currently there is no way to
import CyOS' global variables, so you'll probably have to provide an extension
function which returns the address of a variable (say, a font handle) to
get access to it.
Please see the attached example program for further details.
Internally, bytecode interpreter (i.e. vm_exec() function found in the
bytecode.dl dynamic library) uses the following register assignments:
er0
er1
er2
er3
er4
er5
er6
er7
The bytecode assembler - vas utility, understands the following options:
-i
-d
-h
-o outfile
-O
-v
If no options are given, vas tries to read stdin and to write to stdout, don’t optimize output code and don’t write statistic messages. Preferable extension for the output file is .bin, which stands for binary. While parsing input:
Note: forward references (i.e. references to labels which have not been defined yet) are OK. Moreover, the assembler operates so that even resolving backward references is always postponed until the second pass.
Associates
all the following instructions (until next .ln directive) with a
particular line number within source code in a higher-level language (say,
C, C++, BASIC, Pascal, etc.) Compilers should use this directive
extensively while emitting code so that the assembler can report errors and
other problems in association with the source line that caused them.
Introduces
a global label. Label name is a number in range [99, 32768].
It does not matter (at all) what that label denotes —
program code or data, or even something else (like control information);
the label is just assigned the current value of the Program Counter (PC).
Two, or more, labels on sequential lines are OK; these labels will be
identical.
.disp <global_label> <displacement>
Allocates
2 bytes and then stores there the unsigned distance from the current
position of the Program Counter forward to the label
<global_label> + <displacement>.
I.e. the following sequence
.disp 555 0
.label
555
stores 0, not 2; while the following sequence stores 7:
.disp 333 0
.skip 7
.label
333
Currently, this directive is used in conjunction with the switch opcode only.
.ldisp <global_label> <displacement>
Allocates 4 bytes and then stores there the distance from the current position of the Program Counter to the label <global_label> + <displacement>.
Allocates
2 bytes and stores an absolute offset to the label <global_label>.
This opcode is currently used for building export
tables only.
If
current value of Program Counter is an odd number, then this directive
emits zero bytes. Please note that there is no need to align code, so this
directive is only useful for data elements bigger then one byte.
Advances
Program Counter by specified <number_of_bytes> (i.e. allocates<number_of_bytes>
zero bytes).
Emits
byte <number>. Preferred way to initialize static data.
Emits word <number>.
.string <text_in_double_quotes>
Emits
string literal with the value <text_in_double_quotes>, plus terminating
0. Unfortunately, there is currently no way to escape any symbols, notably
newline and quotable quotation marks. As a workaround, the .byte directive could be used to emit any bytes, but
please be warned that, in the future, characters may no longer be single
bytes.
Starts the 'bss' (Blank Storage Segment) segment. Preferred place for uninitialized global variable.
Starts segment of code.
Starts segment of data.
Starts segment of data. Duplicate segments of this type will be merged by vlink.
Declares public symbol for using in other modules.
Declares extern symbol used in module.
Specifies
logical end of the source file. Source files that do not use this directive
may not compile with future versions of the assembler.
All bytecodes have one
explicit operand, at most. However, assembler instructions for some of them seem
to allow for one extra argument (see below). Please note that this is only a
notation device — a way to tell assembler to adjust target address after
converting label number or base address into a final offset (stack-, object-,
or buffer-relative).
Suppose the Cybiko C compiler sees something like this:
int my_array[ 3 ];
...
my_array[ 2 ] = 1;
Obviously, there is no need to load index, scale it (since array elements are
word-sized), then load address of the array, then add it to scaled index,
etc. Instead,
our compiler will produce something like this:
.label 777 ; my_array
.skip 6
...
leag.u 777 4 ; my_array
move
load1
storeis
Here, 777 is label number (ordinal), 6 is the size of my_array[], and 4 is
extra displacement relative to the label 777. Note that assembler will turn 'leag.u 777 4' into a single opcode followed by the
word-sized displacement (hence .s
suffix).
Note that instruction suffixes designate optional data that may follow (and not
the size of the operands):
.c char (that is, signed byte),
.b byte (unsigned byte),
.s short (16-bit signed word),
.u unsigned short (16-bit unsigned word),
.w word (16-bit unsigned word),
.l long word (32-bit signed double word).
Compares
R0 to R1, sets R0 to 1 if registers are equal, or to 0 otherwise.
Compares
R0 to R1, sets R0 to 1 if R1 is less than R0, or to 0 otherwise.
Compares
R0 to R1, sets R0 to 1 if R1 is less than or equal to R0, or to 0
otherwise.
Compares
R0 to R1, sets R0 to 1 if R1 is greater than R0, or to 0 otherwise.
Compares
R0 to R1, sets R0 to 1 if R1 is greater than or equal to R0, or to 0
otherwise.
Tests
R0 and sets it to 1 if it was zero, or to 0 otherwise, effectively
performing 'logical not' upon contents of R0. Roughly equivalent to the
following opcode sequence: move load0
seteq.
Tests
R0 and sets it to 1 if it was non-zero; roughly equivalent to the following
opcode sequence: move load0
setne. This is effectively 'convert to
boolean' operator performed upon contents of R0.
Compares
R1 to signed byte <num> (-128..127), sets R0 to 1 if they are equal,
or to 0 otherwise. Same as the 'load.c <num>
seteq' sequence but more efficient.
Compares R1 to signed word <num> (-32768..32767), sets R0 to 1 if they are equal, or to 0 otherwise. Same as the 'load.s <num> seteq' sequence but more efficient.
Compares
R1 to signed double word <num>, sets R0 to 1 if they are equal, or to
0 otherwise. Same as the 'load.l <num>
seteq' sequence but more efficient.
After this opcode, there must be a table of 16-bit words formed with the respective number of .disp directives, which denote jump displacements for switch values 0, 1, 2, etc. Number of .disp directives must not be less than maximum switch value expected. No bound checks are currently performed at run time (for the sake of effectiveness), so use of this opcode may be tricky! Used for switchfast C statement implementation.
Unconditional branch to instruction that is within the -128..127 range relative to the instruction immediately following this one. If target <label>is not within the reach of this instruction, assemble will try to use its 'long' counterpart, jump.s.
If R0 is zero, then branch to instruction that is within the -128..127 range relative to the instruction immediately following this one. If target <label> is not within the reach of this instruction, assemble will try to use its 'long' counterpart, jumpz.s.
If R0 is not zero, then branch to instruction that is within the -128..127 range relative to the instruction immediately following this one. If target <label> is not within the reach of this instruction, assemble will try to use its 'long' counterpart, jumpnz.s.
Unconditional
branch to instruction that is within the -32768..32767 range relative to the
instruction immediately following this one.
If R0
is zero, then branch to instruction that is within the -32768..32767 range
relative to the instruction immediately following this one.
If R0
is not zero, then branch to instruction that is within the
-32768..32767
range relative to the instruction immediately following this
one.
Push
address of the instruction immediately following call itself on stack, then
jump to the instruction that is within the -32768..32767 range from the immediately
following one.
Indirect
call: push address of the instruction immediately following call itself on
stack, then jump to the instruction whose address is currently within R1.
Call extension functions represented with its index within
the table of extension functions.
Call the CyOS function
which takes 0, 1, or 2 arguments. For 1-argument system functions, simply
evaluate (just before call) the necessary expression so that the evaluation
result is in R0, then use calls12.w. For 2-argument functions, evaluate
the second argument, use move,
then evaluate the first argument, then use calls12.w (or, if R1 is
likely to be lost in the process of evaluation of the first argument, evaluate
second argument, use push, evaluate first
argument, use pop, then use calls12.w).
Calls
CyOS function which takes 3 or more arguments. This opcode first pops 3-rd off the stack, then acts exactly as calls12.w. So you must first evaluate and push arguments 3 and beyond in reverse order,
then evaluate arguments 2 and 1, as shown above,
then use calls3.w.
Calls
function located within CyOS dynamic library or even application that
exports [some of] its functions. Very similar to calls12.w
and even uses the same calling convention. However, it differs in that is
must be immediately followed by the .disp
directive which should define offset to a 4-byte static variable that holds
address of export table of respective dynamic library. It is that table
that is indexed with argument <index>.
Just
like calld12.w, calls function located
within CyOS dynamic library or even application that exports [some of] its
functions. Very similar to calls12.w (uses
the same calling convention) and to calld12.w
(requires .disp directive); see these opcodes
for details.
"Return
far" — return from vm_exec().
"Return
near" — return from function called via calln.s.
Same
as retn, but pop arguments off the stack upon
return (here, <num> is number to be added to stack pointer just before return).
Add
-128..127 to stack pointer.
Push
R0.
Pop R1.
The
same as consequitive execution of pop and add
operations. New
in virtual machine version 11 (included in SysPack 55).
Move
R0 to R1; same as "push pop"
sequence, but by far more efficient. The "push pop" sequence, however, may also be more
convenient sometimes, since it allows you to preserve R1. See calls12.w opcode description.
Load
0 (zero) into R0.
Load
1 (one) into R0.
Load
-128..127 into R0.
Load
-32768..32767 into R0.
Load
double word into R0.
As same as loadic, but the parameter of this operation will be added to the byte pointer before execution. New in virtual machine version 11 (included in SysPack 55).
As same as loadic, but the parameter of this operation will be added to the word pointer before execution. New in virtual machine version 11 (included in SysPack 55).
As same as loadic, but the parameter of this operation will be added to the double word pointer before execution. New in virtual machine version 11 (included in SysPack 55).
leal.b <offset> <displacement>
Load
effective address of 'auto' variable into R0.
leat.b <offset> <displacement>
Load
effective address of 'object' variable into R0. If byte-sized offset
specified in the instruction is 0, then such instruction actually means
'load this into R0'. Compilers for C++ and other OO languages may
make use of this feature.
Load effective address of 'static' variable or function into R0.
If effective address is before, assemble will try to use
its 'back' counterpart, leagb.u.
Load char
pointed to by contents of R0 into R0.
Load short
pointed to by contents of R0 into R0.
Load long
pointed to by contents of R0 into R0.
Store
char contained in R0 at the address pointed to by contents of R1.
Store
short contained in R0 at the address pointed to by contents of R1.
Store
long contained in R0 at the address pointed to by contents of R1.
loadlc.b <offset> <displacement>
Load char
located at the address 'stack pointer' + 'next byte', into R0.
loadls.b <offset> <displacement>
Load short
located at the address 'stack pointer' + 'next byte', into R0.
loadll.b <offset> <displacement>
Load long
located at the address 'stack pointer' + 'next byte', into R0.
loadtc.b <offset> <displacement>
Load char
located at the address 'this' + 'next byte', into R0.
loadts.b <offset> <displacement>
Load short
located at the address 'this' + 'next byte', into R0.
loadtl.b <offset> <displacement>
storelc.b <offset> <displacement>
Store
char from R0 at the address 'stack pointer' + 'next byte'.
storels.b <offset> <displacement>
Store
short from R0 at the address 'stack pointer' + 'next byte'.
storell.b <offset> <displacement>
Store
long from R0 at the address 'stack pointer' + 'next byte'.
storetc.b <offset> <displacement>
Store
char from R0 at the address 'this' + 'next byte'.
storets.b <offset> <displacement>
Store
short from R0 at the address 'this' + 'next byte'.
storetl.b <offset> <displacement>
Store
long from R0 at the address 'this' + 'next byte'.
loadgc.u <label> <displacement>
loadgs.u <label> <displacement>
loadgl.u <label> <displacement>
Add 1
to R0.
Add 2
to R0.
Add 4
to R0.
Add to
byte addressed by eR0 register. New
in virtual machine version 11 (included in SysPack 55).
Add to
word addressed by eR0 register. New
in virtual machine version 11 (included in SysPack 55).
Add to
double word addressed by eR0 register. New
in virtual machine version 11 (included in SysPack 55).
Subtract
1 from R0.
Subtract
2 from R0.
Subtract
4 from R0.
Subtruct
1 from byte addressed by eR0 register. New
in virtual machine version 11 (included in SysPack 55).
Subtract
1 from double word addressed by eR0 register. New
in virtual machine version 11 (included in SysPack 55).
Some words after this opcode are a list of patches. A zero word (.word 0) is a marker of the end of the list. Every word is a relative offset of a label (.disp). Every label is a label of ".ldisp" (address of variable). To each value of .ldisp the absolute address following it byte will be added. New in virtual machine version 11 (included in SysPack 55).
Arithmetically
shift R0 left by 1 bit.
Arithmetically
shift R0 left by 2 bits.
Arithmetically
shift R0 right by 1 bit.
Arithmetically
shift R0 right by 2 bits.
Add R1 to R0, result in R0.
Add
-128..127 to R0, result in R0.
Add
-32768..32767 to R0, result in R0.
Add
double word to R0, result in R0.
Subtract
R0 from R1 (!), result in R0.
Negate
R0.
Multiply
R0 by R1, result in R0.
Multiply
eR0 by parameter of this command. New
in virtual machine version 11 (included in SysPack 55).
As
same as consequtive execution of "mul.c" and "add"
commands. New
in virtual machine version 13 (included in SysPack 56).
Jump,
if eR0 <= eR1. In that case eR0 = 0 after execution, otherwise eR0 = 1. New
in virtual machine version 13 (included in SysPack 56).
Jump,
if eR0 >= eR1. In that case eR0 = 0 after execution, otherwise eR0 = 1. New
in virtual machine version 13 (included in SysPack 56).
Divide
R1 by R0 (!), result in R0.
Divide
R1 by R0 (!), remainder (result) in R0.
Bitwise
AND of R0 and R1, result in R0.
Bitwise
OR of R0 and R1, result in R0.
Bitwise
eXclusive OR of R0 and R1, result in R0.
Syspack V. |
Bytecode V. |
Own version in root.inf |
Changes
|
53, 54 |
10 |
N\A |
initial implementation |
55 |
11 |
N\A |
decic1, incic1, decis1, incis1, decil1, incil1, popadd, patch, loadic.b, loadis.b, loadil.b, mul.c |
56 |
13 |
N\A | |
57 |
14 |
1.3.3 |
"root.inf" added |
N\A | 15 | 1.4.4 | lshift, rshift, urshift1, urshift2, urshift, native, bcopy.b, swap, push2, pop2, move2, nop |
N\A | 16 | 1.5.1 | loadiuc, loadius, loadiuc.b, loadius.b, loadluc.b, loadlus.b, loadtuc.b, loadtus.b, loadguc.u, loadgus.u, cuwd, cubd, deciuc1, inciuc1, decius1, incius1, setb, setbe, seta, setae, mulu.c, mulu, divu, modu |
N\A | 18 | 1.7.1 | callni.b |
Since we assumed nothing, we'll profile anything :-) and come up with the
rest of 256-147=109 opcodes. Most probably, there will appear direct
loads/saves for static variables and better (read: some :-) support for
increments and decrements for variables (as opposed to memory locations
denoted with address expressions).
Assembler should become macro assembler (not a top priority task, though).
Life should become easier :-).
Copyright © 2001 Cybiko, Inc. All rights reserved. | More information... |