After many, many days without a working monitor, I’ve been practically forced to plan things out on paper. The good news: the newer version will finally have a compiled form!
If you haven’t seen the files in the Zip archive (downloads/interpreter14d3.zip), then you probably won’t recognize some of this.
Strings will be natively stored in v2 (ASCII-compatible types only), along with C-form
concatenation (but not static-only):
`1 31`
results in "131". `"foo"==>a; " fighter"==>b; a b`
results in "foo fighter".
Identifier storage and look-up will use a faster approach without consuming too much memory. The identifier parser will calculate a 2 bit checksum—a simple sum of the name characters. Scanning the local and global lists will first check the length (stored minus one) and hash (checksum and leading character combined). Identifiers will be limited to 256 characters, even though the high-level form has allowed each length to be 31 bits in size (over 2.1 billion).
There are three ways of addressing each:
Static | name := initial value |
Global/Static | value==>name |
Local | value-->name |
The compilation process will be an optional extension on the parser, storing functions as identifier binary data (as opposed to value/string data), constants and variable references in global or local blocks (similar to object file storage).
What’s sped up the process in getting the compiled form to work, and easily, was the realization that the expression stack in the parser already stored all of the necessary operator and value information. It was just a matter of reverse-copying this data to the function allocation. Two benefits came with this: automatic optimization—conceived 2-3 years ago, the precalculation of constants, left-to-right, bailing when a non-constant appears)—was made easy, and bit #7 of the opcode is already defined—used in run-on multiplication (`2(17)` —> 34), evaluation continues after using the stored number, valid and carried in the compiled form.
After six more prototypes, I’ve got the new instruction set, as close as I could get it to the high-level interpreter. There are only 41 types in all.
Instruction Set (Octal): hi+ x0x x1x x2x x3x x4x x5x x6x x7x 0xx ldv cpy srt abs neg not sno stv -- nop mnemonic: ldv w/ no operand 1xx ldx asr shr shl eor ior and stx -- see Line-0/1 types below 2xx add sub mul pow div dvs mod mds 3xx. clt cge cgt cle ceq cne cid cni Low operand types: +0 param, uint8, or stack pull (no byte) if ≥0x40/ldx +1 param, uint32 +2/3 stack, int8/32 Line-0 secondary, low: push prior to operation; 'not'+4 == '+not' Line-0/1 secondary types: hex instruction 38 lad; list add 3C ret (no operand); return 3D-F clm; call/run-on multiply 44 cnc; concat 4E ads; add stack 66/7 bra (relative); branch 7E/F bnz (rel.); .. if not zero 76/7 brz (rel.); .. if zero 7C ldi; load immediate Line-2/3 secondary operands (≥0x7C/ldi): immediate: 8/32 bits; double, 32/64 bits
Interpreter script: high-level form Don't return NULL on ':' out of place for func '::'. Bump up version to 1.5 for this. High-level Keywords 'if'/endif, while/endw/endwhile, and end for func. Negated position for line = goto. endw stores returning line, func start holds end for skip. Possibly do `&&` for string concat w/ space (HyperCard form) -- interferes with bool-and op. Compilation: string form (canceled) Low 5 bits always zero for termination. Store strings in reverse order. Not exactly C-string -- numbers may have nulls (evalvalue). ***OBSOLETE*** 000 end 001 str 010 getv (includes func) 011 setv 100 push 101 pull 110 bnz (`?`) 111 bra (`:`) Nix all that - CISC order for RISC set. Nix that too. Use wadfile method for searches - 2/4 characters in ID, 2/4 byte length — fastest search done by packing all initial search data in 32 bits. Offset to name string in string buffer, maybe chained. item/word/para-range for modifying text (ref to alloc instead of copy - `@` = manual copy). MAX_VALUE_SIZE 8 Internal ctype array, putting bit #7 as the “keyword/name char.” BCN (optional): Binary Coded Numeric: 0-9, A=zero/ten, B=base number (digits prior), C=minus (invert current sign), D=decimal/fraction point, E=notation (1e+9), F=terminator.