STOIC (STACK ORIENTED INTERACTIVE COMPILER) SAO VAX/VMS version implemented by Roger Hauck Smithsonian Institution Astrophysical Observatory Cambridge, Massachusetts June, 1980 STOIC is a general purpose interactive program which incorporates the capabilities of a compiler, assembler, debugger, loader, and operating system within a single consistent architecture. It is core-efficient while retaining high running speeds. In addition, the language is extremely flexible, permitting the user to develop a working vocabulary of subroutines tailored to his specific application. The single most prominent feature of STOIC is its principal data structure, called the dictionary. The dictionary is an ordered list of entries called words. Associated with the dictionary entry for each word is a name; a legal name for a word is any string of up to 255 ASCII characters. Punctuation, numerics, and most other special characters may be freely used within a name. Illegal characters within a name are: SPACE TAB CARRIAGE RETURN FORM FEED LINE FEED RUBOUT NULL A literal is a sequence of characters which describes a constant. The basic STOIC supports two types of literal; 32-bit integer and string. The language may be extended to include other data types such as floating point or double precision integer. NOTE: floating point literals (double-precision or "D"-floating) have now been implemented on the VAX version. An integer literal is a sequence of digits optionally preceded by a plus or minus sign. All digits must be less than the current radix. No spaces may be embedded within the literal. Integer literals must be in the range of -2,147,483,654 to 2,147,483,653. Although the default radix is 16 (hexadecimal); to avoid confusion, the examples presented in this manual use a default radix of 10 Page 2 (decimal). EXAMPLES: -1234 is a legal literal +-100 is not a legal literal -AFC0 is a legal literal if radix is hexadecimal String literals may take one of two forms: 1) A string enclosed in double quotes: "STRING" 2) A string preceded by single quote and terminated by space or tab: 'STRING In either of the above types of string literal, the end of the line (return or form feed) may serve as a terminator. EXAMPLES: "THIS IS A STRING LITERAL" '17715 STOIC syntax is quite simple. A legal command line consists of a sequence of literals and/or names of words separated by spaces or tabs, and terminated by carriage return. Programming in STOIC consists primarily of defining a set of new words based on words which have already been defined. An initial vocabulary of about one hundred words called the KERNEL enables the user to get started. The principal vehicle for communication between words is the parameter stack, frequently called "THE STACK". Typically, the parameters upon which a word will operate are pushed on the stack, the word pops its parameters from the stack and pushes its results on the stack. Communication through variables in fixed locations is also used. STOIC uses reverse-polish notation for all operations. This means that all operands precede their operators; parentheses are never necessary. EXAMPLES: Page 3 1 1 + 2 * in algebraic notation is (1+1)*2 1 2 3 * - in algebraic notation is 1-(2*3) Unlike most other higher-level languages, STOIC enables the user to manipulate addresses as well as data. it is very important however for the user to remain aware of the distinction between an address and its contents. There are three common types of words which push numbers on the stack; literals, constants, and variables. a reference to a literal or a constant causes its value to be pushed on the stack. A reference to a variable causes its address to be pushed on the stack. The three operators "@", "!" and "<-" are used to obtain and modify the value of a variable. They are defined as follows: @ Replace the address on the top of the stack by the contents of that address. this word is used to load the contents of a memory location onto the stack. ! Store at the address on the top of the stack the number next to top of the stack. Both numbers are removed from the stack. <- Store the number on the top of the stack at the address next to top of the stack. Both numbers are removed from the stack. EXAMPLES: In the following examples, X, Y, and Z are variables, while A, B, and C are constants. 100 X ! set value of X to 100. X 100 <- set value of X to 100. X 100 ! store the address of X in location 100 (under normal circumstances, this would result in a memory-access violation). X @ Y ! set the value of Y to the value of X X Y ! set the value of Y to the address of X X @ Y @ + Z ! add the values of X and Y and store the result in Z. X A + Y ! store (address of X)+A in Y. X A B + + @ Y ! set value of Y to the contents of location X+A+B. Page 4 1.0 OPERATORS STOIC provides the user with an unusually large number of fixed-point operators from which the following more common examples have been extracted. Unless otherwise specified, all numbers are 32-bit integers. 1.1 Unary Operators The following operators replace the top of the stack with their result. The number on the top of the stack is called "A". MINUS -A ABS Absolute value of A NOT Logical complement of A 2* A*2 4* A*4 2/ A/2 4/ A/4 1+ A+1 2+ A+2 4+ A+4 1- A-1 2- A-2 4- A-4 EQZ -1 if A equal to 0 0 otherwise NEZ -1 if A not equal to 0 0 otherwise LTZ -1 if A less than 0 0 otherwise LEZ -1 if A less or equal to 0 0 otherwise GEZ -1 if A greater or equal to 0 0 otherwise Page 5 GTZ -1 if A greater than 0 0 otherwise 1.2 Binary Operators The following operators replace the top two numbers on the stack with their result. The number on the top of the stack is called "A", the next to top is called "B". + B+A - B-A * B*A / B/A MAX maximum (B,A) (signed) MIN minimum (B,A) (SIGNED) MOD remainder of B/A AND logical AND of B,A OR logical OR of B,A XOR logical EXCLUSIVE OR of B,A EQ -1 if B equal to A 0 otherwise NE -1 if B not equal to A 0 otherwise LT -1 if B less than a 0 otherwise LE -1 if B less or equal to A 0 otherwise GE -1 if B greater or equal to A 0 otherwise GT -1 if B greater than A 0 otherwise Page 6 1.3 Stack Order Operators A number of operators are also provided whose sole function is to reorganize the elements of the stack: STACK STACK NAME BEFORE AFTER DESCRIPTION DUP A A duplicates top of stack A OVER A B duplicates top - 1 B A B 2OVER A C duplicates top - 2 B A C B C 3OVER A D duplicates top - 3 B A C B D C D UNDER A A stores top at top - 1 B 2UNDER A B stores top at top - 2 B A C 3UNDER A B stores top at top - 3 B C C A D DROP A B discards top B 2DROP A C discards top 2 stack entries B C 3DROP A D discards top 3 stack entries B C D UNDROP A X restores a stack entry; A usable after EQ_IF or EQZ_IF--see "CONDITIONALS" 2UNDROP A Y restores two stack entries Page 7 X as "UNDROP", above A SWAP A B exchanges top and top - 1 B A 2SWAP A A exchanges top - 1 and B C top - 2 C B FLIP A C exchanges top and top - 2 B B C A +ROT A B roll top 3 stack entries up B C C A -ROT A C roll top 3 stack entries B A down C B DDUP A A duplicate the top 2 stack B B entries A B 1.3.1 [ And ] - The words [ and ] (square brackets) are special stack manipulation words used to count parameters. When a ] is found, the number of words on the stack after the last occurrence of [ is pushed onto the top of the stack. EXAMPLE: 0> [ 2 3 4 ] = = = = 3 4 3 2 NOTE: Since [ and ] use the loop stack (see appendix A), the ] must occur at the same level of nesting as the [ it corresponds to. Page 8 1.4 I/O Words "N" is the number on the top of the stack. COLUMN Variable containing the current column number. NOTE: "COLUMN" is not yet implemented in VAX STOIC. TYO Output the ASCII character in the rightmost 8 bits of N. Column is incremented unless the character output is a return in which case it is zeroed. CR Output a return followed by a line feed. column is zeroed. IFCR Outputs a return, line feed if column is non-zero. NOTE: "IFCR" is currently not implemented in VAX STOIC. SPACE Output a space. SPACES Output N spaces. TAB Tab to column N. If already at or beyond column N, nothing is output. NOTE: "TAB" is not yet implemented in VAX STOIC. TYI Input a character. The ASCII value of the character is placed in N. = Output N in the current radix followed by space. ? Output the contents of the location addressed by N followed by space. TYPE Output N characters starting at byte pointer at top - 1. 1.5 Words Which Change The Current Radix OCTAL Set current radix to OCTAL DECIMAL Set current radix to DECIMAL HEX Set current radix to HEXADECIMAL Page 9 1.6 Words Used To Reference Memory Locations A represents the top of the stack, B represents the next to top. 0<- store 0 at location A -1<- store -1 at location A +! add B to the contents of location A 1+! increment location A 1-! decrement location A MOVE copy the contents of the location addressed by B to the location addressed by A. EXCHANGE exchange the contents of locations A and B. MVBYTES copies bytes sequentially from one area to another. The byte count is at top, source byte address at top - 2, destination byte address at top - 1. EXAMPLES: 0> 1 MINUS = -1 0> -1 ABS = 1 0> -1 EQZ = 0 0> 0 EQZ = -1 0> 1 1 + = 2 0> 1 1 1 + + = 3 0> 1 2 3 = = = 3 2 1 0> 1 2 SWAP = = 1 2 0> 1 2 DUP = = = 2 2 1 0> 1 2 1 - EQ = Page 10 -1 0> 100 X ! X ? 100 0> -1 5 MAX = 5 0> -1 5 MIN = -1 1.7 Words Used To Move Bytes The words used to move single bytes are: 1B! -1B! 0B! B! B<- B@ These words perform like their longword-moving counterparts, but they reference 8-bit bytes rather than full longwords. Page 11 2.0 COLON DEFINITIONS STOIC provides the capability of defining a new word in terms of previously defined words by means of the colon definition. Its syntax is as follows: 'NEWWORD : WORD1 WORD2 ... WORDN ; This creates a new dictionary entry called NEWWORD which, when executed, will in turn execute WORD1, WORD2, ..., WORDN. Each of the WORDs WORD1, WORD2, ... must already exist as entries in the dictionary. If not, a fatal error message will be generated. For example, if in the above example, WORD2 is not yet defined when NEWWORD is defined, the fatal error message will be WORD2 undefined, compiling... WORD 2 ...in line 'NEWWORD : WORD1 WORD2 ... WORDN ; A word may be redefined at any time. In this case, all prior definitions which referenced that word will still execute the old version. All subsequent definitions however will execute the most recently defined version. EXAMPLES: 'AVERAGE : + 2/ ; This defines the word average which computes The average of the top 2 numbers on the stack. 0> 2 4 AVERAGE = 3 'SPACE : 32 TYO ; This defines the word "SPACE" which types a space. (The ASCII code for space is 32 decimal). If the name of a word being redefined appears within the new definition, its old meaning will be used. The procedure for making recursive calls is described below under "RECURSION". Page 12 3.0 CONDITIONALS 3.1 IF ... ELSE ... THEN STOIC has a powerful IF ... ELSE ... THEN construction which allows complicated logical tests to be performed. Conditionals may be nested subject to the normal restrictions on overlapping ranges, i.e. any conditional which is initiated within the range of another conditional must be terminated within that same range. For the purposes of the conditional, "TRUE" is considered to be any odd value; "FALSE" is any even value. N IF T1 T2 ... TN THEN The top of the stack, "N" is tested. If true (odd) the words T1, T2, ... TN are executed. If false (even) control passes to the word following "THEN". N IF T1 T2 ... TN ELSE F1 F2 ... FN THEN The top of the stack, "N" is tested. If true (odd) the words T1, T2, ... TN are executed; control then passes to the word following "THEN". If false (even) control passes to the word following "ELSE". The words F1, F2, ... FN are executed. NOTE: Since byte relative jumps are used to save space in the compiled code, no more than 128 bytes can be used inside any "IF" conditional or in any loop. Using more will invoke the error message "Byte Displacement Overflow". If more space is necessary, a new word should be defined for the contents of the conditional or loop. EXAMPLES: 'ABS : DUP LTZ IF MINUS THEN ; This defines the word "ABS" which replaces the top of the stack with its absolute value. 'MAX : DDUP GT IF DROP ELSE UNDER THEN ; This defines the word "MAX" which compares the top two stack entries and leaves the larger of the two. Page 13 3.2 EQ_IF, EQZ_IF, And UNDROP The words EQ_IF and EQZ_IF combine the functions of a test and a conditional. However, the values that are popped off the stack for the call are stored, and can be retrieved by using the word UNDROP directly after the EQ_IF, EQZ_IF, or ELSE associated with either of these. The words "NE_IF", "GT_IF", "LE_IF", "GE_IF", "LT_IF", "NEZ_IF", "GTZ_IF", "LEZ_IF", "GEZ_IF", and "LTZ_IF" can be used the same way; combining their respective tests and the conditional "IF". Page 14 4.0 ITERATION STOIC provides five means for iterative execution of a sequence of words, namely: N ( ... ) Execute the words included in parentheses N times. BEGIN ... END Execute words between "BEGIN" and "END" until a condition is satisfied. BEGIN ... IF ... REPEAT Execute words between "BEGIN" and "IF". If the condition is met, execute the words between "IF" and "REPEAT", then loop back to begin. If the condition is not met, exit, skipping the words between "IF" and "REPEAT". DO ... LOOP Execute the words between "DO" and "LOOP", running index from a lower to an upper limit, incrementing by 1 each time. DO ... N +LOOP Execute the words between "DO" and "+LOOP", running an index from a lower to an upper limit, incrementing by N each time. 4.1 ( ... ) A sequence of words may be executed repetitively using the following syntax: N ( WORD1 WORD2 ... WORDN ) This causes the sequence WORD1, WORD2, ... to be executed N times where N is the number on the top of the stack. If N is zero or negative, the sequence of words is not executed at all and control passes to the word following the ")". EXAMPLES: 'DINGDING : 2 ( DING ) ; Page 15 This definition is functionally equivalent to: 'DINGDING : DING DING ; In either case, executing "DINGDING" causes the word "DING" to be executed twice. 'SPACES : ( SPACE ) ; This is a definition of the word "SPACES". thus "20 SPACES" CAUSES "SPACE" to be executed 20 times. 4.2 BEGIN ... END The BEGIN ... END syntax permits the user to execute a sequence of words and then, depending on a computed logical variable, either loop back or continue on: BEGIN WORD1 WORD2 ... WORDN END The sequence WORD1, WORD2, ... is executed once. when the "END" is reached, the top of the stack is popped and tested. If it is true (odd) then control passes to the word following "END". If it is false (even) control passes back to the word following "BEGIN". EXAMPLE: 'EXAMPLE : BEGIN 1- DUP DUP = EQZ END DROP ; This defines the word "EXAMPLE" which might be called as follows: 0> 5 EXAMPLE 4 3 2 1 0 Each time through the loop, the top of the stack (initially the number 5) is decremented, printed, and compared to zero. If it is not zero, the loop is repeated, when it becomes zero, the loop terminates. 4.3 BEGIN ... IF ... REPEAT BEGIN ... IF ... REPEAT is similar to BEGIN ... END except that the test is made in the middle of the loop rather than at the end. The words from "BEGIN" to "IF" are executed. If the top of the stack is true (odd) the words between "IF" and "REPEAT" are executed and control then passes back to the word following "BEGIN". If the top of the stack is false (even), control passes to the word Page 16 following "REPEAT". EXAMPLE: 0> BEGIN EOF NOT IF READ_RECORD REPEAT This example might be used to read the contents of a file. "EOF" returns a -1 if end of file has been encountered; a 0 otherwise. "READ_RECORD" reads the next entry in the file. By testing for eof at the beginning of the loop, the case of a zero-length file is properly handled. 4.4 DO Loops A do loop facility is provided by STOIC for indexing through a sequence of words. There are two forms of do loop: HIGH LOW DO WORD1 WORD2 ... WORDN LOOP HIGH LOW DO WORD1 WORD2 ... WORDN INCR +LOOP The limits "HIGH" and "LOW" (the top two stack entries) are compared. If "HIGH" is less than or equal to "LOW", control passes to the word following "LOOP" or "+LOOP". Otherwise, the sequence WORD1, WORD2, ... is executed. "LOOP" causes the lower limit ("LOW") to be incremented by 1 and compared to the upper limit ("HIGH"). if "LOW" is equal to or greater than "HIGH", the loop is terminated. Otherwise another iteration is performed. "+LOOP" is identical to "LOOP" with the exception that "LOW" is incremented by the word on the top of the stack ("INCR"). "INCR" is normally a positive number. Within the range of the loop, the current value of the loop index is available by using the word "I". if do loops are nested, "I" always contains the value of the innermost index. The next outer indices are available using the words "J" and "K". The word "I'" is used to obtain the value of "HIGH"+"LOW"-I-1. This is used to run an index backwards from "HIGH"-1 to "LOW". the words "J'" and "K'" are similarly defined. When parentheses are nested with "DO" loops, they count as one level of indexing. "I" used within the range of a parenthesis iteration will return the current value of the iteration count (which runs from its initial value downwards to one). The word "EXIT", causes the innermost loop in which it is embedded to unconditionally terminate on the next cycle, whether a do loop or a parenthesis loop. Page 17 The word "LAST_I", if executed immediately after leaving a loop, will push onto the stack the value of "I" at the time the word "EXIT" was executed. If the word "EXIT" was never executed, LAST_I will push the value of "HIGH". EXAMPLES: 0>5 0 DO I = LOOP 1 2 3 4 0> 4 0 DO 4 0 DO J 4 * I + = LOOP CR LOOP 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0> 5 0 DO I' = LOOP 4 3 2 1 0 0>0 21 1 DO I + DUP = 2 +LOOP DROP 1 4 9 25 36 49 64 81 100 When using "I'" (or "J'" or "K'") in conjunction with "+LOOP", "HIGH" should be replaced by "HIGH"-"INCR"+1, if it is desired to produce the same set of indices as with "I". EXAMPLES: 0> 24 0 DO I = 4 +LOOP 0 4 8 12 16 20 0> 24 0 DO I' = 4 +LOOP 23 19 15 11 7 3 0> 24 4 - 1 + 0 DO I' = 4 +LOOP 20 16 12 8 4 0 Iterations may be nested subject to the same restrictions under which conditionals may be nested. Page 18 5.0 RECURSION Since STOIC is a stack oriented language, recursion is handled quite easily. The word "EXEC" takes as an argument on the stack the address of a word to be executed and calls it. By placing the address of the calling routine itself on the stack, a recursive call is generated. EXAMPLE: 0 'SELF VARIABLE 'FACTORIAL : DUP 1 NE IF DUP 1- SELF @ EXEC * THEN ; () FACTORIAL SELF ! This defines a recursive factorial routine. Page 19 6.0 USING STOIC FROM THE KEYBOARD When typing in a command from the keyboard, the rubout key may be used to delete the last character; typing a CTRL-U deletes the entire command line. When activated, STOIC types a prompt message consisting of the current nesting depth (see below) followed by "> " to indicate that it is awaiting keyboard input. At this point the operator types in a command line. as soon as RETURN is typed, STOIC compiles the command line and, in the absence of compilation errors, executes it. Upon completion of command execution, the compiled code from the last command is discarded and STOIC again types its prompt message and waits for the next command line. 6.1 Comments The word "%" appearing on a line causes all text up to and including a second appearance of the word "%" to be ignored. If there is no second occurence of the word "%", the remainder of that line will be ignored. Note that the "%", to be recognized as a word, must be preceded and followed by space or tab. EXAMPLES: X @ Y ! % this is a comment X @ % this is a comment % 6.2 Nesting Depth And Continuation Lines STOIC maintains a nesting depth which is used for syntax checking and to determine when a multi-line command has been completed and is ready to execute. initially, the nesting depth is set to zero; it is incremented whenever any of the following words are encountered during compilation: IF ELSE ( BEGIN DO : The nesting depth is decremented by the following words: THEN ELSE ) END LOOP +LOOP ; REPEAT (decrements nesting depth by 2) Page 20 If the nesting depth ever becomes negative, the fatal error "SYNTAX ERROR" is given. A "SYNTAX ERROR" is also generated if the nesting depth is non-zero either at the beginning or at the end of a colon definition. After compiling a line, STOIC checks the nesting depth; if it is zero, the line is executed, if non-zero, it continues compilation on the next line. For example: 0> 3 0 DO 1> 2 0 DO 2> I = 2> LOOP 1> CR 1> LOOP 0 1 0 1 0 1 0> Thus, the execution of the do loop is automatically postponed until the nesting depth returns to zero, i.e. when the "LOOP" matching the first "DO" is encountered. similarly, a multi-line colon definition is extended to include all words up to the matching ";". Execution of compiled code may be postponed even if the nesting depth is zero by using the word "^". If "^" is used anywhere on a line, compilation will unconditionally continue on to the next line. this feature may be used as follows: 0> 5 0 DO I = LOOP ^ 0> 5 0 DO I = LOOP 0 1 2 3 4 0 1 2 3 4 0> A more reasonable application of "^" involves the use of string literals. Since compiled code outside of a colon definition is discarded after execution, string literals typed in on one line are not accesible to subsequent lines unless "^" is used to force the compiled string literal to be saved. For example: 0> "THIS IS A LONG MESSAGE WHICH USES THE ENTIRE LINE." ^ 0> MSG THIS IS A LONG MESSAGE WHICH USES THE ENTIRE LINE. 0> NOTE: The "^" feature described in the preceding paragraphs has not yet been implemented in VMS STOIC. Page 21 6.3 Repeating The Last Command Line Typing a line feed causes STOIC to recompile and re-execute the last command line executed. NOTE: The LINE FEED word is not yet implemented in VMS STOIC. EXAMPLES: 0> 2 2 + = 4 0> (line feed typed) 4 0> (line feed typed) 4 6.4 Error Handling The following words are used to handle fatal errors in STOIC: ABORT clears the parameter, return, and loop stacks, resets the nesting depth to zero, and forces control to return to the keyboard. "ABORT" is invoked by CTRL-C. ERR Types the string at the top of the stack followed by the name of the word last scanned from the input stream, and, if input is not coming from the keyboard, the line last compiled from the input file. "ABORT" is then invoked to reset STOIC. WHERE Types the address of the word named by the string on top of the stack. WHAT Types the name of the last word whose address is at or before the address on the top of the stack. Also, the word ERROR_TRACE types all words entered, in reverse order. Page 22 7.0 DEFINING CONSTANTS, VARIABLES, AND ARRAYS 7.1 Constants A constant is a dictionary entry which causes a 32-bit integer to be pushed on the parameter stack. To define a constant, the word "CONSTANT" is used: VALUE 'NAME CONSTANT Here, "VALUE" is the number on the top of the stack and "NAME" is the name to be assigned to the constant. EXAMPLE: 5 'NPTS CONSTANT This sets up a dictionary entry with name "NPTS". Executing the word "NPTS" will cause a 5 to be pushed on the stack. 7.2 Variables A variable is a dictionary entry which contains a 32-bit integer as its value. When executed, it causes the address of its value to be pushed on the parameter stack. Variables are defined as follows: VALUE 'NAME VARIABLE "VALUE" is the number on the top of the stack and "NAME" is the name to be assigned to the variable. "VALUE" is used to set the initial value of the variable. EXAMPLE: 100 'X VARIABLE This defines a variable "X" with an initial value of 100. 7.3 Arrays Page 23 7.3.1 Defining Arrays - While STOIC does not have a built-in array handling facility, its ability to perform address arithmetic makes subscripting possible. There are several methods for setting aside storage for an array. The simplest is to use the word ARRAY": LENGTH 'NAME ARRAY This defines and zeros an array whose length (in 16-bit words) and name are specified. The array is just a variable with extra storage locations reserved. Referencing an array thus causes the address of the zeroth element to be pushed. EXAMPLE: 100 'BUFFER ARRAY This defines and zeroes a 100-word array named "BUFFER". A convenient technique for defining small arrays is: 0 '3VEC VARIABLE 1 ,D 2 ,D This definition sets up a 3-vector called "3VEC" whose initial value is (0,1,2). The word ",D" takes a number from the top of the stack and appends it to the end of the data area, automatically extending the length of the most recent definition. This serves the dual functions of reserving space in the dictionary and initializing the array. The contents of an array may be initialized to zero or any constant value using the following words: FILL Fills an array whose address is at top-2 and has length (in words) at top-1 with the constant value at top. 0FILL Zero fills an array whose address is at top-1 and has length (in words) at top. EXAMPLE: 100 'X ARRAY X 100 -1 FILL This defines a 100-word array "X" which is then filled with -1's. Equivalently, the same action is performed by: -1 'X VARIABLE 99 ( -1 ,D ) Page 24 7.3.2 Referencing Array Elements - To reference an element of an array, all that is necessary is to add an appropriate offset to the address of the zeroth element. Note that since the first element has offset zero, the Nth element has offset N-1. EXAMPLE: 10 'X ARRAY 10 0 DO I X I 4* + ! LOOP The above code defines a 10 element array "X" and fills it with the numbers 0 to 9. Note that since addresses are in bytes, the index must be multiplied by 4. Multidimensional subscripting is handled in a similar fashion. EXAMPLE: 100 'X ARRAY 10 0 DO 10 0 DO I J + J 10 * I + 4 * X + ! LOOP LOOP This example sets up a 100 element array "X" which is treated as a 10 by 10 matrix and then stores I+J in the element (I,J). Page 25 8.0 THE DICTIONARY The dictionary starts in low core and grows upward toward the top of memory. Each definition, as it is made, is appended to the end of the dictionary. the variable ".D" is a pointer to the first free word following the end of the dictionary. The address of a dictionary entry may be obtained by invoking the following word: () name pushes on the stack the address of the parameter field of the named word. This word makes it possible to modify the value of a constant. EXAMPLE: 0> 100 'N CONSTANT 0> 200 () N ! The first line defines a constant "N" whose value is 100. The next line resets the value to 200. This feature should not be used in any program which will subsequently be committed to read- only memory, or saved as a core image using the "IMAGE" word. The following word is used to obtain the address of a word in the dictionary given its name: ADDRESS Takes a string argument on the stack which is looked up in the dictionary. If not found, an error messsage is typed. If found, the address of the parameter field of the word is returned on the stack. NOTE: "ADDRESS" is not yet implemented in VAX STOIC. EXAMPLE: 'NEWVAL : ADDRESS 200 SWAP ! ; This defines a word "NEWVAL" which sets the parameter field of its argument to 200. thus, 'N NEWVAL is equivalent to the previous example. Page 26 9.0 EXECUTING PROGRAMS Stoic programs are stored as ASCII source files. Executing a file causes STOIC to accept its commands from the file instead of the keyboard. file execution is controlled by the following words: LOAD Execute a program; address of name is on top of stack. LIST Lists the contents of a program on the terminal; address of name is on top of stack. ;F Terminates execution of the current file and returns control to the calling input file or to the system keyboard. (If ";F" is typed on the keyboard, it causes STOIC to terminate.) STOIC maintains a "file stack". Each time LOAD is executed a new file is opened; when STOIC finishes compiling and executing the line which contains the word "LOAD", compilation will continue from the newly opened file. Upon reaching end of file or the word ";F", the file is closed and popped from the file stack. Compilation continues from the file now on top of the file stack. Currently the file stack has the capability of four levels. Upon invoking STOIC, the file stack has only one entry, namely the console. EXAMPLES: 'DEFS LOAD This executes a file called "DEFS" 'LDDEFS : . = 'DEFS LOAD ; This defines "LDDEFS" which types out the current value of the dictionary pointer and executes "DEFS". 'A LOAD 'B LOAD 'C LOAD This command line causes the contents of file "C" to be executed, then file "B", then file "A". Page 27 10.0 STRING HANDLING Stoic strings are stored in memory in the following format: STRING The string may include nulls. The length field must be less than 65,536 decimal. EXAMPLE: <4>ABCD This is the internal representation of the string "ABCD". Executing a string literal causes a pointer to The length word of the string to be pushed on the stack. This pointer may be converted into byte-pointer, byte-count form by using the word "COUNT" described below. COUNT takes a pointer to a string as input on the stack and leaves the length of the string on the top of the stack, and a byte pointer to the first character following the length byte at top - 1. MSG is equivalent to "COUNT TYPE". given a pointer to a string at top, it causes the string to be typed. ASCII converts the next word in the input stream to a single character and pushes its ASCII value onto the stack. EXAMPLES: 'LARK : 'NONSENSE MSG ; Executing "LARK" causes "NONSENSE" to be typed. DECIMAL ASCII A = This causes a 65 (the ASCII value of "A", in decimal) to be typed out. Page 28 10.1 String Variables String variables are stored in memory in the following format: STRING A string variable is defined as follows: 80 'NAME SVARIABLE This defines a string variable of maximum length 80 and current length 0. "NAME" causes a pointer to the current-length word to be pushed onto the stack. MOVE_STRING causes the string whose address is at top-1 to replace the contents of the string variable whose address is at top. .MOVE_STRING causes the string whose starting address is at top-2 and whose byte count is at top-1 to replace the contents of the string variable whose address is at top. STRAP causes the string whose address is at top-1 to be appended to the string variable whose address is at top. .STRAP causes the string whose starting address is at top-2 and whose byte count is at top-1 to be appended to the string variable whose address is at top. STAB causes the byte (low-order 8 bits of the 32-bit longword) at top-1 to be appended to the string variable whose address is at top. .STREQ compares the string with byte count at top and address at top-1 to the string with byte count at top-2 and address at top-3; returns -1 if the strings are equal or 0 if they are not. SEARCH_STRING searches the string with byte count at top and address at top-1 for occurrences of the string with byte count at top-2 and address at top-3. Returns a true or false value at top indicating whether the string was Page 29 found or not, a byte count and address for the remainder of the source string at top-1 and top-2, and a byte count and address for the unchanged pattern string at top-3 and top-4. Note that SEARCH_STRING could be immediately executed again after the logical value is popped off. EXAMPLES: 0> 80 'SBUF SVARIABLE 0> "HELLO" SBUF MOVE_STRING 0> SBUF MSG HELLO 0> " - GOODBYE" SBUF STRAP 0> SBUF MSG HELLO - GOODBYE 0> 33 SBUF STAB (33 decimal is ASCII !) 0> SBUF MSG HELLO - GOODBYE! 0> " " SBUF MOVE_STRING 0> SBUF MSG 0> "L" COUNT SBUF COUNT SEARCH_STRING = -1 0> TYPE CR TYPE LO - GOODBYE L 10.2 Number Output Conversion Stoic preforms number conversion using a small but powerful set of words which permit a wide variety of output formats to be generated. Number conversion proceeds from right to left, converting one digit at a time. the conversion takes place in the current radix. Digits are stored in the number conversion stack, a string variable which is set up by "<#". RADIX Variable which contains current input and output radix. <# Initiate number conversion. #PUT Output a character from top of stack. #A Convert the number on the top of the stack to an ASCII digit. Page 30 # Compute the next digit and output it. An integer is input from the top of the stack and divided by "RADIX". The remainder is converted to a digit and output; the quotient remains on the stack. #S Compute digits and output them until the remaining number is zero. "#S" always outputs at least one digit. #> Terminate the number conversion, leaving a byte count on top of stack, a byte pointer at top - 1. Higher level words are used to provide number output in a default format: <#> Convert the signed number on the top of the stack and leave byte count and byte pointer on the stack. "<#>" is defined as: '<#> : DUP LTZ IF 55 #PUT THEN #> ; = Convert and output the number on the top of the stack. (signed) ? Convert and output the number addressed by the top of the stack. (signed) EXAMPLE: '$= : <# # # 46 PUT #S 36 #PUT #> TYPE ; 0> 100 $= $1.00 0> 1000 $= $10.00 "$=" Types the top of the stack in an unsigned dollar and cents format. Number conversion is initiated by "< ", the cents are output by the two " "'s, a period is output by "46 PUT", the remaining dollars are output by " S", and the dollar sign is output by "36 PUT". " >" terminates the conversion process and leaves a byte pointer and byte count for the resultant string on the stack as input to "TYPE" which outputs the string. Justification of numeric output may be achieved as follows: Page 31 '4= : <#> 4 OVER - SPACES TYPE ; The word "4=" will type out numbers right justified in a four column field by typing a variable number of spaces preceding the number. 10.3 TYI And TYO TYI causes a character to be input from the terminal and pushed onto the stack as the low-order byte of a longword with zero fill. TYO causes the low-order byte of the longword at top to be typed on the terminal. Page 32 11.0 VOCABULARIES In addition to the parameter stack, STOIC uses another stack called the vocabulary stack. This stack is used to control dictionary lookup operations. When STOIC looks up a word in the dictionary, it starts its search at a location whose address in stored on the top of the vocabulary stack. If the search fails, it searches again using the next to top of the vocabulary stack, etc. until the bottom of the vocabulary stack is reached. If the name is still not found, the search fails. The dictionary may have any number of branches. different words having exactly the same names may be resolved if they are on separate branches of the dictionary. A branch of the dictionary is called a vocabulary. When STOIC defines a new word, the dictionary entry is appended to the branch pointed to by the variable "CURRENT". The kernel has two vocabularies, called "STOIC<", and "ASSEMBLER<". Initially, the vocabulary stack contains only the address of "STOIC<". "ASSEMBLER<" is used when making code definitions and contains special words used only by the STOIC assembler. "STOIC<" is the main branch of the dictionary, and "ASSEMBLER<" is a side-branch. A new branch of the dictionary is created as follows: 'NAME BRANCH Referencing a vocabulary causes its address to be pushed on the vocabulary stack. The new branch is attached at the end of the existing branch pointed to by "CURRENT". > discards the top of the vocabulary stack Attempting to pop the last entry off the bottom of the vocabulary stack causes the error: "VOCABULARY STACK EMPTY". To set "CURRENT" to point to a new vocabulary, the following word is used: DEFINITIONS Sets "CURRENT" to link new definitions onto the same branch of the dictionary as is pointed to by the top of the vocabulary stack. The normal procedure when setting up a new vocabulary is as follows: 1) Define the vocabulary using "BRANCH" Page 33 2) Push the address of the vocabulary on the vocabulary stack by referencing the vocabulary. 3) Set "CURRENT" using "DEFINITIONS". 4) Define all the words in the vocabulary. 5) Pop the vocabulary stack using ">". 6) Reset "CURRENT" using DEFINITIONS. EXAMPLE: 'VOCAB< BRANCH VOCAB< DEFINITIONS . . . > DEFINITIONS This defines a new branch of the dictionary called "VOCAB". Entire sections of the dictionary may be deleted by using the following words: 'NAME MODULE NAME FORGET This discards the named dictionary entry and all subsequent entries. If code is defined into more than one vocabulary in the intervening space, a MODULE should be defined after each switch of "DEFINITIONS", and the MODULES should be forgotten in reverse order, with the active vocabulary for FORGET the same as the active vocabulary at the MODULE definition. Note that the name given to FORGET is not quoted. Forget is useful when trying out definitions from the keyboard. The procedure is to first make a module definition: 'TEST MODULE Next, test definitions are made (typically by loading a program). If they are unsuccessful, "TEST FORGET" will delete them from the dictionary, and the process is repeated. for convenience, the module definition may be placed at the beginning of the program. Page 34 "FORGET" may also be used to provide an overlay facility. Say that a program has been loaded. it may then be deleted by forgetting the initial module definition, after which the dictionary space will be free to load another program. APPENDIX A THE LOOP STACK The loop stack is a secondary stack which contains the high values, low values, and current index values for any DO loops or ( ) loops currently being executed. It also can be used by the program as a second stack, as long as care is taken to restore the stack when the end of any loop currently being executed is reached. A.1 VALUES KEPT ON THE LOOP STACK When either of the words "DO" or "(" are reached, STOIC pops the high value and low value off the parameter stack and places them on the loop stack, followed by the starting index (which will be the same as the low value initially). The loop stack then contains: index low value high value When the "LOOP", "+LOOP", or ")" is reached, the index is retrieved from the loop stack, the increment is determined and added (this will be 1 unless +LOOP is used, in which case it will be the number on the top of the parameter stack), and the index is compared to the high value. If the index is equal to or greater than the high value the loop is terminated; otherwise the index is replaced and the loop continued. The values of the high value, the low value, and the index are always available to the program (the index can be retrieved using the word "I"). THE LOOP STACK Page A-2 A.1.1 Words Referencing The Loop Stack Four words are available for directly addressing the loop stack: MARK Pushes the parameter stack pointer onto the loop stack. The original value of the parameter stack pointer can be restored using the word "RESTORE". NOTE Pops a value off the parameter stack, and pushes it onto the loop stack. The value can be retrieved using the word "RECALL". RESTORE Pops the loop stack and moves the obtained value into the pointer to the parameter stack. RECALL Pops the loop stack and pushes the obtained value onto the parameter stack. The "MARK" and "RESTORE" words are useful if the program is using a procedure which will leave unwanted values on the parameter stack. Before use of the procedure, "MARK" would be used to save the stack pointer. After the procedure, "RESTORE" would return the parameter stack to its original state. "NOTE" and "RECALL" are useful for saving values that would be "in the way" were they put on the parameter stack. In any use of the loop stack, care should be taken not to undesirably affect loops by replacing loop values with parameter stack pointers or program data. One rule to follow is this: treat all "MARK"/"RESTORE" and "NOTE"/"RECALL" combinations as "DO"/"LOOP" combinations and follow the same nesting rules. This will ensure that if values are pushed on the loop stack, they are removed before the current loop reached the end and needs its values back. Another thing to be careful of: if operations involving the loop stack are in use, the words "I", "J", and "K" may behave unreliably due to values on the loop stack that do not directly affect the loops. If the program needs to recall index values, their positions on the loop stack should first be determined. They can then be obtained by pushing dummy values onto the loop stack to bring them into the 4th or 7th position, where they can be referenced using "J" and "K" respectively. APPENDIX B CODE SAVING TECHNIQUES Experience writing programs in STOIC will demonstrate that using the file loading word, "LOAD", each time one wants to run a STOIC program is annoyingly slow and generally unsatisfactory. Fortunately, STOIC has two ways of storing compiled code so that it can be used quickly. The first of these, the "IMAGE" word, makes .EXE programs that can be run directly and will start up with the STOIC code already loaded. The second stores code that can be quickly read into an existing STOIC image. B.1 CREATING CORE IMAGES When a STOIC program is "put into production", users should not be expected to have to load the files containing the program before they can use it; in fact, the best way to make the program available is to require a minimum of action to run the program. The "IMAGE" word, whose syntax is: filename IMAGE (the file name should be quoted) saves the current state of STOIC into the file given. The file type is usually EXE. The file is started using the DCL command: $filename STOIC-input-line Anything that follows the file name in this command is passed to STOIC as the first command line to execute (possibly starting the routine that the image is to be used for). If nothing is present after the file name, the image starts up with the normal "Welcome to STOIC" prompt and awaits user input as usual. CODE SAVING TECHNIQUES Page B-2 B.1.1 Creating Save Files Using SAVE And XLOAD Another way of quickly saving and retrieving STOIC object code is by use of the two words: filename SAVE filename XLOAD (where the file names are quoted). SAVE will create the specified file and write the "current state of STOIC" into it. XLOAD will retrieve the state in the specified file. The file type does not matter, but it can be helpful to use a consistent file type (such as .STO) for all STOIC save files. Some notes about SAVE and XLOAD are in order. They are primarily useful for development of new STOIC programs, as they are more space-efficient than creating new images. The reason for their reduced space requirements is that they do not save all STOIC code defined, as the IMAGE word does; they save all code and data that has been loaded into the current image only. Since this means that the loaded code is dependent on the code in the image being the same as when the code was saved, a "time-stamp" has been added to the save files to prevent the strange errors that would result when a save file was loaded into a STOIC image to which revisions had been made. If XLOAD is done on a file with a different time-stamp from that of the image currently running, the file is rejected and a message is displayed showing the two time-stamps. There is one exception to the time-stamp rule: the stamping is done when the definitions file (RDEF) is loaded. This means that if a SAVE is done and the current image is a STOIC kernel, it will be possible to XLOAD the file to any STOIC kernel, but not to a STOIC image saved in any other way. This is probably unwise anyway, but if you do save files from a kernel image, be sure to reLOAD them if you make changes to the MACRO source files and re-link. The time-stamp for the current image can be displayed at the terminal by typing "REVISION" (after RDEF is loaded). The stamp on the current image can be changed to the current date and time by typing SET_REVISION. APPENDIX C DEBUGGING TECHNIQUES The following are techniques which may prove useful in debugging new definitions: 1) It is usually more efficient to make a new definition in a file using the editor rather than from the keyboard if it is more than one line long. "FORGET" is then used to remove the last version of the definition before trying again. 2) To test a word, merely feed in arguments on the stack, execute the word, and examine the results using "=" and "?". It can be helpful to put an obvious value such as "12345" on the stack before pushing arguments to the word and calling it. This allows you to determine whether the word left any unwanted values on the stack, without causing an access violation (and the resulting "ABORT" and stack reset) when the stack underflows. 3) If a word fails, type in the words which make it up, one at a time, examining the stack as you go along and restoring it by typing the parameters back in, in reverse order. 4) After executing a word and examining the results on the stack, type an extra "=" to make sure the stack is empty. You should get an access violation indicating that the stack is empty. Similarly, the loop stack should be checked by typing "RECALL"; this should generate a similar access violation. 5) Keep track of the radix you are using. This is the source of frequent errors. Within a file being executed, save the radix, set it to the value the program expects, and when done, restore the saved value. For example: RADIX @ DECIMAL . . . RADIX ! DEBUGGING TECHNIQUES Page C-2 ;F This saves the current radix on the stack, and sets the radix to decimal. Upon termination, The old radix is restored (assuming the stack has not been disturbed). Saving the old radix in a variable can be safer. 6) The proper selection of lower level words has an enormous effect on all subsequent higher level definitions. Thus it pays to design the lower level words very carefully. 7) When debugging a program, test all lower level words thoroughly before testing the words which call them. 8) Be especially careful with any word which modifies memory; make sure the word is modifying only those locations you intend, and not part of the program.