FORTH David Millington presents a program to develop Forth on your Spectrum. This version of the computer language Forth will run on the 48K Spectrum, but because of many of the POKE addresses used, it is not compatible with Microdrives and the Inter- face 1 operating system. It consists of two distinct parts: a compiler and editor written in Basic and a set of Forth words in machine code. The main advantage of Forth as a language is its very fast execution speed and this implementation will run typically 50 times faster than Basic. The speed increase is due to the fact that a Forth program is converted directly into machine code, and the modular nature of Forth makes the process of compilation very easy. Since the compiler in my version is in Basic, it compiles quite slowly, but the code produced will run as fast as commercial Forths. The theory of simple programming in Forth has been covered in previous issues of Your Computer, but I will recap the simple concepts for those who are new to the language. My implementation is mostly standard Forth and includes all the usual structures, but it differs in its use of variables and strings and is less flexible in the methods of defining words. However, it should serve as an introduction to the advantages of the language and can be used for applications such as graphic games, as I hope to show in a future article. [When you load the program, you] should be presented with the title, and after a short pause a question mark prompt and a flashing blank cursor will appear at the bottom, awaiting your commands. The main feature of Forth is the stack, which is simply a pile of numbers. A number can be added to the top of the stack and later the top number can be removed. These two simple operations are the basis of Forth. You should now type in 23 and press ENTER, and this number will be placed on top of the stack. If you now type a full stop and ENTER, the top number on the stack will be removed and printed. The full stop is an example of a Forth word, many of which do something to the stack, as shown. The word + will fetch two numbers from the stack, add them together, and place the sum back on top. You should now be able to use Forth to add together two numbers and print the result. One way is to enter 23 45 + . and the answer 68 will appear. This also illustrates how several items can be entered together, separated by spaces. Similarly the words -, /, * are available for arithmetic, and complex expressions can be evaluated. Consider the Basic statement PRINT (5+11)/(5-3) The equivalent in Forth is 5 11 + 5 3 - / . Both will yield the answer 8. If the Forth version seems strange, study figure 3, which details the effect upon the stack as each command is executed. Forth simply requires each operation to be placed after the operands instead of in between, whether they are numbers or other expressions. This is known as postfix notation, and it automatically removes the need for brackets. ___________________________________________________________ Figure 3. Expression 5 11 + 5 3 - / . - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Stack 3 11 5 5 2 5 5 16 16 16 16 8 ___________________________________________________________ You should experiment with various expressions until you are sure what is happening and how the stack is being used. Sooner or later you will discover that Forth operates only on integers. Try 20 6 / . and the answer 3 is produced. The range of numbers which can be handled is -32768 to 32767, although any between 32768 and 65535 can be entered and they will be converted to negatives. If you try to enter anything outside this range, then the error message 'Number out of range' will appear. A full list of the system's error messages is given in figure 4 for reference. [Figure 4, and the other refe- rence tables, can be found at the end of this text file.] So far we have done only simple calculations in direct mode and you are probably waiting to try a full program. The main point to realise is that a Forth program bears little resemblance to either Basic or machine code. There are no line numbers and there is no sequential order of instruc- tions that constitute an actual program. Instead we store instructions by defining words. You have met five words so far and those which you define yourself can be used in exactly the same way. Suppose we wish to define a word called average which will calculate the average of three numbers. Enter :average + + 3 / .; There will be a delay while the word is compiled, and then the prompt and cursor will reappear. The syntax for defining a word is a colon followed by the name we wish to use followed by a sequence of operations and terminated with a semicolon. There is no space after the colon or before the semicolon. If you now enter 10 14 3 average . then 9, the average of these three numbers, will be printed. The word average is now as much a part of the computer's Forth vocabulary as the built in or 'core' words, and all are stored in an area of memory called the dictionary. You can define further words using both core words and your own, and the idea is to evolve a program consisting of nested word definitions until typically only a single word is needed to execute your program. This makes program development easier than in Basic since tasks can be subdivided and appropriate words written and tested separately. The importance of the stack becomes apparent since it is used to pass parameters to and from words, as was demonstrated with the word average. Figure 5 lists and briefly explains most of the Forth words supported. Those already familiar with the language should be able to try some larger programs, but before embarking on anything too adventurous you will need to know how to edit your work in case of errors. Forth is just as prone to program bugs and typing errors as Basic but, like any com- piled language, errors take a long time to correct. There are several commands for editing and using periphe- rals and these are listed in figure 6. It should be noted that these are not part of the Forth language and are by no means standard in any other Forth systems, which instead use screens for editing. To indicate a system command you should begin the line with an asterisk. Enter *edit average and you should find the definition brought to the bottom of the screen just as you first entered it. It can be altered using standard cursor controls and entered when finished. Alter it to :average + + + 4 / .; to allow the average of four numbers to be found. On pressing ENTER you will hear a strange squeak, which is an effect of the Sinclair editor being used at high speed. After several seconds the cursor will reappear with the amendment made to average. In general, the more words you have defined, the longer the delay will be, since a lot of recompiling must be done. Every definition you enter is stored in a source code buffer in case editing is required later. The command *list will list every definition in the buffer. However, if you type *del then the source buffer is cleared and you will be unable to edit average, although it can still be used in Forth. This explains why you cannot edit the core words. There are Save and Load commands for both the source buffer and the Forth dictionary, again described in figure 6. After a *sload command there will be a long delay while the source code is compiled. The command *reset will delete everything and start Forth from the beginning again. If you have a ZX Printer then it can be turned on with *pr on, allowing all output to be printed. *pr off will cancel this facility. As stated before, figure 5 contains brief explanations of the more common Forth words. Emit is used for printing single characters, and on the Spectrum it is especially useful since it can handle the usual colour and position control characters. For instance 16 emit 4 emit 42 emit will print a green star.' There are two words which deal with keyboard input, but only at the single character level. Get will wait for a key or shifted key to be pressed, and returns its code on the stack. It will produce the standard key click, and when it is called rapidly, it will allow keys to repeat. The other word is key, and this works like INKEY$ in Basic. It will return the code of the key which is being pressed, or 0 if none is pressed. There is no implied wait as there is with get. There are four words for manipulating the stack and these are illustrated diagrammatically in figure 5b. The most useful is dup, as it will duplicate the top number, allow- ing one copy to be used and the other preserved for later use. In any computer language, the ability to perform repeated operations using loops, and to make decisions are very important. You may think that the absence of line numbers and GO TO will make this difficult but, in fact, Forth provides several useful control words and these allow clearer program structures to be set up than in Basic. There are direct equivalents of Basic's FOR-NEXT state- ments and these are do and loop. The actual layout of this and the other structures are shown in figure 5. The word ind will place the index counter of the loop on the stack where it can be used. In standard Forth this word is simply I, but I have altered it to avoid confusion with the vari- able I. The word +loop will allow steps of other than 1. The following direct line illustrate how different step sizes and directions are catered for, and should be entered individually: 20 1 do ind . loop 1 20 do ind . loop 100 1 do ind . 17 +loop -30 30 do ind . 2 +loop Figure 7 contains the definition of a word called count, which uses a loop to show the speed of Forth. Two points are shown from this listing. A Forth word definition can span several lines provided the : and ; mark the beginning and end, and also comments can be included provided they are on a separate line and surrounded by brackets. Enter the definition and execute it by typing count, and you should see the numbers from 1 to 10000 being rapidly printed in the top left of the screen. You should also try editing count to see how each line is presented separately at the bottom. The begin-until loop will allow a block of instructions to be repeated until the condition at the end of the loop is true, and the begin-while-repeat structure will cause looping as long as the condition at the beginning is true. Both these loops will be useful in different circumstances. The conditions are the results of the operators <, >, <> and =, which all require the two numbers to be compared to be on the stack, and they will return a true value - 1 - or a false value - 0. Until and while both expect such a condition code to be on the stack. They use this value to decided whether to exit or to continue looping. The follo- wing line uses a loop to wait for the a key to be pressed: begin key a = until The if-then-else structure will allow two different sections of code to be executed depending on whether a condition is true or false, before resuming with the normal flow of execution. The actual layout of these structures is again explained in figure 5. It must be emphasised that while all of the control structures can be mixed and nested to any depth, they must not cross or be jumped out of other than by a normal exit. Also, all of the loop must be in the same word definition. The word ind will return the index of the inner-most loop, and will only give the correct value if it occurs in the same word definition as the start and finish of the loop. If you do cross your structures, then a crash is likely. The ease of crashing Forth is a penalty of its high speed and closeness to machine code. You can place the following instructions in a loop if you think that it might not exit: key 32 = if abort then Pressing the space key will stop the program with no ill effects, and the message Program ABORTed will appear. Abort is the one word which will safely stop execution and jump out of all the nested loops and words. The rest of figure 7 shows some example of word definitions to illustrate different aspects of simple programming. Type will allow you to enter a line of text onto the screen, terminated by enter. It shows a begin-until loop in action, and also illustrates the use of Get. Fill is a simple utility to fill the screen with the character of your choice. For example, 35 fill will fill the screen with hash signs. This may be slightly slower than you expected, but this is due to the slowness of Sinclair's print routine. The next word, square, will calculate and print the square of the number on the stack. It is called by the final word, squares, which will print a formatted table of square numbers up to any specified value. When you have entered all these definitions, you may like to save them on tape, to try out the cassette commands. It is important that you should experiment with Forth and the facilities of the compiler. [In fact, they are already on Forth.tzx, under the name "Figure 7", following the main program.] [The system also has some features which were not described in the article. Some of these are similar to standard Forth features, but most of them are not quite the same as FIG- or Forth-97. They're quite useful, though, so I'll describe them in short. Keep in mind that I've only gleaned all this from the program itself; it is probably accurate, but I may have made a mistake. First up are a number of memory management words. First of these is allot. This takes a number from the stack, and allots that number of bytes from a dedicated area between the compiled code and the stack. The base address of this area is then pushed onto the stack. You can safely use this memory to store values in, as described later. Subsequent calls to allot will reserve further memory directly follow- ing the previous one. The word clear will clear both the stack and the allot area. After clear, the next value pushed will be pushed onto the base of the stack just as if the system had just been reset, and the next allot will allot from the start of the allot area. In effect, it's like a *reset, but without losing all your code. If you want to use your allotted memory, or indeed any memory not part of the stack or your program, you can use ^, ?, ! and @. The ^ word pops an address from the stack, followed by a value, and stores the lower byte of that value at the address specified. For values below 256, that is the value itself. This means that 255 23692 ^ is the same thing as POKE 23692,255 that is, it sets the scroll count 255. Conversely, ? pops an address, and pushes the contents of that memory location onto the stack. Thus, 23693 ? . is the Forth equivalent of PRINT PEEK 23693, which is the current permanent colours. The words ! and @ are very similar to ^ and ? respec- tively, except that they work on two-byte numbers. They store and read the low byte of the value at the specified address and the high byte at the next address, just as described in chapter 25 of the Spectrum manual. For example, 23606 @ will put the address of the character set on the stack, and 30000 23606 ! will change it (hopefully to the address where you have loaded your new, beautifully redesigned typeface). Note that you are not limited to addresses within the Forth system for any of these four words, and that messing about with addresses which you aren't sure are safe is just as dangerous as using careless POKEs in Basic. The Forth system itself starts at 43000, and anything above that and below the UDG area is unsafe to ^ or ! in, except for memory you have requested using allot. The words ! and @ should not be confused with the next feature, which is that of variables. Like Basic, Forth does use variables, but because of the strong focus on the stack their use is more limited. In the present system, they are even more limited than in most. You have exactly 26 of them; they are identified by a single lower-case letter, a to z. Each can contain a single Forth value, just like a stack entry. To store a value into one of these variables, push the value onto the stack and then enter the variable name letter, followed immediately, without a space, by a !. To retrieve a variable value, use the name letter followed by a @. For example, 42 d! will store the value 42 into variable d, and t@ . will print out the value currently stored in variable t. Note the difference between this feature and the double- byte peek/poke commands: in variable use, the ! and @ are directly attached to the variable name, while the peek/poke commands are individual words, separated from the address they work on by a space just like any other normal word. Rather useful as well are the two string functions. If you include any string between double quote marks within your command, this will put that string, with an ENTER (CHR$ 13) appended, into the compiled code, and push its start add- ress on the stack. To complement this, there is the $ word, which takes an address from the stack and prints the string starting at that address and continuing until the first CHR$ 13 found. Do note that this CHR$ 13 is not itself printed, so if you want a message to end with a newline, you will have to use 13 emit. Note also that this string address is a normal 16-bit number, and you can perform arithmetical operations on it just as you could in Basic. For example, try: "Oh Hello, World!" 3 + $ (The same thing is true, by the way, for the addresses used by ^ ? ! @ and allot - but take care, again, not to make a mistake and point where you didn't intend to.) Finally, there is the "insert direct code" facility. This puts bytes, specified by hexadecimal numbers, directly into the code stream. Needless to say, this is both a very powerful and very dangerous feature. It is accessed by using the # symbol followed immediately (again without intervening space) by the hexadecimal codes for the machine code or data to be inserted. Each byte must be exactly two hex digits (so if you want to insert 12, you must use 0c, never just c.) Take care: the code you insert is executed literally, as machine code, as part of your command. You can, of course, do untold damage with this, but you can also use it to great effect, if you take care. For example, you can use Forth's Pop and Push routines by inserting #cdd4a8 and #cdc8a8 respectively. These put the value from the top of the stack into HL, and push HL back onto the stack. In between, you can do with HL whatever you want; for example, you can shift it for a quick 2 * or 2 /, or even hand it over to the ROM calculator to extend your Forth program with SIN and COS. Less trickily, you can call simple ROM routines like the ones for BEEP or CLS. Or even more simply, just the one word #c7 inserts the command RST 0, which resets the Spectrum from within Forth! Less drastically, #cf means RST 8, which is the error report facility of the Spectrum ROM; Forth captures the error handling for its own use, so this is not immensely useful, but one very handy use of it is to include #cf16. This triggers error H, STOP in INPUT, which causes the Forth system to stop cleanly without resetting the computer. And by the way, yes, that does mean that the last undocu- mented feature of Forth is not a new word, but the way to stop the system from the editor: simply break into the command prompt using the down arrow key, or shift-6. It's a normal Basic INPUT, disguised with a crafty POKE.] ___________________________________________________________ Figure 4. Forth compiler error messages. Undefined word A word is either undefined in the dictionary or in the case of *edit, the source code if the definition is unavailable. Bad line The line entered generally does not make sense. This error may also be produced if you exceed the memory reserved for the compiler. Bad variable An illegal variable name has been used. Division by zero 1 0 / has been attempted, for example. Number out of range A number outside the range -32768 to 65535 has been entered. Invalid number A number contains a non-numeric character. BREAK BREAK was pressed when using tape, printer or 'scroll?' Invalid name Illegal file name in cassette commands. Invalid colour code Same as in Basic. Tape loading error Same as in Basic. Program ABORTed The word abort has been executed. After any error message, the line containing the error must be entered again in full. If any error occurs after editing a word, the word will automatically be presented again for editing, starting from the beginning, and you must skip through it by pressing ENTER, until you reach the offending line. ___________________________________________________________ Figure 5. Summary of main Forth words. Arithmetic operators: + Add two numbers and place result on stack. - Subtract top number from second number. * Multiply top two numbers. / Divide top number into second number. Result is rounded to lowest integer. Conditional operators: =,<>,<,> Compare top two numbers according to specified condition, and return 1 if the condition holds, otherwise 0. Stack manipulation: (see Figure 5b) drop Remove top number from stack. dup Duplicate top number on stack. swap Swap top two numbers on stack. over Copy second number on stack to the top, over the original top number. Control structures: In the following descriptions, the items in brackets refer to any appropriate block of Forth words. All structures can be nested within themselves and others to any level, but all the words at any one level of nesting must be in the same word definition. (finish) (start) do (code) loop Perform an indexed loop, with index starting at (start) and counting up or down by one until (finish) is reached. (code) is repeated appropriate number of times. (finish) (start) do (code) (increment) +loop Same as previous, except size of increment is specified. begin (code) (condition) until (code) is repeated until (condition) gives a true (non-zero) value. begin (condition) while (code) repeat (code) is repeated as long as (condition) gives a true value. If (condition) is false the first time, then (code) is skipped altogether. (condition) if (true code) else (false code) then If (condition) is true then (true code) is executed, otherwise (false code) is executed. Both parts continue executing after then . (condition) if (true code) then Same as previous, except the false condition is not treated separately, and else and the (false code) are omitted. ind Place index counter of innermost do loop on stack. abort Return to command mode, clearing all nested loops and words. ___________________________________________________________ [ Figure 5a. Non-documented Forth words. Memory management: allot Pops a number from the stack, and allots that many bytes of memory from the "free area" between the stack and the code. Pushes the base address of the allotted block onto the stack. clear Resets the stack pointer and the allotted block pointer back to their starting values. ^ Pops two numbers, and loads the memory address of the top number with the lowest 8 bits of the second value. In other words, the same thing as a Basic POKE. ? Pops a number, then pushes the one byte found at that address onto the stack. That is, PEEK. ! Does the same thing as !, except that it pokes a full 16-bit, number into two subsequent memory positions, in the format described in the Spectrum manual, chapter 25. Note that this is a separate word, and not to be confused with the x! variable storage, described below. @ Does the same thing as ?, except that it peeks and pushes a 16-bit, 2-byte number. Like !, not to be confused with variable usage. Variable storage: In both these commands, x can be any single, lower-case letter from a to z. They are not to be confused with the two-byte peek and poke words, described above. Those are separate words; these are always part of a command con- sisting of one variable letter and either ! or @. x! Pops a value and stores it in variable x. x@ Pushes the value of variable x on the stack. String handling: "string" Insert "string", followed by a newline, into memory, and push its address on the stack. $ Pop an address from the stack, and print the string starting at that address, up to (not including) the first newline (CHR$ 13) found. Direct code insertion: # Must be followed directly (no space) by any number of two-digit hexadecimal numbers. These numbers will be inserted directly into the command code. ] ___________________________________________________________ Figure 5b. Effects of stack operations. Initial state of stack dup drop swap over 30 20 30 30 20 30 20 20 20 30 20 10 10 10 10 10 ___________________________________________________________ Figure 6. Compiler operating commands. All commands must be prefixed with an asterisk. *edit 'word' Allows word to be edited one line at a time, if the source code is available, then recompiles 'word' and everything after it. *list Lists all word definitions in source code buffer. *del Clears source code buffer, but leaves Forth dictionary intact. *reset Clears everything and restarts Forth. *dsave 'filename' Save entire Forth dictionary in three parts. *dload 'filename' Loads dictionary, and clears source code buffer. *ssave 'filename' Saves source code buffer on tape. *sload 'filename' Loads source code buffer and compiles it into dictionary. There will be a long delay while this is done. Note that the filename is NOT placed in quotation marks, and if it is omitted in a load command, then the first file found will be loaded. *pr on Sends all further output to ZX printer. *pr off Uses screen again for output. *dlist Lists contents of dictionary, and the address of the machine code routine for each word. *msave Saves Forth program as an independent machine code routine. ___________________________________________________________ Figure 7 *list :count 10000 1 do (set up the loop) 22 emit 0 emit 0 emit (move print pos. to top left) ind . (print the loop index) loop; :type begin (set up loop) get dup (get a character from the keyboard and make another copy on the stack) emit (print the character) 13 = until (continue with the loop until the character is ENTER - code 13) ; :fill (the code of a character is already on the stack) 22 emit 0 emit 0 emit 704 1 do (set up loop) dup emit (duplicate the character on the stack and print the top copy) loop drop (the character is removed from the stack) ; :square dup * . (print the square of the number on the stack) ; :squares 1 do (the upper limit of the loop is already on the stack) ind . (print the number) 6 emit (this is a COMMA control character) ind square (calculate the square using the previous word we defined) 13 emit (print on a new line) loop; 20 squares 1 1 2 4 3 9 4 16 5 25 6 36 7 49 8 64 9 81 10 100 11 121 12 144 13 169 14 196 15 225 16 256 17 289 18 324 19 361 20 400 ___________________________________________________________ [ And finally a few notes about the TZX. This contains (of course) the program itself, which is followed by the program used to create the default dictionary. Following that is a small additional dictionary of tools which I had to create to port a program ("OXO", published in ZX Computing in June 1985, and also avail- able at WoS) to this version of Forth, and which I thought would make a nice sample dictionary. You can load either the source or the compiled dictionary, whichever you prefer. Do be careful not to load it over the top of your own program, as this Forth does not do Merge; load the toolset first, then enter your program. Most of the words in this dictionary are simple enough to understand. abs and mod do what you would expect them to: abs turns the top stack value into its absolute (i.e., nonnegative) value, like Basic ABS, and mod calculates the remainder of the division of the top two stack values - the complement of Forth's integer /. Equally unsurprising are and, or and not. They behave just like Basic AND, OR and NOT, including and and or returning one of their arguments if possible. 4 2 and, for example, gives 4, just as 4 AND 2 does in Basic. You may wonder why I included not, when 0 = is just as short. The reason is that 0 = compiles to code that does while not compiles to (which then in turn inserts 0 and calls =). Thus, not results in rather shorter, but marginally slower, code than 0 =. The choice is yours. There are two screen functions, which also do just what you would expect them to. at puts the cursor AT ,, while cls clears the screen (calling two ROM routines to do so), then puts the cursor back at the top left corner of the screen. The last two words are not immediately obvious, as they had to be written in machine code. The first of these is rot, which is actually a standard Forth word, It acts something like a three-value swap. It takes the top three values off the stack, then puts them back on with the third value on top, followed by the other two. 1 2 3 4 rot results in the stack holding, from the top down, 2 4 3 1: the 2 has been rotated to the top. Finally we have leave, also standard Forth, which sets the final limit of the inner-most do-loop to its current index. This means that when the loop is next executed, the loop will terminate. Note that it is the limit which is altered, not the current index. Strange, but for some reason that's the standard. One last word of warning: this Forth does not do a lot of error-checking. If you fill the source buffer with more text than will fit, it will merrily discard the rest but keep updating its pointers, resulting in code which will run but cannot be altered any more (and a *list which will terminate in an error). Using *del does solve this in so far that any new code will behave as normal, but of course you still can't edit the old source. More seriously, you can compile more code than will fit between the compiler and the allot area, or push more than 250-is values and start running over the end of RAM, or even pop more values than are on the stack and (eventually) start popping your compiled code! None of this will be prevented. You have to work at it, but it's all possible. The control structures also do no error checking at all, and these are rather easier to break. All you have to do is accidentally leave out one word. For example, load the system from fresh (don't do this with your freshly typed program in memory), then type loop as your first line. Just the one word, without its do. Now you'll be careful in the future. Leaving off the structure ending word can be just as disastrous; try if on its own. And yes, it would certainly be possible to add checks for all these, but they would cost a lot of compilation time and above all memory, leaving even less room for your source code, so that would unfortunately defeat the purpose. Richard Bos, October 2012. ]