Understanding the Stacks

by Doug Hoyte

The concept of a stack is, by far, the most famous aspect of programming in forth. In contrast to most other popular languages, forth allows you to directly manipulate the data structures that control execution both at compile-time and run-time.

Many forth tutorials over-emphasise the importance of the stacks. In terms of understanding the workings of forth, stacks are, of course, crucially important, but they are simply a straight-forward tool for accomplishing the more interesting and innovative aspects of forth.

The data stack
The return stack
The "control" stack
Stack over/underflows

The data stack

The data stack is the most visible stack in forth and it is used for many things:

Temporary storage within words; in a sense, local variables
Passing arguments to and from words
Implementing the "control" stack, as we shall see below

One of the most obvious differences between forth and more "conventional" languages is the direct access to a stack data structure. It turns out that most languages make use of a stack in order to implement control flow throughout procedures/functions/words. In the conventional implementation of C, for instance, every time a function is called (barring compiler optimisations) a new monolithic data structure called a stack frame is created on a stack. This provides C's local variable functionality, among other things. Forth, on the other hand, permits words to push and pop from the stack at will. A word could push several variables from the stack, or it could pop off several variables from the stack, or it could not affect the stack at all.

Here are some example words:

: a . ;

: b 2 ;

: c 2 . ;

Word 'a' will remove a value from the stack, word 'b' will leave a value upon the stack, and word 'c' will not affect the stack.

The return stack

The return stack is forth's method for managing control flow throughout words. In general, a compiled forth program is a list of addresses in memory that correspond to other words. (1)

Consider this piece of forth code:

: duplicate dup ;

: square duplicate * ;

2 square

Once the word 'square' is executed, forth needs to execute the word 'duplicate' but also needs to remember its location inside 'square' so it can come back and execute *. Forth uses the return stack for this.

In the following diagram, the program counter (pc) is pointing to the location '5' in memory. Forth recognises that it needs to execute the code at memory location 12, but has to be able to resume execution at location 6 after the called word is done executing. It does this by pushing the location after the program counter onto the return stack, setting the program counter to 12, and then resuming execution. At some point, the called word will execute the word 'exit' which will simply pop a value from the top of the return stack, place it into the program counter, and then resume execution.

The "control" stack

The control stack is, in most forths, the same stack as the data stack. The control stack is used during compilation of a program. The control stack, when used properly, is a very powerful tool that allows you to define your own branching constructs, looping constructs, and much more. If you are familiar with lisp, control stack manipulation is similar to lisp macros.

The reason why ANSI Forth does not demand that the data stack and the control be one and same is in order to permit various types of pre-compilation: typically compilation on a system different from the system executing the forth code.

Stack over/underflows

In a traditional forth system, stack under/overflows are not checked for after every push/pop for efficieny's sake, but instead are check by the word 'stacks' inside the QUIT loop (which we'll read more about shortly). So in a forth system, you may see something like this:

.
0 STACK UNDERFLOW   ok

Notice that the stack underflow is not detected until after the word '.' is executed, so it prints out whatever value happened to be immediatley below (or above, depending in which direction the stack grows), which just happens to be 0 in our case. Not all forth systems implement the 'stacks' word, and some only check for stack underflows. Keep in mind that the 'stacks' word just makes sure that the stack pointers don't fall outside their designated range, and is helpless to detect stack underflows of the sort:

. 1234
0   ok

This is somewhat a simplification of the truth, as we shall see later on.
Though ANSI Forth compliance does not demand this.