.BP .SN 4 .S1 "DATA ADDRESS SPACE" The data address space is divided into three parts, called 'areas', each with its own addressing method: global data area, local data area (including the stack), and heap data area. These data areas must be part of the same address space because all data is accessed by the same type of pointers. .P Space for global data is reserved using several pseudoinstructions in the assembly language, as described in the next paragraph and chapter 11. The size of the global data area is fixed per program. .A Global data is addressed absolutely in the machine language. Many instructions are available to address global data. They all have an absolute address as argument. Examples are LOE, LAE and STE. .P Part of the global data area is initialized by the compiler, the rest is not initialized at all or is initialized with a value, typically -32768 or 0. Part of the initialized global data may be made read-only if the implementation supports protection. .P The local data area is used as a stack, which grows from high to low addresses and contains some data for each active procedure invocation, called a 'frame'. The size of the local data area varies dynamically during execution. Below the current procedure frame resides the operand stack. The stack pointer SP always points to the bottom of the local data area. Local data is addressed by offsetting from the local base pointer LB. LB always points to the frame of the current procedure. Only the words of the current frame and the parameters can be addressed directly. Variables in other active procedures are addressed by following the chain of statically enclosing procedures using the LXL or LXA instruction. The variables in dynamically enclosing procedures can be addressed with the use of the DCH instruction. .A Many instructions have offsets to LB as argument, for instance LOL, LAL and STL. The arguments of these instructions range from -1 to some (negative) minimum for the access of local storage and from 0 to some (positive) maximum for parameter access. .P The procedure call instructions CAL and CAI each create a new frame on the stack. Each procedure has an assembly-time parameter specifying the number of bytes needed for local storage. This storage is allocated each time the procedure is called and must be a multiple of the wordsize. Each procedure, therefore, starts with a stack with the local variables already allocated. The return instructions RET and RTT remove a frame. The actual parameters must be removed by the calling procedure. .P RET may copy some words from the stack of the returning procedure to an unnamed 'function return area'. This area is available for 'READ-ONCE' access using the LFR instruction. The result of a LFR is only defined if the size used to fetch is identical to the size used in the last return. The instruction ASP, used to remove the parameters from the stack, the branch instruction BRA and the non-local goto instrucion GTO are the only ones that leave the contents of the 'function return area' intact. All other instructions are allowed to destroy the function return area. Thus parameters can be popped before fetching the function result. The maximum size of all function return areas is implementation dependent, but should allow procedure instance identifiers and all implemented objects of type integer, unsigned, float and pointer to be returned. In most implementations the maximum size of the function return area is twice the pointer size, because we want to be able to handle 'procedure instance identifiers' which consist of a procedure identifier and the LB of a frame belonging to that procedure. .P The heap data area grows upwards, to higher numbered addresses. It is initially empty. The initial value of the heap pointer HP marks the low end. The heap pointer may be manipulated by the LOR and STR instructions. The heap can only be addressed indirectly, by pointers derived from previous values of HP. .S2 "Global data area" The initial size of the global data area is determined at assembly time. Global data is allocated by several pseudoinstructions in the EM assembly language. Each pseudoinstruction allocates one or more bytes. The bytes allocated for a single pseudo form a 'block'. A block differs from a fragment, because, under certain conditions, several blocks are allocated in a single fragment. This guarantees that the bytes of these blocks are consecutive. .P Global data is addressed absolutely in binary machine language. Most compilers, however, cannot assign absolute addresses to their global variables, especially not if the language allows programs to be composed of several separately compiled modules. The assembly language therefore allows the compiler to name the first address of a global data block with an alphanumeric label. Moreover, the only way to address such a named global data block in the assembly language is by using its name. It is the task of the assembler/loader to translate these labels into absolute addresses. These labels may also be used in CON and ROM pseudoinstructions to initialize pointers. .P The pseudoinstruction CON allocates initialized data. ROM acts like CON but indicates that the initialized data will not change during execution of the program. The pseudoinstruction BSS allocates a block of uninitialized or identically initialized data. The pseudoinstruction HOL is similar to BSS, but it alters the meaning of subsequent absolute addressing in the assembly language. .P Another type of global data is a small block, called the ABS block, with an implementation defined size. Storage in this type of block can only be addressed absolutely in assembly language. The first word has address 0 and is used to maintain the source line number. Special instructions LIN and LNI are provided to update this counter. A pointer at location 4 points to a string containing the current source file name. The instruction FIL can be used to update the pointer. .P All numeric arguments of the instructions that address the global data area refer to locations in the ABS block unless they are preceded by at least one HOL pseudo in the same module, in which case they refer to the storage area allocated by the last HOL pseudoinstruction. Thus LOE 0 loads the zeroth word of the most recent HOL, unless no HOL has appeared in the current file so far, in which case it loads the zeroth word of the ABS fragment. .P The global data area is highly fragmented. The ABS block and each HOL and BSS block are separate fragments. The way fragments are formed from CON and ROM blocks is more complex. The assemblers group several blocks into a single fragment. A fragment only contains blocks of the same type: CON or ROM. It is guaranteed that the bytes allocated for two consecutive CON pseudos are allocated consecutively in a single fragment, unless these CON pseudos are separated in the assembly language program by a data label definition or one or more of the following pseudos: .DS ROM, BSS, HOL and END .DE An analogous rule holds for ROM pseudos. .S2 "Local data area" The local data area consists of a sequence of frames, one for each active procedure. Below the frame of the current procedure resides the expression stack. Frames are generated by procedure calls and are removed by procedure returns. A procedure frame consists of six 'zones': .DS 1. The return status block 2. The local variables and compiler temporaries 3. The register save block 4. The dynamic local generators 5. The operand stack. 6. The parameters of a procedure one level deeper .DE A sample frame is shown in Figure 1. .P Before a procedure call is performed the actual parameters are pushed onto the stack of the calling procedure. The exact details are compiler dependent. EM allows procedures to be called with a variable number of parameters. The implementation of the C-language almost forces its runtime system to push the parameters in reverse order, that is, the first positional parameter last. Most compilers use the C calling convention to be compatible. The parameters of a procedure belong to the frame of the calling procedure. Note that the evaluation of the actual parameters may imply the calling of procedures. The parameters can be accessed with certain instructions using offsets of 0 and greater. The first byte of the last parameter pushed has offset 0. Note that the parameter at offset 0 has a special use in the instructions following the static chain (LXL and LXA). These instructions assume that this parameter contains the LB of the statically enclosing procedure. Procedures that do not have a dynamically enclosing procedure do not need a static link at offset 0. .P Two instructions are available to perform procedure calls, CAL and CAI. Several tasks are performed by these call instructions. .A First, a part of the status of the calling procedure is saved on the stack in the return status block. This block should contain the return address of the calling procedure, its LB and other implementation dependent data. The size of this block is fixed for any given implementation because the lexical instructions LPB, LXL and LXA must be able to obtain the base addresses of the procedure parameters \fBand\fP local variables. An alternative solution can be used on machines with a highly segmented address space. The stack frames need not be contiguous then and the first status save area can contain the parameter base AB, which has the value of SP just after the last parameter has been pushed. .A Second, the LB is changed to point to the first word above the local variables. The new LB is a copy of the SP after the return status block has been pushed. .A Third, the amount of local storage needed by the procedure is reserved. The parameters and local storage are accessed by the same instructions. Negative offsets are used for access to local variables. The highest byte, that is the byte nearest to LB, has to be accessed with offset -1. The pseudoinstruction specifying the entry point of a procedure, has an argument that specifies the amount of local storage needed. The local variables allocated by the CAI or CAL instructions are the only ones that can be accessed with a fixed negative offset. The initial value of the allocated words is not defined, but implementations that check for undefined values will probably initialize them with a special 'undefined' pattern, typically -32768. .A Fourth, any EM implementation is allowed to reserve a variable size block beneath the local variables. This block could, for example, be used to save a variable number of registers. .A Finally, the address of the entry point of the called procedure is loaded into the Program Counter. .P The ASP instruction can be used to allocate further (dynamic) local storage. The base address of such storage must be obtained with a LOR~SP instruction. This same instruction ASP may also be used to remove some words from the stack. .P There is a version of ASP, called ASS, which fetches the number of bytes to allocate from the stack. It can be used to allocate space for local objects whose size is unknown at compile time, so called 'dynamic local generators'. .P Control is returned to the calling procedure with a RET instruction. Any return value is then copied to the 'function return area'. The frame created by the call is deallocated and the status of the calling procedure is restored. The value of SP just after the return value has been popped must be the same as the value of SP just before executing the first instruction of this invocation. This means that when a RET is executed the operand stack can only contain the return value and all dynamically generated locals must be deallocated. Violating this restriction might result in hard to detect errors. The calling procedure has to remove the parameters from the stack. This can be done with the aforementioned ASP instruction. .P Each procedure frame is a separate fragment. Because any fragment may be placed anywhere in memory, procedure frames need not be contiguous. .DS |===============================| | actual parameter n-1 | |-------------------------------| | . | | . | | . | |-------------------------------| | actual parameter 0 | ( <- AB ) |===============================| |===============================| |///////////////////////////////| |///// return status block /////| |///////////////////////////////| <- LB |===============================| | | | local variables | | | |-------------------------------| | | | compiler temporaries | | | |===============================| |///////////////////////////////| |///// register save block /////| |///////////////////////////////| |===============================| | | | dynamic local generators | | | |===============================| | operand | |-------------------------------| | operand | |===============================| | parameter m-1 | |-------------------------------| | . | | . | | . | |-------------------------------| | parameter 0 | <- SP |===============================| Figure 1. A sample procedure frame and parameters. .DE .S2 "Heap data area" The heap area starts empty, with HP pointing to the low end of it. HP always contains a word address. A copy of HP can always be obtained with the LOR instruction. A new value may be stored in the heap pointer using the STR instruction. If the new value is greater than the old one, then the heap grows. If it is smaller, then the heap shrinks. HP may never point below its original value. All words between the current HP and the original HP are allocated to the heap. The heap may not grow into a part of memory that is already allocated for the stack. When this is attempted, the STR instruction will cause a trap to occur. .P The only way to address the heap is indirectly. Whenever an object is allocated by increasing HP, then the old HP value must be saved and can be used later to address the allocated object. If, in the meantime, HP is decreased so that the object is no longer part of the heap, then an attempt to access the object is not allowed. Furthermore, if the heap pointer is increased again to above the object address, then access to the old object gives undefined results. .P The heap is a single fragment. All bytes have consecutive addresses. No limits are imposed on the size of the heap as long as it fits in the available data address space.