.NH 2 Usage of registers in ACK compilers .PP We will first describe the major design decisions of the Amsterdam Compiler Kit, as far as they concern register allocation. Subsequently we will outline the role of the Global Optimizer in the register allocation process and the interface between the code generator and the optimizer. .NH 3 Usage of registers without the intervention of the Global Optimizer .PP Registers are used for two purposes: .IP 1. for the evaluation of arithmetic expressions .IP 2. to hold local variables, for the duration of the procedure they are local to. .LP It is essential to note that no translation part of the compilers, except for the code generator, knows anything at all about the register set of the target computer. Hence all decisions about registers are ultimately made by the code generator. Earlier phases of a compiler can only \fIadvise\fR the code generator. .PP The code generator splits the register set into two: a fixed part for the evaluation of expressions (called \fIscratch\fR registers) and a fixed part to store local variables. This partitioning, which depends only on the target computer, significantly reduces the complexity of register allocation, at the penalty of some loss of code quality. .PP The code generator has some (machine-dependent) knowledge of the access costs of memory locations and registers and of the costs of saving and restoring registers. (Registers are always saved by the \fIcalled\fR procedure). This knowledge is expressed in a set of procedures for each target machine. The code generator also knows how many registers there are and of which type they are. A register can be of type \fIpointer\fR, \fIfloating point\fR or \fIgeneral\fR. .PP The front ends of the compilers determine which local variables may be put in a register; such a variable may never be accessed indirectly (i.e. through a pointer). The front end also determines the types and sizes of these variables. The type can be any of the register types or the type \fIloop variable\fR, which denotes a general-typed variable that is used as loop variable in a for-statement. All this information is collected in a \fIregister message\fR in the EM code. Such a message is a pseudo EM instruction. This message also contains a \fIscore\fR field, indicating how desirable it is to put this variable in a register. A front end may assign a high score to a variable if it was declared as a register variable (which is only possible in some languages, such as "C"). Any compiler phase before the code generator may change this score field, if it has reason to do so. The code generator bases its decisions on the information contained in the register message, most notably on the score. .PP If the global optimizer is not used, the score fields are set by the Peephole Optimizer. This optimizer simply counts the number of occurrences of every local (register) variable and adds this count to the score provided by the front end. In this way a simple, yet quite effective register allocation scheme is achieved. .NH 3 The role of the Global Optimizer .PP The Global Optimizer essentially tries to improve the scheme outlined above. It uses the following principles for this purpose: .IP - Entities are not always assigned a register for the duration of an entire procedure; smaller regions of the program text may be considered too. .IP - several variables may be put in the same register simultaneously, provided at most one of them is live at any point. .IP - besides local variables, other entities (such as constants and addresses of variables and procedures) may be put in a register. .IP - more accurate cost estimates are used. .LP To perform its task, the optimizer must have some knowledge of the target machine. .NH 3 The interface between the register allocator and the code generator .PP The RA phase of the optimizer must somehow be able to express its decisions. Such decisions may look like: 'put constant 1283 in a register from line 12 to line 40'. To be precise, RA must be able to tell the code generator to: .IP - initialize a register with some value .IP - update an entity from a register .IP - replace all occurrences of an entity in a certain region of text by a reference to the register. .LP At least three problems occur here: the code generator is only used to put local variables in registers, it only assigns a register to a variable for the duration of an entire procedure and it is not used to have some earlier compiler phase make all the decisions. .PP All problems are solved by one mechanism, that involves no changes to the code generator. With every (non-scratch) register R that will be used in a procedure P, we associate a new variable T, local to P. The size of T is the same as the size of R. A register message is generated for T with an exceptionally high score. The scores of all original register messages are set to zero. Consequently, the code generator will always assign precisely those new variables to a register. If the optimizer wants to put some entity, say the constant 1283, in a register, it emits the code "T := 1283" and replaces all occurrences of '1283' by T. Similarly, it can put the address of a procedure in T and replace all calls to that procedure by indirect calls. Furthermore, it can put several different entities in T (and thus in R) during the lifetime of P. .PP In principle, the code generated by the optimizer in this way would always be valid EM code, even if the optimizer would be presented a totally wrong description of the target computer register set. In practice, it would be a waste of data as well as text space to allocate memory for these new variables, as they will always be assigned a register (in the correct order of events). Hence, no memory locations are allocated for them. For this reason they are called pseudo local variables.