We are now considering the synthesis phase of the compiler, which converts a semantically-valid abstract syntax tree into an equivalent program in the target language. Before looking at code generation, we are first studying the run-time system associated with a compiled program. This system consists of the routines that are loaded along with the generated target code at run-time, which enable the target code to execute in the fashion defined by the language.
Last time, we considered the basic issues that underlie a run-time system: the activation of computational processing through procedure calls and the binding of names to storage locations and storage locations to values. The essential point to draw from that discussion is the distinction between static notions that reside in program text and dynamic notions that exist as the program executes.
Today, we learn how to organize run-time storage for the various parts of an executing program. The techniques we study work immediately for languages such as C, Ada, and Pascal. They extend naturally to languages such as Java and C++ and BigRig, in which packages of procedures and data may form higher-level structures.
The storage associated with a running program consists of three kinds:
We can further subdivide the storage devoted to data objects into two categories:
What will the compiler know about storage requirements at the time it generates its output? First, it will know the size of the generated target code. Second, it can also know at compile time the memory needed to hold some of the program's data objects. In general, we would like to maximize the number of objects whose memory needs are known at compile time, because these objects can be assigned to a fixed location in the run-time memory of the target program. The compiler can compute addresses for these objects and write them directly into the target code. This makes the target code simpler, and faster to execute at run time.
The size of some data objects, though, can be determined only at run time. Perhaps the size of a record is not determined until the user running the program specifies the size of a string it contains. Perhaps an array's size depends on the number of records in an external data file. These objects are created dynamically under program control. For such objects, the run-time system must provide a separate source of storage locations. Typically, this source is called the heap, though it may or may not be implemented using a heap data structure.
Some source languages do not allow dynamically-created data, but most modern languages do. We will consider the organization of static and dynamic memory for data objects in more detail later in this session.
What about the control stack? For most languages, the size of the control stack cannot be known at compile time. When a procedure call occurs, the running program must interrupt its execution path and store information about that path (such as the value of the program counter and the machine registers). This information is stored in an activation record or stack frame. When the procedure returns, the executing program restores the program counter and registers so that the interrupted execution path can be resumed. The activation record for a procedure may also include data objects created by the procedure.
We call this structure a control 'stack', but we use the term loosely. Most languages require that an executing program have access to entries on the stack other than the top. For example, a procedure or code block may refer to a variable defined in a wider scope. In such cases, the program will violate the strict definition of a stack to provide such access. Some languages even require a heap for the control stack, because the lifetime of procedure activations may not fit a tree model. (For example, think about a language that permits closures with references to free variables, such as Scheme.)
We will consider the organization of the control stack in more detail later in this session.
In order to account for the configuration of target code, data objects, and control stack, a typical compiler organizes its run-time memory in this way:
The control stack grows 'downward', toward higher addresses, with the dynamic memory storage growing upward. The address of frames lower in the stack can be computed as negative offsets from the top of the stack. This is standard procedure in a compiler because, on many machines, such computations can be done efficiently by storing the value of top in a register.
From this image, we can see one good reason to allocate as many data objects statically as possible: if the compiler must accommodate large dynamic memory and control stack partitions in order to be able to run worst-case programs, then the run-time system will take up much more space than the average program needs.
A common way to lay out an individual stack frame is:
Many modern compilers deviate from this traditional layout by using registers to pass data values into and out of a procedure. Registers are faster and do not require extra space, and many modern computers provide enough registers that the compiler has enough at its disposal to use a few for this purpose.
The optional control link points to the activation record of the calling procedure. The optional access link points to the location of non-local references. Not all languages require these, for a variety of reasons:
The local data objects region holds objects bound explicitly to names within the procedure. The temporary data objects region holds objects created implicitly in the course of evaluating expressions. Consider the expression
a = b + c - d
What temporary objects might the compiler create in implementing this statement?
The order of these items is intentional. The values needed only by the procedure (local and temporary data) appear on top of the stack, while values needed at the times of the procedure's call and return (the argument and return values) appear at the base of the record, where they are most easily accessed based on the size of the caller's activation record.
The size of each region can be computed at run time, when the procedure is called, or at compile time. Whenever possible, the compiler should do these computations, because that allows more efficient allocation of space. Some objects require run-time computation, such as an array whose size is determined at run time.
The amount of storage required for each data object is determined by its type. For the basic integer, floating-point, and character types, the compiler knows the fixed, integral number of bytes to allocate on the target machine. An aggregate data type contains other objects, so its size can be computed only if the number and size of its parts is known at compile time. Many aggregates typically require run-time computation of size.
Even in such cases, the compiler must lay out memory for data objects on the stack at compile time. It keeps track of the number of memory locations allocated at any point in time. This way, it can compute a relative address for each local object with respect to the beginning of the local data region (or the stack frame itself). These offsets can be used to refer to each data object in the generated code.
The details of data object storage have historically been driven by the addressing constraints on the target machine. For example, the integer addition instruction on some machines requires that its arguments be aligned in memory, perhaps at locations with addresses that are divisible by 4. This may require that memory be padded with unused bytes or packed into compressed data. The choice between padding and packing raises a familiar trade-off: padding uses extra space, and packing uses extra execution time to unpack the location before doing the addition. (It also requires code in the run-time system to do the unpacking, which takes space!)
As you can see, there are many decisions to be made in organizing local data within the stack frame. One way to isolate these decisions from the rest of the compiler is to encapsulate as many of the details of the target machine as possible from the machinery that generates code. In doing this, we create abstractions within our compiler, such as a stack frame abstraction. Some authors of more modern compiler texts, such as Appel argue forcefully for a large number of abstractions in compiler design.
We must consider allocation of storage for three kinds of objects:
As you'll soon learn, we can allocate memory for activation records at either compile time or run time, depending upon the nature of the language.
Sometimes a name can be associated with a storage location at compile time. When this is possible, the compiler does not need to generate any run-time support for the object. (Remember: bindings do not change at run time -- only values do!)
But when is this possible? The compiler must be able to know the size of the object as well as any constraints on its position. This eliminates any dynamically-sized aggregates. The compiler must also be able to use a single binding for the name. This eliminates the data objects local to a recursive procedure, because those objects may require multiple bindings for a given name. (But it doesn't necessarily eliminate procedures with tail recursion, which can be compiled using a single activation frame!) So: the compiler will be able to allocate data objects statically in non-recursive procedures and on fixed-sized global objects.
How does the compiler allocate static storage? As described above, the compiler must:
The symbol table can be used to store the target address of a statically-allocated data object. The code generator can then look up the address whenever it needs it.
Note: the compiler may be able to do the same for several components of the stack frame, namely, the information stored about program state at the time of a procedure call.
For languages that do not allow recursion, or that allow only tail recursion, a compiler can allocate activation records in the same way that it allocates static data objects, with one activation record for each procedure located in the static data region. There is no separate control stack, but a stack forms dynamically as called procedures have links pointing back to the procedures that called them. This makes for an incredibly efficient run time! The venerable language Fortran supported this kind of memory layout, and it is one of the sources of Fortran's efficiency in data-intensive scientific computation.
(This also reminds me of how we simulated linked lists in static-only Fortran.)
But... most modern languages do not limit procedure calls in this way. How then does a compiler implement a procedure call?
Each procedure call requires a sequence of instructions to allocate an activation record and then store some initial information there. This is called the calling sequence. When the procedure is finished executing, the compiler invokes a parallel return sequence to restore the state of the machine so that the calling procedure can complete its execution. The instructions in the calling and return sequences are usually shared between the calling procedure and the called procedure, with the particular split determined by requirements imposed by the target language and perhaps the operating system.
Quick Exercise: In general, we would like for the called procedure to do as much of the work as possible. Why?
Quick Answer: Consider the case in which a procedure is called from n different places, say a utility procedure or a method in a commonly-used ADT. Any instructions for the calling sequence that are executed by the callers will be generated n times, once for each caller. The instructions that are part of the called procedure will be generated only once!
The caller usually computes the values of arguments passed to the called procedure, because each call has its own specific values. These are typically stored at the bottom of the activation record, so that the caller can access them without having to know the layout of the rest of record.
Temporary data objects pose something of a problem. While their size and number can be know at compile time, they may not be fixed at the time the record is allocated. (Later phases of the compiler, especially the optimizer, may change their number or size as a part of their tasks!) So they typically come at the top of the frame, where they do not affect the caller or the offsets computed for the rest of the record's fields.
Here is a typical calling sequence. Assume that the register named status points to the end of the region labeled "state of program before call" in our typical stack frame above. The caller can initialize all the fields below this pointer, using addresses computed as negative offsets from it, while the called procedure can initialize and access the fields above the pointer using positive offsets.
The corresponding return sequence might look like this.
Consider this example:
i := sos(2, a);
...
int sos(int x, int y)
{
return x*x + y*y;
}
... work through the example.
Notice that this mechanism allows the calling procedure to pass any number of arguments, as in C's printf function. In such cases, the called procedure must have some way to recover the number of arguments and process them appropriately. In the case of printf, the first argument contains information about the number and the type of the remaining arguments.
If the value of a local name outlives the procedure activation or if the called activation outlives the calling activation, then a stack cannot be used to implement the control "stack" -- because control cannot be modeled as a last-in, first-out process! The compiler must use another way to organize storage, such as a heap. Think about how a Scheme closure might work...
That's what tomorrow is for -- things we can't do today. :-) We'll also tackle the problem of accessing non-local names next time. Neither of these issues is at play in your Klein compiler.
After learning how a compiler organizes run-time memory for a program, you may better appreciate how Java's reference semantics for objects simplifies the Java compiler. Even though objects can be of arbitrary size and thus may need to be allocated from dynamic memory, every Java object variable is a fixed-length reference to such an object. This means that most data objects can be statically allocated, and even objects placed on the control stack as local or temporary objects have a size known at compile time. The small price of dereferencing the object is paid at run-time.