Session 19

A Program's Run-Time Environment


810:155
Translation of Programming Languages


The end of these notes are incomplete. I hope to complete then soon.



Opening Exercise

Write type expressions for these Scheme procedures:

    apply                      ;; recall (apply + '(1 2 3)) → 6

    (define compose
      (lambda (f g)
        (lambda (x)
          (f (g x)))))

How would you represent these type expressions as trees?

You can take a look at answers here.



Segue to Program Synthesis

To date, we have been concerned with the analysis phase of the compiler, those stages that build an abstract picture of the source program to be compiled. This phase includes lexical analysis, syntax analysis, creation of an abstract representation of the program, and semantic analysis.

We now turn our attention to the synthesis phase of the compiler, those stages that convert our semantically-valid abstract representation into an equivalent program in the target language.

Before we dive headlong into this translation process, we must first consider one topic that sits between analysis and synthesis. To date, our concern has been with static features of the source program, those features understandable by examining only the text of the source program. At run time, though, this program will execute on the target machine, and static features of the program will take on their dynamic meaning. For example, the same name may mean different values and even different source locations during the execution of the program. In one case, it may be used as a temporary variable name; in another, as a formal parameter name; and in a third, as the formal parameter on a recursive call.

For the next couple of sessions, we will study the run-time system associated with a compiled program. This system consists of the routines that are loaded along with the generated target code at run-time, which enable the target code to execute in the fashion defined by the language.

Note that the idea of a run-time system is quite general. Some languages, such as Java and Smalltalk, rely on a virtual machine to execute a rather abstract target language. A virtual machine is an especially powerful and portable form of run-time system. (One famous computer scientist once defined a virtual machine as a way to give programmers good error messages, so that when there weren't any error messages the program would run successfully. That's not of much help if you don't already know what we mean by a VM, but it is perfect if you do!)

Our focus in these sessions will be on the sort of run-time system associated with a language that runs natively on some "real" machine.

Our study of run-time systems will include the two main elements of a program's execution profile:

These two elements correspond to the two sides of any program: data and behavior.



A Quick Exercise

How does the answer to each of these questions affect how a compiler works?



Run-Time Systems and Source Language

Both memory allocation and procedure activation are determined in large part by the semantics of the source language.

Memory Allocation    How we allocate space for the data in a program depends on the type of the object. We may be able to represent basic types directly by mapping them onto equivalent types in the target machine. Characters, integers, and real numbers are common examples. For other basic types, we will have to construct a representation in terms of types available on the target machine, allocate instances of this representation, and manage these values within the run-time environment. We will have to do this for most or all of the constructed types in our language.

Procedure Activation.    The declaration of a procedure associates a name with a statement, the procedure body. This is a static binding. The execution of the same procedure creates an activation that runs code against the state of the program. This is a dynamic relationship. A program activation allocates memory for the objects required by the procedure.

This is a recurring distinction in the study of programming languages, in the implementation a compiler, and in computer science more generally. You should be becoming familiar with this Big Idea.

Consider this example with a quicksort program, taken from the Dragon book. Lines 3-7 declare the procedure readarray. When the name appears on Line 23, we say that that the programs calls the procedure. In many languages, a procedure call can be a statement, or it can occur as part of another expression (Line 16).

In the declaration of the partition procedure, we also declare two special identifiers, the integers y and z. We call these the formal parameters of the procedure. They are like local variables but are special in that any code that calls partition must provide the initial values for them. The values that are passed are called the actual parameters of the procedure, or the arguments. Line 16 provides the integers m and n as actual parameters. Different calls to a procedure, say, Lines 17, 18, and 24, can provide different actual parameters.

Control Flow Among Procedures.    On a sequential machine, program control flows sequentially, with one step following another. But each procedure call branches to a new sequence and returns to the point of the call, which in effect creates and extends multiple paths. As a result, it is often useful for us to think of control flow among procedures as a tree of activations.

Every time a procedure is called, its formal parameters are initialized with the values of the actual parameters and the statements in its body are executed. In then quicksort program, the partition procedure may be called several times, say, by the first execution of quicksort and by the two recursive executions of quicksort from within quicksort's body. We call each of these executions an activation. The lifetime of activation is the time spent executing the body of the procedure, including any procedure calls it makes. This is a sequence of steps that is determined at run time.

Each time a called procedure finishes execution, it returns to the same point in the calling activation. This means that the lifetimes of any two procedure activations are either disjoint or nested. And this means that we can use a stack to manage the flow of control among procedures.

In this sense, recursive procedures are not really special; it's just that the nested activation is of the same program text as the calling activation. This leads us to the answer to the second question in our quick exercise above... Without recursion, we could simplify storage allocation and reuse a single activation record!

... a tree of activations. Walk it DFS to reconstitute the flow. The State of control stack corresponds to paths in the tree.

Declarations and Bindings    coming soon.



Eugene Wallingford ..... wallingf@cs.uni.edu ..... April 2, 2009