Contents | Prev | Next | The T Language Specification, Version 2 Spring 2006 |
CHAPTER 8
Much of the work in a program is done by evaluating expressions, either for their side effects, such as assignments to variables, or for their values, which can be used as arguments or operands in larger expressions, or to affect the execution sequence in statements, or both.
This chapter specifies the meanings of expressions and the rules for their evaluation.
When an expression in a program is evaluated (executed), the result denotes one of two things:
Evaluation of an expression can also produce side effects, because expressions may contain embedded assignments and method invocations.
Each expression occurs either in the main block (§7.1) or in the declaration of some class type that is being declared. In a class declaration the expression might occur in a constructor declaration, in a destructor declaration, or in the code for a method.
If an expression denotes a variable, and a value is required for use in further evaluation, then the value of that variable is used. In this context, if the expression denotes a variable or a value, we may speak simply of the value of the expression.
If an expression denotes a variable or a value, then the expression has a type known at compile time. The rules for determining the type of an expression are explained separately below for each kind of expression.
The value of an expression is always assignment compatible (§3.2) with the type of the expression, just as the value stored in a variable is always compatible with the type of the variable. In other words, the value of an expression whose type is T is always suitable for assignment to a variable of type T.
If the type of an expression is an
integer, then the value of the expression is an integer. But if the type of an
expression is a reference type, then the class of the referenced object,
or even whether the value is a reference to an object rather than
null
,
is not necessarily known at compile time.
There are a few places in the T programming language where the actual
class of a referenced object affects program execution in a manner that
cannot be deduced from the type of the expression. They are as follows:
o.m(
...)
is
chosen based on the methods that are part of the class that is
the type of o
. The class of the object
referenced by the run-time value of o
participates because a
subclass may override a specific method already declared in a parent class so
that this overriding method is invoked. (The overriding method may or may not
choose to further invoke the original overridden m
method.)
[]
to be treated as a subtype of
T[]
if S is a subtype of T, but
this requires a run-time
check for assignment to an array component, similar to the check performed
for casting of reference types.
The first of the cases just listed ought never to result in detecting a type error, as it is compile-time constrained to be valid. Thus, a run-time type error can occur only when the actual class of the object referenced by the value to be assigned (either implicitly or explicitly) is not compatible with the actual run-time reference variable or component type of the array. In these cases, the program terminates with a Run-Time error (§8.17).
The T programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.
It is recommended that code not rely crucially on this specification. Code is usually clearer when each expression contains at most one side effect, as its outermost operation.
The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. For example, if the left-hand operand contains an assignment to a variable and the right-hand operand contains a reference to that same variable, then the value produced by the reference will reflect the fact that the assignment occurred first.
The T programming language also guarantees that every operand of an operator appears to be fully evaluated before any part of the operation itself is performed.
T programming language implementations must respect the order of evaluation as indicated explicitly by parentheses and implicitly by operator precedence. An implementation may not take advantage of algebraic identities such as the associative law to rewrite expressions into a more convenient computational order unless it can be proven that the replacement expression is equivalent in value and in its observable side effects for all possible computational values that might be involved.
Note that integer addition and multiplication are provably associative in the T programming language.
For example a+b+c
, where a
,
b
, and c
are main block variables
will always produce the same answer whether evaluated as
(a+b)+c
or a+(b+c)
; if the expression b+c
occurs nearby in the code, a smart compiler may be able to use this common
subexpression.
In a method or constructor invocation or class instance creation expression, argument expressions may appear within the parentheses, separated by commas. Each argument expression appears to be fully evaluated before any part of any argument expression to its right.
Primary expressions include most of the simplest kinds of expressions, from which all others are constructed: literals, field accesses, method invocations, array accesses and names. A parenthesized expression is also treated syntactically as a primary expression.
Primary: ArrayCreationExpression Identifier PrimaryNoNewArray PrimaryNoNewArray: ParenExpression THIS FieldAccess MethodInvocation ArrayAccess ClassInstanceCrreationExpression Literal
A literal denotes a fixed, unchanging value.
The following production from (§1.7) is repeated here for convenience:
Literal: INTEGER_LITERAL NULL_LITERAL
The type of a literal is determined as follows:
int
.
null
is the null type;
its value is the null reference.
this
The keyword this
may be used only in the body of an
method, constructor, or destructor.
When used as a primary expression, the keyword this
denotes
a value, that is a reference to the object for which the method was invoked,
or to the object being constructed or destroyed.
The type of this
is the class C within which the
keyword this
occurs. At run time, the class of the actual object
referred to may be the class C or any subclass of C.
A parenthesized expression is a primary expression whose type is the type of the contained expression and whose value at run time is the value of the contained expression. If the contained expression denotes a variable then the parenthesized expression also denotes that variable.
A class instance creation expression is used to create new objects that are instances of classes.
ClassInstanceCreationExpression:new
ClassType Arguments Arguments:(
ArgumentList)
(
)
ArgumentList ArgumentList,
Expression Expression
We say that a class is instantiated when an instance of the class is created by a class instance creation expression. Class instantiation involves determining what class is to be instantiated, what constructor should be invoked to create the new instance and what arguments should be passed to that constructor.
The class being instantiated is the class denoted by ClassType.
The type of the class instance creation expression is the class type being instantiated.
Let C be the class type being instantiated. To create an instance of C, i, a constructor of C is chosen at compile-time by the following rules:
At run time, a class instance creation expression requires memory space to be allocated for the new class instance. If there is insufficient space to allocate the object, the program terminates with a run-time error.
The new object contains new instances of all the fields declared in the specified class type and all its superclasses. As each new field instance is created, it is initialized to its default value.
Next, the actual arguments to the constructor are evaluated, left-to-right.
Next, the selected constructor of the specified class type is invoked. This results in invoking at least one constructor for each superclass of the class type.
The value of a class instance creation expression is a reference to the newly created object of the specified class. Every time the expression is evaluated, a fresh object is created.
An array instance creation expression is used to create new arrays (§6).
ArrayCreationExpression:new
ClassType DimensionExpressions Dimensionsnew
ClassType DimensionExpressionsnew
PrimitiveType DimensionExpressions Dimensionsnew
PrimitiveType DimensionExpressions DimensionExpressions: DimensionExpressions DimensionExpression DimensionExpression DimensionExpression:[
Expression]
Dimensions: Dimensions Dimension Dimension Dimension:[
]
An array creation expression creates an object that is a new array whose elements are of the type specified by the PrimitiveType or ClassType.
The type of the array creation expression is an array type that can be denoted by a copy of the creation expression from which the new
keyword and every DimensionExpression expression have been deleted.
For example, the type of the creation expression:
new int[3][3][]
is:
int[][][]
The type of each dimension expression within a DimensionExpression must be an integer type, or a compile-time error occurs.
An array creation expression specifies the element type, the number of levels of nested arrays, and the length of the array for at least one of the levels of nesting. The array's length is available as an instance variable length.
Each array dimension expression denotes the length of the corresponding array. The array dimensions are evaluated left-to-right.
Next, the values of the dimension expressions are checked. If the value of any DimensionExpression expression is less than zero, then the program terminates with a run-time error.
Next, space is allocated for the new array. If there is insufficient space to allocate the array, then the program terminates with a run-time error.
Then, if a single DimensionExpression appears, a single-dimensional array is created of the specified length, and each component of the array is initialized to its default value.
If an array creation expression contains N DimensionExpression expressions, then it effectively executes a set of nested loops of depth N - 1 to create the implied arrays of arrays.
A multidimensional array need not have arrays of the same length at each level.
A field access expression may access a field of an object or array,
a reference to which is the value of either an expression or the special
keyword super
.
FieldAccess:
Primary . Identifier
super
. Identifier
The type of the Primary must be a reference type T, or a compile-time error occurs. The meaning of the field access expression is determined as follows:
null
, then the program
terminate with a run-time error.
Note, specifically, that only the type of the Primary expression, not the class of the actual object referred to at run time, is used in determining which field to use.
The special forms using the keyword super
are valid only in an instance method, constructor, or destructor of a class; these are exactly the same situations in which the keyword this
may be used.
Suppose that a field access expression super.name
appears within class C, and the immediate superclass of C is class S. Then super.name refers to the field named name of the current object, but with the current object viewed as an instance of the superclass. Thus it can access the field named name that is visible in class S, even if that field is hidden by a declaration of a field named name in class C.
A method invocation expression is used to invoke a class or instance method.
MethodInvocation:
Identifier ( ArgumentListopt )
Primary . Identifier ( ArgumentListopt )
super
. Identifier ( ArgumentListopt )
The first step in processing a method invocation at compile time is to figure out the name of the method to be invoked and which class to check for definitions of methods of that name. There are several cases to consider, depending on the form that precedes the left parenthesis, as follows:
super
. Identifier, then the name of the method is the Identifier and the class to be searched is the immediate superclass of the class whose declaration contains the method invocation.
The second step searches the class determined in the previous step for method declarations. This step uses the name of the method and the types of the argument expressions to locate method declarations that are applicable, that is, declarations that can be correctly invoked on the given arguments. There may be more than one such method declaration, in which case the most specific one is chosen. The descriptor (signature plus return type) of the most specific method declaration is one used at run time to do the method dispatch.
A method declaration is applicable to a method invocation if and only if both of the following are true:
The class determined by the process described in §8.10.1 is searched for all method declarations applicable to this method invocation; method definitions inherited from superclasses are included in this search.
If the class has no method declaration that is applicable, then a compile-time error occurs.
If more than one method declaration is applicable to a method invocation, it is necessary to choose one to provide the descriptor for the run-time method dispatch. In this case the most specific method is chosen.
The informal intuition is that one method declaration is more specific than another if any invocation handled by the first method could be passed on to the other one without a compile-time type error.
The precise definition is as follows:
A method is said to be maximally specific for a method invocation if it is applicable and there is no other applicable method that is more specific.
If there is exactly one maximally specific method, then it is in fact the most specific method; it is necessarily more specific than any other method that is applicable.
It is possible that no method is the most specific, because there are two or more maximally specific methods. In this case a compile-time error occurs.
The type of the method invocation expression is the result type specified in the compile-time declaration of the most specific method.
At run time, method invocation requires four steps. First, a target reference may be computed. Second, the argument expressions are evaluated. Third, the actual code for the method to be executed is located. Fourth, a new activation frame is created and control is transferred to the method code.
There are several cases to consider, depending on which of the three productions for MethodInvocation (§8.10) is involved:
this
.
The argument expressions are evaluated in order, from left to right.
If the target reference is null
, a run-time error occurs
and the program terminates.
Otherwise, the target reference is said to refer to a target object and
will be used as the value of the keyword this
in the invoked
method.
A dynamic method lookup is used. The dynamic lookup process starts from a class S, determined as follows:
super
, then
S is initially the superclass of the class that contains
the method invocation.
The dynamic method lookup uses the following procedure to search class S, and then the superclasses of class S, as necessary, for method m.
We note that the dynamic lookup process, while described here explicitly, will often be implemented implicitly, for example as a side-effect of the construction and use of per-class method dispatch tables, or the construction of other per-class structures used for efficient dispatch.
A method m in some class S has been identified as the one to be invoked.
Now a new activation frame is created, containing the target reference and the argument values (if any), as well as enough space for the stack for the method to be invoked and any other bookkeeping information that may be required by the implementation (stack pointer, program counter, reference to previous activation frame, and the like). If there is not sufficient memory available to create such an activation frame, a run-time error occurs and the program terminates.
The newly created activation frame becomes the current activation frame. The effect of this is to assign the argument values to corresponding freshly created parameter variables of the method, and to make the target reference available as this. Before each argument value is assigned to its corresponding parameter variable, it is subjected to method invocation conversion (§3.3).
An array access expression refers to a variable that is a component of an array.
ArrayAccess: ExpressionName[
Expression]
PrimaryNoNewArray[
Expression]
An array access expression contains two subexpressions, the array reference expression (before the left bracket) and the index expression (within the brackets). Note that the array reference expression may be a name or any primary expression that is not an array creation expression.
The type of the array reference expression must be an array type (call it T[], an array whose components are of type T) or a compile-time error results. Then the type of the array access expression is T.
The index expression must be of type int
.
The result of an array reference is a variable of type T, namely the variable within the array selected by the value of the index expression.
An array access expression is evaluated using the following procedure:
null
, then
a run-time error occurs and the program terminates.
Note: in an array access, the expression to the left of the brackets appears to be fully evaluated before any part of the expression within the brackets is evaluated.
The unary operators include -
, !
and cast operators. Expressions with unary operators group right-to-left, so that -!x
means the same as -(!x)
.
UnaryExpression:-
UnaryExpression!
UnaryExpression CastExpression Cast Expression: ParenExpression CastExpression(
ArrayType)
CastExpression Primary
-
The type of the operand of the unary minus operator must be
integer type, or a compile-time error occurs.
The type of the unary minus expression is int
.
At run time, the value of the unary minus expression is the arithmetic negation of the value of the operand.
For integer values, negation is the same as subtraction from zero. The T programming language uses two's-complement representation for integers, and the range of two's-complement values is not symmetric, so negation of the maximum negative int
results in that same maximum negative number. Overflow occurs in this case.
!
The type of the operand of the unary logical complement operator must be
integer type, or a compile-time error occurs.
The type of the unary logical complement expression is int
.
At run time, the value of the unary logical complement expression is 1 if the operand value is 0 and 0 if the operand value is not 0.
Conceptually, the grammar for cast expressions is:
Cast Expression:However, for technical reasons (to make the grammar LALR(1)), the grammar was rewritten to parse a simple class name as an Expression. This eliminates an ambiquity that exists with one-token lookahead, where a parenthesized name cannot be distinguished from a cast. (See Section 19.1.5 of the first edition of the Java Language Specification for a discussion of this same problem in Java.)(
ReferenceType)
CastExpression Primary
The type of a cast expression is the type whose name appears within the parentheses. (The parentheses and the type they contain are sometimes called the cast operator.) The result of a cast expression is not a variable, but a value, even if the result of the operand expression is a variable.
At compile time, the type of the operand expression must be convertible by casting conversion (§3.4) to the type of the cast operator.
A run-time error (§8.17) occurs if the type of the cast operator is an array or reference type and the run-time type of the cast operand is not assignable to that type. That is, for reference types, the run-time type of the right-hand operand must be the same type as the left-hand type, or it must be a subclass of that type. For array types,
int
; or
Object
.
The operators *
and /
are called the multiplicative operators. They have the same precedence and are syntactically left-associative (they group left-to-right).
MultiplicativeExpression: UnaryExpression MultiplicativeExpression*
UnaryExpression MultiplicativeExpression/
UnaryExpression
The type of each of the operands of a multiplicative operator must be integer type, or a compile-time error occurs.
The type of the multiplicative expression is int
.
*
The binary *
operator performs multiplication, producing the product of its operands. Multiplication is a commutative operation if the operand expressions have no side effects. Integer multiplication is associative.
If an integer multiplication overflows, then the result is the low-order bits of the mathematical product as represented in some sufficiently large two's-complement format. As a result, if overflow occurs, then the sign of the result may not be the same as the sign of the mathematical product of the two operand values.
Despite the fact that overflow, underflow, or loss of information may occur, evaluation of a multiplication operator *
never causes a run-time error.
/
The binary /
operator performs division, producing the quotient of its operands. The left-hand operand is the dividend and the right-hand operand is the divisor.
Integer division rounds toward 0. That is, the quotient produced for operands n and d that are integers is an integer value q whose magnitude is as large as possible while satisfying |dq| ≤ |n|
; moreover, q is positive when |n| ≥ |d|
and n and d have the same sign, but q is negative when |n| ≥ |d|
and n and d have opposite signs. There is one special case that does not satisfy this rule: if the dividend is the negative integer of largest possible magnitude for its type, and the divisor is -1, then integer overflow occurs and the result is equal to the dividend. Despite the overflow, no run-time error coccurs in this case. On the other hand, if the value of the divisor in an integer division is 0, then a run-time error occurs and the program terminates.
The operators +
and -
are called the additive operators. They have the same precedence and are syntactically left-associative (they group left-to-right).
AdditiveExpression: MultiplicativeExpression AdditiveExpression+
MultiplicativeExpression AdditiveExpression-
MultiplicativeExpression
The type of each of the operands of the additive operators must be
integer type, or a compile-time error occurs.
The type of the additive expression is int
.
The binary + operator performs addition, producing the sum of the operands. The binary - operator performs subtraction, producing the difference of the operands.
Addition is a commutative operation if the operand expressions have no side effects. Integer addition is associative.
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.
The binary -
operator performs subtraction of its
two operands, producing the difference of its operands;
the left-hand operand is the minuend and the right-hand operand
is the subtrahend. It is always the case that a-b
produces the same result as a+(-b)
.
Note that subtraction from zero is the same as negation.
Despite the fact that overflow, underflow, or loss of information may occur, evaluation of a additive operator never causes a run-time error.
The relational operators are syntactically left-associative (they group left-to-right).
RelationalExpression: AdditiveExpression RelationalExpression<
AdditiveExpression RelationalExpression>
AdditiveExpression
<
and >
The type of each of the operands of a integer comparison operator must be integer type, or a compile-time error occurs. Signed integer comparison is performed.
The following rules then hold for integer operands:
<
operator is 1 if the value of the left-hand operand is less than the value of the right-hand operand, and otherwise is 0.
>
operator is 1 if the value of the left-hand operand is greater than the value of the right-hand operand, and otherwise is 0.
The equality operator is syntactically left-associative (it groups left-to-right).
EqualityExpression:
RelationalExpression
EqualityExpression ==
RelationalExpression
The == (equal to) operator is analogous to the relational operators except for its lower precedence. Thus, a < b==c > d
is 1 whenever a < b
and c < d
have the same truth value.
The equality operator may be used to compare two operands of integer type or two operands that are each of either reference type or the null type. All other cases result in a compile-time error. The type of an equality expression is always an integer.
==
If the operands are integers, then an integer equality test is performed.
The following rules then hold for integer operands:
==
operator is 1 if the value of the left-hand operand is equal to the value of the right-hand operand; otherwise, the result is 0.
==
If the operands of the equality operator are both of either reference type or the null type, then the operation is object equality.
At run time, the result of ==
is 1 if the operand values are both null or both refer to the same object or array; otherwise, the result is 0.
The assignment operator is syntactically right-associative (groups right-to-left). Thus, a=b=c
means a=(b=c)
, which assigns the value of c to b and then assigns the value of b to a.
Assignment:
LeftHandSide =
AssignmentExpression
LeftHandSide:
Identifier
FieldAccess
ArrayAccess
AssignmentExpression:
EqualityExpression
Assignment
The result of the first operand of an assignment operator must be a variable, or a compile-time error occurs. This operand may be a named variable, such as a field of the current object or class, or it may be a computed variable, as can result from a field access or an array access. The type of the assignment expression is the type of the variable.
At run time, the result of the assignment expression is the value of the variable after the assignment has occurred. The result of an assignment expression is not itself a variable.
A compile-time error occurs if the type of the right-hand operand cannot be converted to the type of the variable by assignment conversion.
At run time, the expression is evaluated in one of two ways. If the left-hand operand expression is not an array access expression, then three steps are required:
If the left-hand operand expression is an array access expression, then many steps are required:
null
,
then a run-time error occurs and the program terminates.
int
, then SC is necessarily the same as TC. The value of the right-hand operand must be of type int
and is stored into the array component.
There is no exception "handling" in the T programming language. Run-Time errors will cause the abrupt termination of the program with an associated error message.
ERROR: Out of memory (line n).
ERROR: Negative array size (line n).
ERROR: Null reference (line n).
null
.
null
.
null
.
delete
statement (§7.9) is attempted on a null
reference.
ERROR: Index out of bounds (line n).
length
of
the array.
ERROR: Divide by zero (line n).
ERROR: Invalid cast (line n).
ERROR: Invalid array assignment (line n).
Contents | Prev | Next | The T Language Specification, Version 2 Spring 2006 |
Author(s): Pete Mitchell (§8.1-8.5, §8.17), Jakub Mokny (§8.6-8.8), Christopher Sayles (§8.9, §8.11, §8.16), Dennis Tolstenko (§8.10), Sam Winter (§8.12-8.15) and Spring 2006 CS712/CS812 class (edits, §8.12.3)