CHAPTER 7
Expressions
Much of the work in a program is done by evaluating
expressions, either for their side effects, such as
assignments to variables, or for their values, which can be used as
arguments or operands in larger expressions, or to affect the
execution sequence in statements, or both.
This chapter specifies the meanings of expressions and the rules for
their evaluation.
7.1 Evaluation, Denotation, and Result
When an expression in a program is evaluated
(executed), the result denotes one of two things:
- A variable (2.5) (in C, this would be
called an lvalue)
- A value (2.2, 2.3)
Evaluation of an expression can also produce side effects, because
expressions may contain embedded assignments and method invocations.
Each expression occurs either in the main block (6.1) or in the declaration of some class type that is
being declared. In a class declaration the expression might occur in
a constructor declaration, or in the code for an operator or method.
7.2 Variables as Values
If an expression denotes a variable, and a value is required for use
in further evaluation, then the value of that variable is used. In
this context, if the expression denotes a variable or a value, we may
speak simply of the value of the expression.
7.3 Type of an Expression
If an expression denotes a variable or a value, then the expression
has a type known at compile time. The rules for determining the type
of an expression are explained separately below for each kind of
expression.
The value of an expression is always assignment compatible (3.2) with the type of the expression, just as the
value stored in a variable is always compatible with the type of the
variable. In other words, the value of an expression whose type is
T is always suitable for assignment to a variable of type
T.
7.4 Expressions and Run-Time Checks
If the type of an expression is a reference type, then the class
of the referenced object, or even whether the value is a reference to
an object rather than null
, is not necessarily known at
compile time. There are a few places in the maTe programming language
where the actual class of a referenced object affects program
execution in a manner that cannot be deduced from the type of the
expression. They are as follows:
- Method invocation (7.9). The particular
method used for an invocation
o.m(
...)
is
chosen based on the methods that are part of the class that is the
type of o
. The class of the object referenced by the
run-time value of o
participates because a subclass may
override a specific method already declared in a parent class so
that this overriding method is invoked. (The overriding method may
or may not choose to further invoke the original overridden
m
method.)
- Casting (3.4, 7.10.1). The class of the object referenced by
the run-time value of the operand expression might not be compatible
with the type specified by the cast. This may require a run-time
check to ensure that the class of the referenced object, as
determined at run-time, is assignment compatible (3.2) with the target type.
The first of the cases just listed ought never to result in
detecting a type error, as it is compile-time constrained to be valid.
Thus, a run-time type error can occur only when the actual class of
the object referenced by the value to be assigned (either implicitly
or explicitly) is not compatible with the actual run-time reference
variable. In these cases, the program terminates with a Run-Time
error (7.18).
7.5 Evaluation Order
The maTe programming language guarantees that the operands of
operators appear to be evaluated in a specific evaluation
order, namely, from left to right.
It is recommended that code not rely crucially on this
specification. Code is usually clearer when each expression contains
at most one side effect, as its outermost operation.
7.5.1 Evaluate Left-Hand Operand First
The left-hand operand of a binary operator appears to be fully
evaluated before any part of the right-hand operand is evaluated. For
example, if the left-hand operand contains an assignment to a variable
and the right-hand operand contains a reference to that same variable,
then the value produced by the reference will reflect the fact that
the assignment occurred first.
7.5.2 Evaluate Operands before Operation
The maTe programming language also guarantees that every operand of an
operator appears to be fully evaluated before any part of the
operation itself is performed.
7.5.3 Evaluation Respects Parentheses and Precedence
maTe programming language implementations must respect the order
of evaluation as indicated explicitly by parentheses and implicitly by
operator precedence. An implementation may not take advantage of
algebraic identities such as the associative law to rewrite
expressions into a more convenient computational order unless it can
be proven that the replacement expression is equivalent in value and
in its observable side effects for all possible computational values
that might be involved.
Note that Integer addition and multiplication are
provably associative in the maTe programming language.
For example a+b+c
, where a
,
b
, and c
are local variables will always
produce the same answer whether evaluated as (a+b)+c
or
a+(b+c)
; if the expression b+c
occurs nearby
in the code, a smart compiler may be able to use this common
subexpression.
7.5.4 Argument Lists are Evaluated Left-to-Right
In a method or constructor invocation or class instance creation
expression, argument expressions may appear within the parentheses,
separated by commas. Each argument expression appears to be fully
evaluated before any part of any argument expression to its right.
7.6 Primary Expressions
Primary expressions include most of the simplest kinds of
expressions, from which all others are constructed: literals, field
accesses, method invocations and names. A parenthesized expression is
also treated syntactically as a primary expression.
Primary:
Identifier
ParenExpression
this
FieldAccess
MethodInvocation
OperatorInvocation
ClassInstanceCreationExpression
Literal
7.6.1 Lexical Literals
A literal denotes a fixed, unchanging value.
The following production from (1.7) is
repeated here for convenience:
Literal:
IntegerLiteral
NullLiteral
StringLiteral
The type of a literal is determined as follows:
- The type of an integer literal is
Integer
.
- The type of a string literal is
String
.
- The type of the null literal
null
is the null type;
its value is the null reference.
7.6.2 this
The keyword this
may be used only in the body of an
method, operator or constructor.
When used as a primary expression, the keyword this
denotes a value, that is a reference to the object for which the
method was invoked, or to the object being constructed. The type of
this
is the class C within which the keyword
this
occurs. At run time, the class of the actual object
referred to may be the class C or any subclass of C.
7.6.3 Parenthesized Expressions
A parenthesized expression is a primary expression whose
type is the type of the contained expression and whose value at run
time is the value of the contained expression. If the contained
expression denotes a variable then the parenthesized expression also
denotes that variable.
ParenExpression:
(
Expression )
7.6.4 Expression Names
The rules for evaluating expression names are given in 4.5.3.
7.7 Class Instance Creation Expressions
A class instance creation expression is used to create new objects
that are instances of classes.
ClassInstanceCreationExpression:
new
ClassType Arguments
Arguments:
(
ArgumentList )
(
)
ArgumentList:
ArgumentList ,
Expression
Expression
We say that a class is instantiated when an instance of
the class is created by a class instance creation expression. Class
instantiation involves determining what class is to be instantiated,
what constructor should be invoked to create the new instance and what
arguments should be passed to that constructor.
7.7.1 Determining the Class being Instantiated
The class being instantiated is the class denoted by ClassType.
The type of the class instance creation expression is the class type
being instantiated.
7.7.2 Choosing the Constructor and its Arguments
Let C be the class type being instantiated. To create an
instance of C, i, a constructor of C is chosen at
compile-time by the following rules:
- First, the actual arguments to the constructor invocation are
determined.
- The arguments in the argument list, if any, are the arguments to
the constructor, in the order they appear in the expression.
- Once the actual arguments have been determined, they are used to
select a constructor of C, using the same rules as for method
invocations (7.9). As in method invocations, a
compile-time method matching error results if there is no
unique most-specific constructor that is both applicable and
accessible.
7.7.3 Run-time Evaluation of Class Instance Creation Expressions
At run time, a class instance creation expression requires memory
space to be allocated for the new class instance. If there is
insufficient space to allocate the object, the program terminates with
a run-time error.
The new object contains new instances of all the fields declared in
the specified class type and all its superclasses. As each new field
instance is created, it is initialized to its default value.
Next, the actual arguments to the constructor are evaluated,
left-to-right.
Next, the selected constructor of the specified class type is
invoked. This results in invoking at least one constructor for each
superclass of the class type.
The value of a class instance creation expression is a reference to
the newly created object of the specified class. Every time the
expression is evaluated, a fresh object is created.
7.8 Field Access Expressions
A field access expression may access a field of an
object, a reference to which is the value of either an expression or
the special keyword super
.
FieldAccess:
Primary . Identifier
super
. Identifier
7.8.1 Field Access Using a Primary
The type of the Primary must be a reference type T, or a
compile-time error occurs. The meaning of the field access expression
is determined as follows:
- If the identifier does not name a member field of type T,
then the field access is undefined and a compile-time error occurs.
- Otherwise, the identifier names a member field of type T
and the type of the field access expression is the declared type of
the field.
- At run time, the result of the field access expression
is computed as follows:
- If the value of the Primary is
null
, then the
program terminate with a run-time error.
- If not, then the result is a variable, namely, the specified
instance variable in the object referenced by the value of the
Primary.
Note, specifically, that only the type of the Primary
expression, not the class of the actual object referred to at run
time, is used in determining which field to use.
7.8.2 Accessing Superclass Members Using super
The special forms using the keyword super
are valid
only in an instance method, operator or constructor of a class; these
are exactly the same situations in which the keyword this
may be used.
Suppose that a field access expression super.name
appears within class C, and the immediate superclass of
C is class S. Then super.name refers to the field named
name of the current object, but with the current object viewed as an
instance of the superclass. Thus it can access the field named
name that is visible in class S, even if that field is
hidden by a declaration of a field named name in class C.
7.9 Method Invocation Expressions
A method invocation expression is used to invoke a class or instance method.
MethodInvocation:
Identifier ( ArgumentListopt )
Primary . Identifier ( ArgumentListopt )
super
. Identifier ( ArgumentListopt )
7.9.1 Compile-Time Step 1: Determine Class to Search
The first step in processing a method invocation at compile time is to
figure out the name of the method to be invoked and which class to
check for definitions of methods of that name. There are several cases
to consider, depending on the form that precedes the left parenthesis,
as follows:
- If the form is Identifier, then the name of the method is
the Identifier. The class to search is the one whose
declaration contains the method invocation.
- If the form is Primary . Identifier, then the name
of the method is the Identifier and the class to be searched is
the type of the Primary expression.
- If the form is
super
. Identifier, then the
name of the method is the Identifier and the class to be
searched is the immediate superclass of the class whose declaration
contains the method invocation.
7.9.2 Compile-Time Step 2: Determine Method Signature
The second step searches the class determined in the previous step for
method declarations. This step uses the name of the method and the
types of the argument expressions to locate method declarations that
are applicable, that is, declarations that can be correctly invoked on
the given arguments. There may be more than one such method
declaration, in which case the most specific one is chosen. The
descriptor (signature plus return type) of the most specific method
declaration is one used at run time to do the method dispatch.
7.9.2.1 Find Methods that are Applicable
A method declaration is applicable to a method invocation if
and only if both of the following are true:
- The number of parameters in the method declaration equals the
number of argument expressions in the method invocation.
- The type of each actual argument can be converted by method
invocation conversion (3.3) to the type of the
corresponding parameter.
The class determined by the process described in 7.9.1 is searched for all method declarations
applicable to this method invocation; method definitions inherited
from superclasses are included in this search.
If the class has no method declaration that is applicable, then a
compile-time error occurs.
7.9.2.2 Choose the Most Specific Method
If more than one method declaration is applicable to a method
invocation, it is necessary to choose one to provide the descriptor
for the run-time method dispatch. In this case the most
specific method is chosen.
The informal intuition is that one method declaration is more specific
than another if any invocation handled by the first method could be
passed on to the other one without a compile-time type error.
The precise definition is as follows:
- If there is only one applicable method, then that method is most
specific.
- Otherwise: Let m be a name and suppose that there are two
declarations of methods named m, each having n
parameters. Suppose that the types of the parameters of one
declaration are T1, . . . , Tn; and suppose, moreover,
that the types of the parameters of the other declaration are
U1, . . . , Un. Then the method with parameter types
T1, . . . , Tn is more specific than the method with
parameter types U1, . . . , Un if and only if:
- Tj can be converted to Uj by method invocation
conversion, for all j from 1 to n.
A method is said to be maximally specific for a method invocation if
it is applicable and there is no other applicable method that is more
specific.
If there is exactly one maximally specific method, then it is in fact
the most specific method; it is necessarily more specific than any
other method that is applicable.
It is possible that no method is the most specific, because there are
two or more maximally specific methods. In this case a compile-time
error occurs.
The type of the method invocation expression is the result type specified
in the compile-time declaration of the most specific method.
7.9.3 Runtime Evaluation of Method Invocation
At run time, method invocation requires four steps. First, a
target reference may be computed. Second, the argument
expressions are evaluated. Third, the actual code for the method
to be executed is located. Fourth, a new activation
frame is created and control is transferred to the method code.
7.9.3.1 Compute Target Reference (If Necessary)
There are several cases to consider, depending on which of the three
productions for MethodInvocation (7.9) is
involved:
- If the first production for MethodInvocation, which
includes a Identifier, is involved, then the target reference
is the value of
this
.
- If the second production for MethodInvocation, which
includes a Primary, is involved, then the expression
Primary is evaluated and the result is used as the target
reference.
- If the third production for MethodInvocation, which
includes the keyword super, is involved, then the target reference is
the value of this.
7.9.3.2 Evaluate Arguments
The argument expressions are evaluated in order, from left to right.
7.9.3.3 Locate Method to Invoke
If the target reference is null
, a run-time error occurs
and the program terminates. Otherwise, the target reference is said
to refer to a target object and will be used as the value of the
keyword this
in the invoked method.
A dynamic method lookup is used. The dynamic lookup process starts
from a class S, determined as follows:
- If the invocation is via the keyword
super
, then
S is initially the superclass of the class that contains the
method invocation.
- Otherwise S is initially the actual run-time class
R of the target object.
The dynamic method lookup uses the following procedure to search class
S, and then the superclasses of class S, as necessary,
for method m.
- If class S contains a declaration for a method named
m with the same descriptor (same number of parameters, the same
parameter types, and the same return type) required by the method
invocation as determined at compile time, then this is the method to
be invoked, and the procedure terminates.
- If S has a superclass, this same lookup procedure is
performed recursively using the direct superclass of S in place
of S; the method to be invoked is the result of the recursive
invocation of this lookup procedure.
We note that the dynamic lookup process, while described here
explicitly, will often be implemented implicitly, for example as a
side-effect of the construction and use of per-class method dispatch
tables, or the construction of other per-class structures used for
efficient dispatch.
7.9.3.4 Create Frame
A method m in some class S has been identified as the
one to be invoked.
Now a new activation frame is created, containing the target reference
and the argument values (if any), as well as enough space for the
stack for the method to be invoked and any other bookkeeping
information that may be required by the implementation (stack pointer,
program counter, reference to previous activation frame, and the
like). If there is not sufficient memory available to create such an
activation frame, a run-time error occurs and the program terminates.
The newly created activation frame becomes the current activation
frame. The effect of this is to assign the argument values to
corresponding freshly created parameter variables of the method, and
to make the target reference available as this. Before each argument
value is assigned to its corresponding parameter variable, it is
subjected to method invocation conversion (3.3).
7.10 Unary Operators
The unary operators include -
, !
and
cast operators. Expressions with unary operators group right-to-left,
so that -!x
means the same as -(!x)
.
UnaryExpression:
-
UnaryExpression
!
UnaryExpression
CastExpression
CastExpression:
ParenExpression CastExpression
(
ReferenceType )
CastExpression
Primary
7.10.1 Cast Operator
Conceptually, the grammar for cast expressions is:
CastExpression:
(
ReferenceType )
CastExpression
Primary
However, for technical reasons (to make the grammar LALR(1)), the
grammar was rewritten to parse a simple class name as an Expression.
This eliminates an ambiquity that exists with one-token lookahead,
where a parenthesized name cannot be distinguished from a cast. (See
Section
19.1.5 of the first edition of the Java Language Specification for
a discussion of this same problem in Java.)
The type of a cast expression is the type whose name appears
within the parentheses. (The parentheses and the type they contain
are sometimes called the cast operator.) The result of a cast
expression is not a variable, but a value, even if the result of the
operand expression is a variable.
At compile time, the type of the operand expression must be
convertible by casting conversion (3.4) to the
type of the cast operator.
A run-time error (7.18) occurs if the type of
the cast operator is a reference type and the run-time type of the
cast operand is not assignable to that type. That is, for reference
types, the run-time type of the right-hand operand must be the same
type as the left-hand type, or it must be a subclass of that type.
7.10.2 instanceof Operator
The grammar for an instanceof expression is:
InstanceOfExpression:
InstanceOfExpression instanceof ReferenceType
RelationalExpression
The type of the instanceof expression is Integer
. The
value of the Integer
reference is either 0 or 1. It is 1
if and only if the type of InstanceOfExpression is convertible by
casting conversion (3.4) to the type
ReferenceType. If the value of an instanceof expression is 1, a cast
to the same type is guaranteed to succeed.
7.11 Arithmetic Operators
7.11.1 Multiplicative Operators
The operators *
and /
are called the
multiplicative operators. They have the same precedence and are
syntactically left-associative (they group left-to-right).
MultiplicativeExpression:
UnaryExpression
MultiplicativeExpression *
UnaryExpression
MultiplicativeExpression /
UnaryExpression
7.11.2 Additive Operators
The operators +
and -
are called the
additive operators. They have the same precedence and are
syntactically left-associative (they group left-to-right).
AdditiveExpression:
MultiplicativeExpression
AdditiveExpression +
MultiplicativeExpression
AdditiveExpression -
MultiplicativeExpression
7.12 Relational Operators
The relational operators are syntactically left-associative (they
group left-to-right).
RelationalExpression:
AdditiveExpression
RelationalExpression <
AdditiveExpression
RelationalExpression >
AdditiveExpression
7.13 Equality Operator
The equality operator is syntactically left-associative (it groups
left-to-right).
EqualityExpression:
InstanceOfExpression
EqualityExpression ==
InstanceOfExpression
The equality operator may be used to compare two operands for
object equality.
At run time, the result of ==
is an Integer object
with value of either 0 or 1. The value will be 1 if the two operands
denote the same object, and 0 otherwise.
7.14 Assignment Operator
The assignment operator is syntactically right-associative (groups
right-to-left). Thus, a=b=c
means a=(b=c)
,
which assigns the value of c to b and then assigns the
value of b to a.
Assignment:
LeftHandSide =
AssignmentExpression
LeftHandSide:
Identifier
FieldAccess
AssignmentExpression:
EqualityExpression
Assignment
The result of the first operand of an assignment operator must be
a variable, or a compile-time error occurs. This operand may be a
named variable, such as a field of the current object or class, or it
may be a computed variable, as can result from a field access. The
type of the assignment expression is the type of the variable.
At run time, the result of the assignment expression is the value
of the variable after the assignment has occurred. The result of an
assignment expression is not itself a variable.
A compile-time error occurs if the type of the right-hand operand
cannot be converted to the type of the variable by assignment
conversion.
At run time, three steps are required:
- The left-hand operand is evaluated to produce a variable.
- The right-hand operand is evaluated.
- The value of the right-hand operand is converted to the type of
the left-hand variable, and the result of the conversion is stored
into the variable.
7.15 Input Operator
The in
operator is a rudimentary mechanism for accepting
strings as user input and it provides the only way to accept input in the
maTe programming language.
InputOperator:
in
The in
operator reads a string of input from
stdin. The in
operator skips any leading
whitespace characters (CR, LR, LF, tab) in the input, and stops when
it encounters the first non-leading whitespace character (or EOF). If
there are no non-whitespace characters prior to EOF, the
in
operator returns a null reference. Otherwise, a new
String object is created, whose characters are those read in by the
operator (except for leading whitespace characters). The value of the
in expression is a null reference if no non-whitespace characters were
found prior to EOF, or a reference to the String object that was
created. The type of the in expression is String.
7.16 Operator Invocation Expressions
An operator invocation expression is used to invoke a class or instance operator.
OperatorInvocation:
Expression BinaryOperator Expression
UnaryOperator Expression
MinusOperator Expression
Expression MinusOperator Expression
The first step in processing an operator invocation at compile time is
to figure out the operator to be invoked and which class to check for
definitions of operators with the specified operator. There are
several cases to consider, depending on the form, as follows:
- If the form is Expression BinaryOperator Expression, then
the operator is BinaryOperator. The class to search is the
type of the left-most Expression expression. In this form,
the operator takes as its only argument the right-most
Expression expression.
- If the form is UnaryOperator Expression, then the
operator is UnaryOperator. The class to search is the type
of the Expression expression. In this form, the operator
takes no arguments.
- If the form is MinusOperator Expression, then the
operator is MinusOperator (unary minus operator). The class
to search is the type of the Expression expression. In this
form, the operator takes no arguments.
- If the form is Expression MinusOperator Expression
(binary minus operator). The class to search is the type of the
left-most Expression expression. In this form, the operator
takes as its only argument the right-most Expression
expression.
The second step searches the class determined in the previous step for
operator declarations. This step uses the operator and class
determined in the previous step to locate operator declarations that
are applicable, that is, declarations that can be correctly invoked on
the given number of arguments. There may be more than one such
operator declaration, in which case the most specific one is
chosen. The descriptor (signature plus return type) of the most
specific operator declaration is one used at run time to do the
operator dispatch.
Finding applicable operator declarations is entirely similar to
finding applicable method declarations (7.9.2.1).
Find the most specific operator declaration is entirely similar to
finding the most specific method declaration (7.9.2.2).
At run-time, operator invocation is entirely similar to run-time
evaluation of a method invocation (7.9.3).
7.17 Expression
An Expression
is any assignment expression:
Expression:
AssignmentExpression
7.18 Run-Time Errors
There is no exception "handling" in the maTe programming
language. Run-Time errors will cause the abrupt termination of the
program with an associated error message.
-
Run-Time Error conditions and messages are as follows:
-
-
ERROR: Out of memory.
-
A class instance creation expression (7.7)
fails due to insufficient memory available.
ERROR: Null reference.
- A field access (7.8) is attempted when the
value of the object reference expression is
null
.
- A method or operator invocation expression (7.9, 7.16) that invokes an
instance method is attempted when the target reference is
null
.
- A null reference is passed to the out statement
(6.10).
ERROR: Divide by zero.
- An
Integer
division (2.3.3) operation is attempted where the
value of the denominator is zero.
ERROR: Invalid cast.
-
A cast (7.10.1) to a reference type is
attempted where the actual type of the operand expression is
incompatible with (not a subclass of) the reference type to which
it is being cast.
ERROR: Index out of bounds.
-
The
substr
method defined by class
String
was invoked such that at least one of the
following conditions was true (see 2.3.4):
- The
String
reference which invoked the method
contained 0 characters.
- The beg or end indices were not legal indices.
- The end index was smaller than beg.
ERROR: Number format exception.
-
The
toInteger
method defined by class
String
was invoked such that at least one of the
following conditions was true (see 2.3.4):
- The characters of the
String
reference that
invoked the method were other than the ASCII characters '0',
'1', '2', '3', '4', '5', '6', '7', '8', or '9'. An ASCII minus
sign '-' is the exception, but only if it is the first character
(index 0).
- The parsed integer value of the
String
reference that invoked the method was less than -2147483648 or
greater than 2147483647.
ERROR: Concurrent modification exception.
-
The
put
or remove
method defined by
class Table
was invoked after the Table
reference that invoked the method's iterator had been initialized
but before it had reached the end of the Table
.