CHAPTER 1
This chapter specifies the structure of the language.
Programs are written in ASCII characters. Line terminators are defined (1.1) to support the different conventions of existing host systems while maintaining consistent line numbers.
The ASCII characters are reduced to a sequence of input elements (1.2), which are white space (1.3), comments (1.4), and tokens. The tokens are the identifiers (1.5), keywords (1.6), literals (1.7), separators (1.8), and operators (1.9) of the syntactic grammar.
Implementation divides the sequence of ASCII characters into lines by
recognizing line terminators. (This definition of lines
determines the line numbers produced). It also specifies the
termination of the //
form of a comment.
LineTerminator:the ASCII LF character, also known as "newline"
the ASCII CR character, also known as "return"
the ASCII CR character followed by the ASCII LF character
InputCharacter: ASCIICharacterbut not CR or LF
Lines are terminated by the ASCII characters CR
, or
LF
, or CR LF
. The two characters
CR
immediately followed by LF
are counted as
one line terminator, not two.
The result is a sequence of line terminators and input characters, which are the terminal symbols for the third step in the tokenization process.
The input characters and line terminators are reduced to a sequence of input elements. Those input elements that are not white space (1.3) or comments (1.4) are tokens.
This process is specified by the following productions:
Input: InputElementsopt InputElements: InputElement InputElements InputElement InputElement: WhiteSpace Comment Token Token: Identifier Keyword Literal Separator Operator
White space is defined as the ASCII space, horizontal tab, and form feed characters, as well as line terminators.
WhiteSpace:the ASCII SP character, also known as "space"
the ASCII HT character, also known as "horizontal tab"
the ASCII FF character, also known as "form feed"
LineTerminator
The comment can be of the following form.
// text
All the text from the ASCII // characters to the LineTerminator will be ignored.
EndOfLineComment:
/ /
CharactersInLineopt LineTerminator
CharactersInLine:
InputCharacter
CharactersInLine InputCharacter
An identifier is an unlimited-length sequence of letters and digits, the first of which must be a letter. An identifier cannot have the same spelling (ASCII character sequence) as a keyword (1.6), the null literal (1.7.2), the tab string literal, or the newline string literal (1.7.3).
Identifier: IdentifierCharsbut not
a Keyword or NullLiteral IdentifierChars: Letter IdentifierChars LetterOrDigit Letter:any ASCII character that is a letter (see below)
LetterOrDigit:any ASCII character that is a letter or digit (see below)
The letters include uppercase and lowercase ASCII Latin letters A-Z (0x41-0x5a), and a-z (0x61-0x7a), and the ASCII underscore (_, or 0x5f). The digits include the ASCII digits 0-9 (0x30-0x39).
Two identifiers are the same only if they are identical, that is, have the same ASCII character for each letter or digit.
The following character sequences, formed from ASCII letters, are reserved for use as keywords and cannot be used as identifiers.
Keyword: one of
break
class
continue
else
extends
if
in
instanceof
main
new
newline
null
out
operator
return
super
tab
this
while
A literal is the source code representation of a value of an Integer, String or a null type.
Literal: IntegerLiteral NullLiteral StringLiteral
An integer literal should be expressed in decimal (base 10).
IntegerLiteral: DecimalNumeral
A decimal numeral consists of an ASCII digit from 0 to 9, optionally followed by one or more ASCII digits from 0 to 9, representing a positive integer.
DecimalNumeral: Digits Digits: Digit Digits Digit Digit:one of
0 1 2 3 4 5 6 7 8 9
An integer literal is of type Integer
(2.3.3).
The largest decimal literal is 2147483648
(231). All decimal literals from 0 to 2147483647 may appear
anywhere an integer literal may appear, but the literal 2147483648 may
appear only as the operand of the unary negation operator
"-
" .
A compile-time error occurs if a decimal literal is larger than
2147483648 (231), or if the literal 2147483648 appears
anywhere other than as the operand of the unary "-
"
operator.
An integer literal is mapped by the compiler to a class instance creation expression which generates a new Integer object whose value is the value of the integer literal parsed as a signed decimal integer (base 10). The value of the class instance creation expression is a reference to the Integer instance which was created.
The null type has one value, the null reference, represented by the literal null, which is formed from ASCII characters. A null literal has null type (2.1).
NullLiteral:
null
A string literal is of type String
.
StringLiteral: " StringCharacters " WhitespaceStringLiteral WhitespaceStringLiteral: newline tab StringCharacters: StringCharacters ASCIICharacter but not " (double quote), CR, LF or tab
A string literal is mapped by the compiler to a class instance
creation expression which generates a new String object. For the
quote-delimited string literal, the characters of the new
String object will be all characters contained within the double
quotes. For the newline string literal, the characters
will be a single newline character. For the tab string literal,
the characters will be a single tab character. The value
of the class instance creation expression is a reference to the String
instance which was created.
A string literal is of type String
(2.3.4).
The following nine ASCII characters are the separators (punctuators):
Separator:one of
( ) { } ; , .
The following 9 tokens are the operators, formed from ASCII characters:
Operator:
=
==
new
UserDefinedOperator
UserDefinedOperator:
UnaryOperator
BinaryOperator
MinusOperator
UnaryOperator:
!
BinaryOperator:
+
*
/
>
<
MinusOperator:
-
Only UserDefinedOperator
operators may be used in class
operator declarations (5.6).
CHAPTER 2
The maTe programming language is a strongly typed language, which means that every variable and every expression has a type that is known at compile time. Types limit the values that a variable can hold or that an expression can produce, limit the operations supported on those values, and determine the meaning of the operations. Strong typing helps detect errors at compile time.
The maTe programming language is a pure object-oriented language.
All types are classes. Class types are divided into two categories:
predefined types and user-defined types. There are four predefined
types: the object type Object
, the integer type
Integer
, the table type Table
and the string
type String
. User-defined class types are subclasses of
one of the predefined types. There is also a special null type. An
object is a dynamically created instance of a class type. The values
of a class type are references to objects. All objects support the
methods of class Object
. Names of types are used in
declarations, class instance creation expressions, and cast operators.
A variable is a storage location. A variable of a class type
T can hold a null reference or a reference to an instance of
class T or of any class that is a subclass of T. A
variable of type Object
can hold a null reference or a
reference to any object.
There are two kinds of types in the maTe programming language: predefined types and user-defined types. maTe being a pure object-oriented programming language, there is only one kind of data value that can be stored in variables, passed as arguments, returned by methods, and operated on: class object references.
Type: ReferenceType
There is also a special null type, the type of the
expression null
, which has no name. Because the null type
has no name, it is impossible to declare a variable of the null
type. The null reference is the only possible value of an expression
of null type. The null reference can always be converted to any
reference type. In practice, the programmer can ignore the null type
and just pretend that null
is merely a special literal
that can be of any reference type.
A predefined type is predefined by the maTe programming language and named by its reserved keyword. There are four of these:
PredefinedType:
Object
Integer
Table
String
User-defined types are subclasses of at least one of these types.
There is one kind of reference type: class types.
ReferenceType: ClassType ClassType: TypeName
The sample code:
class Point { Integer metrics; }
declares a class type Point
, and uses an
Integer
to declare the field metrics
of the
class Point
.
An object is a class instance.
The reference values (often just references) may be pointers to these objects, or a special null reference, which refers to no object.
A class instance is explicitly created by a class instance creation expression.
The operators on references to objects are:
==
There may be many references to the same object. Most objects have a state, stored in the fields of objects that are instances of classes. If two variables contain references to the same object, the state of the object can be modified using one variable's reference to the object, and then the altered state can be observed through the reference in the other variable.
The class Object
is a superclass of all other classes. A
variable of type Object
can hold a reference to any
object. All classes inherit the methods of class Object
,
which are summarized here:
Integer equals(Object obj) { . . . } Integer hashCode() { . . . } String toString() { . . . }
The equals
method defined by class Object
returns a new Integer
object whose value is 1 if the two
references refer to the same object and 0 otherwise. This method can
be overridden in subclasses to define a notion of object equality,
which is based on value, not reference, comparison.
The hashCode
method defined by class Object
returns a new Integer
object whose value is computed from
the Object
reference which invoked the method. This
method can be overridden in subclasses to define a sensible hashing
algorithm for the class.
The toString
method defined by class Object
returns a new String
object whose characters will be
"Object". This method can be overridden in subclasses to define an
output format for the class.
The Object
class defines a constructor:
Object() { }
Note that the body of the constructor is empty. Since
Object
is the primordial class and has no superclass,
there is no call, either implicit or explicit, to the superclass
constructor.
The class Integer
is a subclass of the class
Object
. A variable of type Integer
can hold
a reference to any Integer
object or a reference to any
object whose class type is a descendant of Integer
.
An Integer
object holds a single integer value. The
smallest legal value is -2147483648 and the largest is 2147483647.
Overflow is not viewed as an error for methods defined by class
Integer
.
All classes which extend Integer
inherit all of the
methods of Integer
, which are summarized here:
Integer equals(Object obj) { . . . } Integer hashCode() { . . . } String toString() { . . . } Integer add(Integer i) { . . . } Integer subtract(Integer i) { . . . } Integer multiply(Integer i) { . . . } Integer divide(Integer i) { . . . } Integer greaterThan(Integer i) { . . . } Integer lessThan(Integer i) { . . . } Integer not() { . . . } Integer minus() { . . . } Integer operator + (Integer i) { . . . } Integer operator - (Integer i) { . . . } Integer operator * (Integer i) { . . . } Integer operator / (Integer i) { . . . } Integer operator > (Integer i) { . . . } Integer operator < (Integer i) { . . . } Integer operator ! () { . . . } Integer operator - () { . . . }
The equals
method defined by class Integer
returns a new Integer
object whose value is either 0 or
1. The value is 1 if and only if obj is convertible by casting
conversion (3.4) to Integer
and the
value of the Integer
reference that invoked the method is
equal to the value of obj.
The hashCode
method defined by class Integer
returns a new Integer
object whose value is the value of
the Integer
reference which invoked the method.
The toString
method defined by class Integer
returns a new String
object whose characters are the
string representation of the Integer
reference which
invoked the method's value in base 10 using signed decimal
representation.
The add
method defined by class Integer
returns a new Integer
object whose value is the value of
i added to the value of the Integer
reference
which invoked the method. The value of the Integer
reference which invoked the method will be unchanged.
The operator +
method defined by class
Integer
is completely similar to the method
add
. If s and t are Integer
references, an invocation of s + t will have all the same
effects as an invocation of s.add(t).
The subtract
method defined by class Integer
returns a new Integer
object whose value is the value of
i subtracted from the value of the Integer
reference which invoked the method. The value of the
Integer
reference which invoked the method will be
unchanged.
The operator -
method defined by class
Integer
is completely similar to the method
subtract
. If s and t are
Integer
references, an invocation of s - t will
have all the same effects as an invocation of s.subtract(t).
The multiply
method defined by class Integer
returns a new Integer
object whose value is the value of
i multiplied by the value of the Integer
reference
which invoked the method. The value of the Integer
reference which invoked the method will be unchanged.
The operator *
method defined by class
Integer
is completely similar to the method
multiply
. If s and t are
Integer
references, an invocation of s * t will
have all the same effects as an invocation of s.multiply(t).
The divide
method defined by class Integer
returns a new Integer
object whose value is the value of
the Integer
reference which invoked the method divided by
the value of i. The value of the Integer
reference which invoked the method will be unchanged.
The following run-time errors must be detected for the
divide
method:
For all detected run-time errors, the action is to terminate the program with an error message.
The operator /
method defined by class
Integer
is completely similar to the method
divide
. If s and t are
Integer
references, an invocation of s / t will
have all the same effects as an invocation of s.divide(t).
The lessThan
method defined by class Integer
returns a new Integer
object whose value is either 0 or
1. The value is 1 if and only if the value of the
Integer
reference that invoked the method is less than
the value of i.
The operator <
method defined by class
Integer
is completely similar to the method
lessThan
. If s and t are
Integer
references, an invocation of s < t will
have all the same effects as an invocation of s.lessThan(t).
The greaterThan
method defined by class
Integer
returns a new Integer
object whose
value is either 0 or 1. The value is 1 if and only if the value of
the Integer
reference that invoked the method is greater
than the value of i.
The operator >
method defined by class
Integer
is completely similar to the method
greaterThan
. If s and t are
Integer
references, an invocation of s > t will
have all the same effects as an invocation of s.greaterThan(t).
The not
method defined by class Integer
returns a new Integer
object whose value is either 0 or
1. The value is 1 if and only if the value of the
Integer
reference that invoked the method is 0. The
value of the Integer
reference which invoked the method
will be unchaged.
The operator !
method defined by class
Integer
is completely similar to the method not. If
s is an Integer
reference, an invocation of
!s will have all the same effects as an invocation of
s.not().
The minus
method defined by class Integer
returns a new Integer
object whose value is the
arithmetic negation of the value of the Integer
reference
which invoked the method. The value of the Integer
reference which invoked the method will be unchanged.
The operator -
method (invoked with no arguments) defined
by class Integer
is completely similar to the method
minus
. If s is an Integer
reference,
an invocation of -s will have all the same effects as an
invocation of s.minus().
The Integer
class defines the following constructors:
Integer()
This constructor defined by class Integer
creates a new
Integer
object whose value is 0.
Integer(Integer i)
This constructor defined by class Integer
creates a new
Integer
object whose value is the same as the value of
i.
The class String
is a subclass of the class
Object
. A variable of type String
can hold
a reference to any String
object or a reference to any
object whose class type is a descendant of String
.
A String
object contains an array of ASCII characters.
String
objects are static in that their characters cannot
be modified once instantiated, nor can their size change (characters
cannot be added or removed).
All classes which extend String
inherit all of the
methods of String
, which are summarized here:
Integer equals(Object obj) { . . . } Integer hashCode() { . . . } String toString() { . . . } Integer length() { . . . } String substr(Integer beg, Integer end) { . . . } String concat(String s) { . . . } Integer toInteger() { . . . } String operator + (String s) { . . . } Integer operator > (String s) { . . . } Integer operator < (String s) { . . . }
The equals
method defined by class String
returns a new Integer
object whose value will be either 0
or 1. The value will be 1 if and only if obj is convertible by
casting conversion (3.4) to String
and the characters of the String
reference that invoked
the method are lexicographically equal to the characters of the
String
reference obj.
The hashCode
method defined by class String
returns a new Integer
object whose value is the summation
of the ASCII values of all characters in the String
reference which invoked the method.
The toString
method defined by class String
returns a new String
object whose characters will be the
same characters as the String
reference which invoked the
method.
The length
method defined by class String
returns a new Integer
object whose value is the number of
characters in the String
reference which invoked the
method.
The substr
method defined by class String
returns a new String
object whose value is the characters
whose indices are defined between the range of beg to
end, inclusive. Indices of a String
object are
assigned in the following manner: the first character of a
String
object is assigned index 0, the second character
is assigned index 1, the third character is assigned index 2, and so
on. The characters of the new String
object will be in
the same order as those in the String
reference which
invoked the method. The minimum legal index for a String
reference is 0. The maximum legal index for a String
reference is the number of characters it contains minus 1. Thus,
legal indices for a String
reference containing 5
characters would range from 0 to 4. A String
reference
containing 0 characters has no legal indices.
The following run-time errors must be detected for the
substr
method:
substr
method on a String
reference containing 0 characters.
substr
method in which the beg or
end indices are not legal indices (see above).
substr
method in which end is
smaller than beg.
For all detected run-time errors, the action is to terminate the program with an error message.
The concat
method defined by class String
returns a new String
object whose value is the characters
of String
s appended to the characters in the
String
reference which invoked the method.
The toInteger
method defined by class String
returns a new Integer
object whose value is the decimal
value obtained by parsing the characters of the String
reference which invoked the method as a signed decimal integer.
The following run-time errors must be detected for the
toInteger
method:
toInteger
method on a String
reference that contains characters other than the ASCII characters
'0', '1', '2', '3', '4', '5', '6', '7', '8', or '9'. An ASCII minus
sign '-' is the exception, but only if it is the first character
(index 0).
toInteger
method on a String
reference whose parsed integer value is less than -2147483648 or
greater than 2147483647.
For all detected run-time errors, the action is to terminate the program with an error message.
The operator +
method defined by class
String
is completely similar to the method
concat
. If s and t are String
references, an invocation of s + t will have all the same
effects as an invocation of s.concat(t).
The operator >
method defined by class
String
returns a new Integer
object whose
value will be either 0 or 1. The value will be 1 if and only if the
characters of the String
reference s
lexicographically precede the characters of the String
reference that invoked the method.
The operator <
method defined by class
String
returns a new Integer
object whose
value will be either 0 or 1. The value will be 1 if and only if the
characters of the String
reference that invoked the
method lexicographically precede the characters of the
String
reference s.
The String
class defines the following constructor:
String(String s)
This constructor defined by class String
creates a new
String
object whose characters are the same as the characters in
s.
The class Table
is a subclass of the class
Object
. A variable of type Table
can hold a
reference to any Table
object or a reference to any
object whose class type is a descendant of Table
.
A Table
object is an implementation of a hash map. The
number of buckets in a Table
object is known as its
capacity. A Table
object's initial capacity is set at
the time it is constructed. A Table
object's load factor
is a measure of how full the Table
is allowed to get
before its capacity is automatically increased. When the number of
entries in a Table
exceeds the product of the load factor
and the current capacity, the capacity is roughly doubled and the
hashCode
method is called on all entries in the
Table
to determine their new bucket placement with the
new capacity.
Each entry in a bucket is a key/value pair. Thus each entry in a
Table
object will contain a key reference and a value
reference. Each bucket of a Table
object can hold any
number of entries. The first bucket is known as bucket 0, the second
bucket is bucket 1, the third bucket is bucket 2, and so on. Bucket n
is the bucket an entry will be placed in if its key's hash value is n.
The hash value for a key is computed by invoking the
hashCode
method on the key then taking modulo n of that
number, where n is the current capacity of the Table
reference.
A Table
reference can be iterated over using the
firstKey
and nextKey
methods. The
Table
iterator is fail-fast. This means that if an entry
is added to or removed from the Table
after the iterator
has been initialized with a call to firstKey
but before a
call to nextKey
causes it to reach the end of the
Table
, a run-time error occurs.
All classes which extend Table
inherit all of the methods
of Table
, which are summarized here:
Object get(Object key) { . . . } Object put(Object key, Object value) { . . . } Object remove(Object key) { . . . } Integer firstKey() { . . . } Object nextKey() { . . . }
The get
method defined by class Table
returns the value reference of the Table
entry whose key
reference is equal to key, or a null reference if no such entry
exists. To find this entry, inspect bucket n, where n is the hash
value for key (computed using the hashCode
method
and the Table
's current capacity, see above). Then, run
the following search algorithm:
equals
method on key using the current
entry's key reference as its argument.
Integer
object with a
value of 1, return the current entry's value reference.
The put
method defined by class Table
inserts a new entry into the Table
reference which
invoked the method, and returns the old value reference for key
or a null reference if there was none. To add a new entry, the hash
value for key is first computed (using the
hashCode
method and the Table
's current
capacity, see above). Then, bucket n is inspected, where n is the
computed hash value, using the following insertion algorithm:
equals
method on key using the current
entry's key reference as its argument.
Integer
object with a
value of 1, remove the current entry from the bucket. Add a new entry
to the bucket using key as the key reference and value
as the value reference, then return the value reference of the removed
entry.
The following run-time errors must be detected for the
put
method:
put
method when the iterator of the
Table
reference which invoked the method has been
initialized but has not reached the end of the Table
.
For all detected run-time errors, the action is to terminate the program with an error message.
The remove
method defined by class Table
removes the entry whose key reference is equal to key in the
Table
reference which invoked the method. If an entry
with a matching key reference is found, the value reference of the
removed entry is returned. Otherwise, the remove
method
returns a null reference. To find the entry to remove, inspect bucket
n, where n is the hash value for key (computed using the
hashCode
method and the Table
's current
capacity, see above). Then, run the following search algorithm:
equals
method on key using the current
entry's key reference as its argument.
Integer
object with a
value of 1, remove the current entry and return its value reference.
The following run-time errors must be detected for the
remove
method:
remove
method when the iterator of the
Table
reference which invoked the method has been
initialized but has not reached the end of the Table
.
For all detected run-time errors, the action is to terminate the program with an error message.
The firstKey
method defined by class Table
initializes an iterator over the Table
reference which
invoked the method. After invoking this method on a
Table
reference, the next invocation of
nextKey
will return the key reference of the first entry
in the first non-empty bucket of that Table
reference, or
a null reference if none exist. If the first invocation of
nextKey
would return a null reference,
firstKey
returns a new Integer
object with a
value of 0, otherwise firstKey
will return a new
Integer
object with a value of 1.
The nextKey
method defined by class Table
advances the iterator and returns the key reference of the entry
previously pointed to by the iterator of the Table
reference which invoked the method. Advancing the iterator is done as
follows:
Table
.
All further calls to nextKey
on a Table
reference whose iterator has reached the end of the Table
will return a null reference until the firstKey
method is
invoked again.
The Table
class defines the following constructors:
Table()
This constructor defined by class Table
creates a new
Table
object whose initial capacity is 16.
Table(Integer n)
This constructor defined by class Table
creates a new
Table
object whose initial capacity is the value of the
Integer
reference n.
Two reference types are the same type if:
Types are used in declarations, in class instance creation expressions, and in cast operator expressions.
A variable is a storage location and has an associated type, sometimes called its compile-time type, that is, a reference type. A variable's value can be changed by an assignment and a variable may only be assigned a value that is assignment compatible with its type.
Compatibility of the value of a variable with its type is guaranteed by the design of the maTe programming language. Otherwise, default values are compatible and all variable assignments are checked for assignment compatibility at compile time, run time, or both (reference types).
A variable of reference type can hold either of the following:
There are five kinds of variables:
Every variable in a program must have a value before its value is used:
null
.In the maTe programming language, every variable and every expression has a type that can be determined at compile time. Reference types are introduced by type declarations, which include class declarations.
Every object belongs to some particular class: the class that was mentioned in the creation expression that produced the object. This class is called the class of the object. An object is said to be an instance of its class and of all superclasses of its class.
Sometimes a variable or expression is said to have a "run-time
type". This refers to the class of the object referred to by the value
of the variable or expression at run time, assuming that the value is
not null
.
The compile time type of a variable is always declared, and the
compile time type of an expression can be deduced at compile time. The
compile time type limits the possible values that the variable can
hold or the expression can produce at run time. If a run-time value is
a reference that is not null
, it refers to an object that
has a class, and that class will necessarily be compatible with the
compile-time type.
CHAPTER 3
Every expression written in the maTe programming language has a type that can be deduced from the structure of the expression and the types of the literals, variables, and methods mentioned in the expression. It is possible, however, to write an expression in a context where the type of the expression is not appropriate. In some cases, this leads to an error at compile time.
A specific conversion from type S to type T allows an expression of type S to be treated at compile time as if it had type T instead. In some cases this will require a corresponding action at run time to check the validity of the conversion.
For convenience of description, the specific conversions that are possible in the maTe programming language are grouped into several broad categories:
There are three conversion contexts in which conversion of expressions may occur. The term "conversion" is also used to describe the process of choosing a specific conversion for such a context. For example, we say that an expression that is an actual argument in a method invocation is subject to "method invocation conversion," meaning that a specific conversion will be implicitly chosen for that expression according to the rules for the method invocation argument context.
This chapter first describes the three categories of conversions (3.1). Then the three conversion contexts are described:
Specific type conversions in the maTe programming language are divided into the following categories.
A conversion from a type to that same type is permitted for any type.
This may seem trivial, but it does have practical consequences. It is always permitted for an expression to have the desired type to begin with, thus allowing the simply stated rule that every expression is subject to conversion, if only a trivial identity conversion.
The following conversions are called the widening reference conversions:
Object
from any other class type.) null
type to any class type. Such conversions never require a special action at run time. They consist simply in regarding a reference as having some other type in a manner that can be proved correct at compile time.
See 5 for the detailed specifications for classes.
The following conversions are called the narrowing reference conversions:
Object
to any other class type.) Such conversions require a test at run time to find out whether the actual reference value is a legitimate value of the new type.
null
type
other than the identity conversion. Assignment conversion occurs when the value of an expression is assigned (7.14) to a variable: the type of the expression must be converted to the type of the variable. Assignment contexts allow the use of an identity conversion (3.1.1) or a widening reference conversion (3.1.2).
If the type of the expression cannot be converted to the type of the variable by a conversion permitted in an assignment context, then a compile-time error occurs.
If the type of an expression can be converted to the type of a variable by assignment conversion, we say the expression (or its value) is assignable to the variable or, equivalently, that the type of the expression is assignment compatible with the type of the variable.
A value of the null
type (the null
reference is the only such value) may be assigned to any reference
type, resulting in a null
reference of that type.
Assignment of a value of compile-time reference type S (source) to a variable of compile-time reference type T (target) is checked as follows:
See 5 for the specification of classes.
Method invocation conversion is applied to each argument value in a method or constructor invocation (7.7, 7.9): the type of the argument expression must be converted to the type of the corresponding parameter. Method invocation contexts allow the use of an identity conversion (3.1.1) or a widening reference conversion (3.1.2).
Casting conversion is applied to the operand of the cast operator: the type of the operand expression must be converted to the type explicitly named by the cast operator. Casting conversion allows the use of an identity conversion (3.1.1), a widening reference conversion (3.1.2), or a narrowing reference conversion (3.1.3). Casting using narrowing reference conversion (3.1.3) will require a run-time check to see if the cast is valid. In the event that the cast is not valid the program will terminate with an appropriate error message.
CHAPTER 4
Names are used to refer to entities declared in a program (4.1). A declared entity is a class type, a member (field or method) of a reference type, a parameter (to a method or constructor) or a local variable.
Names in maTe programs are simple, consisting of a single identifier (4.2).
Every declaration that introduces a name has a scope (4.3), which is the part of the program text within which the declared entity can be referred to by a name.
Reference types (that is, class types) have members (4.4). A member can be referred to using a qualified
name N.x, where N is a variable of a reference type (or
this
or super
) and x is an identifier that
names a member of that type, which is either a field or a method.
In determining the meaning of a name (4.5), the context of the occurrence is used to disambiguate among types, variables, and methods with the same name.
The name of a field, parameter, or local variable may be used as an expression (7.2). The name of a method may appear in an expression only as part of a method invocation expression (7.9). The name of a class type may appear in an expression only as part of a class instance creation expression (7.7), a cast operator (7.10.1) or an instanceof operator (7.10.2).
A declaration introduces an entity into a program and includes an identifier that can be used as a name to refer to this entity. A declared entity is one of the following:
Constructors are also introduced by declarations, but use a name based upon the name of the class in which they are declared rather than introducing a new name.
A name is used to refer to an entity declared in a program. All names are simple names: a single identifier.
The scope of a declaration is the region of the program within which the entity declared by the declaration can be referred to using a name (provided it is visible). A declaration is said to be in scope at a particular point in a program if and only if the declaration's scope includes that point.
These rules imply that declarations of class types need not appear before uses of the types.
Some declarations may be shadowed in part of their scope by another declaration of the same name, in which case a name cannot be used to refer to the declared entity.
A declaration d of a method parameter or constructor parameter named n shadows the declarations of any fields named n that are in scope at the point where d occurs throughout the scope of d.
Similarly, a local variable in a method or constructor body shadows throughout its scope a parameter or a field with the same name. And, an inner declaration of a local variable shadows throughout its scope an outer declaration of a local variable of the same name that is in scope.
A declaration d is said to be visible at point p in a program if the scope of d includes p, and d is not shadowed by any other declaration at p.
Note that shadowing is distinct from hiding. Hiding applies only to members which would otherwise be inherited but are not because of a declaration in a subclass.
Reference types have members.
This section provides an overview of the members of reference types here, as background for the discussion of the determination of the meaning of names.
The members of a class type are fields and methods. Members are either declared in the type, or inherited because they are members of a superclass which are not overridden.
The members of a class type are all of the following:
Constructors are not members.
There is no restriction against a field and a method of a class type having the same name.
A class type may have two or more methods with the same name if the methods have different signatures, that is, if they have different numbers of parameters or different parameter types in at least one parameter position. Such a method member name is said to be overloaded.
A class type may contain a declaration for a method with the same name and the same signature as a method that would otherwise be inherited from a superclass. In this case, the method of the superclass is not inherited. The new declaration is said to override it.
The meaning of a name depends on the context in which it is used. The determination of the meaning of a name requires two steps. First, context causes a name syntactically to fall into one of three categories: TypeName, ExpressionName or MethodName. Second, the resulting category then dictates the final determination of the meaning of the name (or a compilation error if the name has no meaning).
TypeName: Identifier ExpressionName: Identifier MethodName: Identifier
A name is syntactically classified as a TypeName in these contexts:
extends
clause in a class declarationA name is syntactically classified as an ExpressionName in these contexts:
A name is syntactically classified as a MethodName in this context:
A type name consists of a single Identifier. The identifier must occur in the scope of a declaration of a type with this name, or a compile-time error occurs.
The meaning of a name classified as an ExpressionName is determined as follows.
If an expression name consists of a single Identifier, then:
If an expression name is of the form Q.Id, then Q has already been classified as an expression name. Let T be the type of Q:
A MethodName can appear only in a method invocation expression. The meaning of a name classified as a MethodName is determined as follows.
If a method name consists of a single Identifier, then Identifier is the method name to be used for method invocation. The Identifier must name at least one method of a class within whose declaration the Identifier appears.
If a method name is of the form Q.Id, then Q has already been classified as an expression name. Id is the method name to be used for method invocation. Let T be the type of the expression Q; Id must name at least one method of the type T.
CHAPTER 5
Class declarations define new reference types and describe how they are implemented (5.1).
Each class except Object
is an extension of (that is,
a subclass of) a single existing class (5.1.1).
The body of a class declares members (fields, methods and operators) and constructors (5.1.2). The scope (4.3) of a member (5.2) is the entire declaration of the class to which the member belongs. The members of a class include both declared and inherited members (5.2). Newly declared fields can hide fields declared in a superclass. Newly declared methods can override methods declared in a superclass.
Field declarations (5.3) describe instance variables, which are freshly incarnated for each instance of the class.
Method declarations (5.4) describe code that may be invoked by method invocation expressions (7.9). A method is invoked with respect to some particular object that is an instance of the class type.
Operator declarations (5.6) describe code that may be invoked by operator invocation expressions (7.16). An operator is invoked with respect to some particular object that is an instance of the class type.
Method names may be overloaded (5.4.5).
Constructors (5.5) are similar to methods, but cannot be invoked directly by a method call; they are used to initialize new class instances. Like methods, they may be overloaded (5.5.4).
A class declaration specifies a new reference type:
ClassDeclaration:
class
Identifier Superopt ClassBody
The Identifier in a class declaration specifies the name of the class. A compile-time error occurs if a class has the same name as any other class in the program.
The optional extends
clause in a class declaration
specifies the direct superclass of the current class. A class
is said to be a direct subclass of the class it extends. The
direct superclass is the class from whose implementation the
implementation of the current class is derived. If the class
declaration for any class has no extends
clause, then the
class has the class Object
as its implicit direct
superclass.
Super:
extends
ClassType
The following is repeated from 2.3 to make the presentation here clearer:
ClassType: TypeName
The ClassType must name a class type, or a compile-time error occurs.
The subclass relationship is the transitive closure of the direct subclass relationship. A class A is a subclass of class C if either of the following is true:
Class C is said to be a superclass of class A whenever A is a subclass of C.
A class C directly depends on a type T if T is mentioned in the extends clause of C. A class C depends on a reference type T if any of the following conditions hold:
It is a compile-time error if a class depends on itself.
For example:
class Point extends ColoredPoint { Integer x, y; }
class ColoredPoint extends Point { Integer color; }
causes a compile-time error.
A class body may contain declarations of members of the class, that is, fields (5.3) methods (5.4) and operators (5.6). A class body may also contain declarations of constructors (5.5) for the class.
ClassBody: { ClassBodyDeclarationsopt } ClassBodyDeclarations: ClassBodyDeclaration ClassBodyDeclarations ClassBodyDeclaration ClassBodyDeclaration: ClassMemberDeclaration ConstructorDeclaration ClassMemberDeclaration: FieldDeclaration MethodDeclaration OperatorDeclaration
The scope of a declaration of a member m declared in or inherited by a class type C is the entire body of C.
The members of a class type are all of the following:
Object
, which
has no direct superclass
Constructors are not members and therefore are not inherited.
The variables of a class type are introduced by field declarations:
FieldDeclaration: Type VariableDeclarators ; VariableDeclarators: VariableDeclarator VariableDeclarators , VariableDeclarator VariableDeclarator: Identifier
The Identifier in a FieldDeclarator may be used in a name to refer to the field. Fields are members; the scope (4.3) of a field declaration is specified in 5.1.2. More than one field may be declared in a single field declaration by using more than one declarator; the Type apply to all the declarators in the declaration.
It is a compile-time error for the body of a class declaration to declare two fields with the same name.
Methods, types, and fields may have the same name, since they are used in different contexts and are disambiguated by different lookup procedures (4.5).
If the class declares a field with a certain name, then the declaration of that field is said to hide any declarations of fields with the same name in superclasses of the class.
If a field declaration hides the declaration of another field, the two fields need not have the same type.
A class inherits from its direct superclass all the fields of the superclass that are not hidden by a declaration in the class.
It is not possible for a class to inherit more than one field with the same name.
A hidden field can be accessed by using a field access expression
(7.8) that contains the keyword
super
.
A method declares executable code that can be invoked, passing a fixed number of values as arguments.
MethodDeclaration: MethodHeader MethodBody MethodHeader: ResultType MethodDeclarator ResultType: Type MethodDeclarator: Identifier ( FormalParameterListopt )
A method declaration specifies the type of value that the method returns.
The Identifier in a MethodDeclarator may be used in a name to refer to the method. A class can declare a method with the same name as the class or a field of the class.
It is a compile-time error for the body of a class to have as members two methods with the same signature (5.4.2) (name, number of parameters, and types of any parameters). Methods and fields may have the same name, since they are used in different contexts and are disambiguated by the different lookup procedures (4.5).
The formal parameters of a method or constructor, if any, are specified by a list of comma-separated parameter specifiers. Each parameter specifier consists of a type and an identifier (optionally followed by brackets) that specifies the name of the parameter:
FormalParameterList: FormalParameter FormalParameterList , FormalParameter FormalParameter: Type Identifier
If a method or constructor has no parameters, only an empty pair of parentheses appears in the declaration of the method, operator or constructor.
If two formal parameters of the same method, operator or constructor are declared to have the same name (that is, their declarations mention the same Identifier), then a compile-time error occurs.
When the method, operator or constructor is invoked (7.9), the values of the actual argument expressions initialize newly created parameter variables, each of the declared Type, before execution of the body of the method, operator or constructor. The Identifier that appears in the DeclaratorId may be used as a simple name in the body of the method, operator or constructor to refer to the formal parameter.
The scope of a parameter of a method, operator or constructor is the entire body of the method or constructor.
The signature of a method consists of the name of the method and the number and types of formal parameters to the method.
A class may not declare two methods with the same signature, or a compile-time error occurs.
A method body is a block of code that implements the method.
MethodBody: Block
If an implementation requires no executable code, the method body should be written as a block that contains no statements: "{ }".
Since a method must always have a return type, then every return statement (6.9) in its body must have an Expression.
Moreover, a method may only explicitly return by using a
return
statement that provides a value return. Otherwise,
the method may "drop off" the end of its body by executing an implicit
return at the very end of its method body; the value of the expression
for this implicit return is the same as the default value for the
return type of the method (i.e. null
for reference
types).
A class inherits from its direct superclass all the methods of the superclass that are not overridden by a declaration in the class.
A method declared in a class C overrides another method with the same signature declared in class A if C is a subclass of A.
A compile-time error occurs if a method has a different return type than the method it overrides.
An overridden method can be accessed by using a method invocation
expression (7.9) that contains the keyword
super
.
If two methods of a class (whether both declared in the same class, or both inherited by a class, or one declared and one inherited) have the same name but different signatures, then the method name is said to be overloaded. This fact causes no difficulty and never of itself results in a compile-time error.
There is no required relationship between the return types of two methods with the same name but different signatures.
Methods are overridden on a signature-by-signature basis.
If, for example, a class declares two methods with the same name, and a subclass overrides one of them, the subclass still inherits the other method.
When a method is invoked (7.9), the number of actual arguments and the compile-time types of the arguments are used, at compile time, to determine the signature of the method that will be invoked (7.9.2). The actual method to be invoked will be determined at run time, using dynamic method lookup (7.9.3).
A constructor is used in the creation of an object that is an instance of a class:
ConstructorDeclaration: ConstructorDeclarator ConstructorBody ConstructorDeclarator: TypeName ( FormalParameterListopt )
The TypeName in the ConstructorDeclarator must be the name of the class that contains the constructor declaration; otherwise a compile-time error occurs. In all other respects, the constructor declaration looks just like a method declaration that has no result type.
Constructors are invoked by class instance creation expressions, and by explicit constructor invocations from other constructors (5.5.3.1). Constructors are never invoked by method invocation expressions.
Constructors are not members. They are never inherited and therefore are not subject to hiding or overriding.
The formal parameters of a constructor are identical in structure and behavior to the formal parameters of a method.
The signature of a constructor consists of the number and types of formal parameters to the constructor. A class may not declare two constructors with the same signature, or a compile-time error occurs.
The first statement of a constructor body may be an explicit invocation of another constructor of the same class or of the direct superclass (5.5.3.1).
ConstructorBody: { ExplicitConstructorInvocationopt BlockStatementsopt }
It is a compile-time error for a constructor to directly or indirectly
invoke itself through a series of one or more explicit constructor
invocations involving this
.
If a constructor body does not begin with an explicit constructor invocation, then the constructor body is implicitly assumed by the compiler to begin with a superclass constructor invocation "super();", an invocation of the constructor of its direct superclass that takes no arguments. A compile-time error occurs if an implicit superclass constructor invocation is assumed by the compiler but the superclass does not have a constructor that takes no arguments.
ExplicitConstructorInvocation:this
( ArgumentListopt ) ;super
( ArgumentListopt ) ;
Explicit constructor invocation statements can be divided into two kinds:
this
. They are used to invoke an alternate constructor of
the same class.
super
. They are used to invoke a constructor of the
direct superclass.
An explicit constructor invocation statement in a constructor body may
not refer to any variables or methods declared or inherited in the
object being constructed, or use this
or
super
in any expression; otherwise, a compile-time error
occurs.
The evaluation of an explicit constructor invocation proceeds in several steps:
super();
", an invocation of the constructor of its
direct superclass that takes no arguments. Since class
Object
is a fundamental superclass of all classes,
invocation of the superclass constructor will precede the execution of
the local class constructor during the constructor invocation step.
Overloading of constructors is identical in behavior to overloading of methods. The overloading is resolved at compile time by each class instance creation expression.
If a class contains no constructor declarations, then a default constructor that takes no parameters is automatically provided.
The default constructor takes no parameters and simply implicitly invokes the superclass constructor with no arguments. A compile-time error occurs if a default constructor is provided by the compiler but the super class does not have a constructor that takes no arguments.
A operator declares executable code that can be invoked, passing a fixed number of values as arguments.
OperatorDeclaration: OperatorHeader OperatorBody OperatorHeader: ResultType OperatorDeclarator ResultType: Type OperatorDeclarator: operator UnaryOperator ( ) operator BinaryOperator ( FormalParameter ) operator MinusOperator ( ) operator MinusOperator ( FormalParameter )
An operator declaration specifies the type of value that the operator returns.
The token immediately after operator in an OperatorDeclarator (UnaryOperator, BinaryOperator or MinusOperator) may be used in properly formed operator invocation expression (7.16) as a name for the operator.
It is a compile-time error for the body of a class to have as members two operators with the same signature (5.6.2) (operator name, number of parameters, and types of any parameters).
The formal parameter of an operator, if any, is specified by parameter specifiers. The optional parameter specifier consists of a type and an identifier (optionally followed by brackets) that specifies the name of the parameter:
FormalParameter: Type Identifier
If an operator has no parameter, only an empty pair of parentheses appears in the declaration of the operator.
When the operator is invoked (7.16), the values of the actual argument expressions initialize newly created parameter variables, each of the declared Type, before execution of the body of the operator. The Identifier may be used as a simple name in the body of the operator to refer to the formal parameter.
The scope of a parameter of an operator is the entire body of the operator.
The signature of a operator consists of the operator (1.9) and the number and types of formal parameters to the operator.
A class may not declare two operators with the same signature, or a compile-time error occurs.
An operator body is a block of code that implements the operator.
OperatorBody: Block
If an implementation requires no executable code, the operator body should be written as a block that contains no statements: "{ }".
Since an operator must always have a return type, then every return statement (6.9) in its body must have an Expression.
Moreover, an operator may only explicitly return by using a
return
statement that provides a value return. Otherwise,
the operator may "drop off" the end of its body by executing an
implicit return at the very end of its operator body; the value of the
expression for this implicit return is the same as the default value
for the return type of the operator.
A class inherits from its direct superclass all the operators of the superclass that are not overridden by a declaration in the class.
An operator declared in a class C overrides another operator with the same signature declared in class A if C is a subclass of A.
A compile-time error occurs if an operator has a different return type than the operator it overrides.
If two operators of a class (whether both declared in the same class, or both inherited by a class, or one declared and one inherited) have the same operator token but different signatures, then the operator is said to be overloaded. This fact causes no difficulty and never of itself results in a compile-time error.
There is no required relationship between the return types of two operators with the same name but different signatures.
Operators are overridden on a signature-by-signature basis.
If, for example, a class declares two operators with the same name, and a subclass overrides one of them, the subclass still inherits the other operator.
When an operator is invoked (7.16), the number of actual arguments and the compile-time types of the arguments are used, at compile time, to determine the signature of the operator that will be invoked. The actual operator to be invoked will be determined at run time, using dynamic method lookup.
CHAPTER 6
The sequence of execution of a maTe program is controlled by a sequence of statements, which are executed for their effect and do not have values.
Some statements contain other statements as part of their structure; such other statements are substatements of the statement. In the same manner, some statements contain expressions (7) as part of their structure.
Sequences of statements are organized into blocks. There are two primary blocks defined for a maTe program: the main block, which is explained in section (6.1), and statement blocks which are explained in section (6.3).
Statements that will be familiar to C and C++ programmers are the
block (6.3), empty (6.4),
expression (6.5), if
(6.6, 6.7),
while
(6.8),
return
(6.9), break
(6.11), continue
(6.12), and local variable declaration (6.13) statements.
A program shall contain a global construct called main
(the main block), which is the designated start of the execution of a
program. Exactly one main block must exist for every maTe program.
The block is entered when program execution starts. Program execution
continues until a return
statement in the main block is
executed or the end of the main block is reached (or a run-time error
is encountered). If the end of the main block is reached, then the
result is as if the program executed a return of 0. The declaration
of a main block includes the definition of a return type, which must
be Integer
. The return value exists to provide a way to
pass a single status value to the surrounding environment.
All implementations of the main block will have the following definition:
MainFunctionDeclaration:Integer
main()
{
MainBlockStatementsopt}
MainBlockStatements: MainBlockStatements MainBlockStatement MainBlockStatement MainBlockStatement: BlockStatement BlockStatement: Statement
maTe does not allow for arguments to be passed into the main block.
The main block can exist at the beginning of the source file, at the end of the source file or in between any two class definitions within the source file.
Local variable declaration statements (6.13) can be provided at any level of the main block. The scope of a local variable declared in the main block is from its declaration point to the end of the block in which it is declared. It is a compile-time error to declare a local variable with the same name as a previously declared local variable which is still visible from the new local variable's declaration point.
There are many kinds of statements in the maTe programming language. Most correspond to statements in the C and C++ languages. Statements are given by the following grammar:
Statement: Block (6.3) EmptyStatement (6.4) ExpressionStatement (6.5) IfThenElseStatement (6.6) IfThenStatement (6.7) WhileStatement (6.8) ReturnStatement (6.9) OutputStatement (6.10) BreakStatement (6.11) ContinueStatement (6.12) LocalVariableDeclarationStatement (6.13)
A block is a sequence of statements within braces.
Block:{
BlockStatementsopt}
BlockStatements: BlockStatement BlockStatements BlockStatement
The following production from (6.1) is repeated here for convenience:
BlockStatement: Statement
A block is executed by executing each of the statements in order from
first to last (left to right). It is possible for a block to
terminate early through a return
statement.
An empty statement does nothing.
EmptyStatement:
;
Certain kinds of expressions may be used as statements by following them with semicolons:
ExpressionStatement: StatementExpression ; StatementExpression: Assignment MethodInvocation
An expression statement is executed by evaluating the expression; if the expression has a value, the value is discarded.
if-then-else
Statement
The if-then-else
statement allows a conditional choice of
two statements, executing one or the other but not both.
IfThenElseStatement:if
(
Expression)
Statementelse
Statement
The Expression must have type Integer
, or a
compile-time error occurs.
An if-then-else
statement is executed by first evaluating
the Expression. Execution continues by making a choice based
on the resulting value:
else
keyword) is executed.
else
keyword) is executed.
An else
is associated with the lexically immediately preceding
else
-less if
that is in the same block
(but not in an enclosed block).
if-then
Statement
The if-then
statement allows a conditional choice of one
statement, executing the statement or not executing it.
IfThenStatement:if
(
Expression)
Statement
The Expression must have type Integer
, or a
compile-time error occurs.
An if-then
statement is executed by first evaluating the
Expression. Execution continues by making a choice based on
the resulting value:
while
Statement
The while
statement executes an Expression and a
Statement repeatedly until the value of the Expression
is 0.
WhileStatement:while
(
Expression)
Statement
The Expression must have type Integer
, or a
compile-time error occurs. A while
statement is executed
by first evaluating the Expression. Execution continues by
making a choice based on the resulting value:
return
statement.)
If the value of the Expression is 0 the first time it is evaluated, then the Statement is not executed.
return
Statement A return
statement returns control to the invoker of
a method (5.4, 7.9), returns
control to the invoker of a constructor (5.5, 7.7), returns control to the invoker of an operator
(5.6, 7.16), or terminates
the main block (6.1), and is given by the
following grammar:
ReturnStatement:return
Expression;
return
;
Use of a return
without an expression shall only be
used within a constructor. A return
with an expression
in a constructor is a compile-time error.
If a return
statement is contained within a method or
operator, the value of the Expression becomes the value of the
method or operator invocation. More precisely, execution of such a
return
statement first evaluates the Expression.
The value produced by the Expression is communicated to the
invoker. A return
statement with no Expression is
not allowed in this context and will result in a compile-time error.
If a return
statement is contained within the main
block, the value of the Expression becomes the value of the
program. More precisely, execution of such a return
statement first evaluates the Expression. The value produced
by the Expression is communicated to the surrounding execution
environment. A return
statement with no
Expression is not allowed in this context and will result in a
compile-time error.
It is possible to return
from the middle of a
while
or if-then-else
block.
A compile-time error occurs if the type of the return expression
is not convertible by assignment conversion to the return type of the
enclosing method or operator, or to Integer
if the return
statement is in the main block.
out
Statement
The out
statement is a rudimentary mechanism for printing
strings and it provides the only way to generate output in the maTe
programming language. Its syntax is described by the following
grammar:
OutputStatement:out
Expression;
If the Expression has type String
then the
out
statement will print to stdout all characters
in the String
object. If the Expression is not of
type String
then the toString
method will
first be invoked on Expression, and the resulting
String
object will be output in the manner described
above.
If the Expression evaluates to null, then a run-time error occurs and the program terminates (7.18).
break
Statement
The break
statement transfers control out of an enclosing
while
statement. Its syntax is described by the
following grammar:
BreakStatement:break
;
A break
statement transfers control to the innermost
enclosing while
statement of the enclosing method or main
block; this statement, which is called the break target, then
immediately exits. If no while
statement encloses the
break
statement, a compile-time error occurs.
continue
Statement
The continue
statement transfers control to the
loop-continuation point of an enclosing while
statement.
Its syntax is described by the following grammar:
ContinueStatement:continue
;
A continue
statement transfers control to the
innermost enclosing while
statement of the enclosing
method or main block; this statement, which is called the
continue target, then immediately ends the current iteration
and begins a new one. If no while
statement encloses the
continue
statement, a compile-time error occurs.
Local variable declaration statements may be provided at any level of a main block, class constructor or method body. The scope of a local variable is from its declaration point to the end of the enclosing block in which it was declared. Two variables have the same scope if and only if their scopes terminate at the same point. It is a compile-time error to declare two local variables with the same name in the same scope. If an outer declaration of a variable with the same name exists, it is hidden until the end of the scope of the inner variable, after which the outer variable becomes visible again. It is a compile-time error to declare a local variable in a constructor, method or operator body with the same name as a parameter declared in the enclosing constructor, method or operator's signature. Local variable declaration statements are given by the following grammar:
LocalVariableDeclarationStatement:Type
VariableDeclarators
;
VariableDeclarators: VariableDeclarator VariableDeclarators,
VariableDeclarator VariableDeclarator: Identifier
CHAPTER 7
Much of the work in a program is done by evaluating expressions, either for their side effects, such as assignments to variables, or for their values, which can be used as arguments or operands in larger expressions, or to affect the execution sequence in statements, or both.
This chapter specifies the meanings of expressions and the rules for their evaluation.
When an expression in a program is evaluated (executed), the result denotes one of two things:
Evaluation of an expression can also produce side effects, because expressions may contain embedded assignments and method invocations.
Each expression occurs either in the main block (6.1) or in the declaration of some class type that is being declared. In a class declaration the expression might occur in a constructor declaration, or in the code for an operator or method.
If an expression denotes a variable, and a value is required for use in further evaluation, then the value of that variable is used. In this context, if the expression denotes a variable or a value, we may speak simply of the value of the expression.
If an expression denotes a variable or a value, then the expression has a type known at compile time. The rules for determining the type of an expression are explained separately below for each kind of expression.
The value of an expression is always assignment compatible (3.2) with the type of the expression, just as the value stored in a variable is always compatible with the type of the variable. In other words, the value of an expression whose type is T is always suitable for assignment to a variable of type T.
If the type of an expression is a reference type, then the class
of the referenced object, or even whether the value is a reference to
an object rather than null
, is not necessarily known at
compile time. There are a few places in the maTe programming language
where the actual class of a referenced object affects program
execution in a manner that cannot be deduced from the type of the
expression. They are as follows:
o.m(
...)
is
chosen based on the methods that are part of the class that is the
type of o
. The class of the object referenced by the
run-time value of o
participates because a subclass may
override a specific method already declared in a parent class so
that this overriding method is invoked. (The overriding method may
or may not choose to further invoke the original overridden
m
method.)
The first of the cases just listed ought never to result in detecting a type error, as it is compile-time constrained to be valid. Thus, a run-time type error can occur only when the actual class of the object referenced by the value to be assigned (either implicitly or explicitly) is not compatible with the actual run-time reference variable. In these cases, the program terminates with a Run-Time error (7.18).
The maTe programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.
It is recommended that code not rely crucially on this specification. Code is usually clearer when each expression contains at most one side effect, as its outermost operation.
The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. For example, if the left-hand operand contains an assignment to a variable and the right-hand operand contains a reference to that same variable, then the value produced by the reference will reflect the fact that the assignment occurred first.
The maTe programming language also guarantees that every operand of an operator appears to be fully evaluated before any part of the operation itself is performed.
maTe programming language implementations must respect the order of evaluation as indicated explicitly by parentheses and implicitly by operator precedence. An implementation may not take advantage of algebraic identities such as the associative law to rewrite expressions into a more convenient computational order unless it can be proven that the replacement expression is equivalent in value and in its observable side effects for all possible computational values that might be involved.
Note that Integer addition and multiplication are provably associative in the maTe programming language.
For example a+b+c
, where a
,
b
, and c
are local variables will always
produce the same answer whether evaluated as (a+b)+c
or
a+(b+c)
; if the expression b+c
occurs nearby
in the code, a smart compiler may be able to use this common
subexpression.
In a method or constructor invocation or class instance creation expression, argument expressions may appear within the parentheses, separated by commas. Each argument expression appears to be fully evaluated before any part of any argument expression to its right.
Primary expressions include most of the simplest kinds of expressions, from which all others are constructed: literals, field accesses, method invocations and names. A parenthesized expression is also treated syntactically as a primary expression.
Primary: Identifier ParenExpression this FieldAccess MethodInvocation OperatorInvocation ClassInstanceCreationExpression Literal
A literal denotes a fixed, unchanging value.
The following production from (1.7) is repeated here for convenience:
Literal: IntegerLiteral NullLiteral StringLiteral
The type of a literal is determined as follows:
Integer
.
String
.
null
is the null type;
its value is the null reference.
this
The keyword this
may be used only in the body of an
method, operator or constructor.
When used as a primary expression, the keyword this
denotes a value, that is a reference to the object for which the
method was invoked, or to the object being constructed. The type of
this
is the class C within which the keyword
this
occurs. At run time, the class of the actual object
referred to may be the class C or any subclass of C.
A parenthesized expression is a primary expression whose type is the type of the contained expression and whose value at run time is the value of the contained expression. If the contained expression denotes a variable then the parenthesized expression also denotes that variable.
ParenExpression:(
Expression)
A class instance creation expression is used to create new objects that are instances of classes.
ClassInstanceCreationExpression:new
ClassType Arguments Arguments:(
ArgumentList)
(
)
ArgumentList: ArgumentList,
Expression Expression
We say that a class is instantiated when an instance of the class is created by a class instance creation expression. Class instantiation involves determining what class is to be instantiated, what constructor should be invoked to create the new instance and what arguments should be passed to that constructor.
The class being instantiated is the class denoted by ClassType.
The type of the class instance creation expression is the class type being instantiated.
Let C be the class type being instantiated. To create an instance of C, i, a constructor of C is chosen at compile-time by the following rules:
At run time, a class instance creation expression requires memory space to be allocated for the new class instance. If there is insufficient space to allocate the object, the program terminates with a run-time error.
The new object contains new instances of all the fields declared in the specified class type and all its superclasses. As each new field instance is created, it is initialized to its default value.
Next, the actual arguments to the constructor are evaluated, left-to-right.
Next, the selected constructor of the specified class type is invoked. This results in invoking at least one constructor for each superclass of the class type.
The value of a class instance creation expression is a reference to the newly created object of the specified class. Every time the expression is evaluated, a fresh object is created.
A field access expression may access a field of an
object, a reference to which is the value of either an expression or
the special keyword super
.
FieldAccess:
Primary . Identifier
super
. Identifier
The type of the Primary must be a reference type T, or a compile-time error occurs. The meaning of the field access expression is determined as follows:
null
, then the
program terminate with a run-time error.
Note, specifically, that only the type of the Primary expression, not the class of the actual object referred to at run time, is used in determining which field to use.
The special forms using the keyword super
are valid
only in an instance method, operator or constructor of a class; these
are exactly the same situations in which the keyword this
may be used.
Suppose that a field access expression super.name
appears within class C, and the immediate superclass of
C is class S. Then super.name refers to the field named
name of the current object, but with the current object viewed as an
instance of the superclass. Thus it can access the field named
name that is visible in class S, even if that field is
hidden by a declaration of a field named name in class C.
A method invocation expression is used to invoke a class or instance method.
MethodInvocation:
Identifier ( ArgumentListopt )
Primary . Identifier ( ArgumentListopt )
super
. Identifier ( ArgumentListopt )
The first step in processing a method invocation at compile time is to figure out the name of the method to be invoked and which class to check for definitions of methods of that name. There are several cases to consider, depending on the form that precedes the left parenthesis, as follows:
super
. Identifier, then the
name of the method is the Identifier and the class to be
searched is the immediate superclass of the class whose declaration
contains the method invocation.
The second step searches the class determined in the previous step for method declarations. This step uses the name of the method and the types of the argument expressions to locate method declarations that are applicable, that is, declarations that can be correctly invoked on the given arguments. There may be more than one such method declaration, in which case the most specific one is chosen. The descriptor (signature plus return type) of the most specific method declaration is one used at run time to do the method dispatch.
A method declaration is applicable to a method invocation if and only if both of the following are true:
The class determined by the process described in 7.9.1 is searched for all method declarations applicable to this method invocation; method definitions inherited from superclasses are included in this search.
If the class has no method declaration that is applicable, then a compile-time error occurs.
If more than one method declaration is applicable to a method invocation, it is necessary to choose one to provide the descriptor for the run-time method dispatch. In this case the most specific method is chosen.
The informal intuition is that one method declaration is more specific than another if any invocation handled by the first method could be passed on to the other one without a compile-time type error.
The precise definition is as follows:
A method is said to be maximally specific for a method invocation if it is applicable and there is no other applicable method that is more specific.
If there is exactly one maximally specific method, then it is in fact the most specific method; it is necessarily more specific than any other method that is applicable.
It is possible that no method is the most specific, because there are two or more maximally specific methods. In this case a compile-time error occurs.
The type of the method invocation expression is the result type specified in the compile-time declaration of the most specific method.
At run time, method invocation requires four steps. First, a target reference may be computed. Second, the argument expressions are evaluated. Third, the actual code for the method to be executed is located. Fourth, a new activation frame is created and control is transferred to the method code.
There are several cases to consider, depending on which of the three productions for MethodInvocation (7.9) is involved:
this
.
The argument expressions are evaluated in order, from left to right.
If the target reference is null
, a run-time error occurs
and the program terminates. Otherwise, the target reference is said
to refer to a target object and will be used as the value of the
keyword this
in the invoked method.
A dynamic method lookup is used. The dynamic lookup process starts from a class S, determined as follows:
super
, then
S is initially the superclass of the class that contains the
method invocation.
The dynamic method lookup uses the following procedure to search class S, and then the superclasses of class S, as necessary, for method m.
We note that the dynamic lookup process, while described here explicitly, will often be implemented implicitly, for example as a side-effect of the construction and use of per-class method dispatch tables, or the construction of other per-class structures used for efficient dispatch.
A method m in some class S has been identified as the one to be invoked.
Now a new activation frame is created, containing the target reference and the argument values (if any), as well as enough space for the stack for the method to be invoked and any other bookkeeping information that may be required by the implementation (stack pointer, program counter, reference to previous activation frame, and the like). If there is not sufficient memory available to create such an activation frame, a run-time error occurs and the program terminates.
The newly created activation frame becomes the current activation frame. The effect of this is to assign the argument values to corresponding freshly created parameter variables of the method, and to make the target reference available as this. Before each argument value is assigned to its corresponding parameter variable, it is subjected to method invocation conversion (3.3).
The unary operators include -
, !
and
cast operators. Expressions with unary operators group right-to-left,
so that -!x
means the same as -(!x)
.
UnaryExpression:-
UnaryExpression!
UnaryExpression CastExpression CastExpression: ParenExpression CastExpression(
ReferenceType)
CastExpression Primary
Conceptually, the grammar for cast expressions is:
CastExpression:(
ReferenceType)
CastExpression Primary
However, for technical reasons (to make the grammar LALR(1)), the grammar was rewritten to parse a simple class name as an Expression. This eliminates an ambiquity that exists with one-token lookahead, where a parenthesized name cannot be distinguished from a cast. (See Section 19.1.5 of the first edition of the Java Language Specification for a discussion of this same problem in Java.)
The type of a cast expression is the type whose name appears within the parentheses. (The parentheses and the type they contain are sometimes called the cast operator.) The result of a cast expression is not a variable, but a value, even if the result of the operand expression is a variable.
At compile time, the type of the operand expression must be convertible by casting conversion (3.4) to the type of the cast operator.
A run-time error (7.18) occurs if the type of the cast operator is a reference type and the run-time type of the cast operand is not assignable to that type. That is, for reference types, the run-time type of the right-hand operand must be the same type as the left-hand type, or it must be a subclass of that type.
The grammar for an instanceof expression is:
InstanceOfExpression: InstanceOfExpression instanceof ReferenceType RelationalExpression
The type of the instanceof expression is Integer
. The
value of the Integer
reference is either 0 or 1. It is 1
if and only if the type of InstanceOfExpression is convertible by
casting conversion (3.4) to the type
ReferenceType. If the value of an instanceof expression is 1, a cast
to the same type is guaranteed to succeed.
The operators *
and /
are called the
multiplicative operators. They have the same precedence and are
syntactically left-associative (they group left-to-right).
MultiplicativeExpression: UnaryExpression MultiplicativeExpression*
UnaryExpression MultiplicativeExpression/
UnaryExpression
The operators +
and -
are called the
additive operators. They have the same precedence and are
syntactically left-associative (they group left-to-right).
AdditiveExpression: MultiplicativeExpression AdditiveExpression+
MultiplicativeExpression AdditiveExpression-
MultiplicativeExpression
The relational operators are syntactically left-associative (they group left-to-right).
RelationalExpression: AdditiveExpression RelationalExpression<
AdditiveExpression RelationalExpression>
AdditiveExpression
The equality operator is syntactically left-associative (it groups left-to-right).
EqualityExpression:
InstanceOfExpression
EqualityExpression ==
InstanceOfExpression
The equality operator may be used to compare two operands for object equality.
At run time, the result of ==
is an Integer object
with value of either 0 or 1. The value will be 1 if the two operands
denote the same object, and 0 otherwise.
The assignment operator is syntactically right-associative (groups
right-to-left). Thus, a=b=c
means a=(b=c)
,
which assigns the value of c to b and then assigns the
value of b to a.
Assignment:
LeftHandSide =
AssignmentExpression
LeftHandSide:
Identifier
FieldAccess
AssignmentExpression:
EqualityExpression
Assignment
The result of the first operand of an assignment operator must be a variable, or a compile-time error occurs. This operand may be a named variable, such as a field of the current object or class, or it may be a computed variable, as can result from a field access. The type of the assignment expression is the type of the variable.
At run time, the result of the assignment expression is the value of the variable after the assignment has occurred. The result of an assignment expression is not itself a variable.
A compile-time error occurs if the type of the right-hand operand cannot be converted to the type of the variable by assignment conversion.
At run time, three steps are required:
The in
operator is a rudimentary mechanism for accepting
strings as user input and it provides the only way to accept input in the
maTe programming language.
InputOperator:
in
The in
operator reads a string of input from
stdin. The in
operator skips any leading
whitespace characters (CR, LR, LF, tab) in the input, and stops when
it encounters the first non-leading whitespace character (or EOF). If
there are no non-whitespace characters prior to EOF, the
in
operator returns a null reference. Otherwise, a new
String object is created, whose characters are those read in by the
operator (except for leading whitespace characters). The value of the
in expression is a null reference if no non-whitespace characters were
found prior to EOF, or a reference to the String object that was
created. The type of the in expression is String.
An operator invocation expression is used to invoke a class or instance operator.
OperatorInvocation: Expression BinaryOperator Expression UnaryOperator Expression MinusOperator Expression Expression MinusOperator Expression
The first step in processing an operator invocation at compile time is to figure out the operator to be invoked and which class to check for definitions of operators with the specified operator. There are several cases to consider, depending on the form, as follows:
The second step searches the class determined in the previous step for operator declarations. This step uses the operator and class determined in the previous step to locate operator declarations that are applicable, that is, declarations that can be correctly invoked on the given number of arguments. There may be more than one such operator declaration, in which case the most specific one is chosen. The descriptor (signature plus return type) of the most specific operator declaration is one used at run time to do the operator dispatch.
Finding applicable operator declarations is entirely similar to finding applicable method declarations (7.9.2.1).
Find the most specific operator declaration is entirely similar to finding the most specific method declaration (7.9.2.2).
At run-time, operator invocation is entirely similar to run-time evaluation of a method invocation (7.9.3).
An Expression
is any assignment expression:
Expression:
AssignmentExpression
There is no exception "handling" in the maTe programming language. Run-Time errors will cause the abrupt termination of the program with an associated error message.
ERROR: Out of memory.
ERROR: Null reference.
null
.
null
.
ERROR: Divide by zero.
Integer
division (2.3.3) operation is attempted where the
value of the denominator is zero.
ERROR: Invalid cast.
ERROR: Index out of bounds.
substr
method defined by class
String
was invoked such that at least one of the
following conditions was true (see 2.3.4):
String
reference which invoked the method
contained 0 characters.
ERROR: Number format exception.
toInteger
method defined by class
String
was invoked such that at least one of the
following conditions was true (see 2.3.4):
String
reference that
invoked the method were other than the ASCII characters '0',
'1', '2', '3', '4', '5', '6', '7', '8', or '9'. An ASCII minus
sign '-' is the exception, but only if it is the first character
(index 0).
String
reference that invoked the method was less than -2147483648 or
greater than 2147483647.
ERROR: Concurrent modification exception.
put
or remove
method defined by
class Table
was invoked after the Table
reference that invoked the method's iterator had been initialized
but before it had reached the end of the Table
.