The maTe Language Specification, Version 1

Table of Contents

1 Lexical Structure

1.1 Line Terminators
1.2 Input Elements and Tokens
1.3 White Space
1.4 Comments
1.5 Identifiers
1.6 Keywords
1.7 Literals
1.7.1 Integer Literals
1.7.2 The Null Literal
1.7.3 The String Literal
1.8 Separators
1.9 Operators

2 Types, Values, and Variables

2.1 The Kinds of Types and Values
2.2 Predefined Types and Values
2.3 Reference Types and Values
2.3.1 Objects
2.3.2 The Class Object
2.3.3 The Class Integer
2.3.4 The Class String
2.3.5 The Class Table
2.3.6 When Reference Types Are the Same
2.4 Where Types Are Used
2.5 Variables
2.5.1 Variables of Reference Type
2.5.2 Kinds of Variables
2.5.3 Initial Values of Variables
2.5.4 Types and Classes

3 Conversions

3.1 Kinds of Conversion
3.1.1 Identity Conversions
3.1.2 Widening Reference Conversions
3.1.3 Narrowing Reference Conversions
3.1.4 Forbidden Conversions
3.2 Assignment Conversion
3.3 Method Invocation Conversion
3.4 Casting Conversion

4 Names

4.1 Declarations
4.2 Names and Identifiers
4.3 Scope of a Declaration
4.3.1 Shadowing Declarations
4.4 Members and Inheritance
4.4.1 The Members of a Class Type
4.5 Determining the Meaning of a Name
4.5.1 Syntactic Classification of a Name According to Context
4.5.2 Meaning of Type Names
4.5.3 Meaning of Expression Names
4.5.3.1 Simple Expression Names
4.5.3.2 Qualified Expression Names
4.5.4 Meaning of Method Names
4.5.4.1 Simple Method Names
4.5.4.2 Qualified Method Names

5 Classes

5.1 Class Declaration
5.1.1 Superclasses and Subclasses
5.1.2 Class Body and Member Declarations
5.2 Class Members
5.3 Field Declarations
5.4 Method Declarations
5.4.1 Formal Parameters
5.4.2 Method Signature
5.4.3 Method Body
5.4.4 Inheritance and Overriding
5.4.5 Overloading
5.5 Constructor Declarations
5.5.1 Formal Parameters
5.5.2 Constructor Signature
5.5.3 Constructor Body
5.5.3.1 Explicit Constructor Invocations
5.5.4 Constructor Overloading
5.5.5 Default Constructor
5.6 Operator Declarations
5.6.1 Formal Parameters
5.6.2 Operator Signature
5.6.3 Operator Body
5.6.4 Inheritance and Overriding
5.6.5 Overloading

6 Blocks and Statements

6.1 Main Block
6.2 Statements
6.3 Blocks
6.4 The Empty Statement
6.5 Expression Statements
6.6 The if-then-else Statement
6.7 The if-then Statement
6.8 The while Statement
6.9 The return Statement
6.10 The out Statement
6.11 The break Statement
6.12 The continue Statement
6.13 The Local Variable Declaration Statement

7 Expressions

7.1 Evaluation, Denotation, and Result
7.2 Variables as Values
7.3 Type of an Expression
7.4 Expressions and Run-Time Checks
7.5 Evaluation Order
7.5.1 Evaluate Left-Hand Operand First
7.5.2 Evaluate Operands before Operation
7.5.3 Evaluation Respects Parentheses and Precedence
7.5.4 Argument Lists are Evaluated Left-to-Right
7.6 Primary Expressions
7.6.1 Lexical Literals
7.6.2 this
7.6.3 Parenthesized Expressions
7.6.4 Expression Names
7.7 Class Instance Creation Expressions
7.7.1 Determining the Class being Instantiated
7.7.2 Choosing the Constructor and its Arguments
7.7.3 Run-time Evaluation of Class Instance Creation Expressions
7.8 Field Access Expressions
7.8.1 Field Access Using a Primary
7.8.2 Accessing Superclass Members Using super
7.9 Method Invocation Expressions
7.9.1 Compile-Time Step 1: Determine Class to Search
7.9.2 Compile-Time Step 2: Determine Method Signature
7.9.2.1 Find Methods that are Applicable
7.9.2.2 Choose the Most Specific Method
7.9.3 Runtime Evaluation of Method Invocation
7.9.3.1 Compute Target Reference (If Necessary)
7.9.3.2 Evaluate Arguments
7.9.3.3 Locate Method to Invoke
7.9.3.4 Create Frame
7.10 Unary Operators
7.10.1 Cast Operator
7.10.2 instanceof Operator
7.11 Arithmetic Operators
7.11.1 Multiplicative Operators
7.11.2 Additive Operators
7.12 Relational Operators
7.13 Equality Operator
7.14 Assignment Operator
7.15 Input Operator
7.16 Operator Invocation Expressions
7.17 Expression
7.18 Run-Time Errors

8 LALR(1) Grammar (yacc file)


CHAPTER 1

Lexical Structure


This chapter specifies the structure of the language.

Programs are written in ASCII characters. Line terminators are defined (1.1) to support the different conventions of existing host systems while maintaining consistent line numbers.

The ASCII characters are reduced to a sequence of input elements (1.2), which are white space (1.3), comments (1.4), and tokens. The tokens are the identifiers (1.5), keywords (1.6), literals (1.7), separators (1.8), and operators (1.9) of the syntactic grammar.

1.1 Line Terminators

Implementation divides the sequence of ASCII characters into lines by recognizing line terminators. (This definition of lines determines the line numbers produced). It also specifies the termination of the // form of a comment.




LineTerminator:

    the ASCII LF character, also known as "newline"

    the ASCII CR character, also known as "return"

    the ASCII CR character followed by the ASCII LF character

	

InputCharacter:

    ASCIICharacter but not CR or LF



Lines are terminated by the ASCII characters CR, or LF, or CR LF. The two characters CR immediately followed by LF are counted as one line terminator, not two.

The result is a sequence of line terminators and input characters, which are the terminal symbols for the third step in the tokenization process.

1.2 Input Elements and Tokens

The input characters and line terminators are reduced to a sequence of input elements. Those input elements that are not white space (1.3) or comments (1.4) are tokens.

This process is specified by the following productions:




Input:

    InputElementsopt



InputElements:

    InputElement InputElements

    InputElement



InputElement:

    WhiteSpace

    Comment

    Token



Token:

    Identifier

    Keyword

    Literal

    Separator

    Operator



1.3 White Space

White space is defined as the ASCII space, horizontal tab, and form feed characters, as well as line terminators.




WhiteSpace:

    the ASCII SP character, also known as "space"

    the ASCII HT character, also known as "horizontal tab"

    the ASCII FF character, also known as "form feed"

    LineTerminator



1.4 Comments

The comment can be of the following form.




// text



All the text from the ASCII // characters to the LineTerminator will be ignored.




EndOfLineComment:

    / / CharactersInLineopt LineTerminator



CharactersInLine:

    InputCharacter

    CharactersInLine InputCharacter



1.5 Identifiers

An identifier is an unlimited-length sequence of letters and digits, the first of which must be a letter. An identifier cannot have the same spelling (ASCII character sequence) as a keyword (1.6), the null literal (1.7.2), the tab string literal, or the newline string literal (1.7.3).




Identifier:

    IdentifierChars but not a Keyword or NullLiteral



IdentifierChars:

    Letter

    IdentifierChars LetterOrDigit



Letter:

    any ASCII character that is a letter (see below)



LetterOrDigit:

    any ASCII character that is a letter or digit (see below)



The letters include uppercase and lowercase ASCII Latin letters A-Z (0x41-0x5a), and a-z (0x61-0x7a), and the ASCII underscore (_, or 0x5f). The digits include the ASCII digits 0-9 (0x30-0x39).

Two identifiers are the same only if they are identical, that is, have the same ASCII character for each letter or digit.

1.6 Keywords

The following character sequences, formed from ASCII letters, are reserved for use as keywords and cannot be used as identifiers.




Keyword: one of

    break

    class 

    continue

    else

    extends

    if

    in

    instanceof

    main

    new

    newline
 
    null

    out

    operator

    return

    super

    tab

    this

    while





1.7 Literals

A literal is the source code representation of a value of an Integer, String or a null type.




Literal: 

    IntegerLiteral

    NullLiteral

    StringLiteral



1.7.1 Integer Literals

An integer literal should be expressed in decimal (base 10).




IntegerLiteral:

    DecimalNumeral



A decimal numeral consists of an ASCII digit from 0 to 9, optionally followed by one or more ASCII digits from 0 to 9, representing a positive integer.




DecimalNumeral:

    Digits



Digits:

    Digit

    Digits Digit



Digit: one of

    0 1 2 3 4 5 6 7 8 9



An integer literal is of type Integer (2.3.3).

The largest decimal literal is 2147483648 (231). All decimal literals from 0 to 2147483647 may appear anywhere an integer literal may appear, but the literal 2147483648 may appear only as the operand of the unary negation operator "-" .

A compile-time error occurs if a decimal literal is larger than 2147483648 (231), or if the literal 2147483648 appears anywhere other than as the operand of the unary "-" operator.

An integer literal is mapped by the compiler to a class instance creation expression which generates a new Integer object whose value is the value of the integer literal parsed as a signed decimal integer (base 10). The value of the class instance creation expression is a reference to the Integer instance which was created.

1.7.2 Null Literal

The null type has one value, the null reference, represented by the literal null, which is formed from ASCII characters. A null literal has null type (2.1).




NullLiteral:

    null



1.7.3 String Literal

A string literal is of type String.



StringLiteral:

    " StringCharacters "
    WhitespaceStringLiteral

WhitespaceStringLiteral:
    newline
    tab

StringCharacters:

    StringCharacters
    ASCIICharacter but not " (double quote), CR, LF or tab    


A string literal is mapped by the compiler to a class instance creation expression which generates a new String object. For the quote-delimited string literal, the characters of the new String object will be all characters contained within the double quotes. For the newline string literal, the characters will be a single newline character. For the tab string literal, the characters will be a single tab character. The value of the class instance creation expression is a reference to the String instance which was created. A string literal is of type String (2.3.4).

1.8 Separators

The following nine ASCII characters are the separators (punctuators):




Separator: one of

    (    )    {    }    ;    ,    .    



1.9 Operators

The following 9 tokens are the operators, formed from ASCII characters:




Operator:

    =

    ==

    new

    UserDefinedOperator

UserDefinedOperator:

    UnaryOperator
    BinaryOperator
    MinusOperator

UnaryOperator:

    !

BinaryOperator:

    +
    *
    /
    >
    <

MinusOperator:

    -



Only UserDefinedOperator operators may be used in class operator declarations (5.6).


CHAPTER 2

Types, Values, and Variables


The maTe programming language is a strongly typed language, which means that every variable and every expression has a type that is known at compile time. Types limit the values that a variable can hold or that an expression can produce, limit the operations supported on those values, and determine the meaning of the operations. Strong typing helps detect errors at compile time.

The maTe programming language is a pure object-oriented language. All types are classes. Class types are divided into two categories: predefined types and user-defined types. There are four predefined types: the object type Object, the integer type Integer, the table type Table and the string type String. User-defined class types are subclasses of one of the predefined types. There is also a special null type. An object is a dynamically created instance of a class type. The values of a class type are references to objects. All objects support the methods of class Object. Names of types are used in declarations, class instance creation expressions, and cast operators.

A variable is a storage location. A variable of a class type T can hold a null reference or a reference to an instance of class T or of any class that is a subclass of T. A variable of type Object can hold a null reference or a reference to any object.

2.1 The Kinds of Types and Values

There are two kinds of types in the maTe programming language: predefined types and user-defined types. maTe being a pure object-oriented programming language, there is only one kind of data value that can be stored in variables, passed as arguments, returned by methods, and operated on: class object references.



Type:

    ReferenceType



There is also a special null type, the type of the expression null, which has no name. Because the null type has no name, it is impossible to declare a variable of the null type. The null reference is the only possible value of an expression of null type. The null reference can always be converted to any reference type. In practice, the programmer can ignore the null type and just pretend that null is merely a special literal that can be of any reference type.

2.2 Predefined Types and Values

A predefined type is predefined by the maTe programming language and named by its reserved keyword. There are four of these:

PredefinedType:

    Object

    Integer

    Table

    String

    

User-defined types are subclasses of at least one of these types.

2.3 Reference Types and Values

There is one kind of reference type: class types.



ReferenceType:

    ClassType

ClassType:

    TypeName



The sample code:

class Point { Integer metrics; }

declares a class type Point, and uses an Integer to declare the field metrics of the class Point.

2.3.1 Objects

An object is a class instance.

The reference values (often just references) may be pointers to these objects, or a special null reference, which refers to no object.

A class instance is explicitly created by a class instance creation expression.

The operators on references to objects are:

There may be many references to the same object. Most objects have a state, stored in the fields of objects that are instances of classes. If two variables contain references to the same object, the state of the object can be modified using one variable's reference to the object, and then the altered state can be observed through the reference in the other variable.

2.3.2 The Class Object

The class Object is a superclass of all other classes. A variable of type Object can hold a reference to any object. All classes inherit the methods of class Object, which are summarized here:


Integer equals(Object obj) { . . . }
Integer hashCode() { . . . }
String  toString() { . . . }

The equals method defined by class Object returns a new Integer object whose value is 1 if the two references refer to the same object and 0 otherwise. This method can be overridden in subclasses to define a notion of object equality, which is based on value, not reference, comparison.

The hashCode method defined by class Object returns a new Integer object whose value is computed from the Object reference which invoked the method. This method can be overridden in subclasses to define a sensible hashing algorithm for the class.

The toString method defined by class Object returns a new String object whose characters will be "Object". This method can be overridden in subclasses to define an output format for the class.

The Object class defines a constructor:


Object() { }

Note that the body of the constructor is empty. Since Object is the primordial class and has no superclass, there is no call, either implicit or explicit, to the superclass constructor.

2.3.3 The Class Integer

The class Integer is a subclass of the class Object. A variable of type Integer can hold a reference to any Integer object or a reference to any object whose class type is a descendant of Integer.

An Integer object holds a single integer value. The smallest legal value is -2147483648 and the largest is 2147483647. Overflow is not viewed as an error for methods defined by class Integer.

All classes which extend Integer inherit all of the methods of Integer, which are summarized here:


Integer equals(Object obj) { . . . }
Integer hashCode() { . . . }
String  toString() { . . . }
Integer add(Integer i) { . . . }
Integer subtract(Integer i) { . . . }
Integer multiply(Integer i) { . . . }
Integer divide(Integer i) { . . . }
Integer greaterThan(Integer i) { . . . }
Integer lessThan(Integer i) { . . . }
Integer not() { . . . }
Integer minus() { . . . }
Integer operator + (Integer i) { . . . }
Integer operator - (Integer i) { . . . }
Integer operator * (Integer i) { . . . }
Integer operator / (Integer i) { . . . }
Integer operator > (Integer i) { . . . }
Integer operator < (Integer i) { . . . }
Integer operator ! () { . . . }
Integer operator - () { . . . }

The equals method defined by class Integer returns a new Integer object whose value is either 0 or 1. The value is 1 if and only if obj is convertible by casting conversion (3.4) to Integer and the value of the Integer reference that invoked the method is equal to the value of obj.

The hashCode method defined by class Integer returns a new Integer object whose value is the value of the Integer reference which invoked the method.

The toString method defined by class Integer returns a new String object whose characters are the string representation of the Integer reference which invoked the method's value in base 10 using signed decimal representation.

The add method defined by class Integer returns a new Integer object whose value is the value of i added to the value of the Integer reference which invoked the method. The value of the Integer reference which invoked the method will be unchanged.

The operator + method defined by class Integer is completely similar to the method add. If s and t are Integer references, an invocation of s + t will have all the same effects as an invocation of s.add(t).

The subtract method defined by class Integer returns a new Integer object whose value is the value of i subtracted from the value of the Integer reference which invoked the method. The value of the Integer reference which invoked the method will be unchanged.

The operator - method defined by class Integer is completely similar to the method subtract. If s and t are Integer references, an invocation of s - t will have all the same effects as an invocation of s.subtract(t).

The multiply method defined by class Integer returns a new Integer object whose value is the value of i multiplied by the value of the Integer reference which invoked the method. The value of the Integer reference which invoked the method will be unchanged.

The operator * method defined by class Integer is completely similar to the method multiply. If s and t are Integer references, an invocation of s * t will have all the same effects as an invocation of s.multiply(t).

The divide method defined by class Integer returns a new Integer object whose value is the value of the Integer reference which invoked the method divided by the value of i. The value of the Integer reference which invoked the method will be unchanged.

The following run-time errors must be detected for the divide method:

For all detected run-time errors, the action is to terminate the program with an error message.

The operator / method defined by class Integer is completely similar to the method divide. If s and t are Integer references, an invocation of s / t will have all the same effects as an invocation of s.divide(t).

The lessThan method defined by class Integer returns a new Integer object whose value is either 0 or 1. The value is 1 if and only if the value of the Integer reference that invoked the method is less than the value of i.

The operator < method defined by class Integer is completely similar to the method lessThan. If s and t are Integer references, an invocation of s < t will have all the same effects as an invocation of s.lessThan(t).

The greaterThan method defined by class Integer returns a new Integer object whose value is either 0 or 1. The value is 1 if and only if the value of the Integer reference that invoked the method is greater than the value of i.

The operator > method defined by class Integer is completely similar to the method greaterThan. If s and t are Integer references, an invocation of s > t will have all the same effects as an invocation of s.greaterThan(t).

The not method defined by class Integer returns a new Integer object whose value is either 0 or 1. The value is 1 if and only if the value of the Integer reference that invoked the method is 0. The value of the Integer reference which invoked the method will be unchaged.

The operator ! method defined by class Integer is completely similar to the method not. If s is an Integer reference, an invocation of !s will have all the same effects as an invocation of s.not().

The minus method defined by class Integer returns a new Integer object whose value is the arithmetic negation of the value of the Integer reference which invoked the method. The value of the Integer reference which invoked the method will be unchanged.

The operator - method (invoked with no arguments) defined by class Integer is completely similar to the method minus. If s is an Integer reference, an invocation of -s will have all the same effects as an invocation of s.minus().

The Integer class defines the following constructors:


Integer()

This constructor defined by class Integer creates a new Integer object whose value is 0.


Integer(Integer i)

This constructor defined by class Integer creates a new Integer object whose value is the same as the value of i.

2.3.4 The Class String

The class String is a subclass of the class Object. A variable of type String can hold a reference to any String object or a reference to any object whose class type is a descendant of String.

A String object contains an array of ASCII characters. String objects are static in that their characters cannot be modified once instantiated, nor can their size change (characters cannot be added or removed).

All classes which extend String inherit all of the methods of String, which are summarized here:


Integer equals(Object obj) { . . . }
Integer hashCode() { . . . }
String  toString() { . . . }
Integer length() { . . . }
String  substr(Integer beg, Integer end) { . . . }
String  concat(String s) { . . . }
Integer toInteger() { . . . }
String  operator + (String s) { . . . }
Integer operator > (String s) { . . . }
Integer operator < (String s) { . . . }

The equals method defined by class String returns a new Integer object whose value will be either 0 or 1. The value will be 1 if and only if obj is convertible by casting conversion (3.4) to String and the characters of the String reference that invoked the method are lexicographically equal to the characters of the String reference obj.

The hashCode method defined by class String returns a new Integer object whose value is the summation of the ASCII values of all characters in the String reference which invoked the method.

The toString method defined by class String returns a new String object whose characters will be the same characters as the String reference which invoked the method.

The length method defined by class String returns a new Integer object whose value is the number of characters in the String reference which invoked the method.

The substr method defined by class String returns a new String object whose value is the characters whose indices are defined between the range of beg to end, inclusive. Indices of a String object are assigned in the following manner: the first character of a String object is assigned index 0, the second character is assigned index 1, the third character is assigned index 2, and so on. The characters of the new String object will be in the same order as those in the String reference which invoked the method. The minimum legal index for a String reference is 0. The maximum legal index for a String reference is the number of characters it contains minus 1. Thus, legal indices for a String reference containing 5 characters would range from 0 to 4. A String reference containing 0 characters has no legal indices.

The following run-time errors must be detected for the substr method:

For all detected run-time errors, the action is to terminate the program with an error message.

The concat method defined by class String returns a new String object whose value is the characters of String s appended to the characters in the String reference which invoked the method.

The toInteger method defined by class String returns a new Integer object whose value is the decimal value obtained by parsing the characters of the String reference which invoked the method as a signed decimal integer.

The following run-time errors must be detected for the toInteger method:

For all detected run-time errors, the action is to terminate the program with an error message.

The operator + method defined by class String is completely similar to the method concat. If s and t are String references, an invocation of s + t will have all the same effects as an invocation of s.concat(t).

The operator > method defined by class String returns a new Integer object whose value will be either 0 or 1. The value will be 1 if and only if the characters of the String reference s lexicographically precede the characters of the String reference that invoked the method.

The operator < method defined by class String returns a new Integer object whose value will be either 0 or 1. The value will be 1 if and only if the characters of the String reference that invoked the method lexicographically precede the characters of the String reference s.

The String class defines the following constructor:


String(String s)

This constructor defined by class String creates a new String object whose characters are the same as the characters in s.

2.3.5 The Class Table

The class Table is a subclass of the class Object. A variable of type Table can hold a reference to any Table object or a reference to any object whose class type is a descendant of Table.

A Table object is an implementation of a hash map. The number of buckets in a Table object is known as its capacity. A Table object's initial capacity is set at the time it is constructed. A Table object's load factor is a measure of how full the Table is allowed to get before its capacity is automatically increased. When the number of entries in a Table exceeds the product of the load factor and the current capacity, the capacity is roughly doubled and the hashCode method is called on all entries in the Table to determine their new bucket placement with the new capacity.

Each entry in a bucket is a key/value pair. Thus each entry in a Table object will contain a key reference and a value reference. Each bucket of a Table object can hold any number of entries. The first bucket is known as bucket 0, the second bucket is bucket 1, the third bucket is bucket 2, and so on. Bucket n is the bucket an entry will be placed in if its key's hash value is n. The hash value for a key is computed by invoking the hashCode method on the key then taking modulo n of that number, where n is the current capacity of the Table reference.

A Table reference can be iterated over using the firstKey and nextKey methods. The Table iterator is fail-fast. This means that if an entry is added to or removed from the Table after the iterator has been initialized with a call to firstKey but before a call to nextKey causes it to reach the end of the Table, a run-time error occurs.

All classes which extend Table inherit all of the methods of Table, which are summarized here:


Object  get(Object key) { . . . }
Object  put(Object key, Object value) { . . . }
Object  remove(Object key) { . . . }
Integer firstKey() { . . . }
Object  nextKey() { . . . }

The get method defined by class Table returns the value reference of the Table entry whose key reference is equal to key, or a null reference if no such entry exists. To find this entry, inspect bucket n, where n is the hash value for key (computed using the hashCode method and the Table's current capacity, see above). Then, run the following search algorithm:

The put method defined by class Table inserts a new entry into the Table reference which invoked the method, and returns the old value reference for key or a null reference if there was none. To add a new entry, the hash value for key is first computed (using the hashCode method and the Table's current capacity, see above). Then, bucket n is inspected, where n is the computed hash value, using the following insertion algorithm:

The following run-time errors must be detected for the put method:

For all detected run-time errors, the action is to terminate the program with an error message.

The remove method defined by class Table removes the entry whose key reference is equal to key in the Table reference which invoked the method. If an entry with a matching key reference is found, the value reference of the removed entry is returned. Otherwise, the remove method returns a null reference. To find the entry to remove, inspect bucket n, where n is the hash value for key (computed using the hashCode method and the Table's current capacity, see above). Then, run the following search algorithm:

The following run-time errors must be detected for the remove method:

For all detected run-time errors, the action is to terminate the program with an error message.

The firstKey method defined by class Table initializes an iterator over the Table reference which invoked the method. After invoking this method on a Table reference, the next invocation of nextKey will return the key reference of the first entry in the first non-empty bucket of that Table reference, or a null reference if none exist. If the first invocation of nextKey would return a null reference, firstKey returns a new Integer object with a value of 0, otherwise firstKey will return a new Integer object with a value of 1.

The nextKey method defined by class Table advances the iterator and returns the key reference of the entry previously pointed to by the iterator of the Table reference which invoked the method. Advancing the iterator is done as follows:

All further calls to nextKey on a Table reference whose iterator has reached the end of the Table will return a null reference until the firstKey method is invoked again.

The Table class defines the following constructors:


Table()

This constructor defined by class Table creates a new Table object whose initial capacity is 16.


Table(Integer n)

This constructor defined by class Table creates a new Table object whose initial capacity is the value of the Integer reference n.

2.3.6 When Reference Types Are the Same

Two reference types are the same type if:

2.4 Where Types Are Used

Types are used in declarations, in class instance creation expressions, and in cast operator expressions.

2.5 Variables

A variable is a storage location and has an associated type, sometimes called its compile-time type, that is, a reference type. A variable's value can be changed by an assignment and a variable may only be assigned a value that is assignment compatible with its type.

Compatibility of the value of a variable with its type is guaranteed by the design of the maTe programming language. Otherwise, default values are compatible and all variable assignments are checked for assignment compatibility at compile time, run time, or both (reference types).

2.5.1 Variables of Reference Type

A variable of reference type can hold either of the following:

2.5.2 Kinds of Variables

There are five kinds of variables:

  1. An instance variable is a field declared within a class declaration. If a class T has a field a that is an instance variable, then a new instance variable a is created and initialized to a default value as part of each newly created object of class T or of any class that is a subclass of T. The instance variable effectively ceases to exist when the object of which it is a field has been destroyed (deleted).
  2. Method parameters name argument values passed to a method. For every parameter declared in a method declaration, a new parameter variable is created each time that method is invoked. The new variable is initialized with the corresponding argument value from the method invocation. The method parameter effectively ceases to exist when the execution of the body of the method is complete.
  3. Constructor parameters name argument values passed to a constructor. For every parameter declared in a constructor declaration, a new parameter variable is created each time a class instance creation expression or explicit constructor invocation invokes that constructor. The new variable is initialized with the corresponding argument value from the creation expression or constructor invocation. The constructor parameter effectively ceases to exist when the execution of the body of the constructor is complete.
  4. Local Variables are declared by variable declaration statements within the main block or the body of a class constructor, method or operator. Declaration statements are supported at all block levels. A local variables become visible when flow of control reaches its declaration. A local variable effectively ceases to exist when execution leaves the block in which it is declared.

2.5.3 Initial Values of Variables

Every variable in a program must have a value before its value is used:

2.5.4 Types and Classes

In the maTe programming language, every variable and every expression has a type that can be determined at compile time. Reference types are introduced by type declarations, which include class declarations.

Every object belongs to some particular class: the class that was mentioned in the creation expression that produced the object. This class is called the class of the object. An object is said to be an instance of its class and of all superclasses of its class.

Sometimes a variable or expression is said to have a "run-time type". This refers to the class of the object referred to by the value of the variable or expression at run time, assuming that the value is not null.

The compile time type of a variable is always declared, and the compile time type of an expression can be deduced at compile time. The compile time type limits the possible values that the variable can hold or the expression can produce at run time. If a run-time value is a reference that is not null, it refers to an object that has a class, and that class will necessarily be compatible with the compile-time type.


CHAPTER 3

Conversions


Every expression written in the maTe programming language has a type that can be deduced from the structure of the expression and the types of the literals, variables, and methods mentioned in the expression. It is possible, however, to write an expression in a context where the type of the expression is not appropriate. In some cases, this leads to an error at compile time.

A specific conversion from type S to type T allows an expression of type S to be treated at compile time as if it had type T instead. In some cases this will require a corresponding action at run time to check the validity of the conversion.

For convenience of description, the specific conversions that are possible in the maTe programming language are grouped into several broad categories:

There are three conversion contexts in which conversion of expressions may occur. The term "conversion" is also used to describe the process of choosing a specific conversion for such a context. For example, we say that an expression that is an actual argument in a method invocation is subject to "method invocation conversion," meaning that a specific conversion will be implicitly chosen for that expression according to the rules for the method invocation argument context.

This chapter first describes the three categories of conversions (3.1). Then the three conversion contexts are described:

3.1 Kinds of Conversion

Specific type conversions in the maTe programming language are divided into the following categories.

3.1.1 Identity Conversions

A conversion from a type to that same type is permitted for any type.

This may seem trivial, but it does have practical consequences. It is always permitted for an expression to have the desired type to begin with, thus allowing the simply stated rule that every expression is subject to conversion, if only a trivial identity conversion.

3.1.2 Widening Reference Conversions

The following conversions are called the widening reference conversions:

Such conversions never require a special action at run time. They consist simply in regarding a reference as having some other type in a manner that can be proved correct at compile time.

See 5 for the detailed specifications for classes.

3.1.3 Narrowing Reference Conversions

The following conversions are called the narrowing reference conversions:

Such conversions require a test at run time to find out whether the actual reference value is a legitimate value of the new type.

3.1.4 Forbidden Conversions

3.2 Assignment Conversion

Assignment conversion occurs when the value of an expression is assigned (7.14) to a variable: the type of the expression must be converted to the type of the variable. Assignment contexts allow the use of an identity conversion (3.1.1) or a widening reference conversion (3.1.2).

If the type of the expression cannot be converted to the type of the variable by a conversion permitted in an assignment context, then a compile-time error occurs.

If the type of an expression can be converted to the type of a variable by assignment conversion, we say the expression (or its value) is assignable to the variable or, equivalently, that the type of the expression is assignment compatible with the type of the variable.

A value of the null type (the null reference is the only such value) may be assigned to any reference type, resulting in a null reference of that type.

Assignment of a value of compile-time reference type S (source) to a variable of compile-time reference type T (target) is checked as follows:

See 5 for the specification of classes.

3.3 Method Invocation Conversion

Method invocation conversion is applied to each argument value in a method or constructor invocation (7.7, 7.9): the type of the argument expression must be converted to the type of the corresponding parameter. Method invocation contexts allow the use of an identity conversion (3.1.1) or a widening reference conversion (3.1.2).

3.4 Casting Conversion

Casting conversion is applied to the operand of the cast operator: the type of the operand expression must be converted to the type explicitly named by the cast operator. Casting conversion allows the use of an identity conversion (3.1.1), a widening reference conversion (3.1.2), or a narrowing reference conversion (3.1.3). Casting using narrowing reference conversion (3.1.3) will require a run-time check to see if the cast is valid. In the event that the cast is not valid the program will terminate with an appropriate error message.


CHAPTER 4

Names


Names are used to refer to entities declared in a program (4.1). A declared entity is a class type, a member (field or method) of a reference type, a parameter (to a method or constructor) or a local variable.

Names in maTe programs are simple, consisting of a single identifier (4.2).

Every declaration that introduces a name has a scope (4.3), which is the part of the program text within which the declared entity can be referred to by a name.

Reference types (that is, class types) have members (4.4). A member can be referred to using a qualified name N.x, where N is a variable of a reference type (or this or super) and x is an identifier that names a member of that type, which is either a field or a method.

In determining the meaning of a name (4.5), the context of the occurrence is used to disambiguate among types, variables, and methods with the same name.

The name of a field, parameter, or local variable may be used as an expression (7.2). The name of a method may appear in an expression only as part of a method invocation expression (7.9). The name of a class type may appear in an expression only as part of a class instance creation expression (7.7), a cast operator (7.10.1) or an instanceof operator (7.10.2).

4.1 Declarations

A declaration introduces an entity into a program and includes an identifier that can be used as a name to refer to this entity. A declared entity is one of the following:

Constructors are also introduced by declarations, but use a name based upon the name of the class in which they are declared rather than introducing a new name.

4.2 Names and Identifiers

A name is used to refer to an entity declared in a program. All names are simple names: a single identifier.

4.3 Scope of a Declaration

The scope of a declaration is the region of the program within which the entity declared by the declaration can be referred to using a name (provided it is visible). A declaration is said to be in scope at a particular point in a program if and only if the declaration's scope includes that point.

These rules imply that declarations of class types need not appear before uses of the types.

4.3.1 Shadowing Declarations

Some declarations may be shadowed in part of their scope by another declaration of the same name, in which case a name cannot be used to refer to the declared entity.

A declaration d of a method parameter or constructor parameter named n shadows the declarations of any fields named n that are in scope at the point where d occurs throughout the scope of d.

Similarly, a local variable in a method or constructor body shadows throughout its scope a parameter or a field with the same name. And, an inner declaration of a local variable shadows throughout its scope an outer declaration of a local variable of the same name that is in scope.

A declaration d is said to be visible at point p in a program if the scope of d includes p, and d is not shadowed by any other declaration at p.

Note that shadowing is distinct from hiding. Hiding applies only to members which would otherwise be inherited but are not because of a declaration in a subclass.

4.4 Members and Inheritance

Reference types have members.

This section provides an overview of the members of reference types here, as background for the discussion of the determination of the meaning of names.

4.4.1 The Members of a Class Type

The members of a class type are fields and methods. Members are either declared in the type, or inherited because they are members of a superclass which are not overridden.

The members of a class type are all of the following:

Constructors are not members.

There is no restriction against a field and a method of a class type having the same name.

A class type may have two or more methods with the same name if the methods have different signatures, that is, if they have different numbers of parameters or different parameter types in at least one parameter position. Such a method member name is said to be overloaded.

A class type may contain a declaration for a method with the same name and the same signature as a method that would otherwise be inherited from a superclass. In this case, the method of the superclass is not inherited. The new declaration is said to override it.

4.5 Determining the Meaning of a Name

The meaning of a name depends on the context in which it is used. The determination of the meaning of a name requires two steps. First, context causes a name syntactically to fall into one of three categories: TypeName, ExpressionName or MethodName. Second, the resulting category then dictates the final determination of the meaning of the name (or a compilation error if the name has no meaning).



TypeName:

    Identifier



ExpressionName:

    Identifier



MethodName:

    Identifier



4.5.1 Syntactic Classification of a Name According to Context

A name is syntactically classified as a TypeName in these contexts:

A name is syntactically classified as an ExpressionName in these contexts:

A name is syntactically classified as a MethodName in this context:

4.5.2 Meaning of Type Names

A type name consists of a single Identifier. The identifier must occur in the scope of a declaration of a type with this name, or a compile-time error occurs.

4.5.3 Meaning of Expression Names

The meaning of a name classified as an ExpressionName is determined as follows.

4.5.3.1 Simple Expression Names

If an expression name consists of a single Identifier, then:

4.5.3.2 Qualified Expression Names

If an expression name is of the form Q.Id, then Q has already been classified as an expression name. Let T be the type of Q:

4.5.4 Meaning of Method Names

A MethodName can appear only in a method invocation expression. The meaning of a name classified as a MethodName is determined as follows.

4.5.4.1 Simple Method Names

If a method name consists of a single Identifier, then Identifier is the method name to be used for method invocation. The Identifier must name at least one method of a class within whose declaration the Identifier appears.

4.5.4.2 Qualified Method Names

If a method name is of the form Q.Id, then Q has already been classified as an expression name. Id is the method name to be used for method invocation. Let T be the type of the expression Q; Id must name at least one method of the type T.


CHAPTER 5

Classes


Class declarations define new reference types and describe how they are implemented (5.1).

Each class except Object is an extension of (that is, a subclass of) a single existing class (5.1.1).

The body of a class declares members (fields, methods and operators) and constructors (5.1.2). The scope (4.3) of a member (5.2) is the entire declaration of the class to which the member belongs. The members of a class include both declared and inherited members (5.2). Newly declared fields can hide fields declared in a superclass. Newly declared methods can override methods declared in a superclass.

Field declarations (5.3) describe instance variables, which are freshly incarnated for each instance of the class.

Method declarations (5.4) describe code that may be invoked by method invocation expressions (7.9). A method is invoked with respect to some particular object that is an instance of the class type.

Operator declarations (5.6) describe code that may be invoked by operator invocation expressions (7.16). An operator is invoked with respect to some particular object that is an instance of the class type.

Method names may be overloaded (5.4.5).

Constructors (5.5) are similar to methods, but cannot be invoked directly by a method call; they are used to initialize new class instances. Like methods, they may be overloaded (5.5.4).

5.1 Class Declaration

A class declaration specifies a new reference type:




ClassDeclaration:

    class Identifier Superopt ClassBody



The Identifier in a class declaration specifies the name of the class. A compile-time error occurs if a class has the same name as any other class in the program.

5.1.1 Superclasses and Subclasses

The optional extends clause in a class declaration specifies the direct superclass of the current class. A class is said to be a direct subclass of the class it extends. The direct superclass is the class from whose implementation the implementation of the current class is derived. If the class declaration for any class has no extends clause, then the class has the class Object as its implicit direct superclass.




Super:

    extends ClassType



The following is repeated from 2.3 to make the presentation here clearer:




ClassType:

	TypeName



The ClassType must name a class type, or a compile-time error occurs.

The subclass relationship is the transitive closure of the direct subclass relationship. A class A is a subclass of class C if either of the following is true:

Class C is said to be a superclass of class A whenever A is a subclass of C.

A class C directly depends on a type T if T is mentioned in the extends clause of C. A class C depends on a reference type T if any of the following conditions hold:

It is a compile-time error if a class depends on itself.

For example:


class Point extends ColoredPoint { Integer x, y; }
class ColoredPoint extends Point { Integer color; }

causes a compile-time error.

5.1.2 Class Body and Member Declarations

A class body may contain declarations of members of the class, that is, fields (5.3) methods (5.4) and operators (5.6). A class body may also contain declarations of constructors (5.5) for the class.




ClassBody:

    { ClassBodyDeclarationsopt }





ClassBodyDeclarations:

    ClassBodyDeclaration

    ClassBodyDeclarations ClassBodyDeclaration





ClassBodyDeclaration:

    ClassMemberDeclaration

    ConstructorDeclaration





ClassMemberDeclaration:

    FieldDeclaration

    MethodDeclaration

    OperatorDeclaration



The scope of a declaration of a member m declared in or inherited by a class type C is the entire body of C.

5.2 Class Members

The members of a class type are all of the following:

Constructors are not members and therefore are not inherited.

5.3 Field Declarations

The variables of a class type are introduced by field declarations:




FieldDeclaration:

    Type VariableDeclarators ;



VariableDeclarators:

    VariableDeclarator

    VariableDeclarators , VariableDeclarator



VariableDeclarator:

    Identifier



The Identifier in a FieldDeclarator may be used in a name to refer to the field. Fields are members; the scope (4.3) of a field declaration is specified in 5.1.2. More than one field may be declared in a single field declaration by using more than one declarator; the Type apply to all the declarators in the declaration.

It is a compile-time error for the body of a class declaration to declare two fields with the same name.

Methods, types, and fields may have the same name, since they are used in different contexts and are disambiguated by different lookup procedures (4.5).

If the class declares a field with a certain name, then the declaration of that field is said to hide any declarations of fields with the same name in superclasses of the class.

If a field declaration hides the declaration of another field, the two fields need not have the same type.

A class inherits from its direct superclass all the fields of the superclass that are not hidden by a declaration in the class.

It is not possible for a class to inherit more than one field with the same name.

A hidden field can be accessed by using a field access expression (7.8) that contains the keyword super.

5.4 Method Declarations

A method declares executable code that can be invoked, passing a fixed number of values as arguments.




MethodDeclaration:

    MethodHeader MethodBody



MethodHeader:

    ResultType MethodDeclarator



ResultType:

    Type



MethodDeclarator:

    Identifier ( FormalParameterListopt )



A method declaration specifies the type of value that the method returns.

The Identifier in a MethodDeclarator may be used in a name to refer to the method. A class can declare a method with the same name as the class or a field of the class.

It is a compile-time error for the body of a class to have as members two methods with the same signature (5.4.2) (name, number of parameters, and types of any parameters). Methods and fields may have the same name, since they are used in different contexts and are disambiguated by the different lookup procedures (4.5).

5.4.1 Formal Parameters

The formal parameters of a method or constructor, if any, are specified by a list of comma-separated parameter specifiers. Each parameter specifier consists of a type and an identifier (optionally followed by brackets) that specifies the name of the parameter:




FormalParameterList:

	FormalParameter

	FormalParameterList , FormalParameter



FormalParameter:

	Type Identifier



If a method or constructor has no parameters, only an empty pair of parentheses appears in the declaration of the method, operator or constructor.

If two formal parameters of the same method, operator or constructor are declared to have the same name (that is, their declarations mention the same Identifier), then a compile-time error occurs.

When the method, operator or constructor is invoked (7.9), the values of the actual argument expressions initialize newly created parameter variables, each of the declared Type, before execution of the body of the method, operator or constructor. The Identifier that appears in the DeclaratorId may be used as a simple name in the body of the method, operator or constructor to refer to the formal parameter.

The scope of a parameter of a method, operator or constructor is the entire body of the method or constructor.

5.4.2 Method Signature

The signature of a method consists of the name of the method and the number and types of formal parameters to the method.

A class may not declare two methods with the same signature, or a compile-time error occurs.

5.4.3 Method Body

A method body is a block of code that implements the method.




MethodBody:

	Block 



If an implementation requires no executable code, the method body should be written as a block that contains no statements: "{ }".

Since a method must always have a return type, then every return statement (6.9) in its body must have an Expression.

Moreover, a method may only explicitly return by using a return statement that provides a value return. Otherwise, the method may "drop off" the end of its body by executing an implicit return at the very end of its method body; the value of the expression for this implicit return is the same as the default value for the return type of the method (i.e. null for reference types).

5.4.4 Inheritance and Overriding

A class inherits from its direct superclass all the methods of the superclass that are not overridden by a declaration in the class.

A method declared in a class C overrides another method with the same signature declared in class A if C is a subclass of A.

A compile-time error occurs if a method has a different return type than the method it overrides.

An overridden method can be accessed by using a method invocation expression (7.9) that contains the keyword super.

5.4.5 Overloading

If two methods of a class (whether both declared in the same class, or both inherited by a class, or one declared and one inherited) have the same name but different signatures, then the method name is said to be overloaded. This fact causes no difficulty and never of itself results in a compile-time error.

There is no required relationship between the return types of two methods with the same name but different signatures.

Methods are overridden on a signature-by-signature basis.

If, for example, a class declares two methods with the same name, and a subclass overrides one of them, the subclass still inherits the other method.

When a method is invoked (7.9), the number of actual arguments and the compile-time types of the arguments are used, at compile time, to determine the signature of the method that will be invoked (7.9.2). The actual method to be invoked will be determined at run time, using dynamic method lookup (7.9.3).

5.5 Constructor Declarations

A constructor is used in the creation of an object that is an instance of a class:




ConstructorDeclaration:

    ConstructorDeclarator ConstructorBody



ConstructorDeclarator:

    TypeName ( FormalParameterListopt )



The TypeName in the ConstructorDeclarator must be the name of the class that contains the constructor declaration; otherwise a compile-time error occurs. In all other respects, the constructor declaration looks just like a method declaration that has no result type.

Constructors are invoked by class instance creation expressions, and by explicit constructor invocations from other constructors (5.5.3.1). Constructors are never invoked by method invocation expressions.

Constructors are not members. They are never inherited and therefore are not subject to hiding or overriding.

5.5.1 Formal Parameters

The formal parameters of a constructor are identical in structure and behavior to the formal parameters of a method.

5.5.2 Constructor Signature

The signature of a constructor consists of the number and types of formal parameters to the constructor. A class may not declare two constructors with the same signature, or a compile-time error occurs.

5.5.3 Constructor Body

The first statement of a constructor body may be an explicit invocation of another constructor of the same class or of the direct superclass (5.5.3.1).




ConstructorBody:

    { ExplicitConstructorInvocationopt BlockStatementsopt }



It is a compile-time error for a constructor to directly or indirectly invoke itself through a series of one or more explicit constructor invocations involving this.

If a constructor body does not begin with an explicit constructor invocation, then the constructor body is implicitly assumed by the compiler to begin with a superclass constructor invocation "super();", an invocation of the constructor of its direct superclass that takes no arguments. A compile-time error occurs if an implicit superclass constructor invocation is assumed by the compiler but the superclass does not have a constructor that takes no arguments.

5.5.3.1 Explicit Constructor Invocations




ExplicitConstructorInvocation:

	this ( ArgumentListopt ) ;

	super ( ArgumentListopt ) ;



Explicit constructor invocation statements can be divided into two kinds:

An explicit constructor invocation statement in a constructor body may not refer to any variables or methods declared or inherited in the object being constructed, or use this or super in any expression; otherwise, a compile-time error occurs.

The evaluation of an explicit constructor invocation proceeds in several steps:

5.5.4 Constructor Overloading

Overloading of constructors is identical in behavior to overloading of methods. The overloading is resolved at compile time by each class instance creation expression.

5.5.5 Default Constructor

If a class contains no constructor declarations, then a default constructor that takes no parameters is automatically provided.

The default constructor takes no parameters and simply implicitly invokes the superclass constructor with no arguments. A compile-time error occurs if a default constructor is provided by the compiler but the super class does not have a constructor that takes no arguments.

5.6 Operator Declarations

A operator declares executable code that can be invoked, passing a fixed number of values as arguments.




OperatorDeclaration:

    OperatorHeader OperatorBody



OperatorHeader:

    ResultType OperatorDeclarator



ResultType:

    Type



OperatorDeclarator:

    operator UnaryOperator ( )

    operator BinaryOperator ( FormalParameter )

    operator MinusOperator ( )

    operator MinusOperator ( FormalParameter )



An operator declaration specifies the type of value that the operator returns.

The token immediately after operator in an OperatorDeclarator (UnaryOperator, BinaryOperator or MinusOperator) may be used in properly formed operator invocation expression (7.16) as a name for the operator.

It is a compile-time error for the body of a class to have as members two operators with the same signature (5.6.2) (operator name, number of parameters, and types of any parameters).

5.6.1 Formal Parameters

The formal parameter of an operator, if any, is specified by parameter specifiers. The optional parameter specifier consists of a type and an identifier (optionally followed by brackets) that specifies the name of the parameter:




FormalParameter:

	Type Identifier



If an operator has no parameter, only an empty pair of parentheses appears in the declaration of the operator.

When the operator is invoked (7.16), the values of the actual argument expressions initialize newly created parameter variables, each of the declared Type, before execution of the body of the operator. The Identifier may be used as a simple name in the body of the operator to refer to the formal parameter.

The scope of a parameter of an operator is the entire body of the operator.

5.6.2 Operator Signature

The signature of a operator consists of the operator (1.9) and the number and types of formal parameters to the operator.

A class may not declare two operators with the same signature, or a compile-time error occurs.

5.6.3 Operator Body

An operator body is a block of code that implements the operator.




OperatorBody:

	Block 



If an implementation requires no executable code, the operator body should be written as a block that contains no statements: "{ }".

Since an operator must always have a return type, then every return statement (6.9) in its body must have an Expression.

Moreover, an operator may only explicitly return by using a return statement that provides a value return. Otherwise, the operator may "drop off" the end of its body by executing an implicit return at the very end of its operator body; the value of the expression for this implicit return is the same as the default value for the return type of the operator.

5.6.4 Inheritance and Overriding

A class inherits from its direct superclass all the operators of the superclass that are not overridden by a declaration in the class.

An operator declared in a class C overrides another operator with the same signature declared in class A if C is a subclass of A.

A compile-time error occurs if an operator has a different return type than the operator it overrides.

5.6.5 Overloading

If two operators of a class (whether both declared in the same class, or both inherited by a class, or one declared and one inherited) have the same operator token but different signatures, then the operator is said to be overloaded. This fact causes no difficulty and never of itself results in a compile-time error.

There is no required relationship between the return types of two operators with the same name but different signatures.

Operators are overridden on a signature-by-signature basis.

If, for example, a class declares two operators with the same name, and a subclass overrides one of them, the subclass still inherits the other operator.

When an operator is invoked (7.16), the number of actual arguments and the compile-time types of the arguments are used, at compile time, to determine the signature of the operator that will be invoked. The actual operator to be invoked will be determined at run time, using dynamic method lookup.


CHAPTER 6

Blocks and Statements


The sequence of execution of a maTe program is controlled by a sequence of statements, which are executed for their effect and do not have values.

Some statements contain other statements as part of their structure; such other statements are substatements of the statement. In the same manner, some statements contain expressions (7) as part of their structure.

Sequences of statements are organized into blocks. There are two primary blocks defined for a maTe program: the main block, which is explained in section (6.1), and statement blocks which are explained in section (6.3).

Statements that will be familiar to C and C++ programmers are the block (6.3), empty (6.4), expression (6.5), if (6.6, 6.7), while (6.8), return (6.9), break (6.11), continue (6.12), and local variable declaration (6.13) statements.

6.1 Main Block

A program shall contain a global construct called main (the main block), which is the designated start of the execution of a program. Exactly one main block must exist for every maTe program. The block is entered when program execution starts. Program execution continues until a return statement in the main block is executed or the end of the main block is reached (or a run-time error is encountered). If the end of the main block is reached, then the result is as if the program executed a return of 0. The declaration of a main block includes the definition of a return type, which must be Integer. The return value exists to provide a way to pass a single status value to the surrounding environment.

All implementations of the main block will have the following definition:




MainFunctionDeclaration:

    Integer main() { MainBlockStatementsopt }

MainBlockStatements:

    MainBlockStatements MainBlockStatement

    MainBlockStatement

MainBlockStatement:

    BlockStatement
	
BlockStatement:

    Statement



maTe does not allow for arguments to be passed into the main block.

The main block can exist at the beginning of the source file, at the end of the source file or in between any two class definitions within the source file.

Local variable declaration statements (6.13) can be provided at any level of the main block. The scope of a local variable declared in the main block is from its declaration point to the end of the block in which it is declared. It is a compile-time error to declare a local variable with the same name as a previously declared local variable which is still visible from the new local variable's declaration point.

6.2 Statements

There are many kinds of statements in the maTe programming language. Most correspond to statements in the C and C++ languages. Statements are given by the following grammar:




Statement:

    Block (6.3)

    EmptyStatement (6.4)

    ExpressionStatement (6.5)

    IfThenElseStatement (6.6)

    IfThenStatement (6.7)

    WhileStatement (6.8)

    ReturnStatement (6.9)

    OutputStatement (6.10)

    BreakStatement (6.11)

    ContinueStatement (6.12)

    LocalVariableDeclarationStatement (6.13)


6.3 Blocks

A block is a sequence of statements within braces.




Block:

    { BlockStatementsopt }



BlockStatements:

    BlockStatement

    BlockStatements BlockStatement



The following production from (6.1) is repeated here for convenience:





BlockStatement:

    Statement



A block is executed by executing each of the statements in order from first to last (left to right). It is possible for a block to terminate early through a return statement.

6.4 The Empty Statement

An empty statement does nothing.




EmptyStatement:

    ;



6.5 Expression Statements

Certain kinds of expressions may be used as statements by following them with semicolons:




ExpressionStatement:

    StatementExpression ;



StatementExpression:

    Assignment

    MethodInvocation



An expression statement is executed by evaluating the expression; if the expression has a value, the value is discarded.

6.6 The if-then-else Statement

The if-then-else statement allows a conditional choice of two statements, executing one or the other but not both.




IfThenElseStatement:

    if ( Expression ) Statement else Statement



The Expression must have type Integer, or a compile-time error occurs.

An if-then-else statement is executed by first evaluating the Expression. Execution continues by making a choice based on the resulting value:

An else is associated with the lexically immediately preceding else-less if that is in the same block (but not in an enclosed block).

6.7 The if-then Statement

The if-then statement allows a conditional choice of one statement, executing the statement or not executing it.




IfThenStatement:

    if ( Expression ) Statement



The Expression must have type Integer, or a compile-time error occurs.

An if-then statement is executed by first evaluating the Expression. Execution continues by making a choice based on the resulting value:

6.8 The while Statement

The while statement executes an Expression and a Statement repeatedly until the value of the Expression is 0.




WhileStatement:

    while ( Expression ) Statement



The Expression must have type Integer, or a compile-time error occurs. A while statement is executed by first evaluating the Expression. Execution continues by making a choice based on the resulting value:

If the value of the Expression is 0 the first time it is evaluated, then the Statement is not executed.

6.9 The return Statement

A return statement returns control to the invoker of a method (5.4, 7.9), returns control to the invoker of a constructor (5.5, 7.7), returns control to the invoker of an operator (5.6, 7.16), or terminates the main block (6.1), and is given by the following grammar:




ReturnStatement:

    return Expression ;

    return ;



Use of a return without an expression shall only be used within a constructor. A return with an expression in a constructor is a compile-time error.

If a return statement is contained within a method or operator, the value of the Expression becomes the value of the method or operator invocation. More precisely, execution of such a return statement first evaluates the Expression. The value produced by the Expression is communicated to the invoker. A return statement with no Expression is not allowed in this context and will result in a compile-time error.

If a return statement is contained within the main block, the value of the Expression becomes the value of the program. More precisely, execution of such a return statement first evaluates the Expression. The value produced by the Expression is communicated to the surrounding execution environment. A return statement with no Expression is not allowed in this context and will result in a compile-time error.

It is possible to return from the middle of a while or if-then-else block.

A compile-time error occurs if the type of the return expression is not convertible by assignment conversion to the return type of the enclosing method or operator, or to Integer if the return statement is in the main block.

6.10 The out Statement

The out statement is a rudimentary mechanism for printing strings and it provides the only way to generate output in the maTe programming language. Its syntax is described by the following grammar:




OutputStatement:

    out Expression ;



If the Expression has type String then the out statement will print to stdout all characters in the String object. If the Expression is not of type String then the toString method will first be invoked on Expression, and the resulting String object will be output in the manner described above.

If the Expression evaluates to null, then a run-time error occurs and the program terminates (7.18).

6.11 The break Statement

The break statement transfers control out of an enclosing while statement. Its syntax is described by the following grammar:




BreakStatement:

    break ;



A break statement transfers control to the innermost enclosing while statement of the enclosing method or main block; this statement, which is called the break target, then immediately exits. If no while statement encloses the break statement, a compile-time error occurs.

6.12 The continue Statement

The continue statement transfers control to the loop-continuation point of an enclosing while statement. Its syntax is described by the following grammar:




ContinueStatement:

    continue ;



A continue statement transfers control to the innermost enclosing while statement of the enclosing method or main block; this statement, which is called the continue target, then immediately ends the current iteration and begins a new one. If no while statement encloses the continue statement, a compile-time error occurs.

6.13 The Local Variable Declaration Statement

Local variable declaration statements may be provided at any level of a main block, class constructor or method body. The scope of a local variable is from its declaration point to the end of the enclosing block in which it was declared. Two variables have the same scope if and only if their scopes terminate at the same point. It is a compile-time error to declare two local variables with the same name in the same scope. If an outer declaration of a variable with the same name exists, it is hidden until the end of the scope of the inner variable, after which the outer variable becomes visible again. It is a compile-time error to declare a local variable in a constructor, method or operator body with the same name as a parameter declared in the enclosing constructor, method or operator's signature. Local variable declaration statements are given by the following grammar:


LocalVariableDeclarationStatement:

    Type VariableDeclarators ;

VariableDeclarators:

    VariableDeclarator

    VariableDeclarators , VariableDeclarator

VariableDeclarator:

    Identifier


CHAPTER 7

Expressions


Much of the work in a program is done by evaluating expressions, either for their side effects, such as assignments to variables, or for their values, which can be used as arguments or operands in larger expressions, or to affect the execution sequence in statements, or both.

This chapter specifies the meanings of expressions and the rules for their evaluation.

7.1 Evaluation, Denotation, and Result

When an expression in a program is evaluated (executed), the result denotes one of two things:

Evaluation of an expression can also produce side effects, because expressions may contain embedded assignments and method invocations.

Each expression occurs either in the main block (6.1) or in the declaration of some class type that is being declared. In a class declaration the expression might occur in a constructor declaration, or in the code for an operator or method.

7.2 Variables as Values

If an expression denotes a variable, and a value is required for use in further evaluation, then the value of that variable is used. In this context, if the expression denotes a variable or a value, we may speak simply of the value of the expression.

7.3 Type of an Expression

If an expression denotes a variable or a value, then the expression has a type known at compile time. The rules for determining the type of an expression are explained separately below for each kind of expression.

The value of an expression is always assignment compatible (3.2) with the type of the expression, just as the value stored in a variable is always compatible with the type of the variable. In other words, the value of an expression whose type is T is always suitable for assignment to a variable of type T.

7.4 Expressions and Run-Time Checks

If the type of an expression is a reference type, then the class of the referenced object, or even whether the value is a reference to an object rather than null, is not necessarily known at compile time. There are a few places in the maTe programming language where the actual class of a referenced object affects program execution in a manner that cannot be deduced from the type of the expression. They are as follows:

The first of the cases just listed ought never to result in detecting a type error, as it is compile-time constrained to be valid. Thus, a run-time type error can occur only when the actual class of the object referenced by the value to be assigned (either implicitly or explicitly) is not compatible with the actual run-time reference variable. In these cases, the program terminates with a Run-Time error (7.18).

7.5 Evaluation Order

The maTe programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.

It is recommended that code not rely crucially on this specification. Code is usually clearer when each expression contains at most one side effect, as its outermost operation.

7.5.1 Evaluate Left-Hand Operand First

The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. For example, if the left-hand operand contains an assignment to a variable and the right-hand operand contains a reference to that same variable, then the value produced by the reference will reflect the fact that the assignment occurred first.

7.5.2 Evaluate Operands before Operation

The maTe programming language also guarantees that every operand of an operator appears to be fully evaluated before any part of the operation itself is performed.

7.5.3 Evaluation Respects Parentheses and Precedence

maTe programming language implementations must respect the order of evaluation as indicated explicitly by parentheses and implicitly by operator precedence. An implementation may not take advantage of algebraic identities such as the associative law to rewrite expressions into a more convenient computational order unless it can be proven that the replacement expression is equivalent in value and in its observable side effects for all possible computational values that might be involved.

Note that Integer addition and multiplication are provably associative in the maTe programming language.

For example a+b+c, where a, b, and c are local variables will always produce the same answer whether evaluated as (a+b)+c or a+(b+c); if the expression b+c occurs nearby in the code, a smart compiler may be able to use this common subexpression.

7.5.4 Argument Lists are Evaluated Left-to-Right

In a method or constructor invocation or class instance creation expression, argument expressions may appear within the parentheses, separated by commas. Each argument expression appears to be fully evaluated before any part of any argument expression to its right.

7.6 Primary Expressions

Primary expressions include most of the simplest kinds of expressions, from which all others are constructed: literals, field accesses, method invocations and names. A parenthesized expression is also treated syntactically as a primary expression.




Primary:

    Identifier

    ParenExpression

    this

    FieldAccess

    MethodInvocation

    OperatorInvocation

    ClassInstanceCreationExpression

    Literal



7.6.1 Lexical Literals

A literal denotes a fixed, unchanging value.

The following production from (1.7) is repeated here for convenience:




Literal:

    IntegerLiteral

    NullLiteral

    StringLiteral



The type of a literal is determined as follows:

7.6.2 this

The keyword this may be used only in the body of an method, operator or constructor.

When used as a primary expression, the keyword this denotes a value, that is a reference to the object for which the method was invoked, or to the object being constructed. The type of this is the class C within which the keyword this occurs. At run time, the class of the actual object referred to may be the class C or any subclass of C.

7.6.3 Parenthesized Expressions

A parenthesized expression is a primary expression whose type is the type of the contained expression and whose value at run time is the value of the contained expression. If the contained expression denotes a variable then the parenthesized expression also denotes that variable.




ParenExpression:


    ( Expression ) 



7.6.4 Expression Names

The rules for evaluating expression names are given in 4.5.3.

7.7 Class Instance Creation Expressions

A class instance creation expression is used to create new objects that are instances of classes.




ClassInstanceCreationExpression:

    new ClassType Arguments 



Arguments:

    ( ArgumentList )

    ( )



ArgumentList:

    ArgumentList , Expression

    Expression



We say that a class is instantiated when an instance of the class is created by a class instance creation expression. Class instantiation involves determining what class is to be instantiated, what constructor should be invoked to create the new instance and what arguments should be passed to that constructor.

7.7.1 Determining the Class being Instantiated

The class being instantiated is the class denoted by ClassType.

The type of the class instance creation expression is the class type being instantiated.

7.7.2 Choosing the Constructor and its Arguments

Let C be the class type being instantiated. To create an instance of C, i, a constructor of C is chosen at compile-time by the following rules:

7.7.3 Run-time Evaluation of Class Instance Creation Expressions

At run time, a class instance creation expression requires memory space to be allocated for the new class instance. If there is insufficient space to allocate the object, the program terminates with a run-time error.

The new object contains new instances of all the fields declared in the specified class type and all its superclasses. As each new field instance is created, it is initialized to its default value.

Next, the actual arguments to the constructor are evaluated, left-to-right.

Next, the selected constructor of the specified class type is invoked. This results in invoking at least one constructor for each superclass of the class type.

The value of a class instance creation expression is a reference to the newly created object of the specified class. Every time the expression is evaluated, a fresh object is created.

7.8 Field Access Expressions

A field access expression may access a field of an object, a reference to which is the value of either an expression or the special keyword super.




FieldAccess: 

    Primary . Identifier

    super . Identifier



7.8.1 Field Access Using a Primary

The type of the Primary must be a reference type T, or a compile-time error occurs. The meaning of the field access expression is determined as follows:

Note, specifically, that only the type of the Primary expression, not the class of the actual object referred to at run time, is used in determining which field to use.

7.8.2 Accessing Superclass Members Using super

The special forms using the keyword super are valid only in an instance method, operator or constructor of a class; these are exactly the same situations in which the keyword this may be used.

Suppose that a field access expression super.name appears within class C, and the immediate superclass of C is class S. Then super.name refers to the field named name of the current object, but with the current object viewed as an instance of the superclass. Thus it can access the field named name that is visible in class S, even if that field is hidden by a declaration of a field named name in class C.

7.9 Method Invocation Expressions

A method invocation expression is used to invoke a class or instance method.




MethodInvocation:

    Identifier ( ArgumentListopt )

    Primary . Identifier ( ArgumentListopt )

    super . Identifier ( ArgumentListopt )



7.9.1 Compile-Time Step 1: Determine Class to Search

The first step in processing a method invocation at compile time is to figure out the name of the method to be invoked and which class to check for definitions of methods of that name. There are several cases to consider, depending on the form that precedes the left parenthesis, as follows:

7.9.2 Compile-Time Step 2: Determine Method Signature

The second step searches the class determined in the previous step for method declarations. This step uses the name of the method and the types of the argument expressions to locate method declarations that are applicable, that is, declarations that can be correctly invoked on the given arguments. There may be more than one such method declaration, in which case the most specific one is chosen. The descriptor (signature plus return type) of the most specific method declaration is one used at run time to do the method dispatch.

7.9.2.1 Find Methods that are Applicable

A method declaration is applicable to a method invocation if and only if both of the following are true:

The class determined by the process described in 7.9.1 is searched for all method declarations applicable to this method invocation; method definitions inherited from superclasses are included in this search.

If the class has no method declaration that is applicable, then a compile-time error occurs.

7.9.2.2 Choose the Most Specific Method

If more than one method declaration is applicable to a method invocation, it is necessary to choose one to provide the descriptor for the run-time method dispatch. In this case the most specific method is chosen.

The informal intuition is that one method declaration is more specific than another if any invocation handled by the first method could be passed on to the other one without a compile-time type error.

The precise definition is as follows:

A method is said to be maximally specific for a method invocation if it is applicable and there is no other applicable method that is more specific.

If there is exactly one maximally specific method, then it is in fact the most specific method; it is necessarily more specific than any other method that is applicable.

It is possible that no method is the most specific, because there are two or more maximally specific methods. In this case a compile-time error occurs.

The type of the method invocation expression is the result type specified in the compile-time declaration of the most specific method.

7.9.3 Runtime Evaluation of Method Invocation

At run time, method invocation requires four steps. First, a target reference may be computed. Second, the argument expressions are evaluated. Third, the actual code for the method to be executed is located. Fourth, a new activation frame is created and control is transferred to the method code.

7.9.3.1 Compute Target Reference (If Necessary)

There are several cases to consider, depending on which of the three productions for MethodInvocation (7.9) is involved:

7.9.3.2 Evaluate Arguments

The argument expressions are evaluated in order, from left to right.

7.9.3.3 Locate Method to Invoke

If the target reference is null, a run-time error occurs and the program terminates. Otherwise, the target reference is said to refer to a target object and will be used as the value of the keyword this in the invoked method.

A dynamic method lookup is used. The dynamic lookup process starts from a class S, determined as follows:

The dynamic method lookup uses the following procedure to search class S, and then the superclasses of class S, as necessary, for method m.

We note that the dynamic lookup process, while described here explicitly, will often be implemented implicitly, for example as a side-effect of the construction and use of per-class method dispatch tables, or the construction of other per-class structures used for efficient dispatch.

7.9.3.4 Create Frame

A method m in some class S has been identified as the one to be invoked.

Now a new activation frame is created, containing the target reference and the argument values (if any), as well as enough space for the stack for the method to be invoked and any other bookkeeping information that may be required by the implementation (stack pointer, program counter, reference to previous activation frame, and the like). If there is not sufficient memory available to create such an activation frame, a run-time error occurs and the program terminates.

The newly created activation frame becomes the current activation frame. The effect of this is to assign the argument values to corresponding freshly created parameter variables of the method, and to make the target reference available as this. Before each argument value is assigned to its corresponding parameter variable, it is subjected to method invocation conversion (3.3).

7.10 Unary Operators

The unary operators include -, ! and cast operators. Expressions with unary operators group right-to-left, so that -!x means the same as -(!x).




UnaryExpression:

    - UnaryExpression

    ! UnaryExpression

    CastExpression

CastExpression:

    ParenExpression CastExpression

    ( ReferenceType ) CastExpression

    Primary


7.10.1 Cast Operator

Conceptually, the grammar for cast expressions is:




CastExpression:

    ( ReferenceType ) CastExpression

    Primary

However, for technical reasons (to make the grammar LALR(1)), the grammar was rewritten to parse a simple class name as an Expression. This eliminates an ambiquity that exists with one-token lookahead, where a parenthesized name cannot be distinguished from a cast. (See Section 19.1.5 of the first edition of the Java Language Specification for a discussion of this same problem in Java.)

The type of a cast expression is the type whose name appears within the parentheses. (The parentheses and the type they contain are sometimes called the cast operator.) The result of a cast expression is not a variable, but a value, even if the result of the operand expression is a variable.

At compile time, the type of the operand expression must be convertible by casting conversion (3.4) to the type of the cast operator.

A run-time error (7.18) occurs if the type of the cast operator is a reference type and the run-time type of the cast operand is not assignable to that type. That is, for reference types, the run-time type of the right-hand operand must be the same type as the left-hand type, or it must be a subclass of that type.

7.10.2 instanceof Operator

The grammar for an instanceof expression is:




InstanceOfExpression:

     InstanceOfExpression instanceof ReferenceType

     RelationalExpression


The type of the instanceof expression is Integer. The value of the Integer reference is either 0 or 1. It is 1 if and only if the type of InstanceOfExpression is convertible by casting conversion (3.4) to the type ReferenceType. If the value of an instanceof expression is 1, a cast to the same type is guaranteed to succeed.

7.11 Arithmetic Operators

7.11.1 Multiplicative Operators

The operators * and / are called the multiplicative operators. They have the same precedence and are syntactically left-associative (they group left-to-right).




MultiplicativeExpression:

    UnaryExpression

    MultiplicativeExpression * UnaryExpression

    MultiplicativeExpression / UnaryExpression



7.11.2 Additive Operators

The operators + and - are called the additive operators. They have the same precedence and are syntactically left-associative (they group left-to-right).




AdditiveExpression:

    MultiplicativeExpression

    AdditiveExpression + MultiplicativeExpression

    AdditiveExpression - MultiplicativeExpression



7.12 Relational Operators

The relational operators are syntactically left-associative (they group left-to-right).




RelationalExpression:

    AdditiveExpression

    RelationalExpression < AdditiveExpression

    RelationalExpression > AdditiveExpression



7.13 Equality Operator

The equality operator is syntactically left-associative (it groups left-to-right).




EqualityExpression:

    InstanceOfExpression

    EqualityExpression == InstanceOfExpression



The equality operator may be used to compare two operands for object equality.

At run time, the result of == is an Integer object with value of either 0 or 1. The value will be 1 if the two operands denote the same object, and 0 otherwise.

7.14 Assignment Operator

The assignment operator is syntactically right-associative (groups right-to-left). Thus, a=b=c means a=(b=c), which assigns the value of c to b and then assigns the value of b to a.




Assignment:

    LeftHandSide = AssignmentExpression



LeftHandSide:

    Identifier

    FieldAccess



AssignmentExpression:

    EqualityExpression

    Assignment



The result of the first operand of an assignment operator must be a variable, or a compile-time error occurs. This operand may be a named variable, such as a field of the current object or class, or it may be a computed variable, as can result from a field access. The type of the assignment expression is the type of the variable.

At run time, the result of the assignment expression is the value of the variable after the assignment has occurred. The result of an assignment expression is not itself a variable.

A compile-time error occurs if the type of the right-hand operand cannot be converted to the type of the variable by assignment conversion.

At run time, three steps are required:

7.15 Input Operator

The in operator is a rudimentary mechanism for accepting strings as user input and it provides the only way to accept input in the maTe programming language.




InputOperator:

    in



The in operator reads a string of input from stdin. The in operator skips any leading whitespace characters (CR, LR, LF, tab) in the input, and stops when it encounters the first non-leading whitespace character (or EOF). If there are no non-whitespace characters prior to EOF, the in operator returns a null reference. Otherwise, a new String object is created, whose characters are those read in by the operator (except for leading whitespace characters). The value of the in expression is a null reference if no non-whitespace characters were found prior to EOF, or a reference to the String object that was created. The type of the in expression is String.

7.16 Operator Invocation Expressions

An operator invocation expression is used to invoke a class or instance operator.


OperatorInvocation:

    Expression BinaryOperator Expression

    UnaryOperator Expression

    MinusOperator Expression

    Expression MinusOperator Expression

The first step in processing an operator invocation at compile time is to figure out the operator to be invoked and which class to check for definitions of operators with the specified operator. There are several cases to consider, depending on the form, as follows:

The second step searches the class determined in the previous step for operator declarations. This step uses the operator and class determined in the previous step to locate operator declarations that are applicable, that is, declarations that can be correctly invoked on the given number of arguments. There may be more than one such operator declaration, in which case the most specific one is chosen. The descriptor (signature plus return type) of the most specific operator declaration is one used at run time to do the operator dispatch.

Finding applicable operator declarations is entirely similar to finding applicable method declarations (7.9.2.1).

Find the most specific operator declaration is entirely similar to finding the most specific method declaration (7.9.2.2).

At run-time, operator invocation is entirely similar to run-time evaluation of a method invocation (7.9.3).

7.17 Expression

An Expression is any assignment expression:




Expression:

    AssignmentExpression



7.18 Run-Time Errors

There is no exception "handling" in the maTe programming language. Run-Time errors will cause the abrupt termination of the program with an associated error message.

Run-Time Error conditions and messages are as follows:


ERROR: Out of memory.
A class instance creation expression (7.7) fails due to insufficient memory available.


ERROR: Null reference.
A field access (7.8) is attempted when the value of the object reference expression is null.

A method or operator invocation expression (7.9, 7.16) that invokes an instance method is attempted when the target reference is null.

A null reference is passed to the out statement (6.10).


ERROR: Divide by zero.
An Integer division (2.3.3) operation is attempted where the value of the denominator is zero.


ERROR: Invalid cast.
A cast (7.10.1) to a reference type is attempted where the actual type of the operand expression is incompatible with (not a subclass of) the reference type to which it is being cast.


ERROR: Index out of bounds.
The substr method defined by class String was invoked such that at least one of the following conditions was true (see 2.3.4):
  • The String reference which invoked the method contained 0 characters.
  • The beg or end indices were not legal indices.
  • The end index was smaller than beg.


ERROR: Number format exception.
The toInteger method defined by class String was invoked such that at least one of the following conditions was true (see 2.3.4):
  • The characters of the String reference that invoked the method were other than the ASCII characters '0', '1', '2', '3', '4', '5', '6', '7', '8', or '9'. An ASCII minus sign '-' is the exception, but only if it is the first character (index 0).
  • The parsed integer value of the String reference that invoked the method was less than -2147483648 or greater than 2147483647.


ERROR: Concurrent modification exception.
The put or remove method defined by class Table was invoked after the Table reference that invoked the method's iterator had been initialized but before it had reached the end of the Table.