Orthogonal Type Extensions and Reductions Jukka Paakki, Anssi Karhinen, Tomi Silander Nokia Research Center P.O.Box 156 SF. O21O1Espoo Finland
Abstract In this paper we present a generaliTation of Oberon's record type extensions. Our extension mechanLcm is orthogonally applicable to all the conventional data types found in Pascal-like languages. In order to balance the type system, we present an inverse concept, type reductions, that also can be applied to all the programmer-defined data types. These concepts provide flexible and powerful means for expressing universal subtyping and supertyping in a statically-typed programming language, as well as many possibilities for reuse of data types via a sound mathematical characterization of assignment statements. The presented ideas are under implementation in Albefich, a programming language that can be considered as a natural descendant of Oberon.
I. Introduction Extensible data types were introduced in [Wir88a] as an elegant concept for expressing extensible
systems. The basic idea is to embed the central inheritance mechanism of object-oriented languages in modern modularized procedural languages. This can be achieved in a simple way by focusing the inheritance mechanism on data, and by this indirectly also on actions in the form of polymorphic procedures. The practical significance is most evident in connection with modules: extending an imported data type in a client module does not cause the imported module to be recompiled. In such a way a system can be decomposed systematically and efficiently into subsystems that are related, yet individual as well. The original notion was restricted to record types (and pointer types) and it has been implemented in the programming language Oberon [Wir88b, Wir88c]. In this paper we introduce an orthogonally extensible type concept where extensibility is defined not only for record and pointer types, but for other conventional data types as well. In order to make the scheme balanced we also present an inverse concept, type reducibility, where a type domain can be soundly restricted. These dual operations on types span subtyping hierarchies that support system structuring, customizability, as well as reliability through static type checking. Another solution for reaching a compact and powerful type system is to ,nlfy the conventional array and record types into a single framework, structured types. These integrated concepts form the basis of a type system which will be realized in the programming language Alberich currently under implementation. We present the guidelines of the design of Alberich's type system in Chapters 2 - 5: type extensions, type reductions, type hierarchies, and structured types. The programming language Alberich is briefly introduced in Chapter 6, and a simple example is given as an Appendix. 2. Generalized type extensions Types are a central concept in advanced programming languages. The active research in the area of sophisticated type systems has generated varied notions and terminology (for a review, see e.g. [CAW85]). One of the most important typing facilities from a software methodology point of view is subtyping; as noted in [Car89], "subtypes are essential for the ordered extension of large software systems". Subtyping is a relation between types: if A is a subtype of type B, then any object x of type A is also an object of type B, and x has all the properties of an object of type B (now B is called the supertype of A). In object-oriented languages this mechanism is known as inheritance; in [Wir88a] "Alberich is a heroic figure in ancient German mythology;Oberon is the name of the corresponding figure in English mytholo~.
28
SIGPLAN NOTICES V25, #9, July 199~
it is expressed by type extensions that introduce a systematic way to extend the domain of record types. Record types Consider for examplethe Oberon declarations TYPE TO = RECORDx,y:BOOLEANEND; VAR VO : TO;
Now TO represents the Cartesian product of its two component types. Thus the domain of TO, dora(T0), i.e. the set of all the possible values that can be assigned to V0, is the set of functions f : (x,y} -> {TRUE,FALSE} such that f(x)E {TRUE,FALSE}, and f(y)~: {TRUE,FALSE}.
Note that in the Oberon declaration the index set of the Cartesian product is rather implicit; it is here represented by the set of field names {x,y}. In order to make the formulation easier to grasp, we make use of the conventional notation where each function f:{il,i2,...,i }->S in a Cartesian product is expressed by the sequence (f(il),f(£z),...,f(in)). In this case t~e notation is actually slightly mislead£ng since it may give the impression that the values in the set {il,i2,...,in} have a fixed order i1 < i2 < ... < in. This is not the case when considering record types: on the conceptual level the fields of a record do not have an order (although within each implementation they probably have a hidden order in the memory of a computer). Thus in our example the domain of TO can be characterized by the set dom(rO) = { ( TRUE,TRUE), (TRUE,FALSE), ( FALSE,TRUE), ( FALSE,FALSE) }
where each pair (a,b) represents a function f such that f(x) = a and f(y) = b. Suppose that we extend the record type TO: TYPE T1 = RECORD (TO) z:BOOLEANEND; VAR V1 : T1;
Now dora(T1) is the Cartesian product of dom(T0) (which itself is a Cartesian product) and dom(BOOLEAN).In our notation this yields dom(rl)
{((TRUE,TRUE),TRUE),((TRUE,TRUE),FALSE),((TRUE,FALSE),TRUE), ((TRUE,FALSE),FALSE),((FALSE,TRUE),TRUE),((FALSE,TRUE),FALSE), ((FALSE,FALSE),TRUE),((FALSE,FALSE),FALSE)}
where each value ((a,b),c) in dora(T1) represents a function g such that g ( z ) = c and g(x&y) = (f(x),f(y)), i.e. f(x) = a and f(y) = b, where f is as described above. The assignment compatibility rules of Oberon state that the assignment V0: = Vl is legal since the type of Vl is an extension of the type of V0. As pointed out by Wirth, V0: = Vl should not be considered as an "assignment" but rather a "projection from dora(T1) onto dora(T0)". Thus here the "assignment" is actually a function PTO : dom(T1) -> dom(TO) such that Pro((a,b),c) = (a,b). For instance PTo((FALSE,TRUE),FALSE) = (FALSE,TRUE)which means that the "assignment" VO:= Vl where V1.x=FALSE,V1.y=TRUE,V1.z=FALSEresults in VO.x=FALSE,VO.y=TRUE.
We can characterize extension of record types more precisely as follows. Suppose that we can declare record types using the general notation TYPE T = < d e f i n i t i o n of T>
(1)
and extend them using the general notation TYPE T' = e x t e n s i o n o f T w i t h < d e f i n i t i o n of extension>
Now each declaration of the form (1) will unambiguously define (i) the domain of T, dora(T), and (ii) the assignment function assign : dora(T) -> dora(T).
29
(2)
Each declaration of the form (2) will unambiguously define (i) T' to be a subtype ofT, and T the supertype ofT', (ii) the domain of T', dom(T' ), (iii) the assignment function assign : dom(T') -> dom(T'), and (iv) the assignment function assign : dora(subtype) -> dom(supertype), i.e. assign : dom(T') -> dom(T). An assignment v: = e now results in v obtaining the value assign(value of e). The functions (ii) in (1) and (iii) in (2) are identity functions. The function assign : dora(subtype) -> dom(supertype) is a projection, as discussed above. Inherently this function is a surjection, i.e. it covers all the values in dom(supertype), and it may yield the same value in dom(supertype) for different values in dom(subtype). Array types Array types are rather similar to record types: both define a mapping from a set of names (record fields, array indices) to a set of values. The natural interpretation of array type extension is thus derived from record type extension by now a.pplying the extension to the index type. For instance, TYPE T = ARRAY[I..10] OF INTEGER; TYPE T'= extension of T with 10; would define type T' as ARRAY[1..20] OF INTEGER.The formalizations (1) and (2) given above apply to array types in the same way as to record types; assign : dom(T') -~. dora(T) is again a projection. Set types Projection has an obvious counterpart in the world of sets: taking a subset. Hence extension of a set type can be given an interpretation which is mathematically justified. For instance, TYPE T = SET OF (a,b); TYPE T'= extension of T wtth (c);
means that the enumerated element type of T is extended in the definition of T' with literal c. Now dom(T)={ ~ ,{a},{b},{a,b}}, dom(T')={ ~ ,{a},{b},{a,b},{c},{a,c},{b,c},{a,b,c}}, and assign : dom(T') -> dom(T) means to produce the maximal subset of a value in dom(T') such that is in dom(T). For instance, assign({b, c}.)={b}, and assign({c})=#. Scalar types Above we characterized extensions of constructed types, i.e. types that have an internal structure. The same can obviously be done for scalar ("basic") types as well. However, extending a standard type, such as Integer or Real, would probably cause problems in the implementation which usually provides the maximal, non-extendible domain for each standard type. That is why the only conventional scalar types appropriate for extension are enumeration types. Extending an enumeration type obviously reverses the subtype/supertype relation, and thus definition (2) must be revised as follows: For (i) (iv) For
scalar types, each declaration of the form (2) will unambiguously define T to be a subtype of T', and T' the supertype of T. produces in this case the function assign : dora(T) -> dom(T'). instance, TYPE T = (a,b); TYPE T'= extension of T with (c); yields dom(T)={a,b}, dom(T')={a,b,c}, and assign : dom(T) -> dom(T') is an identity function. Pointer types Obviously extension of a pointer type and its mathematical characterization is reduced to extension of its pointer base type, as done also in Oberon. Extending a base type will thus implicitly extend the corresponding pointer type as well, and assignments between pointers are interpreted according to the rule for the base types. Definitions Given a type extension declaration of the form TYPE T' = extension of T with <definition of extension>
30
T' is a direct subtype of T and T the direct supenype of T' in case T is a constructed type; in case T is a scalar type, T' is a direct supertype of T and T the direct subtype of T'. We write < direct subtype > -- < direct supertype >. A type T" is a subOppe of type T if T" = T or T" is a direct subtype of a subtype of T. In that case T is a supenype of T' '. We write < subtype > =- < supertype >. Note that for constructed types the (direct) subtype and (direct) supertype definitions correspond to (direct) extensions and (direct) base types of Oberon [Wir88a], respectively. When denoting the type of object o with type(o), an assignment v: = x is legal if type(x) =type(v). In case type(v) aJad type(x) are pointer types, this assignment compatibility rule is applied to the base types of type(v) and type(x). Example As noted before, type extensions can be applied in object-oriented progrzmmlng for modelling inheritance between classes and objects. Another related paradigm where extensions provide substantial benefits is modular progrzrnrnlng where a system is built out of a number of separate modules that communicate via explicitly specified interfaces. Now a module M can be reused (without recompilation) not only when directly making use of its exported data types and data structures, but also when introducing new data structures that are compatible with the exported structures of M through type extensions. For example, suppose that we are producing a multi-pass compiler such that each pass is implemented as a module. Now we can present the compiler's symbol table in the syntax analyzer module by: TYPE SynEntry = record name: String; I i ne, co ] umn: Integer; kind: SymboIKind; type: Ref end;
VAR SynTab]e : array [1..Max] of SynEntry;
This data structure can be reused in the module for semantic analysis by taking into account the additional information needed for static semantic checking (e.g. the length information for checking type coercions): TYPE SemEntry = extension of SynEntry with (length: Integer); VAR SemTable : array [1..Max] of SemEntry;
Now we can initizliTe SemTable from SynTable and, since the elements are assignment compatible (SynTable[ i ] := ScruTable[i ] is legal), also reuse operations of the syntax analyzer module (e.g. for printing values of symbol attributes) without recompilation. 3. Type reductions Type extensions provide a rigorous method for deriving a subtype T' of an existing type T, and they define most notably the assignment ffanctions assign : dom(T') -> dora(T) (assign : dora(T) -> dom(T') in the case of scalar types). In other words a systematic way to convert the values of the new type T' to the values of the existing type T is implicitly defined (vice versa for scalar types). Often a reverse mecbzni~m is needed: we want to be able to convert the values of an existing constructed type to the values of the new type. In our scheme this can be done by composing the values of the new type as a reduced domain of the existing type. Besides balancing and advancing an extensible type system, this operation also supports such fundamental programming concepts as explicit supertypes (superclasses) and abstraction: the new type can be considered as a supertype or an abstraction of the old type. The principle can be illustrated with a simple example: TYPE RO : record x,y:Boolean end; TYPE RI : reduction of RO with (y);
31
The declaration of R1 means that R1 will be a record type that is derived from type R0 by removing field y. Now we get dom(RO) = {(TRUE,TRUE),(TRUE,FALSE),(FALSE,TRUE),(FALSE.FALSE)}, dom(R1) = {TRUE,FALSE}.
"Assignments" of values in dom(RO) to variables of type R1 are allowed; the effect is similar to that in the extension mechanism. As in extension, type reduction is not applicable to record types only but instead to any conventional data type. Thus a type reduction of the form I'YPE T' = reductlon of T with <definition of reduction>
will unambiguously define (i) T' to be a supertype of T and T the subtype of T' for constructed types T and T', and T' to be a subtype of T and T the supertype of T' for scalar types, (ii) the domain of T', dom(T' ), (iii) the assignment function assign : dom(r' ) -> dom(T' ), and (iv) the assignment function assign : dom(subtype) -> dom(supertype). The characterization of the assign functions is the same as for type extensions: each assign : dom(T') -> dom(T') is an identity function; assign : dom(subtype) -> dom(supertype) is an identity function for scalar types, a generation of a subset for set types, and a projection for record and array types. Definitions Given a type reduction declaration of the form TYPE T' = reduction of T wlth <definition of reduction> T' is a direct supertype of T and T the direct subtype of T' in case T is a constructed type; in case T is a scalar type, T' is a direct subtype of T and T the direct supertype of T'. Subtypes, supertypes, and assignment compatibility are defined in the same manner as for type extensions (see Chapter 2) in terms of direct subtypes and direct supertypes. Examples Reduced scalar types support module reuse (without recompilation) due to the way assignment compatibility is def'med. For instance, days can be modelled as follows: TYPE Day = (Sun,Mon,Tue,Wed,Thu,Fri,Sat); TYPE WorkingDay = reduction of Day with (Sun,Sat);
In case these data types are declared in separate modules, the operations on objects of type Day can be reused for processing objects of type Worki ngDay as well.
The reduction mechanism is also convenient when we want to reuse only a part of existing structured data. For instance, if we are going to write a code generator module for the multi-pass compiler used as an example in Chapter 2, we can reuse the symbol table as follows: TYPE CodeEntry : reduction of SemEntry with (name.line.column); VAR CodeTable : away [1..Max] of CodeEntry;
Now we are interested merely in the symbol information that is relevant for generating code; thus the entries in CodeTable have just the attributes kind, type, and length. CodeTable can be initialized from SemTable (CodeTab1e[ i ]: = SemTab1e[ i ] is legal). In case we need to associate some additional information with symbols, we can do it by extending CodeEntry: TYPE FinalEntry : extension of CodeEntry with (address: Integer);
4. Type hierarchy Extensions and reductions of a type generate a hierarchy of related types. In Chapters 2 and 3 assignment compatibility has been defined between subtypes and supertypes, induced by type extensions and reductions. This can be illustrated by drawing a directed acyelic graph where the
32
nodes represent the types and edges represent the subtype and supertype relations between types (a relation T1 -> T2 generates a directed edge from T1 to T2 in the graph). For example, the type declarations TYPE E1 TYPE E2 TYPE E3 TYPE E4
= = = =
(a,b);
extension of E1 with (c); extension of El with (d,e); reduction of E1 with (a);
would generate the following type hierarchy graph El:(a.b} ~ E2:{a,b,c}
E4:{b)
E3:la,b,d,e}
The type nodes that are located in a simple path are assignment compatible in such a way that values of type T1 can be assigned to a variable of type T2 when the node for T2 is a descendant of the node for T1 in the path. For instance, from the graph above we can conclude that values of type E4 can be assigned to variables of type E3, but values of type E2 cannot. Powersets as domains of set types provide an appenling compatibility between a set subtype T2 and its supertype T1. It is possible to interpret an "assignment" of a value in dora(T1) to a variable of type T2 as an identity function, and an "assignment" of a value x in dora(T2) to a variable of type T1 as taking a subset of x consisting of all the elements in x that are in the domain of the element type of T1. Since an "assignment" can in this case be defined in a mathematically sound way to both directions, we can give a general defimtion for assignment compatibility between set types: an assignment v: = x, where type(v) and type(x) are set types, is legal if type(v) and type(x) are located in a simple path in a graph of related set types (i.e. type(x)=>type(v), or type(v)=>type(x)). 5. Uniform structured types Traditionally structured objects have been expressed in programming languages either as arrays or as records. When designing the type system of Alberich we felt that providing two separate concepts for this purpose would only make the language bigger and harder to adopt without any substantial benefits: an array can be considered as a homogeneous record with some special accessing properties (e.g. its components may be accessed in an ascending or a descending order). That is why conventional array and record types have been unified in Alberich within a single concept, a structured type. This unified view on structures provides, besides uniformity of the type system, also some useful properties of structured objects that are absent in traditional languages: for instance it is possible to access sequentially the components of a "record", and to denote a "record field" using a dynamic expression (this feature, however, contradicts static typing and it has been thus excluded from Alberich; dynamic "indexing of arrays" is permitted as usual). At the program level it is actually misleading to talk about "arrays" or "records"; the distinction has now been hidden on the implementation level. The Cartesian characterization of record types (see Chapter 2) makes use of an index set representing the component names of the record. This index set is conventionally given implicitly in programming languages; we feel, however, that having the index set more explicit clarifies the mathematical characterization of structured types. That is why the component "names" are given in our scheme by an index type associated with the declaration of a structured type. The index type can be any ordered type, such as Integer or an enumeration type. This implies that extending/reducing a structured type can only be made by extending/reducing the index type as well. Since in practice standard types cannot be extended, it is sensible to define extension only for those structured types whose index type is enumerated. In Alberich this principle is applied uniformly to all the programmer-defined data types: only types that are built on top of enumeration types can be extended (reduction is defined also for types that are built on top of standard ordinal
types).
33
Example In Alberich we could express the symbol table used as an example in Chapters 2 and 3 as follows: TYPE TableSize = INTEGERRANGE1..Max; SynAttributes : (name,line,column,kind,type); SynEntry : STRUCT (SynAttributes) WITH name: String; line,column: INTEGER; kind: SymbolKind; type: Ref END STRUCT; SynType = STRUCT (Tab]eSize) WITH ALL: SynEntry END STRUCT; VAR SynTable : SynType; TYPE SemAttributes = EXTENDED(SynAttributes) WITH (length); SemEntry = EXTENDEDSTRUCTSynEntry (SemAttributes) WITH length: INTEGEREND STRUCT; SemType = STRUCT (Tab]eSize) WITH ALL: SemEntry END STRUCT; VAR SemTable : SemType; TYPE CodeAttributes = REDUCEDSemAttributes WITHOUT (name,line,column); CodeEntry = REDUCEDSTRUCTSemEntry (CodeAttributes) END STRUCT; CodeType = STRUCT (TableSize) WITH ALL: CodeEntry END STRUCT; VAR CodeTable : CodeType; TYPE Fina]Attributes = EXTENDEDCodeAttributes WITH (address); FinaIEntry = EXTENDEDSTRUCTCodeEntry (FinalAttributes) WITH address: INTEGEREND STRUCT;
Note that all types must be named in Alberich. The index sets in this example are the following: SynEntry SynType SemEntry SemType CodeEntry CodeType FinalEntry
: dom(SynAttributes) = {name,line,column,kind,type} : dom(TableSize) = (1,2 . . . . . Max} : dom(SemAttributes) = {name,line,co]umn,kind,type,]ength} : dom(TableSize) = (1,2 . . . . . Max} : dom(CodeAttributes) = (kind,type,length} : dom(TableSize) = {1,2 . . . . . Max} : dom(Fina]Attributes) : {kind,type, length,address}
Components are denoted in an array-like fashion: SynTable[1], SynTabl e[ 1] [ name], etc. 6. Alberich In this chapter we briefly introduce Alberich, a programming language where the ideas given in Chapters 2 - 5 are rezllzed; the concepts introduced above have been the main characters in the design of Alberich. All its data types (standard, enumeration, structured, set, and pointer types) follow the presented principles. Enumeration types Enumeration types have been excluded from Oberon because of the conceptual and implementational problems in connection with modules [Wir88b]. We feel, however, that the advantages of enumeration types in data abstraction and type discipline far outweight these disadvantages. That is why enumeration types are included in Alberich and even in a most central role: they form the basis for the extension/reduction mecbz-i~m of the language. By this we reach a significant amount of both conceptual clarity and uniformity: now extensions as well as reductions can be defined within a ,nified framework of extending/reducing enumeration types, instead of having different definitions for different types. Traditionally an order relation is implicitly defined for each enumeration type. For some cases implicit ordering is natural, but in a case like type Scandinavian = (Danish,Norwegian,Finnish,Swedish)
34
it produces such intuitively absurd conclusions as "Finnish < Swedish". Thus we have conceptual reasons to divide enumeration types into two different classes: those with the order relation ("ordered enumeration types"), and those without it ("unordered enumeration types"). We believe that this distinction will aid in modelling the real world in a more natural way. Unordered enumeration types are included in Alberich, as well as in some other type systems (e.g. as "enumeration variant types" in Quest [Car89]). Structured types The structured types of Alberich have been designed according to the principles discussed in Chapter 5 where also some examples are given. Set types In Alberich one can express sets of literal values (values of an ordinal or an unordered enumeration type) using set types. Besides extension and reduction, a number of conventional set operations are predefined. Set constants are denoted by S(a,b} where S is a set type name, and a and b literals of the element type of S. (} is a shorthand notation for S{} for each set type S. A set type can be extended/reduced via an extension/reduction of its element type. A set type T' declared as an extension/reduction of a set type T will have as its domain the powerset of the domain of the extended/reduced element type of T. Note that since the o~y element types that can be extended are enumeration types, only sets of enumeration literals can be extended. Pointer types Alberich provides pointers to values of any type (except a pointer type). Extensions and reductions of pointer types are made indirectly by extending/reducing the pointer base type. Two pointer types are assignment compatible if their pointer base types are. Heterogeneous data structures can be constructed in the same way as in Oberon (with the generalization to reduced types). Type tests "v IS T" and type guards "v(T)" are available, as in Oberon. Implementation issues Enumeration types are implemented conventionally by assigning integer values (which may be negative, due to extensions) to enumeration literals. Our solution will assign the same internal values for the corresponding literals in compatible enumeration types. For implementing set and structured types we need a mechanism for coercing the internal representation of the literals always to start from 0. This is achieved by associating with each enumeration type T compiler functions FirstOrd(T) and LastOrd(T) giving the internal value of the first, resp. the last literal in T. Set types are implemented in the standard way as bit vectors with LastOrd(T) - FirstOrd(T) + 1 bits where T is the element type of the set. The bits are numbered from 0 which implies that a coercion from the internal representation of enumeration literals to bit locations is needed: a literal with internal value t is in the set if bit (t-FirstOrd(T)) is on. Assignments of sets (e.g. subsetting) are implemented by initizliTing the left-hand side variable always to an empty set. Structured types in their general form make it necessary to postpone part of the memory management from compilation time to runtime. One central feature that prohibits compile-time solving of storage allocation is the public projection of structures [Wir88a]. For example, consider the following declaration in a definition part of module M: TYPE Field = (a,b); Object = STRUCT (Field) WITH a:T1; b:T2
END STRUCT;
Suppose that we import M into some other module N where we declare a variable V of type Object. In our implementation the declaration of V allocates runtime space for an incomplete descriptor of V. Address and size information must be left undefined since, in case the declaration of Object is a public projection, the actual size of V may be unknown; the descriptor is, however, complete enough for generating code for the statements in N using indirect referencing. The descriptor is completed when initjzliTing module N at runtime. Note that at that time the implementation of module M must be available. A call is made for a generated initizliTing
35
procedure INIT(V) in M that will complete the descriptor for V by fixing its address and size information (descriptor fields Offset and Size, respectively). This process may be recursive in case M itself has imported some modules. Finally space is allocated for V on the basis of the available size information:
a
>
b
'"~
Offset
Storage
ISi~
V/IlIA pr i rate
fields
This solution is more elegant than the use of a compiler hint [Wir88a] since it requires no implementation-dependent size in.formation to be expressed through special commands. The drawback, however, is some loss of space and time efficiency of the progrzrn.~. Since a number of facilities in Alberich (modularization, standard types, statements) are more or less similar to those in Oberon, the implementation of Alberich will be based on an Oberon compiler written in Modula-2 [KaS89]. 7. Related work Conceptually type extensions and reductions of Alberich are analogous to well-established subtyping and supertyping concepts in type-oriented progrzmmlng languages. Subtypes and supertypes, especially with respect to record types, have been rigorously discussed e.g. in [Mit84], [Car84], [Wan87], [Sta88], [BrL88], and [Rtm89]. A more orthogonal subtyping concept is presented in [Car88]. Two well-known languages with similar ideas as Alberich are Oberon [Wir88b, Wir88c] and Modula-3 [CDJ89]. All these three belong to the Algol - Pascal fzmily of progrzmming languages and provide means to explicitly specify subtype and supertype relations between data types. The concept of type extensions in Albefich is much more general than in Oberon where they are applicable to record and pointer types only. Modula-3 provides a uniform subtyping concept that is founded on structural equivalence: two types are identical if their definitions are the same when expanded. This is in contrast to Alberich where name equivalence of types is applied. Another significant difference between the type systems of Modula-3 and Alberich is that in Alberich the subtype and supertype relations have to be explicitly expressed using type extensions and reductions, i.e. different types can be made compatible only via declarations of the form TYPE T' = e x t e n s i o n o f T w i t h < . . > and TYPE T " = reduction of T with ,
whereas in Modula-3 this is possible just for object ("record") types using a notation that corresponds to type extension of Alberich; for other types of Modula-3 subtyping and supertyping are rather implicit. Some more significant differences are that in Modula-3 scalar types, set types, or array types cznnot be "extended", and that subset operations on sets or projections of arrays are not available in Modula-3. On the other hand, the type system of Modula-3 provides a number of facilities that are missing from Albefich, such as division of reference ("pointer") types into traced and untraced ones, procedure types, and general object types ("record types with methods", i.e. classes). 8. Discussion In our opinion, disciplined and rigorous concepts should be guiding principles in designing progrnmming languages. Having a mathematical characterization underlying the design process lead to a better structured, less ambiguous, and more orthogonal language. Albefich is a progrnmming language whose type system is based on well-estabfished subtype and supertype concepts in the form of generally applicable type extensions and reductions, and where assignments are characterized as simple and well-known mathematical functions: identity, projection, and subset generation. In a sense Alberich can be seen as a natural descendant of Oberon because many of the features of Oberon have a direct counterpart in Albefich. However, the most significant concept in Alberich, the type system, is much more orthogonal than that in Oberon. Record extensions of Oberon have in Alberich been generalized to cover all the progrnmmer-defmed data types, and an inverse concept, type reduction, is provided as well. Orthogonality of the type system has been reached by founding the extension/reduction mechnni~m on enumerated types that also are missing
36
from Oberon. Recently Oberon has been an inspiration for other refinements as well, e.g. an object-oriented Oberon [MOT89]. Type extensions and reductions in an Alberich program specify classes of related (compatible) types. These classes must always be constituted explicitly by giving in each type extension and reduction declaration both the subtype and its supertype. This, associated with name equivalence of types, induces a significant degree of methodological safety into type hierarchies: the programmer is forced to clearly analyze which types are somehow related. This principle is in contrast to corresponding type systems (such as in Modula-3) that typically employ implicit relation of types and structural equivalence, both of which may make two unrelated types compatible "by accident". Having a powerful and flexible type system in a programming language at best supports many different programming paradigms, such as modular, object-oriented, imperative, and algebraic programming. The orthogonally extensible and reducible type system of Alberich contributes to a uniform progrzmmlng style where the emphasis is on widespread use of types and automatic typechecking. This paradigm has been characterized by Cardelli as typefulprogramrning[Car89]. Acknowledgements. We sincerely thank Prof. Kai Koskimles for his comments on an earlier draft. The work on this subject has started when producing an Oberon compiler at the Department of Computer Science, University of Helsinki. References [BrL88] Bruce K.B., Longo G.: A Modest Model of Records, Inheritance and Bounded Quantification. In: Proc. of the 3rd Annual Symposium on Logic in Computer Science, Edinburgh, 1988. IEEE, 1988, 38-50. [Car84] Cardelli L.: A Semantics of Multiple Inheritance. In: Proc. of the Int. Symposium on Semantics of Data Types, Sophia-Antipolis, 1984. LNCS 173 (G.Kahn, D.B.MacQueen, G.Plotkin, eds.), Springer-Verlag, 1984, 51-67. [Car88] Cardelli L: Structural Subtyping and the Notion of Power Type. In: Proc. of the 15th Annual ACM Symposium on Principles of Programming Languages, San Diego, California, 1988, 70-79. [Car89] Cardelli L.: Typeful Programming. Report 45, Systems Research Center, DEC, 1989. [CAW85] CardeUi L., Wegner P.: On UnderstandingTypes, Data Abstraction and Polymorphism. Computing Surveys 17, 4, 1985, 471-522. [CDJ89] Cardelli L., Donahue J., Jordan M., Kalsow B., Nelson G.: The Modula-3 Type System. In: Proe. of the 16th Annual ACM Symposium on Principles of Programming Languages, Austin, Texas, 1989, 202-213. [KaS89] Karhinen A., Silander T.: An Oberon Compiler (in Finnish). Report C-1989-53, Department of Computer Science, University of Helsinki, 1989. [Mit84] Mitchell J.C.: Coercion and Type Inference (Summary). In: Proc. of the llth Annual ACM Symposium on Principles of Programming Languages, Salt Lake City, Utah, 1984, 175-185. [MrT89] MrssenbOck H., Tempi J.: Object Oberon - A Modest Object-Oriented Language. Structured Programming 10, 4, 1989, 199-207. [Rrm89] Rrmy D.: Typecheeking Records and Variants in a Natural Extension of ML. In: Proe. of the 16th Annual ACM Symposium on Pnneiples of Programming Languages, Austin, Texas, 1989, 77-88. [StaB8] Stansifer R.: Type Inference with Subtypes. In: Proc. of the 15th Annual ACM Symposium on Principles of Programming Languages, San Diego, California, 1988, 88-97. [Wan87] Wand M.: Complete Type Inference for Simple Objects. In: Proc. of the 2rid Annual Symposium on Logic in Computer Science, Ithaca, New York, 1987. IEEE, 1987, 37-44. [Wir88a] Wirth N.: Type Extensions. ACM TOPLAS 10, 2, 1988, 204-214. [Wir88b] Wirth N.: From Modula to Oberon. Software - Practice and Experience 18, 7, 1988, 661..670. [Wir88c] Wirth N.: The Programming Language Oberon. Software - Practice and Experience 18, 7, 1988, 671-690.
37
Appendix Suppose that we want to express the construction of cars. This can be done using a heterogeneous structured ("record") type Car:
TYPE &IrModel - (Mercedes,Audt,Saab,Mlzda,Lada); Cal~tCrtbuta • (Mode1,kllul|,Ft.ozlt.Seats,BacL~ts,Radlo); Car = S'I'RUCT(Cm-AtCribute) WITH
Model : Cat'Model;
k~mi1s = INT~ERI F~rontSests,BackS~s = IN11~BI Radio : BI) STRUCT; YAR M)Cat. : CarI We can express the construction of our own car by=
RyCar[Model]:= Audt| I~R,ar[k~mela]:- 4; I~¢zr[FrodcSeat=]=- 2; RyCat.[BackSeat=]=- 3; I~yCat.[~dlo]:- TRUE Suppose now that now regulatlons arise wlth respect to the legal construction of cars. This imposes no problems for our implementation since we can reuse our existing solution:
TYPE ~ r A t ~ r t b u C e
- EXTENOEDCarAtCrlbuCe WITH (Catalyutm');
Neu('..at. = EXTENOE]]STRUCTCat" (~CarAttt.tbute) wrn.l Catalysat.ot. : BOOLEANEND STRUCT;
YM MyNe~Cat. = NeKat.; We can assign the fields of MyCar to MyNowCarand, after Installlng a catalysator In the car, set
I ~ C a r [ C a ~ l y M t . o r ] := TRUE. The structured
types can be reused for other purposes as well. Suppose e.g.
that we want to express the special
construction of sport cars. Thls can be done now easily:
TYPE SportCarAttt.lbute - R~UC~ NeKarAttHbute wTrNoErr (BackSeats); Spot.CCat. • REDUCEDSllUICT Net~t. (5pm'l:ClrAtCt.tbute)
E~ STRUCT;
VM l~,SportCat. : Sport:Cat.; After transforming our ordinary car into a sport car by removing its back seats we can express its construction simply by assigning RySpol'CCat. := MyNe~(at.. Now MySportCar has the components Model, Wheels, FrontSeats, Radio, and Catalysatoc. Suppose that we also want to select some properties of our car to be printed. The properties may be selected a r b i t r a r i l y , depending on our current needs. This can be expressed as follows:
TYPE Prlnted - SET OF $pot.CC41r&ttrtbute; VAR Selected , PHnted; Attribute : SportCarAtt'rtbute; ,go
Selected== I~,lnted{Cal:a ly~rr,m', Rad10}; oao
~ I L E Selected e O DO EXI~UST(~ lected ,Attt.tbul:e ) | ~ t ~ ( R y ~ ' t C a r , A t t H bul:m)
EXHAUST(S,e) is a predeflned procedure that returns in • a value In set S, and updates S by removing the selected value from i t . We assume the existence of the procedure Print for printing the value of the given property of our car.
38