Extending ODMG for Federated Database Systems - Semantic Scholar

Report 2 Downloads 116 Views
Extending ODMG for Federated Database Systems Elke Radeke

C-LAB Furstenalle 11, 33102 Paderborn, Germany [email protected], http://www.c-lab.de/elke

Abstract A federated database system (FDBS) allows a uniform and transparent access to the data of multiple heterogeneous DBS. Therefore the federation layer converts the heterogeneous data into a canonical data model. Object-oriented data models are shown to be most appropriate for a canonical data model. As a new standard has arisen in the object-oriented database area, ODMG-93 (short ODMG), this paper extends the interfaces of ODMG (with C++ binding) for an FDBS. We extend the syntax of the ODMG object de nition language (ODL) to a schema integration language. In contrast to most other approaches, it uses object-oriented mechanisms like inheritance and user-de ned methods for deriving federated schemata from existing DBS schemata. The syntax of the object manipulation language (OML) and object query language (OQL) do not have to be extended for an FDBS, but only the semantics. These languages then allow transparent data access on multiple heterogeneous DBS.

1 Introduction Federated database systems (FDBS) [16] have become very important in order to control sharing of data among heterogeneous autonomous database systems. A global interface on top of all database systems allows a uniform transparent access to data of all coupled DBS. However, they save hugh investments as all existing database applicationscontinue running locally on their DBS. Schema integration represents an important step for building and extending an FDBS. Thereby the schemata of the component database systems (CDBS) are mapped to some federation schema(ta) [1]. Sheth  R&D institute of University Paderborn & Siemens Nixdorf Informationssysteme

and Larson [16] de ned a reference architecture with ve schema levels (Fig. 1) where the CDBS schemata are rst mapped into the canonical data model (component schemata), then data not relevant for the federation are ltered (export schemata), multiple export schemata are combined to federated schema(ta) where users may de ne application speci c views on top (external schemata). External Schema Federated Schema Export Schema

External Schema

...

Federated Schema

... Export Schema

External Schema

...

Export Schema

Component Schema

...

Component Schema

Local Schema

...

Local Schema

DBS1

DBSn

Figure 1: Five level reference schema architecture An analysis of [13] shows that functional and objectoriented data models are most appropriate as a canonical data model. Their expressive power allows to cover other data models and eases the mapping. [4] identi es object-oriented data models as best canonical data model due to the extensible nature and good support of data abstraction and encapsulation which allow to overcome heterogeneity through the provision of abstract data types. Unfortunately, no standard for an object-oriented data model existed for many years and several di erent variations were created. Also multiple FDBS prototypes have been developed with di erent object-oriented data models as canonical data model. As a commondisadvantage of non-standard models and interfaces, they cannot be used by a huge set of standard applications but require data model speci c implementations. Moreover advanced concepts and functionality of one FDBS cannot be directly adopted to another but need to be converted to its di erent data model.

Finally in 1993, an industry standard was developed from important vendors of object-oriented database systems, the Object Database Management Group (ODMG). It is called ODMG-93 and includes languages for object de nition (ODL), object manipulation (OML), and object query (OQL) [3]. In this paper, we extend the ODMG ODL with C++ binding to a schema integration language, named ODL+. It enables the administrator to map the CDBS schemata to some federation schemata during FDBS installation, e.g. according to the ve level schema architecture. Moreover (s)he can dynamically extend the schema integration information in the running federated database system. Thereby new DBS can be coupled step by step to the FDBS, schema extensions of some CDBS can be considered, and new requirements of global applications can be re ected. Our approach also provides constructs to solve schema integration con icts such as synonyms/homonyms, precision, scaling, granularity of data types [15, 8]. Moreover we illustrate the impact of ODL+ on object level, i.e. the object transformation along the schema levels when they are accessed by ODMG OML or OQL. In contrast to many other FDBS approaches, our schema integration language is not query based but a data de nition like language. This was suggested in [5]. The main bene ts are that object-oriented mechanisms like inheritence and encapsulation can be used which is not naturally expressible by queries and that schema extensions are easier to de ne. Therefore we extend the ODMG ODL to ODL+. An extension of ODMG for FDBS was already speci ed in the IRO-DB project [2], but they extended the query language OQL for schema integration purposes. Moreover our global interface for accessing the data of all DBS does not completely restrict on global queries due to the view update problem like many other approaches. Instead we allow to create and update data for the many cases which do not raise this problem. An important aspect for this is the use of an objectoriented data model, providing object identity [14]. But there are also some FDBS using an object-oriented data model as canonical data model but restricting on queries, e.g. ECRC MDBS [6]. UniSQL/M [7] restricts the creation and updates on those types that were mapped 1:1 from local CDBS types and does not allow to create global new data like inter-DBS relationships. Our approach allows creations and updates for many other cases and even allows to create global new data. The ODL+ allows to de ne reversible schema mappings, in particular by the de nition of methods with their corresponding reversible methods. Any global type that has a reversible mapping to CDBS types can

be used for creations and updates. If updates or creations of further types are desired, the ODL+ allows to override the create/update operator for these types. Moreover the ODL+ allows to de ne global new data like inter-DBS relationships and its instances will be stored in the FDBS internal database. The rest of this paper shows our syntax extensions for ODL and the semantics extensions for OML/OQL.

2 Extending ODMG ODL to a Schema Integration Language Our ODMG ODL extension ODL+ allows to de ne the heterogeneous CDBS schemata in a uniform language, i.e. the speci cation of the component schemata, and to derive further schemata, i.e. the export, federated, external schemata. While the current ODMG ODL does not support multiple schemata, we introduce a new de nition scope, the schema, and de ne schema derivations enabling to combine, extend, lter, or transform schemata. Moreover we allow to de ne a type generalization where a new type inherits all common properties from multiple base types. The following subsections start with schema harmonization where CDBS schemata are de ned in our canonical data model. Afterwards the various facilities to derive a new schema from one or multiple others are presented. Our ODL extension is illustrated by an ongoing example.

2.1 Schema Harmonization The rst important step in schema integration is the mapping of the CDBS schemata into the canonical data model. Referring to the ve level reference schema architecture, it represents the transformation from the local schemata to the corresponding component schemata. (Note that this paper does not concentrate on the semi-automatic transformation of local schemata in arbitrary data model to component schemata de ned in ODL+.) Example 2.1

In the following, we illustrate the de nition of the component schemata of our ongoing example from Computer Aided Concurrent Engineering where database systems of a design, production, and sales department are federated to an FDBS. Their schemata are de ned in our C++ ODL extension as de ned in the code below. Each schema starts with the new ODL+ keyword schema followed by a schema name (see code below, line 1). Within such a schema scope arbitrary ODL de nition is allowed. Structured

types are de ned like C++ structs (2) and object types like classes (3). Within the type de nition, the C++ access levels public/private (4), can be used to classify the type members, i.e. the attributes de nitions (5), relationship de nitions (6), or method de nitions (7). Object types can also be derived from other object types (8). (1)schema DesignCompSchema { (2) struct Status {Date String (3) (4) (5)

change_date; change_comment;};

2.2 Direct Schema Derivation A transitive 1:1 mapping between two schemata where one schema inherits all meta data (types, attributes, relationships, methods) from its base schema is an analogy to type inheritance of ODMG ODL. Example 2.2

class Product {public: int no; String name; String prod_line; Status last_change; };

This kind of schema transformation is used in the CACE FDBS to map 1:1 the component schema of ProductionDBS to its export schema because any local data shall be visible/accessible in the federation.

(6)

class Method {public: String name; Set< Ref > used_for;

(8) (7)

class Simulation : public Method {public: int check(); };

class Synthesis : public Method void compose(); }; }; schema ProductionCompSchema { class Material {public: int no; String name; Ref is_used_by

After this section has de ned the component schemata of di erent CDBS, the following subsections allow to derive further schemata from them using a 1:1 mapping, ltering, composition of existing information, or augmentation with new information.

};

{public:

inverse uses; };

class Product {public: int no; String name;}; }; schema Sales1CompSchema { class Article {public: int id; String kind; Set< Ref > stored_in inverse stores; Set< Ref > bought_by inverse buys;}; class Depot {public: int id; int capacity; char town; Set< Ref > stores inverse stored_in;}; class Customer {public: int id; String name; String street; int zip; String town; Set< Ref > buys inverse bought_by;}; };

schema ProdExportSchema:ProductionCompSchema {};

2.3 Filtering Any meta data can be ltered during the schema derivation. Inherited types can be eliminated completely in the derived schema or some of their attributes, relationships, or methods can be dropped. Therefore ODL is extended by a new keyword drop. For ltering types completely, only their name is required while for attributes, relationships, and methods also the name of its type is necessary due to the name scope. Example 2.3

For the SalesDBS1, not all data shall be visible in the federation and are therefore ltered in the corresponding export schema. Type Method (1) and Simulation (2) and attribute Product::last change (3) of DesignDBS are ltered in the export schema. The same holds for the attribute Depot::capacity of SalesDBS1 which is dropped from the Sales1ExportSchema (4). Moreover the relationships between Article and Customer (5) as well as between Article and Depot (6) are only required as uni-directional relationships in the export schema. One direction is ltered for that. schema DesignExportSchema:DesignCompSchema (1) { drop Method; (2) drop Simulation; (3) drop Product::last_change; }; schema Sales1ExportSchema:Sales1CompSchema (4) { drop Depot::capacity; (5) drop Article::bought_by; (6) drop Article::stored_in; };

A speci c case which needs further investigation is the ltering of a type ( ). The impact on its sub- and supertypes as well as attributes/relationships/methods using the ltered type is as follows (based on [10]): 1. All subtypes keep their attributes, relationships, and methods. Those which were inherited from now become explicit properties of the subtype. 2. Each direct subtype of is attached to each direct supertype of . 3. For attributes, relationships, or methods using in their de nition or implementation the following holds: If has only one supertype or one subtype ( 1) after the lterings then 1 is used in their de nitions. If has multiple subtypes 2 , then the corresponding (2   ) is used in the de nition of a and recursively in its subtypes. For all other cases the attribute, relationship, or method de nition is deleted. 4. A ltering of attributes, relationships, or methods may a ect other methods of this or other types (latter for public members) which use the dropped items. These methods have to be modi ed in the derived schema according to 3.. In the example above, we ltered type Method of DesignCompSchema as well as its subtype Simulation. Subtype Synthesis is not ltered and therefore gets the prior inherited attribute name and relationship used for as explicitly own properties. T

T

T

T

T

T

T

T

T

T

Ti

i

; :::; T n

n

Ti

2.4 Composition In order to combine multiple schemata to a single one, e.g. several export schemata to a federated schema in the reference architecture, we provide a more general form of schema derivation. This schema composition extends the direct schema derivation (see above) by allowing multiple base schemata. We provide facilities for generalization and specialization. If a schema is generalized from schemata 1 n it inherits all their common meta data. If it is specialized it inherits the sum of all their meta data. On object level, all objects of the inherited types are visible in the new schema. S

S ; :::; S

Example 2.4

To combine the three export schemata of the CACE FDBS to a federated schema containing the sum of their meta data the following has to be speci ed in ODL+: schema EnterpriseSchema : DesignExportSchema, ProdExportSchema, Sales1ExportSchema { // restructurings and augmentations };

When types of di erent base schemata have the same name, they are mapped to a single representation in the new schema. The new type inherits the sum of their attributes, relationships, and methods. Type properties with the same name must have syntactically identical de nitions. On object level, the new type presents the sum of the objects from the base types (corresponding to an outer join of relational DBS).1 In the CACE FDBS example, DesignDBS and ProductionDBS both have a type Product. Therefore they are mapped to a single type in the federated schema by default. Nevertheless the administrator can redirect this default mapping, e.g. to rename synonyms or homonyms. We will assume that the enterprise desires to map the Product types of DesignDBS and ProductionDBS and additionally the type Article of SalesDBS1 to a single type in the federated schema because all of them represent product data for the enterprise just in di erent stages (design, production, sales). Therefore the administrator has to rename the type Article to Product and must solve syntactical and semantical discrepancies of their type properties. Apart from restructurings, the federated schema can also be extended with new meta data by augmentation. The ODL+ facilities for restructuring and augmentation are presented in the following paragraphs. All restructurings and augmentations have to be speci ed in the body of the new schema, i.e. between the braces f and g.

2.5 Restructuring For solving schema integration con icts or for changing/enriching the semantics of inherited meta data, various restructuring facilities are o ered. For example, they allow to modify its name or type and enable the conversion of their values. A renaming is o ered to change the name of a type, attribute, relationship, or method. It is essential for solving synonym and homonym con icts. The new name becomes valid in the derived schema for all dependent de nitions. For example, renaming a type also changes the type de nition of all its subtypes in the derived schema. Example 2.5

In the following, we rename the synonymous type Article in SalesDBS1 and change the name of its synonymous attribute no. The renamings cause that the types Product of DesignDBS and ProductionDBS as well as Article of SalesDBS1 are mapped to a single type in the federated schema

1 This speci c type mapping becomes important for object migration [11].

with name Product and the sum of their attributes, i.e. no, name, and prod line. (The syntactical con ict between the attributes no is eliminated below by an additional restructuring.). rename DesignExportSchema::Product::kind name; rename Sales1ExportSchema::Article::id no; rename Sales1ExportSchema::Article Product;

The renaming is reversible, i.e. the mapping is de ned in both directions, from the base schema(ta) to the derived schema and vice versa. Many other restructurings can be realized by methods. Therefore the administrator has to specify a method with the old item as input and the restructured data as output. If the method is also associated to its reverse method the mapping is reversible. Therefore we extended the declaration of methods in ODL by an INVERSE-clause (similar to that for relationships). The implementation of the method can be written in the standard ODL, i.e. by realizing arbitrary algorithm in the chosen programming language (here C++). Example 2.6

A method converting a string to an integer can be declared in ODL+ with C++ binding as: int String2Int (String s) inverse Int2String;

In the following, we will show for known con ict cases [15, 8] how methods can be used to solve the con ict by restructuring. An often existing case occurs when attributes use different scalings, e.g. Fahrenheit and Celsius as thermal unit or miles and kilometer for distance. In order to change the scaling of an attribute, a method is required that recalculates the scaling. For changing an attribute from miles to kilometer, for instance, the method has to return the input value divided by 1.6. Beside those linear calculations, the method may also di erentiate the input values, e.g. to map the old German ZIPs to the new 5-digit ones (precision con icts), or de ne other mappings which can be realized by a C++ program. While the re-scaling does not change the attribute type but only its value, sometimes also a re-typing is required. For example, if an attribute employee-identi er is represented by the data type integer in one base schema and string in another but both represent semantically the same data and therefore need to be mapped to a single attribute in the derived schema. Then the data type of one of these attributes needs to be changed by specifying a corresponding cast operation. Since methods can be realized in ODL by arbitrary algorithms, not only simple type conversions can be covered by this. Some programming languages like

C++ already include prede ned cast operations for the base data types which can be used as well. Example 2.7

While attribute no of type Product is de ned of type integer in DesignExportSchema and Sales1ExportSchema, it is represented as a string in ProdExportSchema. The previously declared operation String2Int can solve this con ict: Product::no = String2Int (ProdExportSchema::Product::no);

A conversion of multiple attributes to a single complex one is achieved by de ning a method with all old attributes as input parameters and the new attribute type as result type. This allows, for example, to convert the simple attributes street, no, zip, town of an inherited type to a complex attribute addr.

Example 2.8

Though the customer address must be attened in the relational model we are able to map it to a single structured attribute in our canonical data model. The corresponding statement for restructuring is: Customer::addr = CombineAddress(Sales1ExportSchema::Customer::street, Sales1ExportSchema::Customer::zip, Sales1ExportSchema::Customer::town);

struct Addr (String street,int zip,String town); Addr& CombineAddress (String inStreet, int inZip, String inTown) { Addr *a = new (Addr); a->street=inStreet; a->zip=inZip; a->town=inTown; return (*a); }

This de nition introduces a new attribute name (addr) and lters the existing ones (street, zip, town). Note, this mapping is de nable in one direction only since the inverse mapping is not precise and cannot be realized as a single method.

To solve generalization con icts where inherited types of di erent CDBS de ne objects at di erent levels of abstraction, they can be related by sub-/supertyping at the composition. Thereby an inclusion relationship is created between the type and its generalized supertype. For example, a type Student and the more general type Person from di erent CDBS can be connected this way. The presented restructurings re ne the mapping between export schemata and federated schema. However, we have seen that some mappings cannot be formulated bijective, e.g. the mapping of multiple addr attributes to a single one with complex data type. In

Section 3, we will see that this may restrict the access facilities to objects of those types. They cannot be created or updated globally in general. In order to overcome this restriction, the ODL+ provides further mechanisms. They allow an administrator to override the OML operations for any schema type, e.g. implements the create operation by corresponding CDBS operations. Though this technique is very powerful of course it is not advisable to realize the complete schema transformations this way, i.e. to de ne the OML for any type, as it requires much implementation e ort. Therefore we recommend it only for cases which cannot be solved by other ODL+ constructs. One such severe con ict is also the type/attribute/value con ict of [9]. However, such a transformation is only implemented once and can be used by any application using that schema.

2.6 Augmentation An inverse capability to the ltering is the augmentation. It allows to add meta data during schema derivation. The derived schema can obtain new types or extend inherited types with new attributes, relationships, or methods. New types can either be de ned as globally new types which are not mapped to any CDBS but to the internal FDBS database or derived as subor supertypes from inherited/new types. Derived subtypes can de ne additional attributes, methods, and relationships (e.g. inter-DBS-relationships). The augmentation during schema derivation is speci ed analogously to that of type derivation. The new meta data are de ned in the body of the derived schema in terms of ODL+. If types are augmented by additional members, i.e. attributes, relationships, and/or methods, these new properties are inherited to all its subtypes. It also holds for those subtypes which were inherited from the base schema and had not these augmentations there. New de nitions may override inherited ones if they have the same name and scope, e.g. a new type Person overrides the inherited type with the same name. Since it incorporates an implicit ltering of the overridden de nition, dependent de nitions like relationships to the overridden type are managed as speci ed for ltering (see above). Example 2.9

In the following, we present the federated schema of the CACE FDBS comprehensively, with all restructurings and augmentations. The federated schema is built by specialization from the export schemata of DesignDBS, ProductionDBS, and SalesDBS1, i.e. it inherits the sum of all their

de nitions (1). Then the restructurings rename some inherited types or type properties (2), convert an attributes type (3), and combine multiple attributes to a single complex one (4). As augmentations, we de ned two subtypes of Product, named InterimProduct and EndProduct (5) and a subtype of Material, named OwnMaterial which posses an inter-DBS-relationship to Depot. Additionally, a new global type EnterpriseInfo is introduced with general information on the enterprise which is not stored in any CDBS (6). (1)schema EnterpriseSchema:DesignExportSchema, ProductionExportSchema, Sales1ExportSchema { //data types and help functions struct Addr (String street,int zip,String town); Addr& CombineAddress (String inStreet,int inZip, String inTown); int String2Int (String s) inverse Int2String; String Int2String (int i); (2)//restructuring: renaming of synonyms/homonyms rename DesignExportSchema::Product::kind name; rename Sales1ExportSchema::Article::id no; rename Sales1ExportSchema::Article Product; (3)//restructuring: casting of an attribute type Product::id = String2Int (ProdExportSchema::Product::no); (4)//restructuring: comb. of multiple attr.s to 1 Customer::addr = CombineAddress( Sales1ExportSchema::Customer::street, Sales1ExportSchema::Customer::zip, Sales1ExportSchema::Customer::town); (5)//augmentation by two new subtypes class EndProduct : Product {public: int quality; sell_to_consumer (Customer c);}; class InterimProduct : Product {public: sell_to_end_producer (Customer c);}; class OwnMaterial : Material {public: Set< Ref > stored_in;}; (6)//augmentation by a globally new type class EnterpriseInfo {public: int employee_no; int turnover; String attitude; void print(); }; };

An external schema can be derived from this federated schema by applying any schema transformation except composition. This completes our exemplary schema integration using the ve level schema architecture. This schema allows an application (set) to see

only a subset of information from the federated schema. It restricts on the classes EndProduct, Customer, and EnterpriseInfo (except attribute turnover). schema { drop drop drop };

ApplicationSchema:EnterpriseSchema

where developed or used for a homogenous DBS. Only the semantics needs to be extended in the sense that all operations allow transparent data access on multiple heterogeneous DBS and are not restricted on a single DBS. Global Appl.

Synthesis, Material, Product; InterimProduct, Depot; EnterpriseInfo::turnover;

The presented schema transformations direct schema derivation, ltering, composition, restructuring, and augmentation can be used by an administrator during the installation of an FDBS. As illustrated for the CACE FDBS example, he can initially de ne the 5level schema architecture for the database systems to be federated. While these transformation processors may also serve for other architectures with more or less schema levels or shortcuts between them, we restrict our description on the ve level schema architecture for clarity and simplicity. ODL+ also allows to extend an existing schema integration dynamically. The existing speci cation only has to be extended by them. For example, to extend the CACE FDBS with SalesDBS2 of a new sales district its component schema has to be de ned in terms of ODL+, its export schema is derived by ltering meta data which are not relevant for the federation layer, and the federated schema has to be extended by a new base schema, the export schema of the new DBS. [11] describes the schema extension in ODL+ and its use for object migration in more detail.

3 Using ODMG OQL/OML as Global Interface Since the FDBS preserves the autonomy of the CDBS, objects can be accessed further by the CDBS interfaces. These interfaces provide data of a single database system in the CDBS speci c data model. Moreover the global interface of the FDBS allows to access data of multiple CDBS uniformly in the canonical data model (Fig. 2). While many FDBS approaches restrict on global queries we also allow to create, delete, and update data globally. Even data can be created which are not mapped to any CDBS but stored in the internal FDBS database, e.g. those of globally new types. As global interface, our approach uses the ODMG OQL/OML. The syntax of theses languages does not have to be extended for an FDBS but can directly serve as a global interface. Thereby all standard applications using the ODMG standard can be used as global applications of the FDBS, even standard applications which

Legend:

Global Interface

Federation layer FDBS Kernel

Local Appl.

Module Application

DBS1− Interface

DBS2− Interface

...

DBSn− Interface

DBS1

DBS2

...

DBSn

Association ’uses’

Figure 2: ODMG OQL/OML as Global Interface As speci ed in the standard ODMG-93, each ODMG OQL/OML application refers to an ODL schema (by the include and open statement). In our FDBS approach we allow to refer to any federated or external schema which was de ned during schema integration (previous section). The speci ed schema and its transformations down to the various CDBS schemata determine the mapping to the various CDBS. They de ne how data accessed by a global application using OQL/OML are mapped to CDBS data units. This is the main issue for the change in the semantics of the OQL/OML when becoming an FDBS language. Therefore we will discuss it in more detail in the following. For simplicity, we assume that for any schema level also a logical global object is created in the federation layer and mappings exist among them according to the schema transformation. As we have already discussed the transformations of schema integration in depth, we show the corresponding transformations on object level by example only. Afterwards we discuss the two directions of object mapping (top-down, bottom-up) and their impact on the global operations. Example 3.1

For our CACE FDBS, we consider the object transformation of an object with global type Product (Fig. 3). The schema harmonization from local to component schemata maps the CDBS data into a logical object of the canonical data model. It transforms a Product object of the object-oriented DesignDBS to a Product object in its component schema, a Product segment occurrence of the hierarchical ProductionDBS to a Product object in the production component schema, and an Article relation row to an Article object in the sales component schema.

the federated schema is mapped 1:1 to a Product object of the external schema.

External Schema

no name prod_line

Product

Federated Schema

Export Schema

rename kind name

String2Int(no)

rename id no; rename Article Product;

drop bought_by; drop stored_in;

Component drop last_change Schema

Local Schema no kind last_change prod_line Product

DesignDBS

(object−oriented)

segment type field attribute

no name prod_line

Product

ProductionDBS (hierarchical)

relation type column attribute

id name bought_by stored_in Article

SalesDBS1 (relational)

Figure 3: Object transformation for a product object in the CACE example The direct schema derivation was used for deriving the export schema of ProductionDBS from its component schema. Therefore the product object is mapped 1:1 between these schema levels. For the type Article of SalesDBS1 a partly ltering was performed by dropping the relationships bought by and stored in in the export schema. Hence, the Article object of the component schema is mapped to a corresponding Article object in the export schema where all type properties but the ltered relationships are visible. Type Product of the federated schema was derived by schema composition and restructuring from the export schema types Product and Article. The restructuring of type Article renames its attribute id to no and its type name to Product. Thereby an Article object of the SalesDBS1 export schema is mapped 1:1 to an object in the federated schema with type Product and the iddata are accessible via the attribute name no. An analogous object transformation exists from the Product object of DesignDBS to the federated schema where only the attribute kind was renamed. The schema mapping from the ProductionDBS export schema contains a retyping of the attribute no from data type string to int and therefore changes the data representation of its attribute value in the federated schema. All three objects of the export schemata are mapped to a single object in the federated schema. As the external schema was derived from the federated schema by ltering some types but without a ecting type Product, the Product object of

In order to support any data access on global objects, their object transformation needs to be reversible:  The top-down direction from global to local data units is required to perform a global create and update operation. In this case the new global data are transformed from the federation schema of the application to the conceptual schemata of the CDBS and then written as local data unit into the CDBS. Type properties which were ltered during schema integration are set to default values by global creations (NULL for basic C++ data types). For the CACE FDBS, a top-down mapping is de ned for all global types except Customer because we did not de ne the mapping of its attribute addr reversibly. Therefore objects of all global types except Customer can be created and updated via the global interface. When an object of type Depot is globally created the ltered attribute capacity is set to the default value NULL. (If we had overloaded the create/update operation for type Customer the create/update would have been possible.)  The bottom-up direction from local to global objects is relevant for a global query. Then local data units of some CDBS are transformed from conceptual schemata to a federation schema and the application receives a global object. As we de ne the schema integrations and schema extensions bottom-up in ODL+, such a bottom-up mapping is possible for all objects. Therefore FDBS global queries can be formulated on any type of a federated or external schema in our approach. Type properties of the global type that have no corresponding in the associated local data units are set to default values (NULL for relationships and attributes of basic C++ data types or collections and an empty or exception raising function for methods). In our CACE FDBS, any data can be queried according to a federated or external schema. When a product object is selected which was locally created in SalesDBS1, its attribute prod line is returned as default value NULL because its attribute was not set in SalesDBS1. If those attributes/relationships are updated that have no correspondence in a local data unit, they are stored in the FDBS internal database. Therefore the export schema of the FDBS internal database has a type duplicate for any type of the federated and external schemata (pre xed by the

schema name to prevent naming con icts and implicit type mappings to their global type). Concluding, the ODMG OQL/OML serve as a powerful global interface for an FDBS which allows to create, modi y, select, and delete data from multiple heterogeneous DBS. Though our work allows many more cases of updates than other approaches we also gured out under which conditions an update is not possible. However, this paper does not focus on speci c mapping issues like transactions or query optimization.

4 Conclusion This paper extended the ODMG interfaces to FDBS languages. Most important extension are facilities for schema integration in the object de nition language ODL. The current ODMG ODL does not support multiple schemata and therefore has no capabilities to derive a schema from another. But schema derivation mechanisms are essential for FDBS to combine the various DBS schemata to one/some FDBS schemata, e.g. according to the ve level reference schema architecture. Therefore we developed new language constructs for the required schema transformations. They allow to derive one schema from one or multiple other schemata whereby information can be ltered, added, renamed, or arbitrarily transformed by methods. These constructs are analogous to those for class de nition and inheritance of ODL which simpli es learning the extensions. The ODMG OML/OQL is extended in the semantics for providing transparent data access on multiple DBS as if it was a single logical database. This enables the user to specify multidatabase queries in the current OQL and data manipulations in OML. In contrast to many other FDBS approaches we allow to update data from the global interface for many cases. This was possible by the powerful schema transformation facilities of the ODL+. Nevertheless we also identi ed under which conditions an update is not possible. The presented ODMG extensions have been realized within Cadlab's federated database system Efendi. Efendi will soon become a product, as part of OpenDM (open database middleware). It is already validated in pilot projects with industrial organizations. A further important feature of Efendi is its data migration capability. It allows to move, copy, and replicate data among the component database systems in various granularities but preserves the global object identity of the migrated data [12]. The future development of Efendi will be in uenced on the one hand by requirements of our project partners and customers and on

the other hand by enhancements such as performance tuning or more sophisticated transaction mechanisms.

References [1] BATINI, C., LENZERINI, M., NATATHE, S. A comparative analysis of methodologies for database schema integration. ACM Computing Survey 18, pages 323{364, 1986. [2] BUSSE, R., FRANKHAUSER, P., NEUHOLD, E.J. Federated schemata in ODMG. In Proc. Int'l East/West Database Workshop, 1994. [3] CATTELL, R.G.G. The object database standard: ODMG93. Morgan Kaufmann, 1994. [4] CONNORS, T., LYNGBAEK, P. Providing uniform access to heterogeneous information bases. In Proc. Int'l Conf. on Extended Data Base Technology (EDBT), pages 162{ 173, 1988. [5] HEIJENGA, W. View de nition in OODBS without queries: a concept to support schema-like views. In Doct. Cons. 2nd Intl. Baltic Wg on Databases and Information Systems, Tallinn (Estonia), 1996. [6] JONKER, W., SCHUETZ, H. The ECRC multidatabase system. In Proc. Int'l Conf. on Management of Data (SIGMOD), San Jose, USA, 1995. [7] KELLEY, W., GALA, S., REYES, T., GRAHAM, B. Schema architecture of the UniSQL/M multidatabase system. In The object model, interoperability, and beyond (Ed. Kim, W.). ACM Press, 1995. [8] KIM, W., CHOI, I., GALA, S., SCHEEVEL, M. On resolving schematic heterogeneity in multidatabase systems. Distributed and Parallel Databases 1, pages 251{279, 1993. [9] KRISHNAMURTHY, R., LITWIN, W., KENT, W. Interoperability of heterogeneous databases with semantic discrepancies. In Proc. 1st Int'l Workshop on Interoperability in Multidatabase Systems (RIDE-IMS), pages 144{151, Kyoto, Japan, 1991. [10] LITWIN, W. O*SQL: a language for multidatabase interoperability. In Proc. IFIP-DS5 Semantics of Interoperable Database Systems, pages 114{133, Lorne, Australia, 1992. [11] RADEKE, E. When and how to extend schema integration for data migration in FDBS. Submitted for publication, 1996. [12] RADEKE, E., SCHOLL, M.H. Functionality for object migration among distributed, heterogeneous, autonomous database systems. In Proc. Int'l WS Research Issues on Data Engineering - Distributed Object Management (RIDE-DOM), pages 58{66, Taipei, Taiwan, 1995. [13] SALTOR F., CASTELLANOS M., GARCIA-SOLACO M. Suitability of data models as canonical models for federated databases. SIGMOD RECORD 20(4), pages 44{48, 1991. [14] SCHOLL, M.H., LAASCH, C., TRESCH, M. Updatable views in object-oriented databases. In Proc. 2nd Conf. on Deductive and Object-Oriented Databases (Munich, Germany), 1991. [15] SHETH, A., KASHYAP, V. So far (schematically) yet so near (semantically). In Proc. IFIP Workshop on Data Semantics, DS-5, pages 272{301, Lorne, Australia, 1992. [16] SHETH, A., LARSON, J. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 22(3), pages 183{ 236, 1990.