On completeness of historical relational query languages

Report 0 Downloads 97 Views
On Completeness of Historical Relational Query Languages JAMES

CLIFFORD

New York University

ALBERT

CROKER

City University

of New York

and

ALEXANDER

TUZHILIN

New York University

Numerous proposals for extending the relational data model to incorporate the temporal dimension of data have appeared in the past several years. These proposals have differed considerably of the in the way that the temporal dimension has been incorporated both into the structure algebra or calculus extended relations of these temporal models and mto the extended relational that they define. Because of these differences, it has been difficult to compare the proposed models and to make judgments as to which of them might in some sense be equivalent or even better. In thm paper we define temporally grouped and temporally ungrouped historical data completeness, analogous to Codd’s notion models and propose two no’uons of hzstorma 1 relational of relational completeness, one for each type of model. We show that the temporally ungrouped models are less expressive than the grouped models, but demonstrate a techmque for extending the ungrouped models with a grouping mechamsm to capture the additional semantic power of temporal grouping. For the ungrouped models, we define three different languages, a logic with explicit reference to time, a temporal logic, and a temporal algebra, and motwate our choice for the first of these as the basin for completeness for these models. For the grouped models, we define a many-sorted logic with variables over ordinary values, hmtorlcal values, and times. Finally, we demonstrate the equivalence of this grouped calculus and the ungrouped calculus extended with a grouping mechanism. We believe the classification of hmtorical data models into grouped and ungrouped models provides a useful framework for the comparison of models in the hterature, and furthermore, the exposition of eqmvalent languages for each type provides reasonable standards for common, and minimal, notions of historical relational completeness. Categories models;

and Subject

H 23

[Database

Descriptors: Management]:

Management]: H.2. 1 [Database Languages-query languages

Logical

Design—data

Authors’ addresses J. Clifford and A. Tuzhilin, Information Systems Department, Stern School of Business, New York University, New York, NY 10012; A. Croker, Statistics and Computer Information Systems, Baruch College, City Umverslty of New York, New York, NY 10010. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the pubhcation and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, reqmres a fee and\or specific permission. 01994 ACM 0362-5915/94/0300-0064 $03.50 ACM

TransactIons

on Database

Systems,

Vol

19, No

1, March

1994, Pages 64-116

Completeness

of Historical Relational Query Languages

.

65

General Terms: Languages, Theory Additional Key Words and Phrases: Completeness, historical databases, query languages, relational model, temporal databases, temporal grouping, temporal logic

1. INTRODUCTION Over

the course

have

been

of the past

proposed,

(1-oker

[1987],

[1980],

Lorentzos

decade,

including

Clifford

and

and

various

Ariav

Warren

Johnson

historical

[1986], [1983],

[1987],

relational

Ben-Zvi Gadia

Navathe

[1988], and

data

[1982], Jones

Ahmed

models

Clifford

and

and

Mason

[1989],

Sarda

[1990], Snodgrass [1987], and Tansel [ 1986].1 These data models are intended for those situations where there is a need for managing data as they change the standard incorporation

over time. Generally, these data models extend model by including a temporal component. This

relational data of the temporal

dimension has taken a number of different forms. Chief among these has to a relation (the equivabeen the addition of an attribute, such as TIME, lence of time stamping) [Snodgrass 1987], or the inclusion of time as a more of a relation [Clifford and Croker 1987; Gadia intrinsic part of the structure 1986].

The

latter

Although proposed degrees, subject

approach

results

in

what

have

been

the

structures

of the

historical relational whether they have for debate.

historical

Moreover,

because

defined

relations

data models differ the same modeling

the literature 12 algebras

the

query

paper

models

provide that

(e.g., McKenzie and Snodgrass alone) that it is crucial to have

to compare

In this data

non-first-

have

of the

languages

defined

in these

it has remained extracting various have appeared in

[1991a] refers some standard

to no fewer than measure against

them. we address

and languages.

a basis

in each

from each other to varying capabilities has remained a

data models differ from each other in their formulations, unclear whether they provide the same capabilities for subsets of a database. In fact, so many different languages

which

called

relations.

normal-form

the issue

of completeness

A metric

of historical

for determining

been defined

the

as part

expressive

of proposed

for historical relational

power

historical

relational can

completeness

of the query relational

languages

data

models.

As such, the notion of historical relational completeness can serve a role similar to that of the original notion of relational completeness first proposed by Codd [1972] and later justified as being reasonable by Bancilhon [1978] and by Chandra and Harel [1980]. In Section 2 we first address the

issue

of the

various

have

been

‘This

historical

hst

m not

data

exhaustive.

Clifford

[1986],

Snodgrass

subject,

see McKenzie

models

For

an overview

[1990],

[1986],

that

and

Stare ACM

Tansel

of the

area

of time

et al. [1993];

and Snodgrass Transactions

[1988],

modeling proposed.

for and

on Database

and

capability In

databases,

an ongoing

of the

particular,

we

see Ariav

bibliography

and

on the

Soo [1991]. Systems,

Vol. 19, No. 1, March

1994.

66

J. Clifford

.

explicate temporal

the different modeling dimension at the tuple

attribute-value the

et al.

terms

between the two

level

(by including

temporally

urzgrouped

these two approaches, approaches. We then

basis for proaches.

capabilities achieved level (by time-stamping

our analysis The distinction

time

as part

and

of each value).

temporally

respectively, propose two

by incorporating the each tuple) or at the We introduce

grouped

distinguish

to

and discuss the relative power of canonical models to serve as the

of the power of query languages for these two apbetween these two different types of models, tempo-

grouped (TG), serves to structure the rally ungrouped ( TU ) and temporally remainder of the paper. weak and strong completeness for comparing the In Section 3 we introduce query languages of different data models. Then in Section 4 we apply these concepts of completeness separately to the temporally ungrouped rally grouped models. For the temporally different

languages:

and a temporal

a temporal

algebra.

logic,

We propose

a logic

with

the logic

ungrouped models,

explicit

with

we

and tempodefine three to time,

reference

explicit

reference

to time

for as a standard for strong completeness, which we call TU-Completeness, temporally ungrouped models. In Section 5 we examine the temporally grouped

calculus

models

and

define

is a many-sorted

a historical

logic with

values, and times. We propose this ness, which we call TG-Completeness,

relational

variables

calculus

over ordinary

for

the

representation extended

extended model. relational

to

power

ungrouped Finally,

pointing

that

ungrouped

the

model

in Section

languages

these metrics. It is worth

of the

incorporate

is strongly

models

complete

with

respect

are a number

extended

to handle

most of the not address

additional

time

work on completeness the issue of temporal

temporal,

we

believe

complete show how

languages show

with

of additional

can

that

to the

of several

in the literature

might reasonably be said to be related to the question query languages, but that are necessarily outside of the We are limiting our attention to models that incorporate of time (historical, as opposed to Snodgrass and Ahn [1985]), but

their and

the completeness

have been proposed there

and

semantics,

grouping

7 we examine

out that

this

historical

calculus as a standard of strong completefor models of this type. In Section 6 we

show that the ungrouped historical data models are only weakly historical models. However, we then with respect to the grouped be

them;

values,

this

grouped historical respect

issues

to

that

of of completeness scope of this paper. a single dimension

models, in the terminology that these results could

dimensions.

Furthermore,

for standard aggregates

relational languages, (as, e.g., in Snodgrass

in the spirit

of be of

we do et al.

[ 1989]). Work in the spirit of Klug [ 1982] could extend the results herein that homogedirection if so desired. Finally, we limit our attention to temporally relations Gadia [1988], that is, relations whose tuples have attributes neous all defined over the same period of time, and do not incorporate schema evolution over time (as in Clifford and Croker [ 1987]) because treatment of these additional issues would significantly lengthen the paper and because they have not been included in most of the proposed historical data models. In all of these decisions of what to incorporate in our notion of “reasonable” ACM Transactions on Database Systems,Vol 19, N. 1, March 1994

Completeness queries,

we have been motivated

denominator apply

of the various

our metrics

8 with

2. TEMPORALLY GROUPED DATA MODELS different

relational relation

strategies

model

proposed. fairly

AND

in the

to include

several

TEMPORALLY

models

a temporal

dimension

In one, the

into

schema

temporal

distinguished

the

of the

attributes

or as a first-normal-form (lNF) as tuple time stamping time stamping approach, referred to as attribute

non-first-normal-form

model,

(NINF)

instead

of adding

Clifford and proaches,

Croker

[1987]

for example,

and salary

and

Snodgrass

a relation

histories

intended

of employees

show

typical

representations

tions

appear

to have

of these

the

same

[1990]

or

additional

to the schema, the domain of each attribute is extended from complex values (e.g. functions) that incorporate the temporal

Consider,

in the paper.

and some directions

to represent the period of time over which the fact represented is to be considered valid. This approach has been referred to in

the literature In the other

mental

common

UNGROUPED

literature.

one or more

67

way we have been able to

of our results

incorporating

appeared

.

to choose the greatest In this

against

a summary

for

have

is expanded

(e.g., TIME) by the tuple

by the desire

models

of completeness

We conclude in Section for future research.

Two

of Historical Relational Query Languages

attributes

simple values to dimension. Both

contrast

to record

these

two

the

changing

in an organization.2

Tables

two

approaches.

information

content,

Although that

model. as a

is, the

ap-

departI and II

both

rela-

same

data

about three different employees over the same period of time, the models represent this information in quite different ways. In the lNF approach (Table I and models such as Ariav [1986], Lorentzos and Johnson [1987], Navathe [1990]), separate and

and Ahmed each moment tuple,

models

[1986]),

which

such

Snodgrass [1987], and Tuzhilin and Clifford relevant to each employee is represented by a

carries

the time

as Clifford

each employee’s

and

entire

stamp.

Croker

history

In the NINF

[1987],

approach

Gadia

is represented

[1988],

within

(Table

and

II

Tansel

a single

tuple,

of the values the time stamps are embedded as components Also note, with respect to the N lNF models, that while,

within which each attribute. general,

[1989], of time

a key field

is no requirement

like that

would

NAME

this

be the

typically

case. For

be constant example,

over time,

in the

of in

there

EMPLOYEE

Tom changes his name to Thomas at time 3. relation in Table II the employee There are many applications where the value of a key need not be constant over time, but merely unique in the relation at any given time. group related facts into a single tuple, Whereas NINF models inherently

lNF models, whether grass and Ahn [1985] tively),

2Similar

are

problematic

examples

have

historical or temporal (using the for models with one or two time in

appeared

this

regard.

in Clifford

and

Such

Warren

models

[ 1983],

distinction dimensions, provide

Gadia

[ 1986],

in Snodrespec-

no inherent

and

Snodgrass

[1987]. ACM

TransactIons

on Database

Systems,

Vol

19, No

1, March

1994

68

.

J. Clifford

et al

Table I.

Prototypical lNF Historical Employee Relatlon EMPLOYEE N,4ME

I DEPT

I SALARY

time

,I MIS 1 Finance

, I 201{ 1

I 431
..-> Y~, t) are true, then patible) and x, = y, for i determines appear.

into

all of the

m = n (i.e., =

attributes

“temporal

objects”:

the rest of the tuple if R(o, xl, . . . . x., t)

no

matter

and

Q(o,

the relations must be union-comother words, OT functionally

1,...,n. In

in all of the relations

in which

O and

T

independently of (2) A group-id uniquely determines the group of the tuples R and Q, which relation they belong to; that is, if o appears in relations Ul, . . . . u., t) and (3yl) .-. meaning that if both (3u I) . . . (3u~)(3t’)R(o, (3y~)(3t’’)Q(o, Xl,

. . ..xn.

yl,

. . . . y.,

t) is true,

then

hold, then, Q(o, xl, . . . . x.,

t“)

for all xl, . . . . x., t, if R(o, t) is also true, and vice versa.

(3) A group of tuples uniquely determines the group-id; that is, there cannot be two identical groups of tuples with different group-ids. Formally, if there are R, Q, o, and o’ such that for all xl, . . . . x., t,if It(o, Xl, . . . . x., t) implies that Q(o’, xl, . . . . x., t) and that Q(o’, xl, . . . . x., t) implies R(o, xl, ..., x~, t), then o = o’. ACM

TransactIons

on Database

Systems,

Vol

19, No

1, March

1994.

94

.

J. Clifford

et al.

Table XI.

Relation

EMPLOYEE

in the Grouped

TCe Model

EMPLOYEE Group-ID

I N.-t

lIE

D17PT

S4LAR}’

Mlit~ hlktg Mktg

301< ‘771< ~~1
xl

where

then

),...,

~ is a safe TC~ formula

rU~(Q)

formulas.

on the set of safe TC~

queries

If Q is a TCK

(on~n)j~)

This

mapping

query

lo}>}>

of the form

is

where historic variables e, correspond to the group-id variables o, appearing R, in ~, and attributes A, correspond to variables xl in these in predicates predicates. Examples illustrating the mapping rc,~ follow. In these examples we assume

that

the

schemata

Q(O, A, T ), respectively, temporal attribute: Example

The TC~

7.

{((0,.x

is mapped

query ),(o’,

Q are

R( O, A, T)

A is an attribute,

and

and T is a

Q x’),

A:t](3x)(3x’

A(Q(e’)

t)lR(o,

x,t)AQ(o’>x’,

)((R(e)

A t Ge’.l

Ae’.

At

Se.1

AR(e)

This

and

R

O is a group-id,

t)}

into [e. A,e’.

of logical

relations

of TU~

where

expression

for

rc,~(Q)

transformation)

to

[e. A,e’.

could

A:t]R(e)

Ae. A(t)

~ t =e.1 A(t)

=x)

=x’))

A Q~e’)

be simplified

A t’ Ge’. [. (using

A t ● e.1 A Q(e’)

standard

techniques

A t ● e’.l.

] ‘Actually, there is no need to add expressions e A,(t) = x, for all L = 1, , n, ah some examples it IS acceptable to do wdl show, but only for those xl’s that appear in other expressions However,

It for all terms,

as It slmphfles

ACM TransactIons

the

presentation

and the transformatmn

on Database Systems, Vol 19, No 1, March 1994

1s stall correct.

Completeness However,

this

simplification

of Historical Relational Query

is not always

possible,

Languages

.

as the following

99

example

shows : Example

The TC~

8.

query x),

{(o,

is mapped

tlR(o,

x,t)

(=Io’)Q(o’,

A

x,t)}

into

[e. A:t](3x) (R(e)

Note

~ t = e.1 ~ e.A(t)

that

in this

was replaced Example

~ (3e’)(Q(e’)

case the variable

via transitivity.

e’. A(t),

=x

with

Also

x serves

note

the historic

to equate

that

the quantified

variable

(3 e’ ) in the

the

t ) and

e.A(

variable

(3 o’ )

formula.

Lk

x’,

t’)

AX

‘X’)}

with A t 6 e.1 A e.A(t)

[e. A:t](3x)((R(e)

A t ● e.1 A e.A(t) AR(e)

This

terms

group-id

{( O, X), tlR(o,.x,t) A (~i)(qt’)(f?(o,x,t) A Q(o,

((l?(e)

=x)).

The TC’8 query

9.

is replaced

A e’.A(t)

~ t G e’.l

At

A (Q(e)

=x)

A (~x’)(~t’)

=x)

A t’ =e.1

A e. A(t’)

=x’)Ax

=x’))

=e.1).

expression

can be simplified

to

[e. A:t](3x)(3x’)(3t’)

A t ● e.1 A e.A(t)

((R(e)

AR(e) Note that

the

equality

it follows

the

variable

x = x’

from

x’ in the TC~ The

10.

=x’)

change

in

the

conversion

e. A(t ) = x, e. A(t’ equal.

remained

Also

unchanged

process. and

) = x’,

x = x’

note

that

the

domain

in the

L~

formula.

formula

TC~

x),

x,t)

t)lR(o,

A

=Q(o,

x,t)}

to

m(Q(e) Note

not that

e. A( t’ ) are

At

[e. A:t](3x)(R(e)

L~

did

facts

formula

{((o,

is converted

the

e. A( t ) and

terms

Example

Ax

=x’)

● e.1).

A t

that

However,

A t’ ● e.1 A e. A(t’)

A (Q(e)

=x)

that,

formulas. PROPOSITION

=e.1

A t = e.1 A e.A(t)

in the previous

examples,

We generalize

these

4.

rUG

Ae.

maps

safe

A(t)

ru~

A

AR(e)

=x))

maps

observations TC~

=x

A t G e.1.

safe TC~

formulas

in the following

formulas

into

safe

L~

into

safe

proposition: formulas.

SKETCH OF PROOF. Let @ be a safe TC~ formula. We will prove that ru~( @) is safe by verifying all of the conditions in the definition of safety for L~ formulas. First, 17U~(+) does not have universal quantifiers since ~ does not have them. ACM

TransactIons

on Database

Systems,

Vol. 19, No 1, March

1994

100

.

J. Clifford

Second,

the

et al.

range

mapped into the X,l A . . . A e,. A,~$t)

expression

V, = (3x,J, )”

(3xzj

)R,(o,,

X,l, . . . . xl~,, t) is

= (e,) ~ t = el.l A e,. A,j\t) R,(e, ) A t ● e,.1 is added

expression

(3x, J,) “ (=.r,j, )(l?, = X,J, ), and also the expression

at the “outermost” level of rc,~( @) because of condition (1) in the definition of the mapping rc,~. Clearly, the two expressions are semantically equivalent. But the second condition was added to make ru~( @) syntactically safe. Since rc~~( ~)

has

the

formula

Ii?, (e,) A t = e,.1

for

each

range

expression

at the

is satisfied. outermost level, the second condition of safety for Lk formulas FI V Fz in ~ is mapped into rr,~(Fl) V r[,G(F2 ) so that Third, subformula rrlG( FI ) and and

FI

ru~( Fz ) have the same set of atoms have

Fz

translates Finally,

of safety

THEOREM

set of pairs

t, = e~ because

(o], t,)and because

related

to maximal

M~G = ( TG, L~ ) is strongly

5.

all

conjuncts

of the

the formulas

tbe mapping three

items

rc,~ in the ❑

are satisfied.

complete

with

respect

to MrrJg

=

, TC~).

SKJiTCH

OF PROOF.

one-to-one, fies

same

t, c e]. them into expressions the mapping rc,~ is defined so that

definition

(TU~

the

the

because second

following

condition

reasons:

mapped

into

First, of our

the

() UG is grouping in

the

Intuitively,

expression

clearly axioms.

definition the

a correspondence Second, of

predicate

strong

mapping

mapping

the

and

ruc~

completeness

R( o, x ~, . . . . x ~,

A t G e.1, so that

R(e)

the

for

t) in

historical

satisthe

TC~

is

variable

e

of e. Furthercorresponds to the group-id o and so that t is in the lifespan more, group-ids are defined so that the variables x ~, . . . . x. are uniquely in the translation process. determined by values of o and t and are irrelevant R( o, xl, . . . . xx, t ) and R(e) Also, the expressions ~ t ● e.1 are equivalent. In addition, it leaves ru~(o).

the mapping conjunctions,

ru~ preserves disjunctions,

the structure and negations

of the formula of d in their

~; that places

is, in



In this section we define the map6.2.3 Mapping L~ Formulas to TC~. into equivalent safe TC~ formulas. Let ping r~c, that maps safe L~ formulas As for the r~,~ mapping, the formula r~~,( ~) is @ be a safe Lk formula. obtained from @ by replacing all of the atomic formulas in @ together with quantified variables and leaving the structure of @ intact (operators A , v , and = remain unchanged). The replacement of atomic formulas and quantified variables is done in the following manner: (1) Replace

quantified

(a) Do not change (3x)

and (~t)

(b) Replace unique

(c)

ACM

variables

in Lh as follows:

any quantified in Lk

will

domain

remain

quantified historic group-id variable.

and temporal

variables;

that

is,

in r(~[,( q5).

variables

( =e, ) with

(2 o, ), where

o, is a

Consider all pairs of historic and temporal variables e and t such that @ contains an expression t E e.1. Depending on the relationship between the scopes of these variables, we add the expression (3 .xI ) . . .

TransactIons

on Database

Systems,

Vol

19, No

1, March

1994

Completeness (3x.) to r~u(+), historic variable (i) If

where e of arity

t is a free

expression

of Historical Relational Query Languages a domain

variable

associated

with

n, as follows:

variable

(3x I)...

is

x,

101

.

and (3x.)

e is a bound before

the

variable,

expression

then (30)

place

the

obtained

in

step (lb); if L and e are bound variables, and if the scope of e is contained within the scope of t, then also place (3x1) . .“ (3x. ) before (30);

(ii)

if ~ and

(iii)

(iv) in all other (2)

Replace (=x.)(

each

=t)l?(o,

group-id

Replace where

occurrence

of

(=e)

in step (lb).

with

(3xI)

““” (3x.

the

L~

R

conjunct

is

of the

R(e)

the

(3x,

) “””

with

all of the

the the

occurrences

of e

o.

t G e.1 with

predicates

t = e.1. 14 If

containing

then

variable

of expression one

within

(St);

to the formula.

expression

If e is free,

the same group-id

occurrence

predicate

) before

If e is a bound variable in O, then same as the one that replaced e in

t).

o is

each

maximal

x.,

variable

are replaced (3)

place

cases, do not add anything

xl,...,

expression

and if the scope of t is contained

e are bound,

scope of e, then

R( o, xl,...,

occurring

e is a bound

variable

the group-id variable o is the same as the one that expression (3e) in step (1), and the domain variables

x.,

positively

in

t),

the

in 4, then

replaced e in the xl, . . . . x,, are the

same as the quantified variables introduced in step (1) for the combinaIf e is a free variable in d, then the tion of (3e ) and (3 t) expressions. group-id variable o and the domain variables x ~, ..., x. are free and are different (4)

from

Replace Since

$

contain replaced this

all other

is

safe,

with

maximal

xl,

that

...,

mapped

(3x’)(3y’)(3t’) y=5.

R(e)

z~, t).

the mapping R of relations

).

x,, where

conjunct

corresponds

(for

xl

is defined

containing some

R).

as follows:

e. Al ( t) In step

Then x, corresponds to the A, in R.15 to attribute

r~u and

must

also

(3), t = e.1 is

follow. In these examples Q from Lk are I?( A, B)

variable

in

we assume and Q(A),

11. The L~ query [e. *:t]R(e)

is

in r~u(+

@ with

t G e.1 and

R(o,

Examples illustrating that the schemata respectively: Example

the

expressions

expression

variables

e. A,(t ) in

each term

into

the

R(o,

TC~

At

query

x’, y’, t’),

Ge.ZAe.~(t)

as follows:

t = e.1

with

R(o,

=5 R(e)

would

x, y, t),

and

be replaced e.B(t)

with

= 5 with

14It follows from the grouping axioms in Section 6.1 that it does not matter which positively occurring predicate R is selected. Any selected predicate produces the same results. In fact, all of the qualifying predicates can be selected as well, for a longer but logically equivalent formula. 15The remark in footnote 14 is also applicable here. ACM Transactums on Database Systems, Vol 19, No 1, March 1994.

102

J. Cllfford et al,

.

Putting {(o,

the pieces

x),

(o, y),

together,

we get the following

fl(31’)(3y’)(3t’

)R(o,

x’, y’, t’)

Since (3x’)(3y’)(3t’) R(o, x’, y’, t’) ~ R(o, we can rewrite the previous query as {(o, Example

The

12.

x),

(o, y),

x,y,

t) Ay

x,.y,

t) /’/y

= 5}.

to I?(o, x, y, f),

= 5}.

query

L~

A t’ Gel

AR(o,

x, y, t) is equivalent

flR(o,

[e. *:t]R(e) (R(e)

answer:

A t Gel

A (Ele’)(Q(e’)

(=t’)

A

A t’ Ee’.l

A~(e)

A t =e.1

~e.B(t)

= e’. A(t))) is mapped

into

the TC~ {(o,.r),

(~(o,

x’’,

Note the that

(o,

the part

domain of the

the variables

t’.In

general,

group-id

y),

Example

is converted

variables

Example

14.

with

in

next

in

o’. Also

temporal

the

same

t are quantified

t. The

A Q(e))

is translated

x’),

The

(o’,

A ~Q(o,

shows L~

[e. + :t]~(e)

note

variable

predicate

as

together

with

the

shows

how

r~c,

example

A~(e’)

A t

● e’.l)

A~(e’)

A t Ge’./

that

y’),

x,t)

r~u

t)l(~o)(ax) A~(o’,

x’,

y’,

does not affect

A~(o’,

t))

domain

x’, y’, t)}.

variables

in

~.

query

A t Gel

A (~z)(~(e)

A t Gel

Ae.

A(t)

=z)

into

{((o,

x),

PROPOSITION

(o, y), 6.

~GW

OF PROOF.

f)l~(o,

~,.y, safe

maPs

The

proof

t)

Lh

A

AX ‘z)}.

(~z)(~(o,~,y,f)

formulas

proceeds

in tO safe along

the

TC~ lines

fornz of

u~as

the

proof

of ❑

4.

THEOREM

7.

M~CT~ = (TUg,

TC~ ) is strongly

complete

with

respect

to M~~

L~).

SKETCH one-to-one,

is quantified

variable

together

variable

=e.1

,t”)t”)

example

= (TG,

example

‘X’))}.

to

The next

SRETCH

t) Ay

query

L~

A -(t

)(3t’’)Q(o,

Proposition

X,y,

group-id

appearing

o and

{((o’, ((3x’’

A~(O,

as the

y“ are quantified

domain

The

[e’. *:tl(3e)(Q(e)

X’, t’)

x’ in the previous

o and temporal

13.

A (~ X’’)(~y’’)(~)’)

t)

(Q(o’,

variable

scope of variables negations.

innermost

handles

x,y,

r~u( ~) formula

x“ and the

variable

tl~(o,

A (30’) (3x’)

y’’,)’)

that

same

query

OF PROOF. because

ACM Transactions

First, of our

QGC, grouping

is

clearly axioms.

a correspondence Second,

on Database Systems, Vol 19, No 1, March 1994

the

mapping

mapping rGU

and satis-

Completeness fies the second show the

condition

by induction L~

formula

in the definition

on maximal

formula

@(el,...

t e e.1 uniquely

on, ~1, ~m,yl,,

along

the lines

The following

theorem

THEOREM 8. with

in

~ in

completeness, At

L~.

Yl,

tl,...

as we shall

any inductive

, t~) is mapped

,l,...,

,tk),

into

group-ids

The grouped

TC~

Y1,...,

where

Y1

I?(e)

A

yl are o ~, . . . ,

and do the proof ❑

5.

immediately model

il!l~u ~ = (TU~,

step,

the

these variables are “superfluous” With this observation in mind,

of Theorem

103

.

variables introduced in the translation process (i.e., when becomes R(o, yl, . . . , y,, t)). Notice that variables yl, ..., determined (i.e., functionally depend) by values of variables

0 ~,xl>. ... xm, tl,...,tk.Therefore, not affect the translation process. proceeds

of strong

conjuncts

, e., xl, . . . . x~, tl,...

r~u(~)(ol,...,

are extra

of Historical Relational Query Languages

follows ikl~~

TC~)

are

from

= (TG, strong

Theorems

L~ ) and

5 and 7:

the

ungrouped

model

equivalent.

Theorems 3 and 8 establish the connections between grouped and ungrouped historical data models. The power of temporal grouping that is inherent in grouped models can only be achieved in an ungrouped model by the addition the grouping.

of some mechanism,

7. HISTORICAL

MODELS

analogous

to our group-ids,

for simulating

AND COMPLETENESS

All of the historical relational data models and languages that proposed differ from one another in the set of query operators provide.

In

addition,

relations that is incorporated

they

often

differ

in

the

structure

have that

of the

been they

historical

they specify, that is, the way in which the temporal component into the structure. Space obviously precludes an analysis of all

of these models with respect to our two notions of completeness. have two orthogonal characteristics to describe these models and

Since we their lan-

guages—gouped or ungrouped, algebra or calculus—we decided to discuss four models, each covering one of the four possibilities. Two of the data models we discuss are ungrouped, one with an algebra [Lorentzos and Johnson 1987] and the other with a calculus [Snodgrass 1987]; we therefore The other two data models investigate whether or not they are TU-Complete. discussed the other whether

are grouped, with

both

or not they

one with

an algebra

an algebra

and a calculus

[Clifford

and

[Gadia

1988],

Croker

1987]

and

so we investigate

are TG-Complete.

We have earlier motivated our choice of L~ and TC as appropriate languages to use for our notions of completeness. Therefore, in this section a with respect to ikf~~ = (TG, L~ ) (or data model is said to be complete M ~u = (TU,

TC))

M ~U = (TU,

TC)).

refer more

if it is strongly complete with respect to M~~ = (TG, Lk) (or Although by our definitions we should, strictly speaking, to completeness with respect to the data models, we will generally speak specifically

about

their

languages

and

apply

the

term

loosely

to them.

For each of the historical query languages discussed in the following, therefore, we consider first its completeness with respect to either of L~ and TC and vice versa. We shall see that L~ and TC are complete with respect to all ACM Transactions

on Database

Systems,

Vol

19, No

1, March

1994

104

J. Clifford

.

of the languages

et al

we consider,

a fact that

lends

further

support

to their

and TU-Completeness. the standards for TG-Completeness We begin with a discussion of the completeness of the historical algebra

specified

Croker defined relations

1987], We in Section in HRDM

was

intended

by the historical

relational

data

model

HRDNI

use as

relational [Clifford

and

discuss this language first both because the TG model 2 is derived directly from the structure of the historical and because the set of operators specified by this model

initially

to provide

all

of the

functionality

thought

useful

and

desirable.

7.1 HRDM The historical [1987]

relational

data

is a temporally

model

grouped

HRDM

historical

presented data

by Clifford

model

with

language that is presented as an extension to the standard We can categorize the operators of HRDM as follows: These

Set-Theoretic.

tics

of relations

tion

( n ), set

operators standard terpart ple,

operators

and include difference

do not mappings

( –),

exploit from

in relational

calculus

Attribute

terms

Based.

of the

product

relational

union

operators,

as suggested in

the

operators also applies

Because

to these

operators

these

relations, to their here.

For

the counexam-

Cs}

A t Gel

[e. *lt]r(e)

by their

standard

( U), intersec-

( X).

aspects of HRDM in relational algebra

historical

query algebra.

of the set characteris-

Vs(e)

A t ● e.1.

This category includes those operators attributes (or their values) of a relation.

that

exist

in terms

set operators

Cartesian

SrVx

-

in

standard and

the these

r(.Js={.x\x

are defined

the

and Croker

an algebraic

names,

relational

are derived algebra.

As

that are defined Some of these

from

similar

operators

shown

below,

often

the

original definition of these operators has been modified to exploit the temporal component of the historical model. For each of these operators, we give both (1)

its set-theoretic This counterpart

Project(u).

tional

definition

and then

an equivalent

expression.

operator is equivalent in definition to its standard and has the effect of reducing the set of attributes

which each of the tuples x in its operand, attributes contained in a set of attributes nx(r)

= {x( X)1.X -

(2)

Lk -based

a relation X.

r, is defined,

relaover

to those

~ r}

A t Gel

[e. X:t]r(e)

(o-IF). This variant of the select operator selects from a relation tuples x, each of which for some period within its lifespan has a A that satisfies a specified selection value for a specified attribute criterion. The period of time within the lifespan is specified by a lifespan Select-if

r those

parameter ACM TransactIons

L.

The

selection

on Database

Systems.

criterion Vol

19,

No

is specified 1, March

1994

as A 9a,

where

9 is a

Completeness comparator

and

attribute

with

operator

is used

a is

of Historical Relational Query Languages

a constant.

another

to denote

tion criterion must the tuple’s lifespan, ‘“~F(Atia,

(if

(It

is

also

in the same tuple.) a quantifier

be satisfied or whether

to

that

105

compare

one

Q, of the select-if

specifies

whether

the

selec-

for all (’d) times in the specified subset there exists (3) at least one such time. ● (L

dQ(t

Q,L) (T-)

=

{X

Qis

=

[e. *:tlr(e)

3)

possible

A parameter,

.

G

of

n x.l))[x.A(t)oa]}, ● e.1 A

At

~tl(tlE L A tl E e.1 A e. A(tl)6cz) (if

Qis

V)-

At

[e. *:t]r(e)

~3tl(tl (3)

Select-when

stricted

However,

to those

times

u-WHEA(~@~(r)

A tl G e.1 A ~e.

GL

operator lifespan

is

the selection

to the selected

criterion

is satisfied.lG

A t Ge.z

[e. *:t]r(e)

A(tl)6a)

similar of each

= {tl~’. A(t)6a}

● r[x.1

= {xlqx’

A

AX.

tuples

are

values

at some time

combined

when

in a @ relationship

with

exactly

when

Let

those rl

= RI

times

each other. this

=

RI

● rz [ e.1 =

zA

e.u(R2)

=er,.u(R2)l,

~])

● el.l

*:t]rl(el) A t Ge2.z

from

data model, this With O-join two each

tuple,

have

lifespans

that

stand

of the resulting

and

respectively,

Rz,

{t/e,,(

A)(t)

~[e.

~ r[l

=L

*:t]r(e)At

A el. A(t)

Oez.B(t)

f’ e’.l Ge.l

Transactions

relation in the tuple e of the

A e.1 = Z A e.u = e’.ull]} At=~

16The notation fl ~ in this definition, used in HRDM, is the standard restriction of the domain of the function f to the set 1. ACM

A

Ar2(e2)

reduces a historical (,%, L). This operator dimension by restricting the lifespan of each r to those times in the set of times L. relation = {elqe’

is

where

0e,,(13)(t)}

time-slice

~.L(r)

tuple

is satisfied.

= erl. u(Rl)l,

At

temporal operand

The lifespan

e.u(Rl)

= [el. *,e2.

one

of the tuples’

relationship

● rl,3er,

(el~e,l

VIX 1]}

Ae.A(t)Oa

attributes,

and rz be relations on schemes and B = Rz are attributes.

rl[AOB]rz

Static

two

in the intersection

3-quantified tuple is re-

U =X’.

O-Join. Like its counterpart in the standard relational operator combines tuples from its two operand relations.

A

(5)

the

when

=

(4)

This

( a-W12EN).

select-if-operator.

Ge.l

on Database

Systems,

notation

for denoting

Vol. 19, No 1, March

the

1994

106

J. Cltford

.

7.1.1

Other

In

Operators.

the HRDM restructure relation.

et al.

These

operators,

difference-merge

time-slice.

given value

to the

union-merge

), first

(–.

difference, respectively, The HRDM algebra

above

it computes

compute

categories

operators information

( U ~),

of’ operators,

that are content

intersection-merge

the set-theoretic

union,

the WHEN

a result

that

operator

is not

( n ~),

as a constant. Applied to a historical defined as the union of the lifespans

and and

relation. dynamic

as an extrarelational

contained

used to of that

intersection,

and then regroup the tuples in the resulting also includes the operators WHEN and

categorize

We

in that

addition

grouping algebra includes several a relation without changing the

operator

in a database

relation

or

relation, this operator returns a of the tuples in that relation. This

aggregate operator. The operator can be viewed as a type of temporal-based dynamic time-slice is only applicable to relations that include in their scheme A whose domain consists of partial functions from the set an attribute

itself. We do not treat

into

TIMES

the models

considered hold

which

they

would

be unfair

other

operators

remaining

such attributes

distinguish

and

do not

to include from

allow

such

our

languages

that

between

in this

ordinary

comparisons

an operator

discussion we will

paper,

values

between

them.

in our comparison.

of completeness

examine.

since most

The

at

Therefore,

it

We omit

the

of HRDM

grouping

of

and the times

and

operators

the

are not

treated because they are not intended for querying, and the aggregate operators, because they are outside of the scope of standard relational-based notions of completeness. The translations that we have provided for each of the relation-defining operators algebra. that

of the HRDM However,

this

are expressible

sequence database

in L~

of algebraic in Table VII

has at some time

algebra algebra

show that is not

for which

[e. NAME,

a cut in salary,

The lack those

of their attribute

< t2) A e.sfi(tl)

of an equivalent

operators

in HRDM

expression

(i.e.,

expressible

in Lk

as

~ t E e.1 A

> e.sfi(tz)).

algebraic that

to this

are queries

~ tl ● e.1 A t2 G Ae.Z A

3t13tz(EMPLOYEE(e) (tl

algebraic

respect

there

One example is the query on the department of each employee that

t] EMPLOYEE(e)

e. DEPT:

with

in that

no equivalent

operations) exists. for the name and

received

L,, is complete

TG-Complete

expression

include

is due to the specification

the comparison

of two values

of

as part

definition: the join and the various select operators. In each case only values that occur at the same point in time can be compared. (This

ability seems to be what is meant by the property of supporting “a 3-D conceptual view of a historical relation” that has been cited as an intuitively necessary component of a good temporal database model (e.g., in Ariav [19861, Clifford and Tansel, [1985], and McKenzie and Snodgrass [ 1991a]). Thus, as required by the above query, it is not possible to compare the salary of an employee at some time time, t2. ACM

TransactIons

on Database

tl

with

Systems,

that

Vol

employee’s

19,

No

1, March

salary

1994

at some other

point

in

Completeness of Historical Relational Query Languages

7.2 The Historical

Homogeneous

.

Model of Gadia

The next historical model that we discuss is one that was proposed [1988]; it is a model that includes a query language and an algebra. model,

which

we call TDMG

as that

of HRDM

Section

2.

In TDMG

the value

the value

domain

attributes

(Gadia’s

temporally

grouped.

In

addition

calculus.

and

algebra is defined temporal relations.

of the

of a tuple

data

model

model,

model

is a function

Gadia

defines

is temporally

from

in terms of the ungrouped Gadia calls this a snapshot

model

does not

algebra

distinguish

and calculus.

between

In terms

them

when

of our discussion

defined

obtained

to

model

algebra

the semantics

is and

of the

by ungrouping semantics. The

interpretation

he proves

in

for all of the

the TDMG

semantics of the historical algebra is defined by ungrouping tions, because Gadia considers grouped and ungrouped models and

is the same

a set of times

a historical

grouped,

by Gadia This data

TG

is the same

Therefore,

assumption).

model

of Gadia),

historical

and the lifespan

homogeneity

his data

data

canonical

attribute

of the attribute,

to the

Although

(for temporal

thus

107

temporal rela“weakly equal”

equivalence

on completeness

of his

in Section

3,

Gadia’s mapping from his grouped model to his ungrouped model is not a one-to-one mapping; unlike our mapping into TCg, Gadia’s f) mapping ignores grouping. Gadia’s (ungrouped) algebra is defined as follows: He starts with the five standard relational operators—selection, projection, difference, Cartesian TA does. He also defines derived temporal operators, product, and union—as such as join, intersection, negation, and renaming. In addition, he defines temporal expressions for the temporal domain. Finally, he combines relational form

and temporal e(u),

where

expressions e and

by considering

u are relational

and

relational temporal

expressions expressions,

of the respec-

tively. TC

The

is complete five

standard

with

respect

temporal

to Gadia’s

operators

algebra

are defined

for the as for

following TA

and,

reasons: therefore,

expressions are defined as a closure of time can be expressed in TC. Temporal intervals over the operations of union, intersection, difference, and negation. Each of these operators references to time. tdom(s(A,

B))

in

TDMG

can be expressed in the first-order For example, the expression can

be defined

in

TC

logic

with

tdom(r(

as {tl(Slx)(3y)(r(x,

A,

explicit B)) V y, t)

This means that every query in TDMG can be expressed in Gadia also defines a historical calculus and shows its equivalence to algebra (modulo temporal grouping). This calculus is expressible in L~ for same reasons that the ungrouped algebra is expressible in TC. A lifespan

S( x, y, t))}.

v

TC.

the the of a

t G x.1 in Lk. temporal tuple .x in TDMG can be captured with expression Also, the operators of union, intersection, difference, and negation for tempothat are used ral expressions can be expressed in Lfi with the same methods supports time. to express algebraic expressions in TC, since L~ explicitly L~ has strictly more expressive power The temporally grouped language Also, the than Gadia’s calculus; that is, this calculus is not TG-Complete. ACM

Transactions

on Database

Systems,

Vol

19, No

1, March

1994

108

.

J Clifford

temporally

et al

ungrouped

algebra; that completeness

language

TC

is strictly

more

powerful

than

Gadia’s

The reason for this lack of algebra is not TU-Complete. same as for HRDM: It is not possible to compare the

is, the is the

t~ with the value of another or the same value of one attribute at time attribute at some other time t ~. For example, the query of the previous section, asking for the name and department of each employee that has at some time received a cut in salary, cannot be expressed in TDMG. 7.3 TQuel TQuel

is the query

language

component

[1987].

of a historical

We shall

call this

relational

model

data

model

proposed

by Snodgrass

TRDM.

TRDM relation,

provides for two types of historical relations. One, called an interval is derived from a standard relation through the addition of two

valid-from and valid-to, both of whose domains are the temporal attributes, set of times T. (An example of such a relation has already been given in Table temporal attributes since we 111). As before, we ignore the two TRANS-TIME historical data models. Thus, we will view TRDM as a are only considering

temporally attributes

ungrouped historical data of a tuple in such a relation

model. The values of the nontemporal are considered to be valid during the value and ending thus denotes the

beginning of the interval of time starting at the ualid-from value. (This interval at, but not including, the ualid-to lifespan

The standard relations

of the tuple.) second

type

of relation,

relation by a single and event relations

addition of attributes The query language

an

event

relation,

is defined

temporal attribute are derived from

cjalid-at.

lNF

by extending

Since relations

a

both interval through the

whose values are atomic, they are also in INF. TQuel is an extended relational calculus derived

from

and defined as a superset of Quel, the query language of the Ingres relational database management system [Stonebraker et al. 1976]. TQuel extends Quel and by adding temporal-based clauses that accommodate the ualid-from valid-to

attributes.

(These

attributes

are not

visible

to the

existing

nents of the Quel language. ) clause is added to define an additional temporal-based A WHEN constraint that must be satisfied in conjunction with the constraint by the TQuel (and temporal predicate

clause. This Quel) WHERE valid-from over a set of tuple

constraint, –valid-to

composelection defined

specified intervals

as a (life-

span),

defines a restricted set of relationships that must hold among them. A clause is used to define, in terms of temporal expressions, talid-from and oalid-fo values for tuples in the relation resulting from the TQuel stat ement. Both temporal predicates and temporal expressions have a semantics that

VALID

is expressible

in

terms

of the

standard

tuple

calculus

;;

[Snodgrass

1987] .17

This specification also includes the use of several auxihary functions that are used to compare times, m order to determme which of two times occurs first or last. Strictly speaking, the set of these functions described m Snodgrass [1987] 1s not complete, but It 1s easdy extendible to a complete set ACM Transactions

on Database

Systems,

Vol

19.

No

1,

March

1994

Completeness

TQuel TQuel,

of Historical

Gluery Languages

Relational

is complete with respect to TC, and vice versa, since like that of Quel [Unman 1988], can be expressed

standard

relational

calculus,

with

which

is clearly

TC

.

109

the semantics of in terms of the

strongly

equivalent.

In

particular, Snodgrass shows how any TQuel query can be expressed as a formula of the form Q A r A @, where Q, 17, and @ are the calculus formulas for

the

underlying

clause, r and

neither

valid-to,

means

Quel

statement,

the

TQuel

WHEN

respectively, and where r and @ contain @ are defined only over the temporal that,

of which

as with

may be included

Quel,

in Q. The structure

not all algebraic

single TQuel statement (e.g., algebraic operator). If none of a nontemporal attributes defined

has

a domain

whose

values

clause,

expressions

VALID

over

of this

formula

can be expressed

expressions

are

and

no quantifiers. Additionally, valid-from and attributes

containing

which

a TRDM

comparable

to those

the

as a union

database in

the

is

set of

expression over the relations in this times T, then in no algebraic valid-from or valid-to. can such an attribute be compared to either

database For such

a database, TQuel statements, formula, are no more restrictive

calculus (as with

QueD,

a sequence

of TQuel

as represented by a defining tuple than Quel statements. Therefore

statements

can express

perhaps

by creating temporal relations APPEND and DELETE. Although interval relations and event

and

by

relations

any algebraic using

expression,

statements

are distinguished

such

as

by TQuel,

they are standard lNF relations that provide a fixed way of encoding temporal data using the temporal attributes. TQuel differs from Quel only in the distinction accorded these attributes. Thus, like Quel, with the addition of such statements

Note

that

department expressible

it is complete

as APPEND,

extension, as a result but like all ungrouped the

query

on

the

of each employee in L~ as [e. NAME,

database

that

Table

t] EMPLOYEE(e)

e. DEPT:

VII

by Codd. By

it is TU-C’onzplete, value

for

integrity.

the

received

name

and

a cut in salary,

A t G e.1 A

e’) A tl G e.1 A tz G Ae.Z A

(t,< t,) A e. SAL(tl) (again,

in

has at some time

3t13tz(EMPLOYEE(

is also expressible

in the sense defined

of the use of the temporal attributes, models, it does not exhibit temporal

ignoring

> e. SAL(t,

transaction

times)

)),

in TRDM

as follows:

range of el is EMPLOYEE range of e2 IS EMPLOYEE retrieve into SalChange(el .NAME, el .DEPT) valld from begin of el to end of el where el ,NAME = e2. NAME and e2.SAL < el .SAL when (end of el ) precede (begin of e2) Further

note that

an algebra

has been proposed

that

provides

a procedural

equivalent to the TRDM calculus [iMcKenzie and Snodgrass 199 lb]. Although it employs a different data model from that in TRDM (in fact, its model is model and does not support grouping. NINF), it is not a grouped ACM

Transactions

on Database

Systems,

Vol

19, No 1, March

1994

110

.

J, Cliford

7.4 The Temporal The

final

et al

Relational Algebra of Lorentzos

historical

Lorentzos

and

the same restricted

data

Johnson

as that to only

model

that

[1987].

This

we discuss data

and Johnson is one that

model,

called

was proposed

TRA,

by

is essentially

model it is [1987], except that as a historical dimension. Two of the stated goals of TRA are

in Snodgrass one temporal

that “no new elementary relational algebra operations are introduced and first normal form is maintained” [Lorentzos and Johnson 1987, p. 99]. Typical relations in this model appear basically as in Table III (with the columns oalid-from and l)alid-to called Sfrom and Sto, respectively). Although the structures historical

of relations in version of TRDM,

in Snodgrass calculus. It

[ 1987],

the

to

discuss

is difficult

this model we discuss language

it

formally

are essentially the same as in the this model here because, unlike that proposes

the

is an algebra

algebra

of TRA

rather

because

than it

a

is not

specified formally. Rather, it is presented via a series of example queries and discussions. Nevertheless, enough of a picture of the algebra emerges clearly through these examples to make a discussion possible. FOLD and UNFOLD, are defined. These operators Two new operators, essentially convert between the time-interval representation (as in Table and UNFOLD and a time-point representation (as in Table I). FOLD clearly

expressible

Lorentzos The

previous

HRDM

in terms

and Johnson sections

and TDMG,

of operators

[1987]

point

have

were

in the standard

relational

III) are

algebra,

as

out.

demonstrated

incomplete

that

because

two

they

other

were

algebras,

that

of

not able to compare

the value

of one attribute at a time tl with the value of another (or the same) such comparisons are possible. t~. In TRA attribute at some other time Again consider the query that finds the name and department of each employee that has at some time received a cut in salary: [e. NAME,

t] EMPLOYEE(e)

e. DEPT:

3t13tz(EMPLOYEE(e) This

query

relation

can be expressed

EMPLOYEE EMPLOYEEU1

into =

A t G e.1 A

A tl < t2 A e. SAL(tl) in TRA

as follows:

all of its time

UNFOLD[

Time,

> e. SAL(tz

First,

)).

the interval

UNFOLD

points: Start,

Stop]

( TIME,

EMPL

).

Then, @-join this relation with itself, joining tuples with the same name and with a pay cut, and then project just the names of the employees from the I and NAME2, etc., refer to the NAME attributes in the result (here, NAME first and second operands to the join): NAME TEMP

1 = Employee,,

TIME

[ TEMP2 ACM

TransactIons

I = NAME2 I < TIME2

SAL I > SAL2

= wN~~I~l(TEMPl). on Database

Systems,

Vol

19, No

1, March

1994

, ,

1

Employee,,

,

Completeness Finally,

join

the result

with

of Historical Relational Query Languages

the original

relation,

to standard

relational

and project

.

111

onto the desired

fields:

Because

TRA

is equivalent

as in the case

TU-Completeness,

completeness

of relational

TU-C’omplete,

but

value

like

of Tl%DM,

algebra.

algebra,

is reduced

Therefore,

all ungrouped

we

languages,

the question

to the

question

conclude

that

it does not exhibit

of its of the

TRA

is

temporal

integrity.

The guages

results

of our

explorations

are summarized

in Table

8. SUMMARY

into

the

completeness

of these

five

lan-

XII.

AND CONCLUSIONS

In this paper we have explored the question of completeness of languages for historical database models. In this exploration we were led to characterize as being

such models

of one of two different

types,

or temporally ungrouped. We first discussed means of example databases and queries, and were not temporally

equivalent. The grouped models,

difference historical

either

temporally

grouped

these notions informally by showed that the two models

between the two models values (like salary histories)

is that, in are treated

to directly in the query language. In as first-class objects that can be referred the temporally ungrouped models, no such direct reference is permitted. We value have characterized this property of the grouped models as temporal integrity.

We then strong

proceeded

completeness

paradigms

to define the two concepts of weak completeness and between two data models with different representation

and different

query

there is a correspondence to the comparison model,

languages.

In the case of weak

completeness,

mapping from the relations of the reference model on the query language that preand a mapping

with weak equivalence is that serves the meaning of a query. The problem different relations in the reference model can be mapped to the same relation in the comparison model, and so information, for example, grouping, can be lost. In the case of strong completeness, the correspondence mapping must be one-to-one, and hence, there is no loss of information. TL, For the ung-rouped models, we have defined three different languages, temporal logic, a logic with explicit reference to time, and a TC, and TA—a temporal

algebra—and Any

TU-Completeness. pleteness.

An

ungrouped

have motivated one of the three model

is said

our choice for TC as the basis for can serve as the basis for TU-Comto be TU-Complete

complete with respect to M~u = (TU, TC). For the grouped models, we have defined

the

calculus

if it is strongly L~,

a many-sorted

logic with variables over ordinary values, historical values, and times. We L~ as the basis for TG-Completeness. A grouped model is said have proposed to be TG-Complete if it is strongly complete with respect to M~~ = (TG, Lk). We then proceeded to explore more formally the relationship between ungrouped and grouped models. We have demonstrated a technique for ACM TransactIons on Database Systems,Vol 19, No. 1, March 1994

112

.

J. Cllfford et al. Table XII.

Section

TC

Section 4 A

of Completeness

-1 --L.-

5

lT -------[lJuLellLAub

extending With

.-A

J Ullllhuu

7 (10-1

1Yo I j

Completeness

gloLlped

Basi> for

TG-Completeness

unglouped

Basis for

TU-Completeness

.-

_.-

_-l

Ullgluupcu

[Snodgrass 1987] [Clifford and Croker 19S7]

ungrouped

[Gadia 198S] [Gadia 1988]

grouped

the ungrouped this mechanism

fier.

T_l-----

dllu

Results

Type

Reference

Language Lh

--

Summary

grouped

I

TTT 1

u-

n.–.

-l...

E TU-

Not

ull~rouped

Not

r-

Not

7’[1-C’omplete

model with a grouping mechanism, we have shown how the ungrouped

1

a group identimodel TU and

TC could be extended to TU~ and TC’~ in such a way as the language the resulting model equivalent in power to TG with Lk. In this way demonstrated that the grouped and ungrouped models differ only capability. More precisely, we have proved spect to the grouping M~~r = (TU, TC ) is weakly equivalent and the model MTcr~ model equivalent to the model M~~ = (TG, Ll,). TC~ ) is strongly

Finally, whether models,

we have

examined

several

historical

or TG-Complete. they were TU-Complete two grouped and two ungrouped, offering

the ungrouped calculus (TQuel

relational

proposals

to make we have with rethat the = (TUR,

to see

We looked at four historical five different languages. In

models, we have found both an algebra (from ‘1’RA) and a whereas in the grouped from TRDM) that are TU-Cornplete,

Lk, two models, we found, apart from our metric, the complete calculus languages that are not TG-Complete: an algebra (from HRDM) and a calculus (from TDMG), as well as an algebra (from TDMG) (which operates on We believe ungrouped versions of grouped relations) that is not TU-Complete. that this classification scheme and our examination of the completeness of

several historical models should help to explicate the differences and the commonalities between the various models proposed in the literature. As with of query languages, the relational model, a baseline notion of completeness although

imperfect

transitive minimum

closure queries or support aggregates), nonetheless and reasonable metric with which to compare a variety

(e.g.,

relationally

complete

languages

do not

allow

for

provides a of different

languages. One point bears emphasizing. It has on occasion been said that the issue of adding time to relational databases is an uninteresting one, since the user can always add whatever extra attributes are desired (e.g., Start-Time and End-Time) and then use standard SQL (or relational algebra) as the query language. In our discussion of the completeness of the ungrouped temporal languages, we, to some extent, have relied on the underlying point of this (which argument. For example, this point underlies our argument that TRA Two points need is equivalent to standard relational algebra) is TU-Complete. to be made

in reply

to this

comment.

First,

there

is a difference

ACM TransactIons on Database Systems,Vol 19, No 1, March 1994

between

the

Completeness

formal

notion

of completeness

of ease of use. Even lent

to a Turing

an operating temporal easier

of Historical

Machine,

features

and the informal,

though

system

the programming it is a lot more

because of the

Relational

of its

to use for managing

and

temporal

Languages

notion

C is formally

equiva-

to use C if one is writing

high-level temporal

data;

113

.

but no less important, language

convenient

built-in

historical

Query

features.

data

without

The

models

these

built-in

make

them

a greater

features

burden is placed upon the user. Second, this paper has shown that the grouped models and languages are more expressive than their corresponding ungrouped models, unless these models add a surrogate grouping mechanism. in This grouping mechanism, itself, is a higher-level construct that is implicit the grouped systems (and this, we argue, makes them more convenient), but in the ungrouped systems for them to be equivalent needs to be made explicit in expressive power. There are a few interesting areas for future research that this work has clarified.

The

seem that simulating Clearly, structural

first

relates

they are temporal

to our

grouping

axioms

rather strong, perhaps grouping in a temporally

(in

Section

in order to have an isomorphism between two mapping and the r mapping on queries must

It is an area for additional most likely at the expense Another area of interest nor are we aware

of, any

6). It

might

stronger than necessary ungrouped model like such work

for TU.

models, the Q hand in hand.

research whether our fl~u could be simplified, of complicating the mapping on queries. arises when it is noted that we did not find here, complete

algebra

for grouped

historical

data

Such an algebra is clearly needed. Another area in which there be interest is in the support of evolving schemata. Our decision

models.

continues to not to treat

this interesting area here was based largely on the fact that hardly any of the models in the literature incorporate this feature, and we wanted to choose the common denominator of all the models in order to make our comparisons fairly. other

The work

continues Finally,

model in Clifford and Croker [1987] addressed this issue, and (e.g., Banerjee et al. [1987] and McKenzie and Snodgrass [1990]) to be done in this area. we would like to address

as opposed to historical and Ahn [ 1985]). We

relational believe that

the question

of completeness

for temporal

models (in the terminology of Snodgrass our results on grouped and ungrouped

historical relational completeness can be extended in a straightforward way to temporal data models and languages. The extension would involve the addition of another sort (for transaction times). In ungrouped temporal models, relations every tuple with

would be extended with an additional its transaction time, and the language

column would

to stamp have con-

stants, as well as variables, and quantification for this sort. temporal models, values would be extended to be doubly indexed; most likely be better modeled as functions from a transaction

In grouped they would time into

functions

of the

from

a data

time

to

a scalar

value,

but

the

order

two

temporal indices could be reversed. Preliminary work that we have done on Indexical Databases [Clifford 1992] holds promise for a unified treatment, not only of these two temporal dimensions, but of spatial, or other, dimensions as well. ACM TransactIons on Database Systems,Vol 19, No. 1, March 1994,

114

.

J. Chfford

et al.

ACKNOWLEDGMENTS

The

authors

Fabio

would

Grandi,

like

for their

to thank valuable

the

reviewers,

comments,

contents and presentation of this Snodgrass for ongoing and fruitful many of the ideas presented here.

which

and

also Jan

have helped

Chomiki

and

to improve

the

paper. We would also like to thank Rick discussions that have helped to clarify

REFERENCES AHO, A

V , AND ULLMAN, J

Syrnposzum

D.

1979.

Umversahty

on 1%-mczples ofl%-ogrammtrzg

of data retrieval languages In ACM ACM, New York, pp. 111-120 Svst 11, 4 (Dec.), ACM Trans. Database

Languages

ARI~V, G. 1986. A temporally oriented data model. 499-527 1986 Temporal data management. Models and systems In Neu ARIAV, G, ANDCLIFFORD, J Dzrectzons for Database Systems, G Anav and J. Chfford, Eds. Ablex, Norwood, N.J , pp 168-185, BANCTIHON, F. 1978 On the completeness of query languages for relational databases In Proceedings of the 7th Sympmzum on Mathematical Foundatzon~ of Computmg. SpringerVerlag, New York, pp. 112-123. BANI?RJR~, J., KIM, W,, KIM, H.-J., AND KORTH, H. F. 1987. Semantm and Implementation of Conference schema evolutlon m object-oriented databases In Proceedl ngs of ACM SIGMOD (San Francisco, B~N-ZVL J 1982. Cahforma CHANIXtA,

A.

Cahf ). ACM, New York, pp. 311-322. The

time

relational

model,

Ph,D,

thesis,

Computer

Science

Dept.,

Umv.

of

at Los Angeles K,

ANn

HAM?L,

D

1980.

Computable

queries

Syst.

ACM Transactmns

Sc~ 21, 2 (Ott

for

relational

data

bases

J

), 156-178. of Workshop on Logzcal CLIIWOEU>,J. 1982. A model for historical databases In Proceedings Bases for Data Bases (Toulouse, France, Dec ), ONERA-CERT, Toulouse, France. of Workshop on Current Issues in 1992 Indexical databases In Proceedings CLIFFORII, J Database Systems (Newark, N J , Ott ) Rutgers LTnlv , New Brunswick, N J CMFFORD, J., AND CRORRR, A. 1987 The historical relational data model HRDM and algebra of the 3rd IEEE International Conference on Data based on hfespans. In Proceedings Engzneermg (Los Angeles, Cahf ) IEEE, New York, pp. 528-537 CLIFFORD, J , AND TANSEL, A U. 1985. On an algebra for historical relational databases: Two of ACM SIGMOD Conference (Austin, Tex., May). ACM, New York, wews. In Proceechngs pp 247-265 CLIFFORD, J., AND WARREN, D. S 1983. Formal semantics for time in databases AC’M Trans Database Syst 6, 2 (June), 214–254. CODD, E F. 1972 Relational completeness of data base sublanguages. In Data Base Systems, R Rustin, Ed Prentice-Hall, Englewood Cliffs, N J Introdaetzon to Database Systems, Vol. II, Addison-Wesley, Reading, Mass, DATE, C J 1983. A Mathematzcul Introduction to Logic. Academic Press, New York. EN~ERTON, H, B. 1972, FISCHER, P C., mm VAN GUCHT, D 1985 Determmmg when a structure m a nested relatlon. Conference on Very Large Databases. pp. 171-180. In In fernatzonul GABBAY,D 1989. The declarative past and Imperative future: Executable temporal lo~c for of Colloquz am on Temporal Logzc zn SpecLflcatLon, B. mteractlve systems. In Proceechngs Bameqbal, H. Barringer, and A Pnueh, Eds. Lecture Notes m Computer Science, vol. 398, Spr]nger-Verlag, New York, pp 402-450. GAIXA, S K 1986 Toward a multihomogeneous model for a temporal database In Proceedings of the 2nd IEEE In ternatzonal Conference on Data Engmeermg (Los Angeles, Cahf., Feb.). IEEE, New York. GADIA, S. K. 1988. A homogeneous relational model and query languages for temporal Syst. 13, 4, 418-448. databases. ACM Trans. Database HALL, P., OWLETT, J., AND TODD, S, J, P, 1976. Relatlons and entitles In MOd@?zzL?Lg zn Data Base Manageme?Lt Systems, G. M. NiJssen, Ed. North-Holland, Amsterdam, Comput

on Database Systems, Vol

19,

No

1, March

1994

Completeness

of Historical

Relational

Query

Languages

.

115

HALMOS, P. 1960. Nazue Set Theory. Van Nostrand, Princeton, N.J. JONES, S., AND MASON, P. J. 1980. Handling the time dimension in a data base. In Proceecilngs of the International Conference of Data Bases (Heyden, U. K., July). British Computer Society, London, pp. 65-83. KABANZA, F., STEVENNE, J.-M., AND WOLPER, P. 1990. Handling infinite temporal data. In Proceedings

of the

9th

ACM

Symposium

York, pp. 392-403. KAMP,H. 1971. Formal properties of ‘now’.

on Principles Z’heorza

of Database

Systems,

ACM, New

37, 3, 227-273.

KAMP, H. 1968. On the tense logic and the theory of order. Ph.D. thesis, Philosophy Dept., Univ. of California at Los Angeles. KLUG, A. 1982. Equivalence of relational algebra and relational calculus query languages having aggregate functions. J. ACM 29, 3 (July), 699-717. Logic of Programs. EATCS Monographs on Theoretical Computer KROGER, F. 1987. Temporal Science, vol. 8, Springer-Verlag, New York. LORENTZOS,R. G., AND JOHNSON, N. A. 1987. TRA: A model for a temporal relational algebra. of the Conference on Temporal Aspects zn Information Systems, AJ?CET, pp. In Proceedings 99-112. Databases. Computer Science Press, Rockville, Md. MAIER, D. 1983. The Theory of Relational Rec. 15, 4 (Dec.), MCKRNZIE, E. 1986. Bibliography: Temporal databases. ACM SIGMOD 40-52. MCKENZIE, E., AND SNODGRASS,R. 1990. Schema evolution and the relational algebra, Inf, Syst. 15, 2 (June), 207-232. MCKENZIE, E., AND SNODGRASS,R. 1991b, Supporting valid time in an historical relational algebra: Proofs and extensions. Tech. Rep. TR-9 1-15, Dept. of Computer Science, Univ. of Arizona, Tucson, Aug. M(;KENZIE, E., AND SNODGRASS,R. 1991a, An evaluation of relational algebras incorporating Sure. 23, 4 (Dec.), 501-543. the time dimension in databases. ACM Comput. NAVATHE, S. B., AND AHMED, R. 1989. A temporal relational model and a query language. Inf. Sci. 49, 2, 147-175. Press, Cambridge, QuINti, W. V. O. 1953. From a Logzcal Point of Vzeu. Harvard University Mass. Logzc. Springer-Verlag, New York. RESCH~R, N., AND URQuHARr, A. 1971. Temporal ROTH, M. A., KORTH, H., AND SILBERSCHATZ,A. 1988. Extended algebra and calculus for nested Syst, 13, 4 (Dee), 388–417. relational databases. ACM Trans. Database J. 33, 1 SARI)A, N. L. 1990. Algebra and query language for a historical data model, Comput, (Feb.) 11-18. SEGEV, A., AND SHOSHANI, A. 1987. Logical modeling of temporal data. In Proceed~ngs of ACM SIGMOD Conference (San Francisco, Calif., May). ACM, New York, pp. 454-466. Syst 12, 2 SNODGRASS,R. 1987. The temporal query language TQuel, ACM Trans. Database (June), 247-298. Rec. SNODGRASS,R. 1990. Temporal databases: Status and research directions. ACM SIGMOD 19, 4 (Dec.), 83-89. of ACM SNODGRASS,R., ANDAHN, I. 1985. A taxonomy of time in databases. In Proceedings SIGMOD Conference. ACM, New York, pp. 236-246. SNODGRASS,R., GOM~Z, S., AND MCKENZIE, E. 1989. Aggregates in the temporal query language tquel. Tech. Rep TR-89-26, Dept of Computer Science, Univ. of Arizona, Tucson, Nov. Rec. 20, 1 (Mar.), SOO, M. D. 1991. Bibliography on temporal databases. ACM SZGMOD 14-23. Eng. 7, 4 STAM, R., AND SNODGRASS,R. 1988. A blbhography on temporal databases. Database (Dec.),

231-239.

STONEBRARER,M., WONG, E., KREFS, P., mm HELD, G. 1976. The design and implementation of Syst. 1, 3 (Sept.), 189-222, Ingres. ACM Trans. Database TANSEL, A., AND GARNE’rT, L. 1992. On Roth, Korth, and Silberschatz’s extended algebra and S.yst. 17, 2 (June), 374–383. calculus for nested relational databases. ACM Trans. Database TANSEL, A., CLIFFORD, J., GADIA, S., JA,JODIA, S., SEGEV, A., AND SNODGRASS,R. ErIs. 1993. Temporal Databases. Benjamin/Cummings, Menlo Park, Calif. ACM Transactions

on Database Systems, Vol.

19, No.

1, March

1994.

116

.

J. Clifford et al.

TANSEL, A. U. 1986. Adding time dimension to relational model and extending relational algebra Inf Syst. 11, 4, 343–355 TUZHILIN, A. 1989 Using relational discrete event systems and models for prediction of future behamor of databases. Ph D. thesis, Computer Science Dept , New York Univ., New York, Oct. TUZHILIN, A., AND CLIFFORD, J. 1990. A temporal relational algebra as a basis for temporal Conference on Very Large Databases. pp 13–23. relational completeness In International Prmclples of Database and Knowledge-Base Systems Vol 1. Computer ULLMAN, J. 1988. Science Press, Rockwlle, Md. VAN BENTH~M, J. F A. K. 1983. The Logzc of Tzme Reldel, Hingham, Mass. Recewed January

1989; revised December

ACM

on Database

Transactions

Systems,

Vol

1991 and December

19,

No

1, March

1994

1992; accepted December

1992