On Completeness of Historical Relational Query Languages JAMES
CLIFFORD
New York University
ALBERT
CROKER
City University
of New York
and
ALEXANDER
TUZHILIN
New York University
Numerous proposals for extending the relational data model to incorporate the temporal dimension of data have appeared in the past several years. These proposals have differed considerably of the in the way that the temporal dimension has been incorporated both into the structure algebra or calculus extended relations of these temporal models and mto the extended relational that they define. Because of these differences, it has been difficult to compare the proposed models and to make judgments as to which of them might in some sense be equivalent or even better. In thm paper we define temporally grouped and temporally ungrouped historical data completeness, analogous to Codd’s notion models and propose two no’uons of hzstorma 1 relational of relational completeness, one for each type of model. We show that the temporally ungrouped models are less expressive than the grouped models, but demonstrate a techmque for extending the ungrouped models with a grouping mechamsm to capture the additional semantic power of temporal grouping. For the ungrouped models, we define three different languages, a logic with explicit reference to time, a temporal logic, and a temporal algebra, and motwate our choice for the first of these as the basin for completeness for these models. For the grouped models, we define a many-sorted logic with variables over ordinary values, hmtorlcal values, and times. Finally, we demonstrate the equivalence of this grouped calculus and the ungrouped calculus extended with a grouping mechanism. We believe the classification of hmtorical data models into grouped and ungrouped models provides a useful framework for the comparison of models in the hterature, and furthermore, the exposition of eqmvalent languages for each type provides reasonable standards for common, and minimal, notions of historical relational completeness. Categories models;
and Subject
H 23
[Database
Descriptors: Management]:
Management]: H.2. 1 [Database Languages-query languages
Logical
Design—data
Authors’ addresses J. Clifford and A. Tuzhilin, Information Systems Department, Stern School of Business, New York University, New York, NY 10012; A. Croker, Statistics and Computer Information Systems, Baruch College, City Umverslty of New York, New York, NY 10010. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the pubhcation and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, reqmres a fee and\or specific permission. 01994 ACM 0362-5915/94/0300-0064 $03.50 ACM
TransactIons
on Database
Systems,
Vol
19, No
1, March
1994, Pages 64-116
Completeness
of Historical Relational Query Languages
.
65
General Terms: Languages, Theory Additional Key Words and Phrases: Completeness, historical databases, query languages, relational model, temporal databases, temporal grouping, temporal logic
1. INTRODUCTION Over
the course
have
been
of the past
proposed,
(1-oker
[1987],
[1980],
Lorentzos
decade,
including
Clifford
and
and
various
Ariav
Warren
Johnson
historical
[1986], [1983],
[1987],
relational
Ben-Zvi Gadia
Navathe
[1988], and
data
[1982], Jones
Ahmed
models
Clifford
and
and
Mason
[1989],
Sarda
[1990], Snodgrass [1987], and Tansel [ 1986].1 These data models are intended for those situations where there is a need for managing data as they change the standard incorporation
over time. Generally, these data models extend model by including a temporal component. This
relational data of the temporal
dimension has taken a number of different forms. Chief among these has to a relation (the equivabeen the addition of an attribute, such as TIME, lence of time stamping) [Snodgrass 1987], or the inclusion of time as a more of a relation [Clifford and Croker 1987; Gadia intrinsic part of the structure 1986].
The
latter
Although proposed degrees, subject
approach
results
in
what
have
been
the
structures
of the
historical relational whether they have for debate.
historical
Moreover,
because
defined
relations
data models differ the same modeling
the literature 12 algebras
the
query
paper
models
provide that
(e.g., McKenzie and Snodgrass alone) that it is crucial to have
to compare
In this data
non-first-
have
of the
languages
defined
in these
it has remained extracting various have appeared in
[1991a] refers some standard
to no fewer than measure against
them. we address
and languages.
a basis
in each
from each other to varying capabilities has remained a
data models differ from each other in their formulations, unclear whether they provide the same capabilities for subsets of a database. In fact, so many different languages
which
called
relations.
normal-form
the issue
of completeness
A metric
of historical
for determining
been defined
the
as part
expressive
of proposed
for historical relational
power
historical
relational can
completeness
of the query relational
languages
data
models.
As such, the notion of historical relational completeness can serve a role similar to that of the original notion of relational completeness first proposed by Codd [1972] and later justified as being reasonable by Bancilhon [1978] and by Chandra and Harel [1980]. In Section 2 we first address the
issue
of the
various
have
been
‘This
historical
hst
m not
data
exhaustive.
Clifford
[1986],
Snodgrass
subject,
see McKenzie
models
For
an overview
[1990],
[1986],
that
and
Stare ACM
Tansel
of the
area
of time
et al. [1993];
and Snodgrass Transactions
[1988],
modeling proposed.
for and
on Database
and
capability In
databases,
an ongoing
of the
particular,
we
see Ariav
bibliography
and
on the
Soo [1991]. Systems,
Vol. 19, No. 1, March
1994.
66
J. Clifford
.
explicate temporal
the different modeling dimension at the tuple
attribute-value the
et al.
terms
between the two
level
(by including
temporally
urzgrouped
these two approaches, approaches. We then
basis for proaches.
capabilities achieved level (by time-stamping
our analysis The distinction
time
as part
and
of each value).
temporally
respectively, propose two
by incorporating the each tuple) or at the We introduce
grouped
distinguish
to
and discuss the relative power of canonical models to serve as the
of the power of query languages for these two apbetween these two different types of models, tempo-
grouped (TG), serves to structure the rally ungrouped ( TU ) and temporally remainder of the paper. weak and strong completeness for comparing the In Section 3 we introduce query languages of different data models. Then in Section 4 we apply these concepts of completeness separately to the temporally ungrouped rally grouped models. For the temporally different
languages:
and a temporal
a temporal
algebra.
logic,
We propose
a logic
with
the logic
ungrouped models,
explicit
with
we
and tempodefine three to time,
reference
explicit
reference
to time
for as a standard for strong completeness, which we call TU-Completeness, temporally ungrouped models. In Section 5 we examine the temporally grouped
calculus
models
and
define
is a many-sorted
a historical
logic with
values, and times. We propose this ness, which we call TG-Completeness,
relational
variables
calculus
over ordinary
for
the
representation extended
extended model. relational
to
power
ungrouped Finally,
pointing
that
ungrouped
the
model
in Section
languages
these metrics. It is worth
of the
incorporate
is strongly
models
complete
with
respect
are a number
extended
to handle
most of the not address
additional
time
work on completeness the issue of temporal
temporal,
we
believe
complete show how
languages show
with
of additional
can
that
to the
of several
in the literature
might reasonably be said to be related to the question query languages, but that are necessarily outside of the We are limiting our attention to models that incorporate of time (historical, as opposed to Snodgrass and Ahn [1985]), but
their and
the completeness
have been proposed there
and
semantics,
grouping
7 we examine
out that
this
historical
calculus as a standard of strong completefor models of this type. In Section 6 we
show that the ungrouped historical data models are only weakly historical models. However, we then with respect to the grouped be
them;
values,
this
grouped historical respect
issues
to
that
of of completeness scope of this paper. a single dimension
models, in the terminology that these results could
dimensions.
Furthermore,
for standard aggregates
relational languages, (as, e.g., in Snodgrass
in the spirit
of be of
we do et al.
[ 1989]). Work in the spirit of Klug [ 1982] could extend the results herein that homogedirection if so desired. Finally, we limit our attention to temporally relations Gadia [1988], that is, relations whose tuples have attributes neous all defined over the same period of time, and do not incorporate schema evolution over time (as in Clifford and Croker [ 1987]) because treatment of these additional issues would significantly lengthen the paper and because they have not been included in most of the proposed historical data models. In all of these decisions of what to incorporate in our notion of “reasonable” ACM Transactions on Database Systems,Vol 19, N. 1, March 1994
Completeness queries,
we have been motivated
denominator apply
of the various
our metrics
8 with
2. TEMPORALLY GROUPED DATA MODELS different
relational relation
strategies
model
proposed. fairly
AND
in the
to include
several
TEMPORALLY
models
a temporal
dimension
In one, the
into
schema
temporal
distinguished
the
of the
attributes
or as a first-normal-form (lNF) as tuple time stamping time stamping approach, referred to as attribute
non-first-normal-form
model,
(NINF)
instead
of adding
Clifford and proaches,
Croker
[1987]
for example,
and salary
and
Snodgrass
a relation
histories
intended
of employees
show
typical
representations
tions
appear
to have
of these
the
same
[1990]
or
additional
to the schema, the domain of each attribute is extended from complex values (e.g. functions) that incorporate the temporal
Consider,
in the paper.
and some directions
to represent the period of time over which the fact represented is to be considered valid. This approach has been referred to in
the literature In the other
mental
common
UNGROUPED
literature.
one or more
67
way we have been able to
of our results
incorporating
appeared
.
to choose the greatest In this
against
a summary
for
have
is expanded
(e.g., TIME) by the tuple
by the desire
models
of completeness
We conclude in Section for future research.
Two
of Historical Relational Query Languages
attributes
simple values to dimension. Both
contrast
to record
these
two
the
changing
in an organization.2
Tables
two
approaches.
information
content,
Although that
model. as a
is, the
ap-
departI and II
both
rela-
same
data
about three different employees over the same period of time, the models represent this information in quite different ways. In the lNF approach (Table I and models such as Ariav [1986], Lorentzos and Johnson [1987], Navathe [1990]), separate and
and Ahmed each moment tuple,
models
[1986]),
which
such
Snodgrass [1987], and Tuzhilin and Clifford relevant to each employee is represented by a
carries
the time
as Clifford
each employee’s
and
entire
stamp.
Croker
history
In the NINF
[1987],
approach
Gadia
is represented
[1988],
within
(Table
and
II
Tansel
a single
tuple,
of the values the time stamps are embedded as components Also note, with respect to the N lNF models, that while,
within which each attribute. general,
[1989], of time
a key field
is no requirement
like that
would
NAME
this
be the
typically
case. For
be constant example,
over time,
in the
of in
there
EMPLOYEE
Tom changes his name to Thomas at time 3. relation in Table II the employee There are many applications where the value of a key need not be constant over time, but merely unique in the relation at any given time. group related facts into a single tuple, Whereas NINF models inherently
lNF models, whether grass and Ahn [1985] tively),
2Similar
are
problematic
examples
have
historical or temporal (using the for models with one or two time in
appeared
this
regard.
in Clifford
and
Such
Warren
models
[ 1983],
distinction dimensions, provide
Gadia
[ 1986],
in Snodrespec-
no inherent
and
Snodgrass
[1987]. ACM
TransactIons
on Database
Systems,
Vol
19, No
1, March
1994
68
.
J. Clifford
et al
Table I.
Prototypical lNF Historical Employee Relatlon EMPLOYEE N,4ME
I DEPT
I SALARY
time
,I MIS 1 Finance
, I 201{ 1
I 431
..-> Y~, t) are true, then patible) and x, = y, for i determines appear.
into
all of the
m = n (i.e., =
attributes
“temporal
objects”:
the rest of the tuple if R(o, xl, . . . . x., t)
no
matter
and
Q(o,
the relations must be union-comother words, OT functionally
1,...,n. In
in all of the relations
in which
O and
T
independently of (2) A group-id uniquely determines the group of the tuples R and Q, which relation they belong to; that is, if o appears in relations Ul, . . . . u., t) and (3yl) .-. meaning that if both (3u I) . . . (3u~)(3t’)R(o, (3y~)(3t’’)Q(o, Xl,
. . ..xn.
yl,
. . . . y.,
t) is true,
then
hold, then, Q(o, xl, . . . . x.,
t“)
for all xl, . . . . x., t, if R(o, t) is also true, and vice versa.
(3) A group of tuples uniquely determines the group-id; that is, there cannot be two identical groups of tuples with different group-ids. Formally, if there are R, Q, o, and o’ such that for all xl, . . . . x., t,if It(o, Xl, . . . . x., t) implies that Q(o’, xl, . . . . x., t) and that Q(o’, xl, . . . . x., t) implies R(o, xl, ..., x~, t), then o = o’. ACM
TransactIons
on Database
Systems,
Vol
19, No
1, March
1994.
94
.
J. Clifford
et al.
Table XI.
Relation
EMPLOYEE
in the Grouped
TCe Model
EMPLOYEE Group-ID
I N.-t
lIE
D17PT
S4LAR}’
Mlit~ hlktg Mktg
301< ‘771< ~~1
xl
where
then
),...,
~ is a safe TC~ formula
rU~(Q)
formulas.
on the set of safe TC~
queries
If Q is a TCK
(on~n)j~)
This
mapping
query
lo}>}>
of the form
is
where historic variables e, correspond to the group-id variables o, appearing R, in ~, and attributes A, correspond to variables xl in these in predicates predicates. Examples illustrating the mapping rc,~ follow. In these examples we assume
that
the
schemata
Q(O, A, T ), respectively, temporal attribute: Example
The TC~
7.
{((0,.x
is mapped
query ),(o’,
Q are
R( O, A, T)
A is an attribute,
and
and T is a
Q x’),
A:t](3x)(3x’
A(Q(e’)
t)lR(o,
x,t)AQ(o’>x’,
)((R(e)
A t Ge’.l
Ae’.
At
Se.1
AR(e)
This
and
R
O is a group-id,
t)}
into [e. A,e’.
of logical
relations
of TU~
where
expression
for
rc,~(Q)
transformation)
to
[e. A,e’.
could
A:t]R(e)
Ae. A(t)
~ t =e.1 A(t)
=x)
=x’))
A Q~e’)
be simplified
A t’ Ge’. [. (using
A t ● e.1 A Q(e’)
standard
techniques
A t ● e’.l.
] ‘Actually, there is no need to add expressions e A,(t) = x, for all L = 1, , n, ah some examples it IS acceptable to do wdl show, but only for those xl’s that appear in other expressions However,
It for all terms,
as It slmphfles
ACM TransactIons
the
presentation
and the transformatmn
on Database Systems, Vol 19, No 1, March 1994
1s stall correct.
Completeness However,
this
simplification
of Historical Relational Query
is not always
possible,
Languages
.
as the following
99
example
shows : Example
The TC~
8.
query x),
{(o,
is mapped
tlR(o,
x,t)
(=Io’)Q(o’,
A
x,t)}
into
[e. A:t](3x) (R(e)
Note
~ t = e.1 ~ e.A(t)
that
in this
was replaced Example
~ (3e’)(Q(e’)
case the variable
via transitivity.
e’. A(t),
=x
with
Also
x serves
note
the historic
to equate
that
the quantified
variable
(3 e’ ) in the
the
t ) and
e.A(
variable
(3 o’ )
formula.
Lk
x’,
t’)
AX
‘X’)}
with A t 6 e.1 A e.A(t)
[e. A:t](3x)((R(e)
A t ● e.1 A e.A(t) AR(e)
This
terms
group-id
{( O, X), tlR(o,.x,t) A (~i)(qt’)(f?(o,x,t) A Q(o,
((l?(e)
=x)).
The TC’8 query
9.
is replaced
A e’.A(t)
~ t G e’.l
At
A (Q(e)
=x)
A (~x’)(~t’)
=x)
A t’ =e.1
A e. A(t’)
=x’)Ax
=x’))
=e.1).
expression
can be simplified
to
[e. A:t](3x)(3x’)(3t’)
A t ● e.1 A e.A(t)
((R(e)
AR(e) Note that
the
equality
it follows
the
variable
x = x’
from
x’ in the TC~ The
10.
=x’)
change
in
the
conversion
e. A(t ) = x, e. A(t’ equal.
remained
Also
unchanged
process. and
) = x’,
x = x’
note
that
the
domain
in the
L~
formula.
formula
TC~
x),
x,t)
t)lR(o,
A
=Q(o,
x,t)}
to
m(Q(e) Note
not that
e. A( t’ ) are
At
[e. A:t](3x)(R(e)
L~
did
facts
formula
{((o,
is converted
the
e. A( t ) and
terms
Example
Ax
=x’)
● e.1).
A t
that
However,
A t’ ● e.1 A e. A(t’)
A (Q(e)
=x)
that,
formulas. PROPOSITION
=e.1
A t = e.1 A e.A(t)
in the previous
examples,
We generalize
these
4.
rUG
Ae.
maps
safe
A(t)
ru~
A
AR(e)
=x))
maps
observations TC~
=x
A t G e.1.
safe TC~
formulas
in the following
formulas
into
safe
L~
into
safe
proposition: formulas.
SKETCH OF PROOF. Let @ be a safe TC~ formula. We will prove that ru~( @) is safe by verifying all of the conditions in the definition of safety for L~ formulas. First, 17U~(+) does not have universal quantifiers since ~ does not have them. ACM
TransactIons
on Database
Systems,
Vol. 19, No 1, March
1994
100
.
J. Clifford
Second,
the
et al.
range
mapped into the X,l A . . . A e,. A,~$t)
expression
V, = (3x,J, )”
(3xzj
)R,(o,,
X,l, . . . . xl~,, t) is
= (e,) ~ t = el.l A e,. A,j\t) R,(e, ) A t ● e,.1 is added
expression
(3x, J,) “ (=.r,j, )(l?, = X,J, ), and also the expression
at the “outermost” level of rc,~( @) because of condition (1) in the definition of the mapping rc,~. Clearly, the two expressions are semantically equivalent. But the second condition was added to make ru~( @) syntactically safe. Since rc~~( ~)
has
the
formula
Ii?, (e,) A t = e,.1
for
each
range
expression
at the
is satisfied. outermost level, the second condition of safety for Lk formulas FI V Fz in ~ is mapped into rr,~(Fl) V r[,G(F2 ) so that Third, subformula rrlG( FI ) and and
FI
ru~( Fz ) have the same set of atoms have
Fz
translates Finally,
of safety
THEOREM
set of pairs
t, = e~ because
(o], t,)and because
related
to maximal
M~G = ( TG, L~ ) is strongly
5.
all
conjuncts
of the
the formulas
tbe mapping three
items
rc,~ in the ❑
are satisfied.
complete
with
respect
to MrrJg
=
, TC~).
SKJiTCH
OF PROOF.
one-to-one, fies
same
t, c e]. them into expressions the mapping rc,~ is defined so that
definition
(TU~
the
the
because second
following
condition
reasons:
mapped
into
First, of our
the
() UG is grouping in
the
Intuitively,
expression
clearly axioms.
definition the
a correspondence Second, of
predicate
strong
mapping
mapping
the
and
ruc~
completeness
R( o, x ~, . . . . x ~,
A t G e.1, so that
R(e)
the
for
t) in
historical
satisthe
TC~
is
variable
e
of e. Furthercorresponds to the group-id o and so that t is in the lifespan more, group-ids are defined so that the variables x ~, . . . . x. are uniquely in the translation process. determined by values of o and t and are irrelevant R( o, xl, . . . . xx, t ) and R(e) Also, the expressions ~ t ● e.1 are equivalent. In addition, it leaves ru~(o).
the mapping conjunctions,
ru~ preserves disjunctions,
the structure and negations
of the formula of d in their
~; that places
is, in
❑
In this section we define the map6.2.3 Mapping L~ Formulas to TC~. into equivalent safe TC~ formulas. Let ping r~c, that maps safe L~ formulas As for the r~,~ mapping, the formula r~~,( ~) is @ be a safe Lk formula. obtained from @ by replacing all of the atomic formulas in @ together with quantified variables and leaving the structure of @ intact (operators A , v , and = remain unchanged). The replacement of atomic formulas and quantified variables is done in the following manner: (1) Replace
quantified
(a) Do not change (3x)
and (~t)
(b) Replace unique
(c)
ACM
variables
in Lh as follows:
any quantified in Lk
will
domain
remain
quantified historic group-id variable.
and temporal
variables;
that
is,
in r(~[,( q5).
variables
( =e, ) with
(2 o, ), where
o, is a
Consider all pairs of historic and temporal variables e and t such that @ contains an expression t E e.1. Depending on the relationship between the scopes of these variables, we add the expression (3 .xI ) . . .
TransactIons
on Database
Systems,
Vol
19, No
1, March
1994
Completeness (3x.) to r~u(+), historic variable (i) If
where e of arity
t is a free
expression
of Historical Relational Query Languages a domain
variable
associated
with
n, as follows:
variable
(3x I)...
is
x,
101
.
and (3x.)
e is a bound before
the
variable,
expression
then (30)
place
the
obtained
in
step (lb); if L and e are bound variables, and if the scope of e is contained within the scope of t, then also place (3x1) . .“ (3x. ) before (30);
(ii)
if ~ and
(iii)
(iv) in all other (2)
Replace (=x.)(
each
=t)l?(o,
group-id
Replace where
occurrence
of
(=e)
in step (lb).
with
(3xI)
““” (3x.
the
L~
R
conjunct
is
of the
R(e)
the
(3x,
) “””
with
all of the
the the
occurrences
of e
o.
t G e.1 with
predicates
t = e.1. 14 If
containing
then
variable
of expression one
within
(St);
to the formula.
expression
If e is free,
the same group-id
occurrence
predicate
) before
If e is a bound variable in O, then same as the one that replaced e in
t).
o is
each
maximal
x.,
variable
are replaced (3)
place
cases, do not add anything
xl,...,
expression
and if the scope of t is contained
e are bound,
scope of e, then
R( o, xl,...,
occurring
e is a bound
variable
the group-id variable o is the same as the one that expression (3e) in step (1), and the domain variables
x.,
positively
in
t),
the
in 4, then
replaced e in the xl, . . . . x,, are the
same as the quantified variables introduced in step (1) for the combinaIf e is a free variable in d, then the tion of (3e ) and (3 t) expressions. group-id variable o and the domain variables x ~, ..., x. are free and are different (4)
from
Replace Since
$
contain replaced this
all other
is
safe,
with
maximal
xl,
that
...,
mapped
(3x’)(3y’)(3t’) y=5.
R(e)
z~, t).
the mapping R of relations
).
x,, where
conjunct
corresponds
(for
xl
is defined
containing some
R).
as follows:
e. Al ( t) In step
Then x, corresponds to the A, in R.15 to attribute
r~u and
must
also
(3), t = e.1 is
follow. In these examples Q from Lk are I?( A, B)
variable
in
we assume and Q(A),
11. The L~ query [e. *:t]R(e)
is
in r~u(+
@ with
t G e.1 and
R(o,
Examples illustrating that the schemata respectively: Example
the
expressions
expression
variables
e. A,(t ) in
each term
into
the
R(o,
TC~
At
query
x’, y’, t’),
Ge.ZAe.~(t)
as follows:
t = e.1
with
R(o,
=5 R(e)
would
x, y, t),
and
be replaced e.B(t)
with
= 5 with
14It follows from the grouping axioms in Section 6.1 that it does not matter which positively occurring predicate R is selected. Any selected predicate produces the same results. In fact, all of the qualifying predicates can be selected as well, for a longer but logically equivalent formula. 15The remark in footnote 14 is also applicable here. ACM Transactums on Database Systems, Vol 19, No 1, March 1994.
102
J. Cllfford et al,
.
Putting {(o,
the pieces
x),
(o, y),
together,
we get the following
fl(31’)(3y’)(3t’
)R(o,
x’, y’, t’)
Since (3x’)(3y’)(3t’) R(o, x’, y’, t’) ~ R(o, we can rewrite the previous query as {(o, Example
The
12.
x),
(o, y),
x,y,
t) Ay
x,.y,
t) /’/y
= 5}.
to I?(o, x, y, f),
= 5}.
query
L~
A t’ Gel
AR(o,
x, y, t) is equivalent
flR(o,
[e. *:t]R(e) (R(e)
answer:
A t Gel
A (Ele’)(Q(e’)
(=t’)
A
A t’ Ee’.l
A~(e)
A t =e.1
~e.B(t)
= e’. A(t))) is mapped
into
the TC~ {(o,.r),
(~(o,
x’’,
Note the that
(o,
the part
domain of the
the variables
t’.In
general,
group-id
y),
Example
is converted
variables
Example
14.
with
in
next
in
o’. Also
temporal
the
same
t are quantified
t. The
A Q(e))
is translated
x’),
The
(o’,
A ~Q(o,
shows L~
[e. + :t]~(e)
note
variable
predicate
as
together
with
the
shows
how
r~c,
example
A~(e’)
A t
● e’.l)
A~(e’)
A t Ge’./
that
y’),
x,t)
r~u
t)l(~o)(ax) A~(o’,
x’,
y’,
does not affect
A~(o’,
t))
domain
x’, y’, t)}.
variables
in
~.
query
A t Gel
A (~z)(~(e)
A t Gel
Ae.
A(t)
=z)
into
{((o,
x),
PROPOSITION
(o, y), 6.
~GW
OF PROOF.
f)l~(o,
~,.y, safe
maPs
The
proof
t)
Lh
A
AX ‘z)}.
(~z)(~(o,~,y,f)
formulas
proceeds
in tO safe along
the
TC~ lines
fornz of
u~as
the
proof
of ❑
4.
THEOREM
7.
M~CT~ = (TUg,
TC~ ) is strongly
complete
with
respect
to M~~
L~).
SKETCH one-to-one,
is quantified
variable
together
variable
=e.1
,t”)t”)
example
= (TG,
example
‘X’))}.
to
The next
SRETCH
t) Ay
query
L~
A -(t
)(3t’’)Q(o,
Proposition
X,y,
group-id
appearing
o and
{((o’, ((3x’’
A~(O,
as the
y“ are quantified
domain
The
[e’. *:tl(3e)(Q(e)
X’, t’)
x’ in the previous
o and temporal
13.
A (~ X’’)(~y’’)(~)’)
t)
(Q(o’,
variable
scope of variables negations.
innermost
handles
x,y,
r~u( ~) formula
x“ and the
variable
tl~(o,
A (30’) (3x’)
y’’,)’)
that
same
query
OF PROOF. because
ACM Transactions
First, of our
QGC, grouping
is
clearly axioms.
a correspondence Second,
on Database Systems, Vol 19, No 1, March 1994
the
mapping
mapping rGU
and satis-
Completeness fies the second show the
condition
by induction L~
formula
in the definition
on maximal
formula
@(el,...
t e e.1 uniquely
on, ~1, ~m,yl,,
along
the lines
The following
theorem
THEOREM 8. with
in
~ in
completeness, At
L~.
Yl,
tl,...
as we shall
any inductive
, t~) is mapped
,l,...,
,tk),
into
group-ids
The grouped
TC~
Y1,...,
where
Y1
I?(e)
A
yl are o ~, . . . ,
and do the proof ❑
5.
immediately model
il!l~u ~ = (TU~,
step,
the
these variables are “superfluous” With this observation in mind,
of Theorem
103
.
variables introduced in the translation process (i.e., when becomes R(o, yl, . . . , y,, t)). Notice that variables yl, ..., determined (i.e., functionally depend) by values of variables
0 ~,xl>. ... xm, tl,...,tk.Therefore, not affect the translation process. proceeds
of strong
conjuncts
, e., xl, . . . . x~, tl,...
r~u(~)(ol,...,
are extra
of Historical Relational Query Languages
follows ikl~~
TC~)
are
from
= (TG, strong
Theorems
L~ ) and
5 and 7:
the
ungrouped
model
equivalent.
Theorems 3 and 8 establish the connections between grouped and ungrouped historical data models. The power of temporal grouping that is inherent in grouped models can only be achieved in an ungrouped model by the addition the grouping.
of some mechanism,
7. HISTORICAL
MODELS
analogous
to our group-ids,
for simulating
AND COMPLETENESS
All of the historical relational data models and languages that proposed differ from one another in the set of query operators provide.
In
addition,
relations that is incorporated
they
often
differ
in
the
structure
have that
of the
been they
historical
they specify, that is, the way in which the temporal component into the structure. Space obviously precludes an analysis of all
of these models with respect to our two notions of completeness. have two orthogonal characteristics to describe these models and
Since we their lan-
guages—gouped or ungrouped, algebra or calculus—we decided to discuss four models, each covering one of the four possibilities. Two of the data models we discuss are ungrouped, one with an algebra [Lorentzos and Johnson 1987] and the other with a calculus [Snodgrass 1987]; we therefore The other two data models investigate whether or not they are TU-Complete. discussed the other whether
are grouped, with
both
or not they
one with
an algebra
an algebra
and a calculus
[Clifford
and
[Gadia
1988],
Croker
1987]
and
so we investigate
are TG-Complete.
We have earlier motivated our choice of L~ and TC as appropriate languages to use for our notions of completeness. Therefore, in this section a with respect to ikf~~ = (TG, L~ ) (or data model is said to be complete M ~u = (TU,
TC))
M ~U = (TU,
TC)).
refer more
if it is strongly complete with respect to M~~ = (TG, Lk) (or Although by our definitions we should, strictly speaking, to completeness with respect to the data models, we will generally speak specifically
about
their
languages
and
apply
the
term
loosely
to them.
For each of the historical query languages discussed in the following, therefore, we consider first its completeness with respect to either of L~ and TC and vice versa. We shall see that L~ and TC are complete with respect to all ACM Transactions
on Database
Systems,
Vol
19, No
1, March
1994
104
J. Clifford
.
of the languages
et al
we consider,
a fact that
lends
further
support
to their
and TU-Completeness. the standards for TG-Completeness We begin with a discussion of the completeness of the historical algebra
specified
Croker defined relations
1987], We in Section in HRDM
was
intended
by the historical
relational
data
model
HRDNI
use as
relational [Clifford
and
discuss this language first both because the TG model 2 is derived directly from the structure of the historical and because the set of operators specified by this model
initially
to provide
all
of the
functionality
thought
useful
and
desirable.
7.1 HRDM The historical [1987]
relational
data
is a temporally
model
grouped
HRDM
historical
presented data
by Clifford
model
with
language that is presented as an extension to the standard We can categorize the operators of HRDM as follows: These
Set-Theoretic.
tics
of relations
tion
( n ), set
operators standard terpart ple,
operators
and include difference
do not mappings
( –),
exploit from
in relational
calculus
Attribute
terms
Based.
of the
product
relational
union
operators,
as suggested in
the
operators also applies
Because
to these
operators
these
relations, to their here.
For
the counexam-
Cs}
A t Gel
[e. *lt]r(e)
by their
standard
( U), intersec-
( X).
aspects of HRDM in relational algebra
historical
query algebra.
of the set characteris-
Vs(e)
A t ● e.1.
This category includes those operators attributes (or their values) of a relation.
that
exist
in terms
set operators
Cartesian
SrVx
-
in
standard and
the these
r(.Js={.x\x
are defined
the
and Croker
an algebraic
names,
relational
are derived algebra.
As
that are defined Some of these
from
similar
operators
shown
below,
often
the
original definition of these operators has been modified to exploit the temporal component of the historical model. For each of these operators, we give both (1)
its set-theoretic This counterpart
Project(u).
tional
definition
and then
an equivalent
expression.
operator is equivalent in definition to its standard and has the effect of reducing the set of attributes
which each of the tuples x in its operand, attributes contained in a set of attributes nx(r)
= {x( X)1.X -
(2)
Lk -based
a relation X.
r, is defined,
relaover
to those
~ r}
A t Gel
[e. X:t]r(e)
(o-IF). This variant of the select operator selects from a relation tuples x, each of which for some period within its lifespan has a A that satisfies a specified selection value for a specified attribute criterion. The period of time within the lifespan is specified by a lifespan Select-if
r those
parameter ACM TransactIons
L.
The
selection
on Database
Systems.
criterion Vol
19,
No
is specified 1, March
1994
as A 9a,
where
9 is a
Completeness comparator
and
attribute
with
operator
is used
a is
of Historical Relational Query Languages
a constant.
another
to denote
tion criterion must the tuple’s lifespan, ‘“~F(Atia,
(if
(It
is
also
in the same tuple.) a quantifier
be satisfied or whether
to
that
105
compare
one
Q, of the select-if
specifies
whether
the
selec-
for all (’d) times in the specified subset there exists (3) at least one such time. ● (L
dQ(t
Q,L) (T-)
=
{X
Qis
=
[e. *:tlr(e)
3)
possible
A parameter,
.
G
of
n x.l))[x.A(t)oa]}, ● e.1 A
At
~tl(tlE L A tl E e.1 A e. A(tl)6cz) (if
Qis
V)-
At
[e. *:t]r(e)
~3tl(tl (3)
Select-when
stricted
However,
to those
times
u-WHEA(~@~(r)
A tl G e.1 A ~e.
GL
operator lifespan
is
the selection
to the selected
criterion
is satisfied.lG
A t Ge.z
[e. *:t]r(e)
A(tl)6a)
similar of each
= {tl~’. A(t)6a}
● r[x.1
= {xlqx’
A
AX.
tuples
are
values
at some time
combined
when
in a @ relationship
with
exactly
when
Let
those rl
= RI
times
each other. this
=
RI
● rz [ e.1 =
zA
e.u(R2)
=er,.u(R2)l,
~])
● el.l
*:t]rl(el) A t Ge2.z
from
data model, this With O-join two each
tuple,
have
lifespans
that
stand
of the resulting
and
respectively,
Rz,
{t/e,,(
A)(t)
~[e.
~ r[l
=L
*:t]r(e)At
A el. A(t)
Oez.B(t)
f’ e’.l Ge.l
Transactions
relation in the tuple e of the
A e.1 = Z A e.u = e’.ull]} At=~
16The notation fl ~ in this definition, used in HRDM, is the standard restriction of the domain of the function f to the set 1. ACM
A
Ar2(e2)
reduces a historical (,%, L). This operator dimension by restricting the lifespan of each r to those times in the set of times L. relation = {elqe’
is
where
0e,,(13)(t)}
time-slice
~.L(r)
tuple
is satisfied.
= erl. u(Rl)l,
At
temporal operand
The lifespan
e.u(Rl)
= [el. *,e2.
one
of the tuples’
relationship
● rl,3er,
(el~e,l
VIX 1]}
Ae.A(t)Oa
attributes,
and rz be relations on schemes and B = Rz are attributes.
rl[AOB]rz
Static
two
in the intersection
3-quantified tuple is re-
U =X’.
O-Join. Like its counterpart in the standard relational operator combines tuples from its two operand relations.
A
(5)
the
when
=
(4)
This
( a-W12EN).
select-if-operator.
Ge.l
on Database
Systems,
notation
for denoting
Vol. 19, No 1, March
the
1994
106
J. Cltford
.
7.1.1
Other
In
Operators.
the HRDM restructure relation.
et al.
These
operators,
difference-merge
time-slice.
given value
to the
union-merge
), first
(–.
difference, respectively, The HRDM algebra
above
it computes
compute
categories
operators information
( U ~),
of’ operators,
that are content
intersection-merge
the set-theoretic
union,
the WHEN
a result
that
operator
is not
( n ~),
as a constant. Applied to a historical defined as the union of the lifespans
and and
relation. dynamic
as an extrarelational
contained
used to of that
intersection,
and then regroup the tuples in the resulting also includes the operators WHEN and
categorize
We
in that
addition
grouping algebra includes several a relation without changing the
operator
in a database
relation
or
relation, this operator returns a of the tuples in that relation. This
aggregate operator. The operator can be viewed as a type of temporal-based dynamic time-slice is only applicable to relations that include in their scheme A whose domain consists of partial functions from the set an attribute
itself. We do not treat
into
TIMES
the models
considered hold
which
they
would
be unfair
other
operators
remaining
such attributes
distinguish
and
do not
to include from
allow
such
our
languages
that
between
in this
ordinary
comparisons
an operator
discussion we will
paper,
values
between
them.
in our comparison.
of completeness
examine.
since most
The
at
Therefore,
it
We omit
the
of HRDM
grouping
of
and the times
and
operators
the
are not
treated because they are not intended for querying, and the aggregate operators, because they are outside of the scope of standard relational-based notions of completeness. The translations that we have provided for each of the relation-defining operators algebra. that
of the HRDM However,
this
are expressible
sequence database
in L~
of algebraic in Table VII
has at some time
algebra algebra
show that is not
for which
[e. NAME,
a cut in salary,
The lack those
of their attribute
< t2) A e.sfi(tl)
of an equivalent
operators
in HRDM
expression
(i.e.,
expressible
in Lk
as
~ t E e.1 A
> e.sfi(tz)).
algebraic that
to this
are queries
~ tl ● e.1 A t2 G Ae.Z A
3t13tz(EMPLOYEE(e) (tl
algebraic
respect
there
One example is the query on the department of each employee that
t] EMPLOYEE(e)
e. DEPT:
with
in that
no equivalent
operations) exists. for the name and
received
L,, is complete
TG-Complete
expression
include
is due to the specification
the comparison
of two values
of
as part
definition: the join and the various select operators. In each case only values that occur at the same point in time can be compared. (This
ability seems to be what is meant by the property of supporting “a 3-D conceptual view of a historical relation” that has been cited as an intuitively necessary component of a good temporal database model (e.g., in Ariav [19861, Clifford and Tansel, [1985], and McKenzie and Snodgrass [ 1991a]). Thus, as required by the above query, it is not possible to compare the salary of an employee at some time time, t2. ACM
TransactIons
on Database
tl
with
Systems,
that
Vol
employee’s
19,
No
1, March
salary
1994
at some other
point
in
Completeness of Historical Relational Query Languages
7.2 The Historical
Homogeneous
.
Model of Gadia
The next historical model that we discuss is one that was proposed [1988]; it is a model that includes a query language and an algebra. model,
which
we call TDMG
as that
of HRDM
Section
2.
In TDMG
the value
the value
domain
attributes
(Gadia’s
temporally
grouped.
In
addition
calculus.
and
algebra is defined temporal relations.
of the
of a tuple
data
model
model,
model
is a function
Gadia
defines
is temporally
from
in terms of the ungrouped Gadia calls this a snapshot
model
does not
algebra
distinguish
and calculus.
between
In terms
them
when
of our discussion
defined
obtained
to
model
algebra
the semantics
is and
of the
by ungrouping semantics. The
interpretation
he proves
in
for all of the
the TDMG
semantics of the historical algebra is defined by ungrouping tions, because Gadia considers grouped and ungrouped models and
is the same
a set of times
a historical
grouped,
by Gadia This data
TG
is the same
Therefore,
assumption).
model
of Gadia),
historical
and the lifespan
homogeneity
his data
data
canonical
attribute
of the attribute,
to the
Although
(for temporal
thus
107
temporal rela“weakly equal”
equivalence
on completeness
of his
in Section
3,
Gadia’s mapping from his grouped model to his ungrouped model is not a one-to-one mapping; unlike our mapping into TCg, Gadia’s f) mapping ignores grouping. Gadia’s (ungrouped) algebra is defined as follows: He starts with the five standard relational operators—selection, projection, difference, Cartesian TA does. He also defines derived temporal operators, product, and union—as such as join, intersection, negation, and renaming. In addition, he defines temporal expressions for the temporal domain. Finally, he combines relational form
and temporal e(u),
where
expressions e and
by considering
u are relational
and
relational temporal
expressions expressions,
of the respec-
tively. TC
The
is complete five
standard
with
respect
temporal
to Gadia’s
operators
algebra
are defined
for the as for
following TA
and,
reasons: therefore,
expressions are defined as a closure of time can be expressed in TC. Temporal intervals over the operations of union, intersection, difference, and negation. Each of these operators references to time. tdom(s(A,
B))
in
TDMG
can be expressed in the first-order For example, the expression can
be defined
in
TC
logic
with
tdom(r(
as {tl(Slx)(3y)(r(x,
A,
explicit B)) V y, t)
This means that every query in TDMG can be expressed in Gadia also defines a historical calculus and shows its equivalence to algebra (modulo temporal grouping). This calculus is expressible in L~ for same reasons that the ungrouped algebra is expressible in TC. A lifespan
S( x, y, t))}.
v
TC.
the the of a
t G x.1 in Lk. temporal tuple .x in TDMG can be captured with expression Also, the operators of union, intersection, difference, and negation for tempothat are used ral expressions can be expressed in Lfi with the same methods supports time. to express algebraic expressions in TC, since L~ explicitly L~ has strictly more expressive power The temporally grouped language Also, the than Gadia’s calculus; that is, this calculus is not TG-Complete. ACM
Transactions
on Database
Systems,
Vol
19, No
1, March
1994
108
.
J Clifford
temporally
et al
ungrouped
algebra; that completeness
language
TC
is strictly
more
powerful
than
Gadia’s
The reason for this lack of algebra is not TU-Complete. same as for HRDM: It is not possible to compare the
is, the is the
t~ with the value of another or the same value of one attribute at time attribute at some other time t ~. For example, the query of the previous section, asking for the name and department of each employee that has at some time received a cut in salary, cannot be expressed in TDMG. 7.3 TQuel TQuel
is the query
language
component
[1987].
of a historical
We shall
call this
relational
model
data
model
proposed
by Snodgrass
TRDM.
TRDM relation,
provides for two types of historical relations. One, called an interval is derived from a standard relation through the addition of two
valid-from and valid-to, both of whose domains are the temporal attributes, set of times T. (An example of such a relation has already been given in Table temporal attributes since we 111). As before, we ignore the two TRANS-TIME historical data models. Thus, we will view TRDM as a are only considering
temporally attributes
ungrouped historical data of a tuple in such a relation
model. The values of the nontemporal are considered to be valid during the value and ending thus denotes the
beginning of the interval of time starting at the ualid-from value. (This interval at, but not including, the ualid-to lifespan
The standard relations
of the tuple.) second
type
of relation,
relation by a single and event relations
addition of attributes The query language
an
event
relation,
is defined
temporal attribute are derived from
cjalid-at.
lNF
by extending
Since relations
a
both interval through the
whose values are atomic, they are also in INF. TQuel is an extended relational calculus derived
from
and defined as a superset of Quel, the query language of the Ingres relational database management system [Stonebraker et al. 1976]. TQuel extends Quel and by adding temporal-based clauses that accommodate the ualid-from valid-to
attributes.
(These
attributes
are not
visible
to the
existing
nents of the Quel language. ) clause is added to define an additional temporal-based A WHEN constraint that must be satisfied in conjunction with the constraint by the TQuel (and temporal predicate
clause. This Quel) WHERE valid-from over a set of tuple
constraint, –valid-to
composelection defined
specified intervals
as a (life-
span),
defines a restricted set of relationships that must hold among them. A clause is used to define, in terms of temporal expressions, talid-from and oalid-fo values for tuples in the relation resulting from the TQuel stat ement. Both temporal predicates and temporal expressions have a semantics that
VALID
is expressible
in
terms
of the
standard
tuple
calculus
;;
[Snodgrass
1987] .17
This specification also includes the use of several auxihary functions that are used to compare times, m order to determme which of two times occurs first or last. Strictly speaking, the set of these functions described m Snodgrass [1987] 1s not complete, but It 1s easdy extendible to a complete set ACM Transactions
on Database
Systems,
Vol
19.
No
1,
March
1994
Completeness
TQuel TQuel,
of Historical
Gluery Languages
Relational
is complete with respect to TC, and vice versa, since like that of Quel [Unman 1988], can be expressed
standard
relational
calculus,
with
which
is clearly
TC
.
109
the semantics of in terms of the
strongly
equivalent.
In
particular, Snodgrass shows how any TQuel query can be expressed as a formula of the form Q A r A @, where Q, 17, and @ are the calculus formulas for
the
underlying
clause, r and
neither
valid-to,
means
Quel
statement,
the
TQuel
WHEN
respectively, and where r and @ contain @ are defined only over the temporal that,
of which
as with
may be included
Quel,
in Q. The structure
not all algebraic
single TQuel statement (e.g., algebraic operator). If none of a nontemporal attributes defined
has
a domain
whose
values
clause,
expressions
VALID
over
of this
formula
can be expressed
expressions
are
and
no quantifiers. Additionally, valid-from and attributes
containing
which
a TRDM
comparable
to those
the
as a union
database in
the
is
set of
expression over the relations in this times T, then in no algebraic valid-from or valid-to. can such an attribute be compared to either
database For such
a database, TQuel statements, formula, are no more restrictive
calculus (as with
QueD,
a sequence
of TQuel
as represented by a defining tuple than Quel statements. Therefore
statements
can express
perhaps
by creating temporal relations APPEND and DELETE. Although interval relations and event
and
by
relations
any algebraic using
expression,
statements
are distinguished
such
as
by TQuel,
they are standard lNF relations that provide a fixed way of encoding temporal data using the temporal attributes. TQuel differs from Quel only in the distinction accorded these attributes. Thus, like Quel, with the addition of such statements
Note
that
department expressible
it is complete
as APPEND,
extension, as a result but like all ungrouped the
query
on
the
of each employee in L~ as [e. NAME,
database
that
Table
t] EMPLOYEE(e)
e. DEPT:
VII
by Codd. By
it is TU-C’onzplete, value
for
integrity.
the
received
name
and
a cut in salary,
A t G e.1 A
e’) A tl G e.1 A tz G Ae.Z A
(t,< t,) A e. SAL(tl) (again,
in
has at some time
3t13tz(EMPLOYEE(
is also expressible
in the sense defined
of the use of the temporal attributes, models, it does not exhibit temporal
ignoring
> e. SAL(t,
transaction
times)
)),
in TRDM
as follows:
range of el is EMPLOYEE range of e2 IS EMPLOYEE retrieve into SalChange(el .NAME, el .DEPT) valld from begin of el to end of el where el ,NAME = e2. NAME and e2.SAL < el .SAL when (end of el ) precede (begin of e2) Further
note that
an algebra
has been proposed
that
provides
a procedural
equivalent to the TRDM calculus [iMcKenzie and Snodgrass 199 lb]. Although it employs a different data model from that in TRDM (in fact, its model is model and does not support grouping. NINF), it is not a grouped ACM
Transactions
on Database
Systems,
Vol
19, No 1, March
1994
110
.
J, Cliford
7.4 The Temporal The
final
et al
Relational Algebra of Lorentzos
historical
Lorentzos
and
the same restricted
data
Johnson
as that to only
model
that
[1987].
This
we discuss data
and Johnson is one that
model,
called
was proposed
TRA,
by
is essentially
model it is [1987], except that as a historical dimension. Two of the stated goals of TRA are
in Snodgrass one temporal
that “no new elementary relational algebra operations are introduced and first normal form is maintained” [Lorentzos and Johnson 1987, p. 99]. Typical relations in this model appear basically as in Table III (with the columns oalid-from and l)alid-to called Sfrom and Sto, respectively). Although the structures historical
of relations in version of TRDM,
in Snodgrass calculus. It
[ 1987],
the
to
discuss
is difficult
this model we discuss language
it
formally
are essentially the same as in the this model here because, unlike that proposes
the
is an algebra
algebra
of TRA
rather
because
than it
a
is not
specified formally. Rather, it is presented via a series of example queries and discussions. Nevertheless, enough of a picture of the algebra emerges clearly through these examples to make a discussion possible. FOLD and UNFOLD, are defined. These operators Two new operators, essentially convert between the time-interval representation (as in Table and UNFOLD and a time-point representation (as in Table I). FOLD clearly
expressible
Lorentzos The
previous
HRDM
in terms
and Johnson sections
and TDMG,
of operators
[1987]
point
have
were
in the standard
relational
III) are
algebra,
as
out.
demonstrated
incomplete
that
because
two
they
other
were
algebras,
that
of
not able to compare
the value
of one attribute at a time tl with the value of another (or the same) such comparisons are possible. t~. In TRA attribute at some other time Again consider the query that finds the name and department of each employee that has at some time received a cut in salary: [e. NAME,
t] EMPLOYEE(e)
e. DEPT:
3t13tz(EMPLOYEE(e) This
query
relation
can be expressed
EMPLOYEE EMPLOYEEU1
into =
A t G e.1 A
A tl < t2 A e. SAL(tl) in TRA
as follows:
all of its time
UNFOLD[
Time,
> e. SAL(tz
First,
)).
the interval
UNFOLD
points: Start,
Stop]
( TIME,
EMPL
).
Then, @-join this relation with itself, joining tuples with the same name and with a pay cut, and then project just the names of the employees from the I and NAME2, etc., refer to the NAME attributes in the result (here, NAME first and second operands to the join): NAME TEMP
1 = Employee,,
TIME
[ TEMP2 ACM
TransactIons
I = NAME2 I < TIME2
SAL I > SAL2
= wN~~I~l(TEMPl). on Database
Systems,
Vol
19, No
1, March
1994
, ,
1
Employee,,
,
Completeness Finally,
join
the result
with
of Historical Relational Query Languages
the original
relation,
to standard
relational
and project
.
111
onto the desired
fields:
Because
TRA
is equivalent
as in the case
TU-Completeness,
completeness
of relational
TU-C’omplete,
but
value
like
of Tl%DM,
algebra.
algebra,
is reduced
Therefore,
all ungrouped
we
languages,
the question
to the
question
conclude
that
it does not exhibit
of its of the
TRA
is
temporal
integrity.
The guages
results
of our
explorations
are summarized
in Table
8. SUMMARY
into
the
completeness
of these
five
lan-
XII.
AND CONCLUSIONS
In this paper we have explored the question of completeness of languages for historical database models. In this exploration we were led to characterize as being
such models
of one of two different
types,
or temporally ungrouped. We first discussed means of example databases and queries, and were not temporally
equivalent. The grouped models,
difference historical
either
temporally
grouped
these notions informally by showed that the two models
between the two models values (like salary histories)
is that, in are treated
to directly in the query language. In as first-class objects that can be referred the temporally ungrouped models, no such direct reference is permitted. We value have characterized this property of the grouped models as temporal integrity.
We then strong
proceeded
completeness
paradigms
to define the two concepts of weak completeness and between two data models with different representation
and different
query
there is a correspondence to the comparison model,
languages.
In the case of weak
completeness,
mapping from the relations of the reference model on the query language that preand a mapping
with weak equivalence is that serves the meaning of a query. The problem different relations in the reference model can be mapped to the same relation in the comparison model, and so information, for example, grouping, can be lost. In the case of strong completeness, the correspondence mapping must be one-to-one, and hence, there is no loss of information. TL, For the ung-rouped models, we have defined three different languages, temporal logic, a logic with explicit reference to time, and a TC, and TA—a temporal
algebra—and Any
TU-Completeness. pleteness.
An
ungrouped
have motivated one of the three model
is said
our choice for TC as the basis for can serve as the basis for TU-Comto be TU-Complete
complete with respect to M~u = (TU, TC). For the grouped models, we have defined
the
calculus
if it is strongly L~,
a many-sorted
logic with variables over ordinary values, historical values, and times. We L~ as the basis for TG-Completeness. A grouped model is said have proposed to be TG-Complete if it is strongly complete with respect to M~~ = (TG, Lk). We then proceeded to explore more formally the relationship between ungrouped and grouped models. We have demonstrated a technique for ACM TransactIons on Database Systems,Vol 19, No. 1, March 1994
112
.
J. Cllfford et al. Table XII.
Section
TC
Section 4 A
of Completeness
-1 --L.-
5
lT -------[lJuLellLAub
extending With
.-A
J Ullllhuu
7 (10-1
1Yo I j
Completeness
gloLlped
Basi> for
TG-Completeness
unglouped
Basis for
TU-Completeness
.-
_.-
_-l
Ullgluupcu
[Snodgrass 1987] [Clifford and Croker 19S7]
ungrouped
[Gadia 198S] [Gadia 1988]
grouped
the ungrouped this mechanism
fier.
T_l-----
dllu
Results
Type
Reference
Language Lh
--
Summary
grouped
I
TTT 1
u-
n.–.
-l...
E TU-
Not
ull~rouped
Not
r-
Not
7’[1-C’omplete
model with a grouping mechanism, we have shown how the ungrouped
1
a group identimodel TU and
TC could be extended to TU~ and TC’~ in such a way as the language the resulting model equivalent in power to TG with Lk. In this way demonstrated that the grouped and ungrouped models differ only capability. More precisely, we have proved spect to the grouping M~~r = (TU, TC ) is weakly equivalent and the model MTcr~ model equivalent to the model M~~ = (TG, Ll,). TC~ ) is strongly
Finally, whether models,
we have
examined
several
historical
or TG-Complete. they were TU-Complete two grouped and two ungrouped, offering
the ungrouped calculus (TQuel
relational
proposals
to make we have with rethat the = (TUR,
to see
We looked at four historical five different languages. In
models, we have found both an algebra (from ‘1’RA) and a whereas in the grouped from TRDM) that are TU-Cornplete,
Lk, two models, we found, apart from our metric, the complete calculus languages that are not TG-Complete: an algebra (from HRDM) and a calculus (from TDMG), as well as an algebra (from TDMG) (which operates on We believe ungrouped versions of grouped relations) that is not TU-Complete. that this classification scheme and our examination of the completeness of
several historical models should help to explicate the differences and the commonalities between the various models proposed in the literature. As with of query languages, the relational model, a baseline notion of completeness although
imperfect
transitive minimum
closure queries or support aggregates), nonetheless and reasonable metric with which to compare a variety
(e.g.,
relationally
complete
languages
do not
allow
for
provides a of different
languages. One point bears emphasizing. It has on occasion been said that the issue of adding time to relational databases is an uninteresting one, since the user can always add whatever extra attributes are desired (e.g., Start-Time and End-Time) and then use standard SQL (or relational algebra) as the query language. In our discussion of the completeness of the ungrouped temporal languages, we, to some extent, have relied on the underlying point of this (which argument. For example, this point underlies our argument that TRA Two points need is equivalent to standard relational algebra) is TU-Complete. to be made
in reply
to this
comment.
First,
there
is a difference
ACM TransactIons on Database Systems,Vol 19, No 1, March 1994
between
the
Completeness
formal
notion
of completeness
of ease of use. Even lent
to a Turing
an operating temporal easier
of Historical
Machine,
features
and the informal,
though
system
the programming it is a lot more
because of the
Relational
of its
to use for managing
and
temporal
Languages
notion
C is formally
equiva-
to use C if one is writing
high-level temporal
data;
113
.
but no less important, language
convenient
built-in
historical
Query
features.
data
without
The
models
these
built-in
make
them
a greater
features
burden is placed upon the user. Second, this paper has shown that the grouped models and languages are more expressive than their corresponding ungrouped models, unless these models add a surrogate grouping mechanism. in This grouping mechanism, itself, is a higher-level construct that is implicit the grouped systems (and this, we argue, makes them more convenient), but in the ungrouped systems for them to be equivalent needs to be made explicit in expressive power. There are a few interesting areas for future research that this work has clarified.
The
seem that simulating Clearly, structural
first
relates
they are temporal
to our
grouping
axioms
rather strong, perhaps grouping in a temporally
(in
Section
in order to have an isomorphism between two mapping and the r mapping on queries must
It is an area for additional most likely at the expense Another area of interest nor are we aware
of, any
6). It
might
stronger than necessary ungrouped model like such work
for TU.
models, the Q hand in hand.
research whether our fl~u could be simplified, of complicating the mapping on queries. arises when it is noted that we did not find here, complete
algebra
for grouped
historical
data
Such an algebra is clearly needed. Another area in which there be interest is in the support of evolving schemata. Our decision
models.
continues to not to treat
this interesting area here was based largely on the fact that hardly any of the models in the literature incorporate this feature, and we wanted to choose the common denominator of all the models in order to make our comparisons fairly. other
The work
continues Finally,
model in Clifford and Croker [1987] addressed this issue, and (e.g., Banerjee et al. [1987] and McKenzie and Snodgrass [1990]) to be done in this area. we would like to address
as opposed to historical and Ahn [ 1985]). We
relational believe that
the question
of completeness
for temporal
models (in the terminology of Snodgrass our results on grouped and ungrouped
historical relational completeness can be extended in a straightforward way to temporal data models and languages. The extension would involve the addition of another sort (for transaction times). In ungrouped temporal models, relations every tuple with
would be extended with an additional its transaction time, and the language
column would
to stamp have con-
stants, as well as variables, and quantification for this sort. temporal models, values would be extended to be doubly indexed; most likely be better modeled as functions from a transaction
In grouped they would time into
functions
of the
from
a data
time
to
a scalar
value,
but
the
order
two
temporal indices could be reversed. Preliminary work that we have done on Indexical Databases [Clifford 1992] holds promise for a unified treatment, not only of these two temporal dimensions, but of spatial, or other, dimensions as well. ACM TransactIons on Database Systems,Vol 19, No. 1, March 1994,
114
.
J. Chfford
et al.
ACKNOWLEDGMENTS
The
authors
Fabio
would
Grandi,
like
for their
to thank valuable
the
reviewers,
comments,
contents and presentation of this Snodgrass for ongoing and fruitful many of the ideas presented here.
which
and
also Jan
have helped
Chomiki
and
to improve
the
paper. We would also like to thank Rick discussions that have helped to clarify
REFERENCES AHO, A
V , AND ULLMAN, J
Syrnposzum
D.
1979.
Umversahty
on 1%-mczples ofl%-ogrammtrzg
of data retrieval languages In ACM ACM, New York, pp. 111-120 Svst 11, 4 (Dec.), ACM Trans. Database
Languages
ARI~V, G. 1986. A temporally oriented data model. 499-527 1986 Temporal data management. Models and systems In Neu ARIAV, G, ANDCLIFFORD, J Dzrectzons for Database Systems, G Anav and J. Chfford, Eds. Ablex, Norwood, N.J , pp 168-185, BANCTIHON, F. 1978 On the completeness of query languages for relational databases In Proceedings of the 7th Sympmzum on Mathematical Foundatzon~ of Computmg. SpringerVerlag, New York, pp. 112-123. BANI?RJR~, J., KIM, W,, KIM, H.-J., AND KORTH, H. F. 1987. Semantm and Implementation of Conference schema evolutlon m object-oriented databases In Proceedl ngs of ACM SIGMOD (San Francisco, B~N-ZVL J 1982. Cahforma CHANIXtA,
A.
Cahf ). ACM, New York, pp. 311-322. The
time
relational
model,
Ph,D,
thesis,
Computer
Science
Dept.,
Umv.
of
at Los Angeles K,
ANn
HAM?L,
D
1980.
Computable
queries
Syst.
ACM Transactmns
Sc~ 21, 2 (Ott
for
relational
data
bases
J
), 156-178. of Workshop on Logzcal CLIIWOEU>,J. 1982. A model for historical databases In Proceedings Bases for Data Bases (Toulouse, France, Dec ), ONERA-CERT, Toulouse, France. of Workshop on Current Issues in 1992 Indexical databases In Proceedings CLIFFORII, J Database Systems (Newark, N J , Ott ) Rutgers LTnlv , New Brunswick, N J CMFFORD, J., AND CRORRR, A. 1987 The historical relational data model HRDM and algebra of the 3rd IEEE International Conference on Data based on hfespans. In Proceedings Engzneermg (Los Angeles, Cahf ) IEEE, New York, pp. 528-537 CLIFFORD, J , AND TANSEL, A U. 1985. On an algebra for historical relational databases: Two of ACM SIGMOD Conference (Austin, Tex., May). ACM, New York, wews. In Proceechngs pp 247-265 CLIFFORD, J., AND WARREN, D. S 1983. Formal semantics for time in databases AC’M Trans Database Syst 6, 2 (June), 214–254. CODD, E F. 1972 Relational completeness of data base sublanguages. In Data Base Systems, R Rustin, Ed Prentice-Hall, Englewood Cliffs, N J Introdaetzon to Database Systems, Vol. II, Addison-Wesley, Reading, Mass, DATE, C J 1983. A Mathematzcul Introduction to Logic. Academic Press, New York. EN~ERTON, H, B. 1972, FISCHER, P C., mm VAN GUCHT, D 1985 Determmmg when a structure m a nested relatlon. Conference on Very Large Databases. pp. 171-180. In In fernatzonul GABBAY,D 1989. The declarative past and Imperative future: Executable temporal lo~c for of Colloquz am on Temporal Logzc zn SpecLflcatLon, B. mteractlve systems. In Proceechngs Bameqbal, H. Barringer, and A Pnueh, Eds. Lecture Notes m Computer Science, vol. 398, Spr]nger-Verlag, New York, pp 402-450. GAIXA, S K 1986 Toward a multihomogeneous model for a temporal database In Proceedings of the 2nd IEEE In ternatzonal Conference on Data Engmeermg (Los Angeles, Cahf., Feb.). IEEE, New York. GADIA, S. K. 1988. A homogeneous relational model and query languages for temporal Syst. 13, 4, 418-448. databases. ACM Trans. Database HALL, P., OWLETT, J., AND TODD, S, J, P, 1976. Relatlons and entitles In MOd@?zzL?Lg zn Data Base Manageme?Lt Systems, G. M. NiJssen, Ed. North-Holland, Amsterdam, Comput
on Database Systems, Vol
19,
No
1, March
1994
Completeness
of Historical
Relational
Query
Languages
.
115
HALMOS, P. 1960. Nazue Set Theory. Van Nostrand, Princeton, N.J. JONES, S., AND MASON, P. J. 1980. Handling the time dimension in a data base. In Proceecilngs of the International Conference of Data Bases (Heyden, U. K., July). British Computer Society, London, pp. 65-83. KABANZA, F., STEVENNE, J.-M., AND WOLPER, P. 1990. Handling infinite temporal data. In Proceedings
of the
9th
ACM
Symposium
York, pp. 392-403. KAMP,H. 1971. Formal properties of ‘now’.
on Principles Z’heorza
of Database
Systems,
ACM, New
37, 3, 227-273.
KAMP, H. 1968. On the tense logic and the theory of order. Ph.D. thesis, Philosophy Dept., Univ. of California at Los Angeles. KLUG, A. 1982. Equivalence of relational algebra and relational calculus query languages having aggregate functions. J. ACM 29, 3 (July), 699-717. Logic of Programs. EATCS Monographs on Theoretical Computer KROGER, F. 1987. Temporal Science, vol. 8, Springer-Verlag, New York. LORENTZOS,R. G., AND JOHNSON, N. A. 1987. TRA: A model for a temporal relational algebra. of the Conference on Temporal Aspects zn Information Systems, AJ?CET, pp. In Proceedings 99-112. Databases. Computer Science Press, Rockville, Md. MAIER, D. 1983. The Theory of Relational Rec. 15, 4 (Dec.), MCKRNZIE, E. 1986. Bibliography: Temporal databases. ACM SIGMOD 40-52. MCKENZIE, E., AND SNODGRASS,R. 1990. Schema evolution and the relational algebra, Inf, Syst. 15, 2 (June), 207-232. MCKENZIE, E., AND SNODGRASS,R. 1991b, Supporting valid time in an historical relational algebra: Proofs and extensions. Tech. Rep. TR-9 1-15, Dept. of Computer Science, Univ. of Arizona, Tucson, Aug. M(;KENZIE, E., AND SNODGRASS,R. 1991a, An evaluation of relational algebras incorporating Sure. 23, 4 (Dec.), 501-543. the time dimension in databases. ACM Comput. NAVATHE, S. B., AND AHMED, R. 1989. A temporal relational model and a query language. Inf. Sci. 49, 2, 147-175. Press, Cambridge, QuINti, W. V. O. 1953. From a Logzcal Point of Vzeu. Harvard University Mass. Logzc. Springer-Verlag, New York. RESCH~R, N., AND URQuHARr, A. 1971. Temporal ROTH, M. A., KORTH, H., AND SILBERSCHATZ,A. 1988. Extended algebra and calculus for nested Syst, 13, 4 (Dee), 388–417. relational databases. ACM Trans. Database J. 33, 1 SARI)A, N. L. 1990. Algebra and query language for a historical data model, Comput, (Feb.) 11-18. SEGEV, A., AND SHOSHANI, A. 1987. Logical modeling of temporal data. In Proceed~ngs of ACM SIGMOD Conference (San Francisco, Calif., May). ACM, New York, pp. 454-466. Syst 12, 2 SNODGRASS,R. 1987. The temporal query language TQuel, ACM Trans. Database (June), 247-298. Rec. SNODGRASS,R. 1990. Temporal databases: Status and research directions. ACM SIGMOD 19, 4 (Dec.), 83-89. of ACM SNODGRASS,R., ANDAHN, I. 1985. A taxonomy of time in databases. In Proceedings SIGMOD Conference. ACM, New York, pp. 236-246. SNODGRASS,R., GOM~Z, S., AND MCKENZIE, E. 1989. Aggregates in the temporal query language tquel. Tech. Rep TR-89-26, Dept of Computer Science, Univ. of Arizona, Tucson, Nov. Rec. 20, 1 (Mar.), SOO, M. D. 1991. Bibliography on temporal databases. ACM SZGMOD 14-23. Eng. 7, 4 STAM, R., AND SNODGRASS,R. 1988. A blbhography on temporal databases. Database (Dec.),
231-239.
STONEBRARER,M., WONG, E., KREFS, P., mm HELD, G. 1976. The design and implementation of Syst. 1, 3 (Sept.), 189-222, Ingres. ACM Trans. Database TANSEL, A., AND GARNE’rT, L. 1992. On Roth, Korth, and Silberschatz’s extended algebra and S.yst. 17, 2 (June), 374–383. calculus for nested relational databases. ACM Trans. Database TANSEL, A., CLIFFORD, J., GADIA, S., JA,JODIA, S., SEGEV, A., AND SNODGRASS,R. ErIs. 1993. Temporal Databases. Benjamin/Cummings, Menlo Park, Calif. ACM Transactions
on Database Systems, Vol.
19, No.
1, March
1994.
116
.
J. Clifford et al.
TANSEL, A. U. 1986. Adding time dimension to relational model and extending relational algebra Inf Syst. 11, 4, 343–355 TUZHILIN, A. 1989 Using relational discrete event systems and models for prediction of future behamor of databases. Ph D. thesis, Computer Science Dept , New York Univ., New York, Oct. TUZHILIN, A., AND CLIFFORD, J. 1990. A temporal relational algebra as a basis for temporal Conference on Very Large Databases. pp 13–23. relational completeness In International Prmclples of Database and Knowledge-Base Systems Vol 1. Computer ULLMAN, J. 1988. Science Press, Rockwlle, Md. VAN BENTH~M, J. F A. K. 1983. The Logzc of Tzme Reldel, Hingham, Mass. Recewed January
1989; revised December
ACM
on Database
Transactions
Systems,
Vol
1991 and December
19,
No
1, March
1994
1992; accepted December
1992