Type-Theoretical Interpretation and Generalization of Phrase Structure Grammar AARNE RANTA, University of Helsinki. E-mail:
[email protected].
Abstract
In this paper, we shall present a generalization of phrase structure grammar, in which all functional categories (such as verbs and adjectives) have type restrictions, that is, their argument types are speci c domains. In ordinary phrase structure grammar, there is just one universal domain of individuals. The grammar does not make a distinction between verbs and adjectives in terms of domains of applicability. Consequently, it fails to distinguish between sentences like every line intersects every line, which is well typed, and every line intersects every point, which is ill typed. Our generalization relates to ordinary phrase structure grammar in the same way as the higherlevel constructive type theory of Martin-Lof (see Nordstrom et al. 1990, part III, or Ranta 1994, chapter 8) relates to the simple type theory of Church (Church 1940). Simple type theory has been used in linguistics and related with phrase structure grammar, especially in the tradition based on the work of Montague (1974). Our de nition of the grammar will be more formal than Montague's in the sense that we shall use a formal metalanguage, that of constructive type theory, for de ning both the object language and its interpretation. The grammar, both syntax and semantics, is thus readily implementable in the type-theoretical proof system ALF (see Magnusson and Nordstrom 1994 for ALF). Inside type theory, a distinction can be made between the object language and the model, in other words, between syntactic and semantic types. It will turn out that the object language cannot be de ned independently of the model, as in ordinary Tarski semantics. This is a direct consequence of introducing the typing restrictions. The grammar presented here can be seen as a formal linguistic elaboration of the work presented in Ranta 1991 and 1994. The organization of generative grammar as categorial grammar followed by sugaring should now look more familiar to the linguist, as the grammatical formalism on which sugaring operates is no longer full type theory but a set of phrase structure trees.
1 Correspondence between rule notations
Almost all contemporary grammatical theories assume some form of phrase structure grammar, consisting of rules like S ! NP VP: This rule says that a sentence (S) may consist of a noun phrase (NP) combined with a verb phrase (VP). Type-theoretically, it is an introduction rule for the category S, saying that given an expression of category NP and an expression of category VP, you may form an expression of category S. In addition to the categories S, NP, and VP, we can make the expressions explicit, Bull. of the IGPL, Vol. 3 No. 2,3, pp. 319{342 1995
319
c
IGPL
320 Type-Theoretical Int. and Gen. of Phrase Structure Grammar given an expression a of category NP and an expression b of category VP, you may form the expression a b of category S. (We write a b for the concatenation of the expressions a and b.) In the rule form, this is written a : NP b : VP ; ab:S where we write c : C to say that c is a C . The original phrase structure formulation shows the categories only, suppressing the expressions. The closest corresponding inference rule is thus the natural deduction rule NP VP : S The relation of this rule to the one showing the expressions a, b, and a b is the same as the relation of predicate calculus to type theory: only types are shown, objects are suppressed. In the rules of logic, the types are called propositions and their objects proofs. For instance, the conjunction introduction rule
A B; A&B
which looks like a rule for combining the propositions A and B into the proposition A&B , is reformulated as a:A b:B (a; b) : A&B ; which is explicitly a rule for combining given proofs into complex proofs. (See MartinLof 1984 for the type-theoretical interpretation and generalization of predicate calculus.) By restoring the objects, we may hope to enrich phrase structure grammar in the same way as type theory enriches predicate calculus by restoring the proofs.
2 Reformulation of phrase structure grammar
Consider the following very small grammar of the language of geometry. S ! NP V1, V1 ! V2 NP, NP ! every CN, CN ! line, V2 ! intersects. The grammar has the categories S of sentences, CN of common nouns, NP of noun phrases, V1 of one-place, or intransitive, verb phrases, and V2 of two-place, or transitive, verb phrases. It generates just one sentence, every line intersects every line. We shall rst give a type-theoretical reformulation of the grammar. The rewrite rules of phrase structure grammar are formulated as introduction rules of the sets S, CN, NP, V1, and V2. F : V2 Q : NP ; Q : NP F : V1 ; SUBJ(Q; F ) : S OBJ(F; Q) : V1
2. REFORMULATION OF PHRASE STRUCTURE GRAMMAR 321 A : CN ; line : CN; intersect : V2: every(A) : NP Observe that the expression of S formed by the rst rule is not mere concatenation of the noun phrase and the verb, but a combination of them by the operator SUBJ. In the second rule, the transitive verb is combined with the noun phrase by the operator OBJ. If we just expressed the rules in terms of concatenation, F : V2 Q : NP ; Q : NP F : V1 ; QF :S F Q : V1 we would overload the concatenation symbol : it would be ambiguous between an operator forming an element of S from elements of NP and V1, and an operator forming an element of V1 from elements of V2 and NP. To get unambiguous constructors of the sets S and V1, we thus introduce the distinct operators SUBJ and OBJ. Now that we have replaced the overloaded concatenation symbol by unambiguous constructors, our grammar does not quite have the eect of the original phrase structure grammar. It does not generate the sentence every line intersects every line but the functional term SUBJ(every(line); OBJ(intersect; every(line))): of type S. In general, what type-theoretical rules generate are functional terms, whereas phrase structure rules generate strings of words. We shall use the Roman type for primitive functional expressions, and the Italic type for English words. To get from functional terms to strings, we have to add sugaring rules, which tell how functional terms are transformed into strings. These sugaring rules are compositional, in the sense that there is a sugaring rule for each constructor, and the sugaring of a complex term is a string composed from the sugarings of its constituents. Sugaring rules will be formulated as equations of the form
F = E read \F is sugared into E ". SUBJ(Q; F ) = Q F , OBJ(F; Q) = F Q , every(A) = every A , line = line, intersect = intersects. (Later, in Section 11, we shall give sugaring rules formally in type theory.) Now we can compute SUBJ(every(line); OBJ(intersect; every(line))) = every(line) OBJ(intersect; every(line)) = every line intersect every(line) = every line intersects every line.
322 Type-Theoretical Int. and Gen. of Phrase Structure Grammar
3 The reformulation in tree notation
The reformulation we have given of the phrase structure grammar can be made to look more familiar by using tree forms instead of new operation symbols as constructors. The category S is then really the set of S trees, V1 is the set of V1 trees, etc. The type-theoretical rules are rules for building complex trees from given trees.
Q : NP
S
Q
A : CN ;
NP every : NP
A
F : V2
F : V1 ;
V1
:S
F
F
CN line
: CN;
Q : NP ; : V1
Q
V2 intersects
: V2:
The sugaring rules are now rules that take trees into strings, deleting the tree structure. S
= Q F ,
F
Q V1
F
= F Q ,
Q
NP every = every A ,
A CN
= line,
line
V2
intersects
= intersects.
4. CORRESPONDENCE WITH MONTAGUE GRAMMAR 323 It is easy to see how the tree S NP
V1 V2
every
NP every
CN CN intersects line line sugars step by step into the string every line intersects every line:
4 Correspondence with Montague grammar
The type-theoretical reformulation we have given to phrase structure grammar is similar to the grammar of analysis trees in Montague's PTQ (The proper treatment of quanti cation in ordinary English, chapter 8 in Montague 1974). The operators SUBJ and OBJ correspond to Montague's syntactic operators F4 and F5 , respectively. The operator every corresponds to Montague's F0 . But there is one important dierence. In Montague's syntactic rules (pp. 251{253 in op. cit.), syntactic operators like F0 , F4 , and F5 are not really introduced as constructors of analysis trees, but as noncanonical string-forming operators. This is seen from the explanations telling how they are computed, corresponding to our sugaring rules. For instance, the rule S4 (p. 251 in op. cit.) reads, If 2 Pt=IV and 2 PIV , then F4 (; ) 2 Pt , where F4 (; ) = 0 and 0 is the result of replacing the rst verb (i.e., member of BIV , BTV , BIV=t , or BIV=IV ) in by its third person singular present. The sugaring rule is expressed by the equation F4 (; ) = 0 , whose right hand side is quite obviously a string. That the operators F0 {F15 are not constructors of trees is also seen from the fact that they are overloaded: F5 , for instance, functions both as combining transitive verbs with objects to form intransitive verbs, and as combining prepositions with noun phrases to form adverbials. Thus F5 cannot properly be a constructor of the type of intransitive verbs, nor of the type of adverbials. But it is all right to see it as a non-canonical operator taking strings into strings. However, the syntactic operators F0 {F15 play quite a dierent role in the translation rules (pp. 261{262 in op. cit.), which de ne an interpretation of analysis trees in intensional logic. These rules are formulated inductively along the structure imposed by the operators F0 {F15 . In order for such a formulation to be valid, the operators F0 {F15 must be understood as canonical forms of analysis trees. Thus there is a tension between the two uses of the operators F0 {F15 in PTQ. If we choose the rst use, as non-canonical string operators, to be the correct one,
324 Type-Theoretical Int. and Gen. of Phrase Structure Grammar we lose the possibility of de ning the interpretation inductively on the structure of analysis trees. The only canonical form will then be the concatenation of a word with a string, and this structure hardly admits of inductively de ned interpretation. But if we choose the second use to be normative, and treat F0 {F15 as constructors of analysis trees, we can still get the strings by de ning sugaring inductively along the tree structure. (We must then, of course, replace the overloaded operators by unambiguous ones.) This may even be a better procedure for de ning what strings are grammatical, since it does not destroy the tree structures of constituents before combination, but only after combination. Thus the sugaring of the combination may appeal to the tree structures of the constituents, which is not possible in Montague's original formulation. (For example, to sugar F4 (; ), one must there run through the string to nd the rst verb in it, and this can only be a single word that is classi ed as a verb. It cannot be a complex verb phrase, like walk and run, and thus the analysis tree F4 (John; F8 (walk; run)) sugars into the ungrammatical string John walks and run.)
5 Interpretation of phrase structure grammar
In PTQ, Montague de ned a model-theoretical interpretation of analysis trees via a translation into intensional logic. In EFL (English as a formal language, Montague 1974, chapter 6), the interpretation was de ned directly. We shall use type-theoretical terms to give the interpretation. As these terms are expressions of a formal language, our interpretation will resemble the translation part of PTQ. But as they can be semantically explained directly, just like the informal metalanguage in which Montague formulates his model theory, the interpretation is really a formal version of EFL. (Notice, for instance, that the de nitions of the domains of individuals are made formally in type theory, and need not be added by model-theoretical semantics. Cf. Ranta 1994, chapter 2, for the syntax and semantics of type theory.) To give the interpretation, we assume a set D to be de ned,
D : set; as the domain of individuals. We also assume the type of propositions, prop : type; but we need not decide whether its objects are to be truth values, as in classical logic, or sets, as in intuitionistic logic. In simple type theory, we can, for any given types and , form the complex type () of functions from to . (Montague's notation for () is (; ).) Thus, for instance, (prop)prop is the type of one-place propositional connectives, such as negation, and (D)prop is the type of one-place propositional functions over the domain D; we assume that all sets are types as well. Now we can assign, to each syntactic category, its interpretation|in Montague's words, its domain of possible denotations. We write C ? for the interpretation of C . The interpretations of the ve categories we have introduced are the following.
5. INTERPRETATION OF PHRASE STRUCTURE GRAMMAR 325 S? = prop, NP? = ((D)prop)prop, V1? = (D)prop, V2? = (D)(D)prop, CN? = (D)prop. In other words, sentences are interpreted as propositions, noun phrases are interpreted as quanti ers over D, intransitive verbs as one-place propositional functions, transitive verbs as two-place propositional functions (that is, as functions from D to one-place propositional functions), and common nouns as one-place propositional functions. These assignments of domains to syntactic categories are in accordance with Montague, with the exception that possible worlds are ignored. It remains to interpret the language itself, that is, to assign to each expression of each category a denotation in the type that interprets the category. We write c? = d to say that the expression c has the interpretation d. The reader can check that the following interpretations have appropriate types. SUBJ(Q; F )? = Q? (F ? ), OBJ(F; Q)? = (x)Q? ((y)F ? (x; y)), every(A)? = (Y )8((x) (A? (x); Y (x))). In these de nitions, we have used abstraction to form functions (x)b : () where
b : under the hypothesis x : :
An application of such an abstract is computed in accordance with the conversion rule ((x)b)(a) = b(a=x); that is, by substituting the argument a for the free occurrences of the variable x in b. Moreover, we have assumed the logical constants for implication and universal quanti cation, : (prop)(prop)prop, 8 : ((D)prop)prop. We have not given the de nitions of line? and intersect? . Now we can easily compute SUBJ(every(line); OBJ(intersect; every(line)))? = every(line)? (OBJ(intersect; every(line))? ) = ((Y )8((x) (line? (x); Y (x)))) ((x)((Y )8(( z ) (line? (z?); Y (z ))))((y)intersect? (x; y))) ? ? = 8((x) (line (x); 8((y) (line (y); intersect (x; y))))). (Bound variables may have to be changed, as usually, to avoid clashes.) This establishes the interpretation of the sentence every line intersects every line,
326 Type-Theoretical Int. and Gen. of Phrase Structure Grammar which is obtained from the tree SUBJ(every(line),OBJ(intersect,every(line))) by sugaring. If the interpretation is given formally in type theory, a separate interpretation function must be used for each syntactic category. Then we have the functions INTS : (S)prop, INTCN : (CN)(D)prop, INTNP : (NP)((D)prop)prop, INTV1 : (V1)(D)prop, INTV2 : (V2)(D)(D)prop. The interpretations of the forms of expression can be reformulated accordingly. The interpretations of categories are indicated implicitly, as the value types of these functions. As it is easy to reformulate the informal star notation along these lines, we shall continue to use it in this paper.
6 Generalization of phrase structure grammar
In constructive type theory, the one and only domain of individuals D is replaced by a whole type of sets, set : type; such that any set may act as a domain. Thus we have the type (A)prop of propositional functions over A, for any set A. We also have the type (A)(B )prop of two-place propositional functions, with the argument types A and B possibly different, and the type ((A)prop)prop of quanti ers over A. We must now introduce a more general universal quanti er , which forms propositions (A; B ); where A : set and B : (A)prop. The operator cannot be typed in simple type theory in a way that forces the type of the second argument B to depend on the rst argument A. For this, we need the generalized function type (x : ) ; where
: type; and : type under the hypothesis x : : The application of a function f : (x : ) to an argument a : yields a value in the instance of corresponding to a, f (a) : (a=x):
6. GENERALIZATION OF PHRASE STRUCTURE GRAMMAR 327 Now we can type the universal quanti er, : (X : set)((X )prop)prop: (Notice that the ordinary function type () is the special case of (x : ) , where : type is independent of x : .) Given a domain A : set, we can form the universal quanti er over A by applying once, (A) : ((A)prop)prop: The universal quanti er 8, which we introduced above in simple type theory, is thus de nable as (D). There is an obvious generalization of phrase structure grammar analogous to the generalization of simple type theory into constructive type theory: all categories interpreted in terms of function types over the domain D are relativized into categories depending on arbitrary domains. Then we have the following system of categories. interpreted category of S sentences prop CN common nouns set noun phrases of type A ((A)prop)prop NP(A) V1(A) intransitive verbs of subject type A (A)prop V2(A; B ) transit. verbs of subj. type A and obj. type B (A)(B )prop (Notice that common nouns are now, quite naturally, interpreted as sets.) The rules forming expressions are generalized A : set Q : NP(A) F : V1(A) ; SUBJ(A; Q; F ) : S A : set B : set F : V2(A; B ) Q : NP(B ) ; OBJ(A; B; F; Q) : V1(A) A : CN every(A) : NP(A? ) ; line : CN; intersect : V2(line? ; line? ); and interpreted SUBJ(A; Q; F )? = Q? (F ? ), OBJ(A; B; F; Q)? = (x)Q? ((y)F ? (x; y)), every(A)? = (A? ). The type-independent SUBJ and OBJ operators are obtained as special cases by putting A = B = D. Sugaring proceeds in the same way as in ordinary phrase structure grammar: type information plays no role. SUBJ(A; Q; F ) = Q F , OBJ(A; B; F; Q) = F Q , every(A) = every A , line = line, intersect = intersects.
328 Type-Theoretical Int. and Gen. of Phrase Structure Grammar Phrase structure notation can be extended to the new grammar for some of the rules, for instance, S ! NP(A) V1(A), V1(A) ! V2(A; B ) NP(B ), but the rule
NP ! every CN does not seem to have such a generalization expressible in phrase structure notation. The new grammar is no longer context-free, but its parsing problem can be solved along the following lines. The grammar of terms without type restrictions is contextfree: we only check whether the terms are built by operators with correct numbers of arguments. Thus, for example, the operator SUBJQ is just treated as a three-place operator whose arguments may be any terms. The rest of parsing is type checking, in the way it is done in type theory and its implementations. To get from an English sequence of words, say, every line intersects every line
to a functional term without type information, SUBJ( ; every(line); OBJ( ; ; intersect; every(line))); is similar to parsing in ordinary phrase structure grammar. From this information, the type checker can already infer how the blanks are to be lled: SUBJ(line? ; every(line); OBJ(line? ; line? ; intersect; every(line))): On the other hand, the sentence every line intersects every point
only passes the rst stage of parsing (assuming we have extended the grammar with the word point : CN), but at the second stage, the type checker detects a type mismatch between the verb and the object (more accurately, the constraint that point? = line? : set).
7 Object language and metalanguage
As it stands, our system of syntactic categories is just a generalization of (a part of) Montague's system to multiple domains of individuals. As his simple categories, like NP, are replaced by dependent categories, like NP(A), the categorial grammar must use dependent types, which are not provided by simple type theory. But if we look carefully at the grammar, we nd another important dierence. Compare the old and the new categorizations of SUBJ. SUBJ : (Q : NP)(F : V1)S, SUBJ : (A : set)(Q : NP(A))(F : V1(A))S.
7. OBJECT LANGUAGE AND METALANGUAGE 329 (Notice that these categorizations are just alternative expressions for the inference rules stated above in Sections 2 and 6, respectively.) In the former, type-independent version of SUBJ, both arguments are expressions of the object language, that is, belong to syntactic categories. In other words, both constituents of the syntactic object SUBJ(Q; F ) are syntactic objects themselves. But in the latter, type-dependent version, SUBJ(A; Q; F ) only the last two arguments are syntactic: the rst argument, A, is categorized as a set, which is not a syntactic category. All of the constituents of the expression SUBJ(A; Q; F ), belonging to the object language, to the language that we are de ning, are thus not expressions of the object language themselves. One of them belongs to the type-theoretical universe, in which the object language is interpreted. Several parallel terminologies are in use here. We have the distinctions object language vs. metalanguage, expression vs. object, syntactic vs. semantic, linguistic vs. ontological, language vs. interpretation. All these distinctions can be made inside type theory, which we are using as the metalanguage for speaking about both the object language and its interpretation. Thus we say that the types S, CN, NP(A), V1(A), and V2(A; B ) are categories of the object language, or categories of expressions, or syntactic categories. Their objects are linguistic, belong to the language. On the other hand, the types set, prop, each set itself, and the function types are categories of the metalanguage, or categories of objects, or semantic categories. Their objects are ontological, belong to the interpretation. It is an immediate consequence of the introduction of these type restrictions that expression-forming operators take arguments that are not expressions themselves: the set arguments. We shall call these arguments the type information, and, more generally, the semantic information. We take it as a fundamental principle of our grammar that all and only semantic arguments are deleted in sugaring. That is, as the phrase structure trees contain both semantic and syntactic constituents, and sugaring deletes the tree structure leaving a string of the basic expressions, it also deletes those basic expressions that belong to the metalanguage, but does not delete any expressions belonging to the object language. We shall see later that this principle has consequences in many decisions about grammatical rules. An immediate consequence of the principle is that every constituent expression of a complex expression is visible in the complex, even in the sugared form. If an expression has a \hidden constituent", it cannot be an expression but it must be semantic information. Syntactic constituents are just combined in grammar, not
330 Type-Theoretical Int. and Gen. of Phrase Structure Grammar deleted. On the other hand, all semantic information must be deleted, since sugaring operations apply only to expressions of syntactic categories. It is easy to see that the grammar of Section 6 obeys this fundamental principle. Semantic information there is just type information. The dierence between the new grammar and the old, type-independent grammar is just that the old grammar has no semantic information to be hidden. Indeed, the old grammar could be seen as a version of the new grammar obtained by omitting semantic information. The phrase structure trees of the new grammar are like trees of the old grammar with additional branches containing semantic information. In Montague grammar, and in the type-independent phrase structure grammar of Section 2, no semantic information is included in the syntactic rules. This now appears to us as a somewhat accidental consequence of there being just one domain of individuals: a grammar with a formalized metalanguage could well employ semantical information in syntactic rules. Montague's adherence to purely syntactic constructions gets a little strange in his mechanism of variable binding, where he does not use the ordinary variables x; y; z; : : : of logic, but introduces an analogous series of variables he0 ; he1 ; : : : that belong to the syntactic category NP. These variables never occur in the English strings produced, but are replaced by quanti er phrases and pronouns in sugaring. We could thus raise the objection that Montague's variable binding mechanism violates the principle that no syntactic constituents are deleted in sugaring. As syntactic categories are sets themselves, one can, with the present rules, form categories such as V1(S). This is, however, a source of impredicativity. To avoid it, one can restrict the domains of individuals to some xed universe of sets, for instance, to a universe of small sets in the sense of Martin-Lof 1984.
8 The nature of type information
It took us some time to realize that type information is given in semantic arguments and not in syntactic constituents. The fashion in which we have introduced the generalized syntactic categories here makes it natural: just replace the hidden constant D : set with an explicit and arbitrary A : set. But if one starts with type theory, beginning to introduce syntactic categories alongside with ordinary types, a natural choice is to make the dependent categories depend on A : CN, NP(A) : set for A : CN, NP(A)? = ((A? )prop)prop. Then we could have the constructor SUBJ : (A : CN)(Q : NP(A))(F : V1(A))S; which takes syntactic arguments only. It is clear from the sugaring rule of SUBJ that if type information is given by a syntactic argument, the principle that only semantic information is deleted is lost. But there are even more direct considerations that lead to the treatment of type information as semantical. These considerations are perhaps easiest to present for a category that we have not yet presented, and which Montague did not have, but
8. THE NATURE OF TYPE INFORMATION 331 which is a very important category in logic and in the language of mathematics: the category PN(A); where A : set, of proper names of type A, or, by another name, of singular terms of type A. The expressions of category PN(A) are interpreted as elements A, PN(A)? = A: Notice that we say, naturally, proper names of type A or proper names of individuals of type A, where it is clear that A stands for a type, not for a type term, that is, for a common noun. Thus we read Cadillac : PN(car) as Cadillac is a name of a car where \car" functions as a name of a set. Similarly, we read the type V1(A) as intransitive verbs of subject type A or, in logical terms, one-place predicates over the domain A. In these locutions, again, A is understood as a type and not as a common noun. We do not even assume that the object language has a common noun standing for the set A. (In the grammar of predicate calculus, one hardly ever assumes that the object language has a name for the domain.) In the case of English, it may happen, for instance, that A is only expressible by a sentence. If we have B : S, then, by the propositions as types principle of intuitionistic logic, B ? : set. Thus we can form the category PN(B ? ). There need not be any C : CN such that C ? = B ? . An example of such a situation is provided by anaphoric reference to events. In the text John broke every bottle. Bill saw it. the pronoun it can be analyzed as a singular term referring to an event of a type expressed by the sentence John broke every bottle, but this type has a common noun expression only if the sentence can be nominalized. The anaphoric reference functions independently of whether there is such a nominalization. (Cf. Barwise 1981 for anaphoric reference to events, and Ranta 1994, section 3.7, for a type-theoretical reconstruction.) For one more consideration, suppose that car and automobile are synonymous common nouns, that is,
332 Type-Theoretical Int. and Gen. of Phrase Structure Grammar car : CN, automobile : CN, car? = automobile? : set. The synonymy of two common nouns, the equality of their interpretations, does not mean that they are equal as common nouns, car = automobile : CN: On the contrary, car and automobile have dierent combinatorial properties in English and must thus be treated as distinct syntactic objects. For instance, the inde nite phrase is formed by pre xing a to car but an to automobile. Yet we want to infer from Cadillac is a name of a car that Cadillac is a name of an automobile and, more generally,
PN (car? ) = PN (automobile? ) : set: The general rule should be that if A is the same domain as B , then all names of elements of A are also names of elements of B and vice versa, A = B : set PN(A) = PN(B ) : set : This equality follows from the rule
A : set PN(A) : set by extensionality. If we had, instead,
A : CN PN(A) : set ; the corresponding rule would not follow.
A? = B ? : set : PN(A) = PN(B ) : set
9 Anaphoric expressions
The pronominalization rule that we present in Ranta 1994 says that pronouns are identity mappings of sets, taking a given element of a given set into itself, Pron = (A)(a)a : (A : set)(a : A)A:
9. ANAPHORIC EXPRESSIONS 333 In sugaring, both the set argument and the element argument are deleted, 8 he < Pron(A; a) = : she ; depending on A: it Now that we distinguish between syntactic and semantic categories|between CN and set, and between PN(A) and A|we have eight alternative versions of the type of Pron, (A : set)(a : A)A, (A : CN)(a : A? )A? , (A : set)(a : PN(A))A, (A : CN)(a : PN(A? ))A? , (A : set)(a : A)PN(A), (A : CN)(a : A? )PN(A? ), (A : set)(a : PN(A))PN(A), (A : CN)(a : PN(A? ))PN(A? ). The rst four alternatives are excluded, since we want Pron to form expressions of the syntactic category of singular terms, no matter what its arguments are. The last and the third last alternatives have the domain information as of type CN, which we cannot assume, since we want pronominalization also to apply to sets that are not expressible by common nouns. Of the remaining alternatives, the seventh has the syntactic category PN(A) as its second argument, and is hence ruled out by the principle that syntactic arguments are not deleted in sugaring. (Notice, furthermore, that a pronoun can serve as a singular term for an individual even if there are no singular terms available for that individual in the object language. Such is the case when the individual is given by a variable.) The fth alternative remains, Pron : (A : set)(a : A)PN(A); and it does not violate the principle that all and only semantic information is deleted in sugaring. The de nite article, in Ranta 1994, is categorized as an identity mapping as well, Def = (A)(a)a : (A : set)(a : A)A: (In Ranta 1994, the name the is used instead of Def.) The sugaring rule deletes the second argument but preserves the rst one, pre xing the word the to it, Def(A; a) = the A : Of the eight alternative categorizations of the de nite article, the right one is Def : (A : CN)(a : A? )PN(A? ): The second argument must be semantic, since it is deleted in sugaring. But the rst argument is preserved, and it must thus be syntactic, since sugaring is not de ned for semantic arguments. Or reason in this way: if the rst argument were A : set, the sugaring of the de nite phrase would have to nd a common noun B : CN such that B ? = A : set, to be able to sugar the de nite phrase. This is not an eective task, since there is no general method taking a set into a common noun.
334 Type-Theoretical Int. and Gen. of Phrase Structure Grammar
10 Grammatical and natural gender
There is a piece of linguistic evidence giving some surprising con rmation to the categorizations we have found for Pron and Def. Some languages, like German, distinguish between grammatical and natural gender. Grammatical gender is a property of common nouns, and it is either the masculine, the feminine, or the neuter. Natural gender is a property of domains of individuals. Thus men are masculine, women are feminine, and countries are neuter in natural gender. Usually there is no con ict: the common noun Mann for man is masculine, Frau for woman is feminine. But the common noun Weib for woman is neuter in grammatical gender. To formalize the gender system of German, assume the three-element set Gender = fM; F; N g : set: Grammatical gender is assigned to common nouns, GG : (CN)Gender; and natural gender to domains of individuals, NG : (set)Gender: Assume, in German, that Frau and Weib are synonymous common nouns that dier in grammatical gender: Frau : CN, Weib : CN, Frau? = Weib? : set, GG(Frau) = F , GG(Weib) = N , NG(Frau? ) = F , NG(Weib? ) = F . An important use of grammatical gender is in the choice of the de nite article: DA : (Gender)Article, M = der, F = die, N = das. The sugaring rule for de nite phrases in German is Def(A; a) = DA(GG(A)) A : Thus we have the de nite phrases Def(Frau; a) = die Frau, Def(Weib; a) = das Weib. But when pronominal reference is made, the feminine pronoun is used, irrespectively of whether the woman has been given as a Frau or as a Weib,
11. THE PLACE OF MORPHOLOGICAL INFORMATION 335 Pron(Frau? ; a) = sie, Pron(Weib? ; a) = sie. The sugaring rule for pronouns in German is Pron(A; a) =
8 er < ; : sie es
depending on the natural gender of A. Given that the rst argument of Pron is semantical, dependence on grammatical gender would not even be possible.
11 The place of morphological information
By now, we have introduced all English words as unanalyzed wholes. We have not distinguished between the singular and the plural number, nor between the nominative and the accusative case. But it is undoubtedly one of the tasks of a grammar to make such distinctions. And it is of great interest to see what the place of such morphological information is in relation to syntactic and semantic information. This is also a practical necessity in a grammar of a larger fragment, and in even a small grammar of a language with rich morphology, like German. To make some use of morphological information, we extend the grammar of Section 6 by the plural noun phrase constructor all, interpreted as a universal quanti er word, all : (A : CN)NP(A? ), all(A)? = (A? ). The types of morphological information are traditionally called auxiliary categories. To them belong the category of number, the category of case, the category of gender, of person, of tense, and possibly some other categories. What auxiliary categories there are and what each of them includes depends on the language. For the fragment of English we shall consider, we need just two auxiliary categories: the category of number, which is a set of the two elements SG (the singular) and PL (the plural), Number = fSG; PLg : set; and the category of case, which has the three elements NOM (the nominative), ACC (the accusative) and GEN (the genitive), Case = fNOM; ACC; GENg : set: In the traditional grammar of English, it is known that number can be assigned to both verbs and nouns, and case to nouns but not to verbs. These assignments are not limited to single words, but extend to verb and noun phrases. And in a sentence, the number and case assigned to a phrase may depend on the place that the phrase occupies, as well as on other phrases occurring in the sentence. The dependence on the place in the sentence is called rection. We shall formalize the following rules of rection:
336 Type-Theoretical Int. and Gen. of Phrase Structure Grammar The common noun A in the noun phrase of the form every A is singular. The common noun A in the noun phrase of the form all A is plural. The subject noun phrase is nominative. The object noun phrase is accusative. The dependence on other phrases is called agreement. We shall formalize the following rule of agreement: The verb receives the same number as the subject noun phrase. In order for this agreement rule to be usable, we need to assign numbers to noun phrases: A noun phrase of the form every A is singular. A noun phrase of the form all A is plural. There are two possible places for morphological information. Either it is included in the phrase structure trees, or it is introduced in the sugaring rules. In the former alternative, the main categories (of common nouns, noun phrases, verbs, etc.) will have dependencies on case and number variables. This involves the same kind of generalization of phrase structure grammar as suggested by Chomsky (1965) and developed by Gazdar et al. (1985). In the latter alternative, we can keep the main categories as they were de ned above, as only depending on type information. We shall study this alternative rst, and then make a comparison with the former alternative. Sugaring is the procedure that takes phrase structure trees (which we represent by functional terms) into strings of English words. Phrase structure trees do not form one category, but there is a whole system of categories of them: S, CN, NP(A), V1(A), V2(A; B ). To present sugaring formally in type theory, as functions from trees to strings, we thus do not manage with one function only, but a system of functions corresponding to the system of categories. We shall denote each such function by the name SUGX , where X is the name of the category. The set of strings of words will be called E . Sentences, in our fragment, have no dependencies on morphological information. The sugaring function for sentences is thus simply SUGS : (S)E: Common nouns have both singular and plural sugarings. The singular is needed in combination with every, the plural in combination with all. Dierent case forms are also needed, for the formation of noun phrases of dierent cases. SUGCN : (CN)(Number)(Case)E: Noun phrases have case forms. And as the category of noun phrases depends on type information, SUGNP is really a family of functions depending on type information. SUGNP : (A : set)(NP(A))(Case)E: But noun phrases do not have dierent number forms: a noun phrase already has a number. The agreement rule could not otherwise choose the number of the verb. We introduce the function NUMNP : (A : set)(NP(A))Number
11. THE PLACE OF MORPHOLOGICAL INFORMATION 337 to assign numbers to noun phrases. For the two forms of noun phrases we have, it is de ned as follows: NUMNP(A? ; every(A)) = SG, NUMNP(A? ; all(A)) = PL. Observe the dierence between noun phrases and common nouns as regards number: a noun phrase has a determinate number, whereas a common noun occurs in dierent numbers. Verbs, both transitive and intransitive, have number forms: SUGV1 : (A : set)(V1(A))(Number)E , SUGV2 : (A : set)(B : set)(V2(A; B ))(Number)E . Now we can give the sugaring rules that formalize the rection and agreement principles informally stated above. These rules generalize the sugaring rules of Section 6 by having dependencies on morphological information, but the output for the small fragment of English presented there is the same. SUGS(SUBJ(A; Q; F )) = SUGNP(A; Q; NOM) SUGV1(A; F; NUMNP(A; Q)), SUGV1(A; OBJ(A; B; F; Q); n) = SUGV2(A; B; F; n) SUGNP(B; Q; ACC), SUGNP(A? ; every(A); c) = every SUGCN(A; SG; c), SUGNP(A? ; all(A); c) = all SUGCN(A; PL; c), SUGCN(line; SG; NOM) = line, SUGCN(line; SG; ACC) = line, SUGCN(line; SG; GEN) = line's, SUGCN(line; PL; NOM) = lines, SUGCN(line; PL; ACC) = lines, SUGCN(line; PL; GEN) = lines', SUGV2(line? ; line? ; intersect; SG) = intersects, SUGV2(line? ; line? ; intersect; PL) = intersect. Now we can sugar, step by step, SUGS(SUBJ(line? ; every(line); OBJ(line? ; line? ; intersect; every(line)))) = SUGNP(line? ; every(line); NOM) SUGV1(line? ; OBJ(line? ; line? ; intersect; every(line))) = every SUGCN(line; SG; NOM) SUGV2(line? ; line? ; intersect; SG) SUGNP(line? ; every(line); ACC) = every line intersects every line. We have now shown how morphological information can be included in the sugaring procedure. An alternative approach is to make the main categories depend on it. Thus we would not have just common nouns, but common nouns of a given number and case, etc.: S : set, CN : (Number)(Case)set,
338 Type-Theoretical Int. and Gen. of Phrase Structure Grammar NP : (set)(Case)set, V1 : (set)(Number)set, V2 : (set)(set)(Number)set. One advantage of this alternative is that we can now express the rection and agreement rules in categorial grammar. Morphological information belongs to phrase structure trees. The SUBJ and OBJ rules get the forms A : set Q : NP(A; NOM) F : V1(A; NUMNP(A; NOM; Q)) ; SUBJ(A; Q; F ) : S A : set B : set n : Number F : V2(A; B; n) Q : NP(B; ACC) : OBJ(A; B; n; F; Q) : V1(A; n) Sugaring is now simply ordering terminal symbols into strings. SUGS(SUBJ(A; Q; F )) = SUGNP(A; NOM; Q) SUGV1(A; NUMNP(A; Q); F ), SUGV1(A; n; OBJ(A; B; n; F; Q)) = SUGV2(A; B; n; F ) SUGNP(B; ACC; Q). But the rules do not look simpler than in the former alternative, because the sugaring operators must still have the morphological arguments that the syntactical categories depend on. What will the lexicon look like, that is, the categorizations of primitive expressions? The old categorization line : CN must be replaced by six ones, starting with line : CN(SG; NOM), line : CN(SG; ACC). But since we are introducing canonical expressions, we must choose dierent symbols here, say, line for the nominative and line' for the accusative form. (In general, we would probably introduce some system for naming the variants.) The prime is then deleted in sugaring, and we do not get rid of changing primitive expressions in sugaring. By including morphological information in phrase structure trees, one could hope to approach the simple sugaring procedure that consists of deleting the tree above the leaves. We have seen that this is not quite possible if the leaves are to be unambiguously typed. There is another phenomenon showing that the permutation of leaves cannot be avoided: the adjectival modi cation of common nouns. Type restrictions require that the domain of the adjective must be the same as the interpretation of the common noun: A : CN G : A1(A? ) : ADMOD(A; G) : CN Progressive typing of the arguments, that is, the dependence of the type of G on A, makes it necessary that A occurs to the left of G in the tree. But in sugaring, the order is changed: ADMOD(line; vertical) becomes
12. COMBINATORS AND SYNTACTIC CATEGORIES 339 vertical line and not line vertical. (The situation is dierent in French. Interestingly, Bally (1944) presents under the name sequence progressive a general tendency of French, one of whose manifestations is the position of the adjective after the noun.)
12 Combinators and syntactic categories
The order in which we actually arrived at the type-theoretical interpretation and generalization of phrase structure grammar was opposite to the order in which we have presented the matters above. In Ranta 1991 and 1994, we have studied the task of relating type theory with English by means of a sugaring procedure, a task motivated in part by applications in a natural language interface to mathematical proof systems, and in part as a semantically motivated approach to generative grammar. In the standard notation of type theory, the formation of quanti ed propositions usually requires variable bindings, and the sugaring of a quanti ed proposition is performed by substituting a quanti er phrase for the variable. For instance, every(A; B ) = B (every A ) :
This sugaring rule corresponds to the rule S14 of PTQ, which introduces a variable binding mechanism in analysis trees. The propositional function B can be assumed to have the form (x)C . The rule works well when there is precisely one free occurrence of x in C , but complications arise when there is no occurrence or several. (Notice, furthermore, that the right hand side of the rule is ill typed: the string every A is not of type A, and thus not a legitimate argument of the propositional function B . The problems with this sugaring rule are discussed in Ranta 1994, chapter 9.) One way of coping with the problems is to delimit a sugarable fragment inside type theory, by stating conditions that an expression has to satisfy in order for the sugaring rules to apply correctly to it. For instance, the sugaring rule above is regulated by the condition that B must be of form (x)C , and there must be exactly one free occurrence of x in C . Now, a propositional expression A might not ful ll the conditions of sugarability, but there may be another expression B for the same proposition which is sugarable. Such new expressions can be formed by using combinators, de ned functional constants. Two combinators that we found quite soon were SUBJ = (A)(Q)(F )Q(F ) : (A : set)(Q : ((A)prop)prop)(F : (A)prop)prop, OBJ = (A)(B )(F )(Q)(x)Q((y)F (x; y)) : (A : set)(B : set)(F : (A)(B )prop)(Q : ((B )prop)prop)(A)prop. By means of such combinators, quanti ed propositions can be expressed without using variable bindings at all. Our goal was to de ne the sugarable fragment of type theory as consisting of expressions formed by means of a limited set of combinators, and excluding expressions formed by using variable bindings in a problematic way. A given proposition can have expressions both inside and outside the sugarable fragment. As an expressive completeness property of the grammar, one could state that
340 Type-Theoretical Int. and Gen. of Phrase Structure Grammar every type-theoretical proposition has a sugarable expression, which means that every type-theoretical proposition can be expressed in English. To have a grammar with this property is a nontrivial problem, which has not been solved. (The very idea of de ning a language by means of combinators instead of abstraction was inspired by Steedman 1988.) In the delimitation of the sugarable fragment, it soon turned out that the standard type structure of type theory was too coarse for adequate grammatical description. For instance, the type of propositional functions over the set of lines has objects like slope : (line? )prop, vertical : ((line? )prop, (y)intersect(a; y) : ((line? )prop. The corresponding expressions have quite dierent combinatorial properties in natural language. The expression slope is an intransitive verb, which can be attached to a subject noun phrase in the way expressed in the sugaring rule for SUBJ above. The expression vertical is an adjective, which can only be attached to the subject by using a copula, like is. The propositional function (y)intersect(a; y) corresponds to the incomplete sentence a intersects which can be attached to an object noun phrase of type line to form a sentence. We have thus found three dierent syntactic categories corresponding to the type of propositional functions: intransitive verbs, adjectives, and sentences missing objects. Now, for the last argument F of the combinator SUBJ : (A : set)(Q : ((A)prop)prop)(F : (A)prop)prop; it is not enough to be a propositional function over the set A, but it must be an intransitive verb, in order for the sugaring to proceed correctly. (The same concerns the argument Q: it cannot be just any function of type ((A)set)set, but a noun phrase of type A. For the set argument A, there is no such restriction, since A is omitted in sugaring.) In analogy with the de nition of combinators providing alternative ways of expression, one could now de ne new categories, such as S = prop : type, NP(A) = ((A)prop)prop : type, V1(A) = (A)prop : type, A1(A) = (A)prop : type, where A : set. The combinator SUBJ could now be typed SUBJ : (A : set)(Q : NP(A))(F : V1(A))S: Thus it would be required that the Q argument really be a noun phrase and the F argument an intransitive verb. But this is not successful. From the de nitions we have given to V1(A) and A1(A), it follows, by the symmetry and transitivity of equality, that A1(A) = V1(A) : type:
12. COMBINATORS AND SYNTACTIC CATEGORIES 341 Any adjective thus also counts as a verb, and vice versa. Really to make distinctions between dierent syntactic categories that are semantically equal, we cannot de ne them as standard types, but we have to take them as primitive, and only interpret them in standard types. Thus, instead of the de nition V1(A) = (A)prop : type; we only have the interpretation V1(A)? = (A)prop : type; as stated above, in Section 6. Now we cannot infer F : V1(A) from F : (A)prop. The new treatment of syntactic categories means that what we are sugaring is no longer a fragment of type theory, but a distinct language, the language of phrase structure trees, which is just interpretable in type theory. This organization of the grammar is precisely the same as in Montague's PTQ, where the analysis trees are not a fragment of intensional logic, but a distinct language that is interpretable in intensional logic. In the PTQ paper, Montague makes a comparison with Ajdukiewicz (1935), who used the function types of simple type theory directly as syntactic categories: It was perhaps the failure to pursue the possibility of syntactically splitting categories originally conceived in semantic terms that accounts for the fact that Ajdukiewicz's proposals have not previously led to a successful syntax. (Montague 1974, p. 249, fn. 4.) Our own approach in Ranta 1991 and 1994 is analogous to Ajdukiewicz's, just replacing simple by constructive type theory. The very idea of \syntactically splitting categories originally conceived in semantic terms" appears already in Lambek (1958), where a distinction is made between the pre x and post x function types, = and n, respectively. But indeed, the distinction between pre x and post x function types was already made by Peano in 1889: Let ' be a sign or an aggregate of signs such that, if x is an object of the class s, the expression 'x denotes a new object . . . Then the sign ' is said to be a function presign in the class s, and we write 'F `s . . . If, x being any object of the class s, the expression x' denotes a new object . . . then we say that ' is a function postsign in the class s, and we write 'F 's (op. cit., xVI; van Heijenoort 1967, p. 91.) (I owe the observation about Peano to Per Martin-Lof.) Phrase structure grammar has a well-established system of syntactic categories dating back to school grammar and to the Greeks. If we consider the categories that correspond to two-place propositional functions, we nd a great multiplicity: to give just a few examples, we have the categories V2, transitive verbs, like intersects, V4(with), two-place verbs with the preposition with, like converges, V4(on), two-place verbs with the preposition on, like lies, A2(to), two-place adjectives with the preposition to, like parallel.
342 Type-Theoretical Int. and Gen. of Phrase Structure Grammar What is missing from these categories are type dependencies, which we can restore in the way indicated above, in Section 6. Thus we categorize, in the language of geometry, intersect : V2(line? ; line? ), converge : V4(line? ; with; line? ), lie : V4(point? ; on; line? ), parallel : A2(line? ; to; line? ). These type-dependent syntactic categories, so to say, unify the expressive powers of traditional categorizations and type theory. In designing them, we can make use of the results obtained in phrase structure grammar. At the same time, we can ful ll the requirement of type-theoretical interpretability, as we cannot combine nouns and verbs into sentences unless type restrictions are obeyed.
References
[1] Kazimierz Ajdukiewicz. Die syntaktische Konnexitat. Studia Philosophica, 1:1{27, 1935. [2] Charles Bally. Linguistique generale et linguistique francaise. A. Francke S.A., Berne, 2nd edition, 1944. [3] Jon Barwise. Scenes and other situations. The Journal of Philosophy, 78:369{397, 1981. [4] Noam Chomsky. Aspects of the Theory of Syntax. The M.I.T. Press, Cambridge, Ma., 1965. [5] Alonzo Church. A formulation of the simple theory of types. Journal of Symbolic Logic, 5:56{68, 1940. [6] Gerald Gazdar, Ewan Klein, Georey Pullum, and Ivan Sag. Generalized Phrase Structure Grammar. Basil Blackwell, Oxford, 1985. [7] Joachim Lambek. The mathematics of sentence structure. American Mathematical Monthly, 65:154{170, 1958. [8] Lena Magnusson and Bengt Nordstrom. The ALF Proof Editor and Its Proof Engine. In H. Barendregt and T. Nipkow, editors, Types for Proofs and Programs, pages 213{237. Lecture Notes in Computer Science 806, Springer-Verlag, Heidelberg, 1994. [9] Per Martin-Lof. Intuitionistic Type Theory. Bibliopolis, Naples, 1984. [10] Richard Montague. Formal Philosophy. Yale University Press, New Haven, 1974. Collected papers edited by Richmond Thomason. [11] Bengt Nordstrom, Kent Petersson, and Jan Smith. Programming in Martin-Lof's Type Theory. An Introduction. Clarendon Press, Oxford, 1990. [12] Giuseppe Peano. Arithmetices principia, nova methodo exposita. Bocca, Turin, 1889. In English, Jean van Heijenoort, editor, From Frege to Godel, pages 85{97. Harvard University Press, Cambridge, Ma., 1967. [13] Aarne Ranta. Intuitionistic categorial grammar. Linguistics and Philosophy, 14:203{239, 1991. [14] Aarne Ranta. Type Theoretical Grammar. Oxford University Press, Oxford, 1994. [15] Mark Steedman. Combinators and grammars. In R. Oehrle, E. Bach, and D. Wheeler, editors, Categorial Grammars and Natural Language Structures, pages 417{442. D. Reidel, Dordrecht, 1988.
Received 23 January 1994