Static Analysis of Communication for ... - Semantic Scholar

Report 1 Downloads 91 Views
Static Analysis of Communication for Asynchronous Concurrent Programming Languages Naoki Kobayashi, Motoki Nakade and Akinori Yonezawa Department of Information Science University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113 Japan fkoba, nakade, [email protected] We propose an e ect-based static analysis technique on communication for asynchronous concurrent programming languages. Our analysis gives an upper-bound of the number of enqueued messages and receivers for each communication channel, which can be used for compiletime optimizations for implementation of message passing. The main targets of our analysis are concurrent object-oriented languages, for which no formal static analysis method has been established. Abstract.

1 Introduction Recent high-level concurrent programming languages, including concurrent objectoriented programming languages[1][19], concurrent logic programming languages, and Concurrent ML(CML)[12], are based on computational models where multiple processes (or concurrent objects) perform local computation, exchanging values via asynchronous or synchronous communication. Although such computational models provide potentially more powerful abstractions for programs than sequential languages such as functional programming languages and sequential object-oriented languages, extensive use of communication between processes often causes a large overhead due to the cost of manipulating messages and processes queue. Recent work on implementation of a concurrent object-oriented language[17] tried to reduce the gap between cost of message passing and that of function call. It reduced cost of message passing very close to that of function call, when a receiver is ready to receive the message. However, in general, message passing is still more costly than function call. The aim of this paper is to reduce the cost of message passing by static analysis of communication. Based on e ect analysis[15], we propose a static analysis method to infer the maximum number of enqueued messages and enqueued receivers for each communication channel during computation. By slightly modifying the typing/e ect rules, we can also approximate how many messages are sent and received along each communication channel. If we know that at most one message is enqueued for the communication channel c, we need only allocate memory for saving a single message. Hence both cost and memory space for manipulating a message queue is reduced. If only a constant number of messages are enqueued, we need

only allocate an array of the xed size for storing messages, and no boundary check is required. The same argument also holds for receiver queues; several concurrent programming languages, including CML[12], PICT[11] and HACL[7], allow multiple processes to compete for the same communication channel, hence receivers must be maintained by a queue. Using the information obtained by our analysis, the cost for manipulating receiver queues is also reduced. Moreover, we can also completely eliminate message passing under a stronger condition. We use our linear logic-based concurrent calculus called Higher-Order ACL (HACL) for presenting our static analysis method. HACL is theoretically clear and provides powerful programming models such as higher-order concurrent programming. Since many concurrent programming languages including concurrent object-oriented languages can be naturally embedded in HACL[6], the techniques proposed here are easily applied to such languages. As far as the author knows, this is the rst formal static analysis technique applicable to concurrent objectoriented languages. The rest of this paper is organized as follows. Section 2 introduces the syntax and the basic type system of HACL. Section 3 addresses problems in applying e ect analysis to concurrent languages and overviews our approach. In Section 4 and Section 5, we present a simple version of our static analysis method. Section 6 discusses optimizations based on our static analysis. Section 7 discusses several re nements of the simple analysis in Section 4 and Section 5. Section 8 discusses related work and Section 9 concludes this paper.

2 Overview of the Core-HACL This section brie y introduces the core-HACL. For the concrete de nition of HACL and the correspondence between linear logic and HACL, please refer to [7][5]. The core-HACL introduced here can be considered as a core-ML extended with primitives for channel creation, value passing along channels, guarded choice, and parallel composition. Channels are rst-class values, that is, they can be passed along other channels as in  -calculus.

2.1 Syntax In HACL, computation is performed by multiple processes which communicate with each other via asynchronous message passing. A message can be sent by specifying a communication channel. If m is a communication channel, m(v) is interpreted as a process which sends a value v along the channel m. On the other hand, m(x) => P is a process which waits for a message m(v) and then executes P [v=x]. Communication channels are created by the operator $. P1 jP2 is a concurrent composition of P1 and P2 . (m(x) => P )&(n(x) => Q) receives a message along channel m or n and accordingly behaves like P or Q. For example, $m.$n.(m(2) | (m(x)=>P)&(n(x)=>Q)) reduces to $m.$n.(P[2/x]). The entire syntax of preterms of the core-HACL is given as follows (we omit tuples in this paper):

De nition 2.1 (preterms) A set of preterms, ranged over by e, is de ned by

the following syntax:

e ::=x (variables) j c (constants) j e1e2 (function application, or message send) j x:e (abstraction) j let val x = e1 in e2 end (polymorphic de nition) j recf (x) = e (recursive functions or processes) j (e1 je2 ) (parallel composition) j r (message receiver) j $x:e (channel creation) r ::= x(y ) => e j r1&r2 where x; y stands for variables and c for constants including a special constant (inaction), and basic values such as integers. Note that e1 e2 is either a function application or a message send according to whether e1 is evaluated to a function or a communication channel. This overloading is useful for optimizations described in Section 6. We often write proc p(x)=e (or fun p(x)=e) for val p=(rec p(x)=e).

2.2 Reduction Semantics Following the standard semantics for concurrent calculi[13], the reduction semantics for the core-HACL, given in Figure 1, is de ned via three relations: structural congruence, reduction relation on -terms, and reduction relation on processes. Throughout this paper, we identify variables up to renaming of bound variables. Later, correctness of our analysis will be stated with respect to this reduction semantics.

2.3 Type System A term of the core-HACL is de ned as a well-typed preterm.

De nition 2.2 (monotypes, type schemes) A set of monotypes, ranged over by  , and a set of type schemes, ranged over by  , are given by the following

syntax:

 ::= j B j o j   ::=  j 8 :

!

where stands for type variables and B for base types, and o is a distinguished type for propositions (i.e., processes or messages).

De nition 2.3 A preterm e is a term of type  under a type assignment (i.e., a map from variables to types) T if there is a derivation for T ` e :  by typing rules in Figure 2.

= (Cong-1) p1 j p2  = p2 j p1 (Cong-2) (p1 j p2 ) j p3  = p 1 j (p 2 j p3 )  p2 &p1 (Cong-3) p1 &p2 = (Cong-4) (p1 &p2 )&p3  = p1 &(p2 &p3 ) (Cong-5) $x:(p1 j p2 )  = p1 j $x:p2 if x is not free in p1 . reduction on -terms: 0! (R-Beta) (x:e)v 0! e[v=x] e1 0! e1 (R-App1) e1 e2 0! e1 e2 (R-App2) e 0! e ve 0! ve e1 0! e1 (R-Let1) let val x = e1 in e2 end 0! let val x = e1 in e2 end (R-Let2) let val x = v in e end 0! e[v=x] (R-Rec) recf (x) = e 0! x:e[(recf (x) = e)=f ] where v ranges over a subset of preterms de ned by: v ::= cjxjx:e. reduction relation: 0! structural congruence:

0

0

0

0

0

0

(R-Com) m(x) (R-Cong):

e1

=> e j m(v) 0! e[v=x]

= e2

e1 e2

0! e1 0! e2

0! e1 e1 j 0! e1 j e2 e 0! e (R-New) $x:e 0! $x:e

(R-Par)

0

e1 e2

0

0

0

Fig. 1.

0

0

0

e1

= e2 0

0! 0! j 0! e j 0! e j e 0! e 0

 e0 e e e1 e3 (R-Choice) (e1 &e2 ) e3

(R-Lambda): e

(R-Inaction)

0

0

Reduction rules

In Figure 2, Const is a set of pairs (c;  ), where c is a constant symbol and  is its type scheme. We assume Const contains at least the constant ( ; o). (T-Par) indicates that e1 j e2 is a process only if both e1 and e2 are processes. In (T-New), x is mapped to a monotype by the context in the premise. This ensures that a channel x is consistently used in e for sending and receiving values of type  . The premises of (T-com) ensures that the continuation x:e invoked after message reception should take an argument of the same type as the type of values carried by the channel m. We write T `T R e :  if T ` e :  is derivable by the rules in Figure 2. Throughout this paper, we assume that e2 in e1 e2 , e1 in let val x = e1 in e2 end never have type o. Therefore, for example, an expression let val x = $m:m(1) in x end is not allowed, but let val x = ():$m:m(1) in x() end is allowed. This restriction was introduced to get rid of imperative or weak type variables (note that no channel is created when e1 in let val x = e1 in e2 end is evaluated), and also to ensure that the behavior of a process let val x = e1 in e2 end is the same as that of e2 [e1 =x]. We use the basic type system just for the sim-

plicity, and ignore how to ensure the above property, since it is not important for the purpose of this paper. Readers interested can refer to [7] on a way to ensure the above property by re ning the basic type system.1

T ` c :  if (c; ) 2 Const and  is a generic instance of  T fx 7! 1 g ` e : 2 (T-var) T ` x :  if T (x) =  . (T-abs) T ` x:e : 1 ! 2 T fp 7! 1 ! 2 g ` x:e : 1 ! 2 T ` e1 : 1 ! 2 T ` e 2 : 1 (T-rec) (T-app) T ` e1 e2 : 2 T ` rec p(x) = e : 1 ! 2 T ` e 1 : 1 T ` e2 [e1 =x] : 2 (T-let) T ` let val x = e1 in e2 end : 2 (T-par) T ` e1 : o T ` e2 : o (T-choice) T ` e1 : o T ` e2 : o T ` e1 &e2 : o T ` e 1 j e2 : o T f x 7!  ! og ` e : o T ` m :  ! o T ` x:e :  ! o (T-com) (T-new) T ` $x:e : o T ` m(x) => e : o (T-const)

Fig. 2.

Typing Rules

Since the typing rules of the core-HACL are essentially the same as the ML type system, types can be automatically recovered from untyped terms. The following is a sample session of the HACL type inference system. (The coreHACL is extended with tuples in examples below.) proc forwarder (m, n) = m x => n x;

>o)3('a->o)->o

val forwarder = proc: ('a-

The process forwarder takes two communication channels m and n as arguments, and receives a value of x along a channel m and sends it along a channel n. The inferred type for forwarder implies that it takes two channels as arguments, and that they take an argument of the same type. The following is a concurrent object-oriented style de nition for point process: proc point (x, y) (getx,gety,set) = getx(reply) => (reply(x) | point(x,y)(getx,gety,set)) & gety(reply) => (reply(y) | point(x,y)(getx,gety,set)) & set(newx,newy,ack) => (ack() | point(newx,newy)(getx,gety,set));

3 >

val point=proc: 'a 'b-

>o

(('a-

>o)->o)3(('b->o)->o)3('a3'b3(unit->o) ->o)-

The process point has three channels for receiving messages.2 If it receives a message along a channel getx, it sends x along the received channel reply 1 2

Alternative way to ensure type soundness would be to use call-by-name semantics for values of type o instead of imposing syntactic restrictions. In [6], we introduce records and put three channels into a record. Then, the resulting de nition is much more similar to de nition of concurrent objects. However, this paper does not use records for the simplicity.

and repeat the same behavior. If it receives a message along a channel gety, it sends y along the received channel reply and repeat the same behavior. If it receives a message via a channel set, it sends an acknowledgement to ack and updates values of x and y to newx and newy. Please keep in mind that concurrent objects are modelled by recursive calls for the same process (in the above case, recursive calls for point). This is one of the reasons why we cannot apply Nielson's techniques[9] to analysis of concurrent object-oriented style programs. With their analyses, the number of input operations for getx and output operations for reply, etc. is always counted as 1, although it is actually the case only when external processes send the in nite number of getx messages. As is given in the later section, we can infer the number of enqueued recipients on the channel getx to be at most 1.

3 Overview of the Approach This section summarizes the basic idea on how to apply an e ect system to analysis of communication. Roughly speaking, analysis of communication is very close to that of operations to stores. Sending a value via a channel corresponds to writing a value in a store, and a receive operation corresponds to a read operation. Therefore, it is easy to imagine that e ect systems developed for analysis of stores[14] would be also applicable to analysis of communication for concurrent languages. In fact, Nielson[9] proposed a technique to analyze communication behaviors of CML programs. However, we claim that the following points should be taken into consideration in designing an e ect system for concurrent languages.

{ Unlike read operations in functional languages with store, a receiver process

is blocked until a value is sent by another process. Therefore, if a value is not sent by anyone, the receiver process is not executed at all; we should ignore the e ect of executing the receiver in that case. This problem is especially serious when we need to handle recursive processes as in concurrent object-oriented languages. For example, for the point process in the previous section, the number of input operations along getx channel is counted as 1 in Nielson's method, where e ects are accumulated in a similar manner to traditional one for e ect analysis of store[14]. { The analyses of: 1. How many times will each channel be used? 2. How many processes will try to send or receive along each channel at the same time? are important for ecient implementation of communication in concurrent languages. For example, the former can reduce the cost for runtime garbage collection, and the latter reduces the cost for manipulation of senders and receivers queues. However, for both optimizations, the preciseness of analyses are very important. Regions were introduced in [14] to represent sets of possibly aliased reference cells, and e ects are counted for each region. For analysis of communication, regions can be used to represent sets of possibly

aliased channels. Unfortunately, the naive formulation of an e ect system causes that too many communication channels are aliased to the same region. This seriously damages the preciseness of analyses. In order to overcome the rst problem, in Section 4, we count the e ect of executing a receiver process m(x)=>P as just a single input operation along the channel m. Instead, the e ect of executing P is counted in the sender process m(v). Thus, the send operation m(v) is treated in a very similar manner to a function application (fn x=>P)v. This choice of an e ect system is e ective for the second problem too. If a channel m carries channels as arguments, counting an e ect of executing P in m(x)=>P causes that all channels sent along m are aliased to the same region. On the other hand, if we count the e ect of P in a sender process m(v), the e ect can be parameterized by the channel v. Of course, a certain problem is still left; since channels need to be assigned monomorphic types once they are created, all channels sent along m are still aliased to the same region. This problem may not be so serious in [15] because polymorphic functions can be de ned using let-expressions, but it is serious in concurrent languages. We overcome this problem in Section 7 by introducing a restricted polymorphism; channels are treated in a polymorphic manner only with respect to regions.

4 Type/E ect System for Analyzing Communication This section presents an e ect system to get the maximum number of enqueued messages and message receivers for each communication channel. Note that an e ect system for analyzing the number of send and receive operations for each channel can be also obtained with a slight change. For the clarity, we present a simple analysis which gives only a rough information. We will discuss several re nements later in Section 7. We use a term channel regions (or simply regions) for sets of possibly aliased communication channels and often use a symbol l. E ects are de ned to be a map from a set of regions to f0; 1; : : : ; M 0 1; 1g 2 f0; 1; : : : ; M 0 1; 1g. We use a symbol b for an e ect, and a special symbol  for an e ect which maps all channel regions to (0; 0). fl1 : (I1 ; O1 ); : : : ; ln : (In ; On )g represents an e ect b such that b(l) = (Ii ; Oi ) if l = li for some i(1  i  n) and (0; 0) otherwise. Intuitively, b(l) = (I; O) means that for a communication channel in a region l, at most I processes can try to receive a value, and at most O processes can try to send a value at the same time. In other words, I is a maximum queue length for receiver processes, and O is a maximum queue length for messages. For example, $m.($n.m(n) | m(x)=>x()) has an e ect fl1 : (1; 1); l2 : (0; 1)g provided that m and n are respectively in regions l1 and l2 . $m.$n.(m(1) | m(x)=>n(x) | m(x)=>(n(x+1))) has an e ect fl1 : (2; 1); l2 : (0; 1)g provided that m and n are respectively in regions l1 and l2 . Please remember that each I; O ranges over a nite set of values f0; 1; : : : ; M 0 1; 1g. If M = 1, then Ii and Oi can range only over two elements set: f0; 1g, hence our analysis corresponds to I/O mode analysis.

Before we present type and e ect rules, we need some preliminary de nitions. We write I (b; l) = i and O (b; l) = o if b(l) = (i; o). Operations +(summation of two e ects), _(maximum of two e ects), and minus on e ects are de ned as follows:  I (b 1 ; l ) + I (b 2 ; l ) < M I (b1 + b2; l) = I1(b1; l) + I (b2 ; l) ifotherwise  O(b1 ; l) + O(b2 ; l) if O(b1 ; l) + O(b2 ; l) < M O(b1 + b2 ; l) = 1 otherwise I (b1 _ b2; l) = max(I(b1 ; l); I (b2 ; l)), O(b1 _ b2; l) = max(O(b1; l); O(b2 ; l)) l) 0 1 if l 2 L and 0 < I (b; l) < M I (minus(b; L); l) = II ((b; b; l) otherwise O(minus(b; L); l) = O(b; l) For example, if e1 has an e ect b1 and e2 have an e ect b2 , then (e1 j e2 ) has an e ect b1 + b2 . An inequality b1  b2 on e ects is de ned as:

b1  b2 , 8l:((I (b1 ; l)  I (b2 ; l)) ^ (O(b1; l)  O(b2 ; l))) Now, types are extended as follows:

De nition 4.1 (types) Types, ranged over by  , are de ned as follows: b  ::= j B j ol j 1 ! 2

where b ranges over e ects and l ranges over a set of regions b Intuitively, an expression of type 1 ! 2 has an e ect b when applied to an expression of 1 . The type o for a process or a message is labelled with a region constant, which is used to identify each communication channel (When it is not b important, we often omit the region and just write o).  ! o is a type of channel which may cause an e ect b when a value of type  is sent along the channel. A new type judgement form is T ; L ` e : (; b), which should be read, \Under a type assignment T , and a region assignment L, e has a type  and an e ect b." T , a type assignment, is a mapping from a nite set of variables to types. We associate each occurrence of $ in a given term with a unique label to identify a program point, like $1 m:$2 n:(m(x) => n(x)). L is a map from each program point to a set of regions. It is used to remember which regions each lexical occurrence of channel creation corresponds to, and to pass it to an optimizing compiler. Each program point may be mapped to multiple regions, because the same channel creation code in let-declaration may be invoked multiple times in the body. Extended typing rules are given in Figure 3. We write ET R for the set of the extended typing rules, and write T ; L `ET R e : (; b) if T ; L ` e : (; b) is derivable in ET R. Rules (ET-abs) and (ET-app) are fairly standard. But please note that (ET-app) is the rule for both function application and message send. In (ETlet), an e ect of e1 is assumed to be empty, because we assumed e1 cannot

have a process type o. In (ET-par), an e ect of concurrent composition of e1 and e2 are computed by adding e ects of the two sub-processes. In the rule (ET-new), the second premise indicates that if a message is sent along a channel, at least that message is enqueued to the channel. (ET-choice) corresponds to a combined form of (T-choice) and (T-com). &i=1;n (mi (x) => ei) represents (m1 (x) => e1 )& 1 1 1 &(mn (x) => en ), and i in the premise ranges over f1,: : : ,ng. An intuitive explanation of the (ET-com) rule is as follows: Since &i=1;n (mi (x) => ei) tries to receive a value along channels m1 ; : : : ; mn , it has an e ect fl1 : (1; 0)g _ 1 1 1 _ fln : (1; 0)g where li is a region of the channel mi . Moreover, if someone sends a message mi (v), then (x:ei )v may be executed, which causes an e ect bi . Therefore, a latent e ect b0i of mi is constrained to be greater than or equal to minus(bi ; fl1 ; : : : ; ln g). The operation minus represents an e ect that a receiver mj (x) => ej (1  j  n) is dequeued from the channel mj before (x:ei )v is executed. The reason b0i may be greater is that there may be another process waiting for a message along the same channel mi which has a more e ect. Alternative way for counting e ects would be to use the following rules instead of (ET-new) and (ET-com) (choice is omitted): fl:(0;1)g l o g; L ` e : (o; b) l 2 L() (ET-new-alternative) T fx 7!  ! T ; L ` $ x:e : (o; b)

b b l : ( ! o; ) T ; L ` m : ( ! o ; ) (ET-com-alternative) T ; L `T ;x:e L ` m(x) => e : (o; fl : (1; 0)g _ b) 0

The above two rules are close to those used in Nielson's analysis[9]. We prefer

(ET-new) and (ET-com) for the reasons explained in Section 3. The rule (ETweak) allows the obtained e ect to be just an upper-bound, rather than exact

information. In order to see how e ects are accumulated in a derivation, consider a process $1 m:$2 n:(m(1)jm(x) => n(x)). m(x) => n(x) causes that if a message is sent along m, a message is also sent along n. Therefore, it constrains that an e ect b1 to send a message along m should be greater than an e ect b2 to send a message along n. Since there is no process that receives a message along n, b2 can be counted as just a single send operation along n, which implies that b1 can be counted as a send operation along m plus b2 . Therefore, we can derive

; L ` $1 m:$2 n:(m(1)jm(x) => n(x)) : (o; f1 : (1; 1); 2 : (0; 1)g)

L = f1 7! 1; 2 7! 2g as is shown in Figure 4. In the gure, T = fm : b2 2 b1 1 o g; b1 = f1 : (0; 1); 2 : (0; 1)g; b2 = f2 : (0; 1)g. Several side o ; n : int ! int !

for

conditions are omitted in the gure. Note that an e ect of sending a message along n is accumulated as an e ect of m(1). The extended type system ET R is sound w.r.t. the basic type system T R in the following sense:

Proposition 4.1 (soundness of the extended type system) If T ; L `ET R e : (; b), then Erase(T ) `T R e^ : Erase( )

T L ` c : (; ) if (c;  ) 2 Const and  is a generic instance of  T f 7!  g; L ` x : (; ) T fx 7! 1 g; L ` e : (2 ; b) (ET-abs) T ; L ` x:e : (1 !b 2 ; ) T ; L ` e1 : (1 !b 2 ; ) T ; L ` e2 : (1 ; ) (ET-app) T ; L ` e1 e2 : (2 ; b) b b T f f 7! 1 ! 2 g; L ` x:e : (1 ! 2 ; ) (ET-rec) b T ; L ` rec f (x) = e : (1 ! 2 ; ) (ET-let) T ; L ` e1 : (1 ; ) T ; L ` e2 [e1 =x] : (2 ; b) T ; L ` let val x = e1 in e2 end : (2 ; b) (ET-par) T ; L ` e1 : (o; b1 ) T ; L ` e2 : (o; b2 ) T ; L ` e1 je2 : (o; b1 + b2 ) b l T fx 7!  ! o g; L ` e : (o; b) b  fl : (0; 1)g l 2 L ( ) (ET-new) T ; L ` $ x:e : (o; b) (ET-const) ; (ET-var) x

0

0

(ET-com)

b b T ; L ` x:ei : (i ! o; ) T ; L ` mi : (i ! ol ; ) bi  minus(bi ; fl1 ; : : : ; ln g) T ; L ` &i=1;n(mi (x) => ei ) : (o; fl1 : (1; 0)g _ 1 1 1 _ fln : (1; 0)g) T ; L ` e : (; b) b  b (ET-weak) T ; L ` e : (; b ) 0

i

i

i

0

0

0

Fig. 3.

extended typing rules

1 1 1 b1  minus(b2 ; 1) 111 (ET-com) T ; L ` m(1) : f1 : (0; 1); 2 : (0; 1)g T ; L ` m(x) => n(x) : f1 : (1; 0)g (ET-par) T ; L ` m(1)jm(x) => n(x) : (o; f1 : (1; 1); 2 : (0; 1)g) (ET-new) b1 fm 7! int ! og; L ` $n:(m(1)jm(x) => n(x)) : (o; f1 : (1; 1); 2 : (0; 1)g) (ET-new) ; L ` $m:$n:(m(1)jm(x) => n(x)) : (o; f1 : (1; 1); 2 : (0; 1)g) Fig. 4.

A sample derivation in the extended type system

holds, where Erase( ) is de ned by: b 2 ) = Erase(1) ! Erase(2 ) Erase(1 ! Erase( ) =  if  is not an arrow type.

The de nition of Erase is extended pointwise to contexts as follows:

Erase() =  Erase(T [ fx :  g) = Erase(T ) [ fx : Erase( )g

e^ is a term obtained by dropping all labels for program points. Proof is straightforward by induction on the structure of the derivation for T ; L `ET R e : (; b).

4.1 Correctness Correctness of the extended typing rules with respect to the reduction semantics is given in Proposition 4.2 and Corollary 4.3 below.

Proposition 4.2 (subject reduction theorem) If T ; L `ET R e : (ol ; b) and e 0! e0, then T ; L `ET R e0 : (ol ; b).3 0

The proof is done by induction on the structure of the proof for e 0! e0 (see [4]). As an immediate corollary of the subject reduction theorem, we can prove that the result of analysis really gives upper-bounds of lengths of message queues and receivers queues.

Corollary 4.3 If T ; L `ET R e : (o; fl1 : (I1 ; O1 ); : : : ; ln : (In; On); : : :g) and L() = fl1 ; : : : ; ln g, then during the evaluation of e, the number of enqueued

messages sent to the communication channel created at the program point  is at most max(fO1 ; : : : ; Ong) and the number of waiting processes for the channel is at most max(fI1 ; : : : ; Ing). Proof sketch of Corollary 4.3 Suppose T ; L ` e : (o; fl1 : (I1 ; O1 ); : : : ; ln : (In ; On ); : : :g), L() = fl1 ; : : : ; ln g

and that at some moment the number of enqueued message sent to a channel m created at the program point  is I 0(> max(fI1 ; : : : ; In g)). Then, there is a transition sequence

e 0!3 $ m:$n1 : 1 1 1 $nk :(m(v1 ) j 1 1 1 j m(vI ) j e00 ) 0

By the subject reduction theorem,

T ; L ` $ m:$n1: 1 1 1 $nk :(m(v1) j 1 1 1 j m(vI ) j e00) : (o; fl1 : (I1 ; O1 ); : : : ; ln : (In ; On ); : : :g) 0

which contradicts the typing rules (ET-new) and (ET-par). 2

5 Type/E ect Reconstruction By Corollary 4.3, for each closed process expression e, we know the upper-bounds of lengths of a message queue and a receiver queue for each channel, by computing L; b such that ; L ` e : (o; b) ( denotes an empty map). The procedure to compute L; b consists of the following two steps: 1. First, construct a syntax-directed version of the type/e ect rules. We introduce e ect variables. By reading the syntax-directed rules from bottom to up, we obtain inequalities on e ect expressions. 2. By solving a system of inequalities, we get the required result. 3

l

and l0 is not necessarily identical.

5.1 Step 1: Extraction of e ect constraints In order to reconstruct an e ect from a given term, we introduce e ect variables ranged over by , and extend the syntax of e ects as follows:

De nition 5.1 (e ect expression) A set of e ect expressions, ranged over by b, is given by the following syntax: b ::= fl1 : (I1 ; O1); : : : ; ln : (In; On )g j  j b1 + b2 j b1 _ b2 j minus(b1 ; l) A judgment form for the syntax-directed rules is EC ; T ; L ` e : (; b). EC , which is a set of inequalities on e ect expressions, restricts the range of e ect variables. EC ) b1  b2 means that for any substitution S for e ect variables in EC [ fb1 ; b2 g, if S EC is satis ed, then b1  b2 is also satis ed. A main part of the syntax-directed rules is given in Figure 5 (see [4] for the whole rules).

T fx 7! 1 g; L ` e : (2 ; b) )b b b ; T ; L ` x:e : (1 ! 2 ; ) b ; T ; L ` e1 : (1 ! 2 ; ) ; T ; L ` e2 : (1 ; ) (SET-app) ; T ; L ` e1 e2 : (2 ; b) ; T ; L ` e1 : (o; b1 ) ; T ; L ` e2 : (o; b2 ) (SET-par) ; T ; L ` e1 je2 : (o; b1 + b2 ) b l (SET-com) b ; T ; L ` x:ei : (i ! o; ) ; T ; L ` mi : (i ! o ; ) ) bi  minus(bi ; fl1 ; : : : ; ln g) ; T ; L ` &i=1;n (mi (x) => ei ) : (o; fl1 : (1; 0)g _ 1 1 1 _ fln : (1; 0)g)

(SET-abs)

EC ;

EC

0

0

EC

EC

EC

EC

EC

EC

EC

i

EC

EC

0

EC

i

i

0

EC

Fig. 5.

Syntax-directed Rules

We write SET R for the set of the resulting typing rules, and write

EC ; T ; L `SET R e : (; b) if EC ; T ; L ` e : (; b) is derivable in SET R. Note that

SET R is syntax-directed; that is, for each syntactic category, the conclusion of only one rule matches. The following theorem ensures that ET R and SET R are essentially equivalent. Proposition 5.1 (soundness and completeness of syntax-directed rules) 1. If EC ; T ; L `SET R e : (; b), then for any ground substitution S for e ect variables in EC ; T ; ; b such that S EC holds, S T ; L `ET R e : (S; Sb). 2. If T ; L `ET R e : (; b), then there is a b0 such that ; T ; L `SET R e : (; b0) and b  b0. Since SET R is syntax-directed, a type/e ect reconstruction algorithm is straightforwardly obtained by reading rules of SET R in a bottom-up manner.

5.2 Step 2: Solving e ect constraints In the second step, in order to compute the minimum e ect for a process expression (i.e., an expression of type o), we need to solve an e ect constraint EC obtained in Step 1. From the rules of SET R, we know that all the inequalities output by the reconstruction algorithm are in the from:   b where  is an e ect variable. Since all operations +, _ and minus on e ects are monotonic, an e ect expression b is monotonic w.r.t. its e ect variables. Moreover, since channel regions in EC are nite, constant e ects range over the nite space. Therefore, we can prove that the minimum solution of EC always exists and is computable in nite steps.

De nition 5.2 Let EC be a set of inequalities of the following form. 1  b1;EC (1 ; : : : ; n )

111

n  bn;EC (1 ; : : : ; n ) k) The above inequalities are abbreviated as   bEC (). Then, we de ne (EC = ) ) ((1k;EC ; : : : ; (n;kEC )(k  0) by:

EC =  = (; : : : ; ) (i+1) (i) (i) (i) (i) (i) EC = bEC (EC ) = (b1;EC (1;EC ; : : : ; n;EC ); : : : ; bn;EC (1;EC ; : : : ; n;EC )) (0)

K) Proposition 5.2 (existence of minimum solution) For some K , (EC is the least xpoint of bEC , that is, K) K) 1. bEC ((EC ) = (EC K) 2. If bEC (b0 ) = b0 for some e ect b0 , then b0  (EC K) The above proposition ensures that  = (EC is the minimum solution of e ect constraints EC . Then, if EC ; ; L `SET R e : (o; b) holds, the minimum b0 K) such that ; L `ET R e : (o; b0 ) is given by b0  b[(EC =].4

Example The following expression creates a new point process (point has been de ned in Section 2) with state variables x = 1:0; y = 0:0, and then sends getx messages twice to the point process.

$getx.$gety.$set.(point(1.0,0.0)(getx,gety,set) (* creates new point process *) | $reply.(getx(reply) | (* sends getx message *) reply(x)=>(getx(reply) | reply(x')=>_))) (* receives reply, and then sends getx message again *) 4

There are more ecient algorithms for solving the e ect constraint. In fact, our current experimental analyzer incorporates a symbolic simpli cation method instead of the above iterative method.

Our analysis infers an e ect of the above expression as

fl1 : (1; 1); l2 : (1; 0); l3 : (1; 0); l4 : (1; 1)g where l1 ; l2 ; l3 ; l4 is respectively regions of channels getx, gety, set, and reply. Please note that the number of enqueued messages for channels getx and reply is counted as 1, although messages are sent twice along channels getx and reply.

6 Examples of Optimization This section gives two examples of optimizations based on information obtained by our analysis.

Reduction of costs for queue manipulation Since communication in HACL is based on one-to-one-of-many communication, there might be multiple senders and receivers for each communication channel. Therefore, two queues, one for messages and the other for message receivers, are required for each communication channel. Typical implementation of message sending is: (1)if the receiver queue is empty, just enqueue the argument of the message, (2)otherwise, picking up a receiver from the queue and process it. Conversely, message reception is implemented as: (1) if the message queue is empty, just enqueue the continuation, (2)otherwise picking up a message from the queue and execute it. Given upper-bounds of the enqueued messages and receivers by our program analysis, we can represent each queue as an array of a xed size and no boundary check is required if the upper-bound is nite. Elimination of redundant message passing By replacing the typing rule (ETcom) by the following rule (ET-com') in Section 3, we can approximate the

maximum number of messages that might be sent for each communication channel.

(ET-com') b l o ; ) b0  b T ; L ` x:e : ( !b o; ) T ; L ` m : ( ! EC ; T ; L ` m(x) => e : (o; fl : (1; 0)g) 0

Then, under a certain condition, we can completely eliminate message passing. Consider the following fragment of a program: $reply.(m(reply) | reply(x)=>e)

The above fragment creates a new communication channel reply, sends message m(reply), and gets a value of x via reply. If we know that a message reply can be sent at most once and reply is received only by reply(x)=>e, we can transform the above fragment to: 5 5

Note that this transformation is possible because message send and function application is overloaded in HACL.

let fun k x = e in m(k) end

Thus, creation of the channel reply and message passing on the channel are eliminated. Moreover, consider the case where e is just message sending n(x): $reply.(m(reply) | reply(x)=>n(x))

It can be transformed to: let fun k x = n x in m(k) end

Since k(x)=n(x), it is simpli ed to: m(n)

This kind of situations often occur in concurrent object-oriented programming. For example, consider the situation where an object A delegates a message to another object B , and receives a reply from B , and then forwards the reply to the sender of the original message. By the above optimization, the reply from B can be directly sent to the sender. The last example looks quite similar to tail recursion elimination in continuation passing style compilation for functional languages[3]. But please note that the above optimization is possible only by analyzing communication behaviors between concurrent processes.

7 Limitations and Re nements of the Simple Analysis Although even the simple version of our analysis gives more useful information than Nielson's one[9] in many cases, there are still two major problems. For the space restriction, we brie y discuss re nements to overcome those problems.

7.1 Polymorphism on Labels The rst problem is that too many channels are merged into the same channel region by uni cation during the type/e ect reconstruction. For example, in the following expression, $getx.$gety.$set.(point(1.0,0.0)(getx,gety,set) | $reply.(getx(reply)) | $reply'(getx(reply')))

our analysis identi es channels reply and reply' and counts the number of messages sent to the channel region as 2, although they are separate channels, and the number of messages sent for each channel is 1. In order to overcome this problem, we need to introduce polymorphism on channel regions. Types are extended as follows: b  ::= j B j ol j 1 ! 2 j 8l:

where b ranges over e ects and l ranges over channel regions. We often write

8l1 1 1 1 ln : for 8l1 : 1 1 1 8ln: .

We can accordingly re ne typing rules (ET-abs), (ET-app), (ET-new), (ET-rec) and (ET-com) in ET R as follows: (ET2-abs) T fx 7! 1g; L ` e : (2; b) fl1; : : : ; lbn g \ FL(T ; L) =  T ; L ` x:e : (8l1 1 1 1 ln :(1 ! 2 ); ) (ET2-app) T ; L ` e1 : (8l1 1 1 1 ln :(1 !b 2); ) T ; L ` e2 : ( 1 ; ) = [l10 =l1; : : : ; ln0 =ln] T ; L ` e1 e2 : ( 2 ; b) b b (ET2-rec) T ff 7! 8l1 1 11 ln :( ! o)g; L ` x:e : (8l1 1 1b1 ln:( ! o); ) T ; L ` rec f (x) = e : (8l1 1 1 1 ln:( ! o); ) (ET2-new) b l o )g; L ` e : (o; b) b0  fl : (0; 1)g l 2 L() T fx 7! 8l1 1 1 1 ln:( ! T ; L ` $ x:e : (o; b) (ET2-com) 0

b l b o );  ) o) ;  ) T ; L ` mi : (8l1 1 11 ln :(i ! T ; L ` x:ei : (8l1 1 1 1 ln :(i ! b0i  minus(bi ; fl1 ; : : : ; lng) T ; L ` &i=1;n (mi (x) => ei ) : (o; fl1 : (1; 0)g _ 1 1 1 _ fln : (1; 0)g) FL(T ; L) is a set of free regions (i.e. regions that are not bound by 8l) which appear in T ; L. With this extension, it is not dicult to check that the subject i

0

i

i

reduction theorem still holds.

7.2 Channel Creation in Recursion

The second problem is related to channel creations in recursive processes. With the rule (ET2-rec), if channels are recursively created in an expression rec f (x) = e, all e ects for them are added, although each channel is actually separate. Thus, we can obtain only rough information for channels created in recursion. The fundamental diculty lies in this problem, because we must deal with in nite channel regions. In order to overcome this problem, we introduce new kinds of variables 1 ; 2 ; : : : for representing a set of (possibly in nite) channel regions. E ect constants are extended as follows: f1 : (I1 ; O1 ); : : : ; m : (Im; Om ); l1 : (I10 ; O10 ); : : : ; ln : (In0 ; On0 )g Intuitively, f : (I1 ; O1 )g represents a constant e ect _ffl : (I1 ; O1 )g j l 2 g. We also extend a region assignment to a map from each program point to a set consisting of channel regions and these new variables. We introduce the following new rule for recursion.

(ET3-rec) T 0 ff 7! 8l1 1 1 1 ln:( !b o)gL0 ` x:e : ([( [ L1)=](8l1 1 1 1 ln:( !b o)); ) T 0 ; L0  [( [ L1)=](T ; L) b L1 \ FL(T ff 7! 8l1 11 1 ln :( ! o)g; L) =  T ; L ` rec f (x) = e : (8l1 1 1 1 ln: !b o); )

where L1 ; L2 denotes a nite set of channel regions. Intuitively, a variable  in the above rule represents in nite regions of recursively created channels at the program point  , and L1 is a set of regions of channels created in each expansion of the recursive de nition. Let ET R3 be a set of rules obtained from ET R2 by replacing (ET2-rec) with (ET3-rec). The obvious subject reduction theorem does not hold any more (see [4]). However, the corresponding theorem to Corollary 4.3 still holds.

8 Related Work

Formal static analysis methods proposed so far for concurrent languages have been very limited. Although several work has been done on I/O mode checking/inference[10][7][18], as far as the authors know, only Nielson's techniques[9][8] for Concurrent ML[12] can analyze more useful information such as the number of input/output operations performed. However, Nielson's method for accumulating e ects seems much closer to the traditional one for e ect analysis of store[14], and cannot cope with the problems stated in Section 3. Of course, our method for accumulating e ects also has a disadvantage; when there are multiple senders and there is a single receiver, extra e ects are accumulated in our analysis. However, we believe that this problem can be overcome with a minor change to our analysis. [2] also proposed a static analysis method for asynchronous concurrent languages. But it cannot handle communication channels as rst-class values.

9 Conclusion

We proposed a static analysis method for message passing in asynchronous concurrent programming languages, and discussed optimizations based on the proposed analysis. We have already implemented and tested the proposed static analysis algorithm, and are now trying to integrate it into a compiler of HACL. We have not formally investigated the computational complexity of our analysis. However, we expect that the complexity is exponential to the nesting of letdeclarations in the worst case, but think that it is not so serious in practice, as in the ML type inference. Future work includes re nements of the analysis method, and its application to compilers of concurrent object-oriented languages. As for the re nement of the analysis, introduction of subtyping[16] to e ect systems seems promising. We expect that subtyping can be introduced for our analysis by taking I/O mode of channels into account.

References 1. Agha, G., Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, 1986. 2. Andreoli, J.-M., R. Pareschi, and T. Castagnetti, \Abstract Interpretation of Linear Logic Programming," in Proceedings of International Logic Programming Symposium, pp. 315{334, 1993. 3. Appel, A. W., Compiling with Continuations. Cambridge University Press, 1992.

4. Kobayashi, N., M. Nakade, and A. Yonezawa, \Static Analysis on Communication for Asynchrnous Concurrent Programming Languages," Tech. Rep. 95-04, Department of Information Science, University of Tokyo, April 1995. 5. Kobayashi, N., and A. Yonezawa, \Asynchronous Communication Model Based on Linear Logic." to appear in Journal of Formal Aspects of Computing, SpringerVerlag. 6. Kobayashi, N., and A. Yonezawa, \Type-Theoretic Foundations for Concurrent Object-Oriented Programming," in Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOP-

SLA'94), pp. 31{45, 1994. 7. Kobayashi, N., and A. Yonezawa, \Higher-Order Concurrent Linear Logic Programming," in Theory and Practice of Parallel Programming, vol. 907 of Lecture Notes in Computer Science, pp. 137{166, Springer Verlag, 1995. 8. Nielson, F., and H. R. Nielson, \Constraints for Polymorphic Behaviors of Concurrent ML," in Proceedings of CCL'94, vol. 845 of Lecture Notes in Computer Science, pp. 73{88, Springer Verlag, 1994. 9. Nielson, H. R., and F. Nielson, \Higher-Order Concurrent Programs with Finite Communicationn Topology," in Proceedings of ACM SIGACT/SIGPLAN Symposium on Principles of Programming Language, pp. 84{97, 1994. 10. Pierce, B., and D. Sangiorgi, \Typing and Subtyping for Mobile Processes," in Proceedings of IEEE Symposium on Logic in Computer Science, pp. 376{385, 1993. 11. Pierce, B. C., \Programming in the Pi-Calculus: An Experiment in Programming Language Design." Lecture notes for a course at the LFCS, University of Edinburgh., 1993. 12. Reppy, J. H., \CML: A Higher-order Concurrent Language," in Proceedings of the ACM SIGPLAN'91 Conference on Programming Language Design and Implemen-

tation, pp. 293{305, 1991. 13. Sangiorgi, D., Expressing Mobility in Process Algebras: First-Order and HigherOrder Paradigms. PhD thesis, University of Edinburgh, 1992. 14. Talpin, J.-P., and P. Jouvelot, \Polymorphic type, region and e ect inference," Journal of Functional Programming, vol. 2, no. 3, pp. 245{271, 1992. 15. Talpin, J.-P., and P. Jouvelot, \The Type and E ect Discipline," in Proceedings of IEEE Symposium on Logic in Computer Science, pp. 162{173, 1992. 16. Tang, Y.-M., and P. Jouvelot, \E ect systems with subtyping," in ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (to

appear), 1995. 17. Taura, K., S. Matsuoka, and A. Yonezawa, \An Ecient Implementation Scheme of Concurrent Object-Oriented Language on Stock Multicomputers," in Proc. ACM Conf. on Principles and Practice of Parallel Programming (PPOPP), 1993. 18. Ueda, K., and M. Morita, \Moded Flat GHC and Its Message-Oriented Implementation Technique," New Generation Computing, vol. 36, no. 3, pp. 3{43, 1994. 19. Yonezawa, A., and M. Tokoro, Object-Oriented Concurrent Programming. The MIT Press, 1987.

This article was processed using the LaTEX macro package with LLNCS style