(SERP) matrices for compressed sensing - Semantic Scholar

Report 1 Downloads 195 Views
Sparse Expander-like Real-valued Projection (SERP) matrices for compressed sensing Abdolreza Abdolhosseini Moghadam and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abdolhos, radha}@msu.edu

Abstract—Sparse binary projection matrices, corresponding to high quality expander graphs are arguably the most commonly used sensing matrices in combinatorial approaches to compressed sensing. In this paper, we are interested in properties of Sparse Expanderlike Real-valued Projection (SERP) matrices that are constructed by replacing the non-zero entries of expander-graph sparse binary projection matrices by Gaussian random variables. We prove that these sparse real-valued matrices have a “weak” form of Restricted Isometery Property (RIP). We also show that such weak RIP enables this class of matrices to be utilized in optimal greedy and geometrical frameworks for compressed sensing. Furthermore, we introduce a combinatorial solver for the proposed class of sparse real-valued projection matrices. While maintaining low-complexity, we prove that this combinatorial solver is (a) robust in the presence of noise and (b) optimal in terms of compressive-sample requirements. Our simulation results verify that these matrices can be robustly utilized in geometrical, greedy and combinatorial compressed sensing frameworks.

Keywords: Compressed sensing, combinatorial approaches.

I. Consider the problem of recovering an unknown signal under-determined (i.e.

∈ℝ

×

where

unique sparsest solution. Given samples sparse signal

INTRODUCTION ∈ℝ

< ) and the signal

from a set of linear samples

=

when the system is

is known to be sufficiently sparse such that it would be the

and the sensing operator , the straightforward approach for finding the unique

is to perform an exhaustive search. That is, by assumptions of the problem = arg min‖ ‖ . . =

(1)

The problem of Compressed Sensing (CS) [1][2] deals with designing a solver Δ( , ) and the sensing mechanism

such

that problem (1) could be solved efficiently [12]. Clearly, the most desirable attributes for a CS framework are (a) demanding

minimum number of samples

to provide a good level of reconstruction (i.e. keeping ‖Δ( , ) − ‖ small), and at the same

time (b) maintaining the complexity of the decoder Δ as low as possible. Roughly speaking, there are three general approaches to finding the solution

in (1) [5]: convex relaxation (geometrical)

methods [2][3][4], greedy algorithms [5] and combinatorial approaches [6][7][8][11]. To motivate the contribution and focus of this paper, we briefly outline the most salient attributes of these CS approaches. The decoder of convex relaxation methods finds the unique sparsest solution to

=

by solving the ℓ convex optimization problem to (1), i.e. arg min‖ ‖ . . =

(2)

It has been shown [1][2] that such relaxation methods that map the pseudo ℓ norm to the convex ℓ norm in the objective approximately preserves the ℓ norm of vectors under projection (a

function of (1) leads to exact recovery of the signal

if

property called Restricted Isometry or RIP), i.e. ‖

‖ ≈ ‖ ‖ . Most notably, dense Gaussian random matrices possess this

property with overwhelming probability if

(number of samples/rows of ) is sufficiently large. Although being tractable

and demanding the fewest number of samples/equations ( ) to guarantee perfect recovery, solving such linear program generally could be very complex (e.g.

(

) for Basis Pursuit [3]). On the other extreme, combinatorial decoders

[6][7][8][9][11] are much less complex as compared to convex relaxation but they require higher number of compressive samples

to guarantee recovery. Greedy approaches, which also require incoherent sensing matrices with RIP, in general fall

in between combinatorial and convex-relaxation based methods in terms of their complexity and sampling requirements [5][14]. Meanwhile, due to the group testing nature of combinatorial solvers, the projection matrices applicable in such approaches have to be sparse and thus popular dense sensing matrices in greedy/convex relaxation frameworks could not be utilized in the sensing stage of a combinatorial based-CS framework. In fact, from the encoder (sensing) perspective, a “good” projection or sensing matrix

in a combinatorial approach should correspond to the sparse binary adjacency matrix

of an expander graph [8]. The authors of [8] showed that such sparse binary matrices approximately preserve the ℓ norm of vectors under projection (‖

‖ ≈ ‖ ‖ ) and hence named it RIP-1 [8]. However, it has been shown in [10] that such binary

matrices with RIP-1 do not possess the original RIP property. Despite the fact that the authors in [8] proved that convex relaxation CS solvers can work under sparse binary projection matrices in the sensing stage, the ℓ norm preservation of sparse binary projections might lead to weaker guarantees of recovery ℓ < ℓ (to be explained in the next section) when compared to the stronger ones in case of greedy/convex relaxation approaches [16]. In this paper, we are interested in finding a sensing operator

which can be utilized in all three types of CS approaches

(combinatorial, convex relaxation, and greedy); and at the same time,

can provide recovery guarantees in ℓ norm sense.

We show that by some modification in the adjacency matrices of expander graphs, one can construct a class of sensing matrices which exhibits some “weak” form of RIP. Let us clarify what we mean by weak in here. As stated before, the formal definition of RIP demands that ‖

‖ must be close to ‖ ‖ (with high probability) for “all” possible -sparse signals.

Although RIP is known to be a sufficient condition to guarantee the success of geometrical approaches in solving CS, it has been reported in numerous papers (e.g. [17][18][19]) that the empirical results of linear programming solvers shows that there might be some less constraining properties than RIP, governing the success of linear programming in CS context. This observation led to the investigation of alternative sufficient conditions, which in turn, led to the introduction of different types of RIP, for instance statistical RIP [18] and weak RIP [20] to name a few. This paper is along with those efforts. More specifically, we focus on the probability of recovering any arbitrary but fixed -sparse signal, and not the probability of failure of the decoder for all possible -sparse signals; it should be clear that the first condition is weaker than the latter one. Such notion of fixing the signal support has been introduced and used before, for instance in [20]. By using basic arguments, we show that under this measure (the probability of failure for a fixed arbitrary signal), a simpler and weaker version of RIP, and which we refer to as w-RIP (that is very similar to the Weak-RIP in [20]) is sufficient to recover any arbitrary fixed signal under convex relaxation methods and at least one optimal greedy algorithm called Cosamp [5] with high probability. The proposed w-RIP potentially broadens the admissibility of many sensing matrices that do not possess the original RIP, to be considered as “good” sensing mechanism at least in the sense of “any arbitrary but -sparse fixed signal”. We show that if a matrix

∈ℝ

×

is constructed by replacing the non-zero entries of the sparse binary adjacency matrix of an expander

graph with a Gaussian random vector then

would benefit from w-RIP and hence could be utilized in geometrical and

greedy frameworks. Also, we show that there exists at least a combinatorial algorithm (based on Sequential Sparse Matching Pursuit [16]) for the proposed

with the ℓ < ℓ guarantee of reconstruction. To the best of our knowledge, no other class

of sensing projection matrices has been introduced in prior works that can provide ℓ < ℓ guarantee of reconstruction under all three approaches to CS. This paper is organized as follows: in Section II, the proposed SERP and our definition of w-RIP are introduced. We prove that if one targets for recovering an arbitrary but fixed signal, then w-RIP is a sufficient condition of success for at least one greedy and all geometrical CS frameworks. In the same section, we prove that SERP matrices adhere to such w-RIP and thus can be safely utilized in the aforementioned CS frameworks. In Section III, we show that for the proposed class of SERP matrices, there exists a fast combinatorial algorithm that can robustly recover a signal from its noisy samples. Simulation results are presented in section IV; and section V concludes the paper.

A. Notations and definitions For a natural number =

:

,

∈ ℕ, we define [ ] ≔ {1,2, … , }. For a matrix

≠ 0 . Hence,

,

×

and for a row index ∈ [ ], we define

is the set of column indices where the -th row of

for any set of rows ⊆ [ ] as: Ω ={ :

∈ℝ

=⋃



is non-zero. We generalize such notation are non-zero is Ω , i.e.

. The set of row indices where the -th column of

≠ 0}. Similarly, one can generalize this notation for any set of columns ⊆ [ ] by: Ω = ⋃ ∈ Ω . A zero vector

of length is denoted by

. The support of a vector

Gaussian distribution with mean with the same length of coefficients in

to zero. Let

and variance

is represented by ( ,

and formed by keeping

{ }={:

{ }, that is

is denoted by

). By

=

[ ], we mean that

largest (absolute value) coefficients of

be any vector in ℝ . Then we say that a pair of projection matrix

the problem (1) guarantees ℓ < ℓ recovery if ‖Δ( , ) − ‖ < ‖

and a decoder Δ( , ) for

are the set of left and right nodes/vertices and

= (1). For a fixed

is ‖Δ( , ) − ‖/‖ ‖.

=

In combinatorial approaches, it is beneficial to model the projection matrix using a bi-partite graph and

is a vector

and setting the rest of

[ ] − ‖ for some constant

decoder Δ and projection matrix , the relative error of reconstruction from samples

≠ 0}. A

is the set of edges connecting vertices in

= ( , , ) where

to . The -th left node in

correspond to the -th signal coefficient ( ) while the -th right node correspond the -th sample ( ). An edge ( , ) exists in the partite graph

if and only if

,

≠ 0 (i.e.

spans

). For a fixed set of vertices , Γ( ) denotes the neighbors of . A

graph is left/right -regular if and only if the degree of each left/right node is exactly . We say that expander, if

is left -regular and for any subset

⊆ [ ] with | | ≤

is a ( , , )

we have |Γ( )| ≤ (1 − ) | |. Note that

∈ [0,1].

As assumed by some prior related papers (e.g. [5]), we also follow the same assumption that the signal of interest is exactly sparse (i.e. most of signal entries are zero). In case of compressible signals where most of the coefficients have very small magnitudes but not exactly zero, one can resort to the technique used in [5]. More specifically, assume signal which possibly is non-zero in all indices and let the system of equation

=

+

=

[ ] be a vector that has the

is equivalent to the system of

as a new noise that encodes both the sampling noise

=

+ ′ where

and the tail of the signal ( −

largest coefficients of . Then

= ( −

dense original (compressible) signal .

)+

can be thought of

[ ]). The idea is that, if

rapidly, then the energy of the tail of the signal should be much smaller than the signal itself: ‖ − one can target finding the -sparse signal

is a compressible

decays

[ ]‖ ≪ ‖ ‖. Hence,

(which is the best -sparse approximation to the original signal ) instead of the

II.

= ( , , , , ) where the function

Consider matrix , , ,

THE PROPOSED PROJECTION MATRIX

are some natural numbers with the condition that

is defined in Algorithm 1,

is a probability distribution and

is divisible by . In words, Algorithm 1 generates the matrix

by following these operations: (i) for each column ∈ [ ], the algorithm selects uniformly at random [ / ] = {1,2, … ,

/ }. Let us denote those indices by ℎ. (ii) Then for each of the

indices among

entries in ℎ, for instance the -th entry

ℎ , the algorithm generates another set of indices {(ℎ − 1) + 1, (ℎ − 1) + 2, … , ℎ }. The collection of these

indices

forms the row-indices where the -th column will be non-zero (i.e. Ω ). (iii) Finally, each non-zero entry of the -th column of would be a random variable with distribution

random indices and each non-zero entry is a random vector ( , , … , ) ∈ ℝ

each of its column is non-zero in exactly with ~

. The output of Algorithm 1 could be also viewed as a matrix (tensor) that

for ∈ [ ]. Note that, for all and where ⌊ / ⌋ = ⌊ / ⌋ we have

set of column indices where the -th row of

=

; where, as explained above,

is the

is non-zero. It is straightforward to verify that for sufficiently large

if (with

high probability) the output of ( / , , 1, , ) corresponds to a high quality expander graph, then ( , , , , ) would also (with high probability) correspond to a high quality expander. Also it should be clear that permuting columns and rows of the generated projection matrix has no effect on our arguments nor on the quality of the matrix. A wide range of popular projection matrices (in CS context) could be generated by Algorithm 1. For instance if ( log / ) and

=

(0,1/ ), then ( , , 1,

, ) outputs a random dense Gaussian projection matrix which has RIP

with high-probability [12][13]. On the other hand, for otherwise for

(i.e. ~

=

= Θ(log ), = 1 and a distribution with delta-Dirac at one and zero

is a constant number with value of one), the output of the Algorithm 1 (with high probability

[7][16]) will correspond to a bipartite expander graph and hence can be utilized in a combinatorial solver for CS. Inputs : , , , and Output : ∈ ℝ × For = 1 to do = Let ℎ be a random subset of [ / ] of size (|ℎ| = ) / Ω = ⋃ {(ℎ − 1) + 1, (ℎ − 1) + 2, … , ℎ } ∀ ∈Ω: ,~ End Algorithm 1 The function ( , , , , )

In this paper, we are interested in the output of Algorithm 1 when = ( log / ) and

=

(0,1/

= (1) is a small natural number, = Θ(log ),

) . We call an output instance of

( , , , , ) with those values as “Sparse

Expander-like Real-valued Projection” or SERP. With some realistic approximations, we show that this class of matrices adheres to a property which we refer to as w-RIP (to be defined bellow) for an arbitrary signal

with a fixed support set

=

{ }. Then, we show that w-RIP is in fact sufficient to guarantee the recovery of that fixed signal under convex

relaxation and Cosamp (an optimal greedy CS decoder) and there is no need for the stronger RIP condition. Also by using the same property, we prove that modifying only one line of the SSMP decoder [16] leads to a combinatorial algorithm with a ℓ < ℓ guarantee of recovery for that particular signal under SERP sensing matrices. To the best of our knowledge, this is the first class of matrices which can provide ℓ < ℓ recovery under all three approaches to the problem of compressed sensing at least for “any arbitrary but fixed signal” setting. Throughout this paper, we assume that parameters

and are

= ( , , , , ) are replaced by one, then it would correspond to a

chosen properly such that, if non-zero entries of (( + 1) , , ) expander for small constants

,

≥ 1 and 0
1−

,

< 0.1 for

then Δ provides a ℓ < ℓ guarantee of recovery for

=1−

from

of cardinality . Assume that,

where

and denote the discrepancy of estimation by

{ }

=

by −

= 1−

=

{ }

. Let

with a probability of at least

{ }

{}

=



{}

∈ℝ

×

has

≥ . If Δ( , ) is the Cosamp algorithm,

Proof of Lemma 2) For a -sparse signal , Cosamp requires at most 6( + 1) ≤ 7 iterations (for the start of the -th iteration, the algorithm estimates



.■

∈ ℝ with the fixed support set

w-RIP of order (2 , ) with the constant

for

≥1−

.

≥ 6). Assume that in

be the residue of estimation in that iteration

. Finally let Ω{ } be the indices corresponding to the 2 largest

{}

values of

Ω{ } \



constant on sets

{}

{

{}

2 and

,

} and

{ }

Ω{ } ∪

=

{ }∪

{

{ }

{ }

,

{ }

{ (

,…,

≤ 0.1 for



(

)}

{}

and

,

{ }

,…,

{ (

with a target probability of )

. For instance, if

)}

= ≤

when the restricted isometery

are all smaller than 0.1. Therefore, if one wants to recover

then the matrix

≥ 1−

{}

}. Note that for all ,

≤ 4 . Based on the proof of [5], Cosamp recovers a fixed sparse signal

any arbitrary but fixed signal constant

{ }

. Then on the -th iteration of Cosamp, one needs RIP inequalities only on column indices

then ≥ 1 −

has to have a w-RIP of order (4 , ) with the .■

C. w-RIP for SERP matrices The following two lemmas prove that SERP matrices have w-RIP and hence could be utilized in Cosamp and convex relaxation methods in a “for any arbitrary but fixed signal” scenario. Lemma 3) Let and

∈ℝ

×

be an output instance of ( , , , , ). If

= (1) is a constant, then for any arbitrary vector

∈ℝ

= Θ(log ),

= ( log / ),

=

(0,1/

with a fixed support, there exist constants

)

and

= ( ) such that Pr(‖

‖ ≥ (1 + )‖ ‖ ) ≤

Proof of Lemma 3) Without loss of generality, assume that

1

(4)

has a unit norm. To bound ‖ ‖ = ‖

‖ (similar to [12]) one

can resort to the moment generating function of the random variable : E Pr(‖ ‖ ≥ (1 + )‖ ‖ ) = Pr(Exp( ‖ ‖ ) ≥ Exp( (1 + )‖ ‖ )) ≤ for a positive value of , where the last inequality is due to Markov’s. Clearly E the random variable ‖ ‖ . Note that Consequently /

~

χ (1)/

~

(0,1)/√

‖ ‖

(see [13]) where

‖ ‖ (

)

(5)

is the moment generating function of (0,1) is standard normal distribution.

where χ (1) is a chi-squared distribution with one degree of freedom. Define

for ∈ [ ], then the random variable ‖ ‖ ~ ∑

=

χ (1) is a weighted sum of chi-squares. It is known and

easy to verify that manipulating the moment generating function of a random variable with the distribution of weighted sum of chi-squares does not lead to neat or usable formulas. Hence, a number of estimations are suggested to approximate that random variable with another one, which can be manipulated more easily (e.g. [23][24]). Here, we use the approximation in [23]. A commonly practiced approximation of a weighted chi-squared random variable is a Gamma distribution with the

same mean and variance of the original random variable. In [23], the authors proved that this approximation leads to acceptable results and more importantly, the gamma approximation tends to be an over-estimate of the true distribution of ‖ ‖ when the variance of { } is not small1. Since values of { } in SERP are linearly proportional to

(i.e. norms of

random subsets of the signal) and the magnitudes of many signals of the interest (e.g. DCT coefficients of images) almost follow a power law, hence at least for these classes of signals, it is very unlikely that

to have small variances.

Consequently, this approximation seems to be reasonable for those practical signals of interest. Define a=

∑ 2∑

,

λ=

(∑ 2∑

)

Then based on the approximation of [23], ‖ ‖ can be approximated by a Gamma distribution with the shape parameter λ and scale parameter 1/a. Note that ∑ that, all columns of

=∑

are non-zero in exactly

is unit norm, we get ∑

/

is the number of times that ∈ [ ] occurs in

where

row indices. Hence, for all ∈ [ ] we have

It can be proved by contradiction that given the properties of { }, the vector

=[



attains its maximum. More specifically, assume that

(since



] is not maximally sparse. Then, one can find two indices , ∈

> 1 and ‖ ‖ ≥

) and form a new sequence of =

It is easy to verify that ∑

=∑

the sparsity of the vector each column of

=[





] has to be maximally sparse

; however the vector

{ } such that

≠ 0 and

= ≠0

] where

≠ , ≠ = =

1. Furthermore, assume that a given sequence { } leads to maximum value for ∑ [

. Since it is assumed that

) = 1 and thus a = λ. Now let us find the upper and lower bounds for ∑

= (∑

when the objective function ∑

=

. Recall

, due to two facts: (a) the matrix

attains its upper bound. However, corresponds to an expander and (b)

indices. This maximally sparse case happens when the signal is non-zero in

exactly one location. Since we have assumed that the signal is unit-norm, then this implies that the signal is one at one index

1

Here we are deriving the upper bound for equation (4). Hence, the derived probability from the approximation in here is higher than the actual one.

and zero in the rest of indices. It is straightforward to verify that in this case, ∑ bound for ∑

. This concludes the upper

.

It is known and easy to verify that the when the constraint ∑ all

= 1/

are equals to 1/ . In that case, ∑

equals to 1/

= 1 is active, the minimum value of ∑

as well. /2 ≤

Putting together the upper and lower bounds, we have

occurs when

=λ≤

/2. Based on the above arguments, the following

holds

Pr(‖ ‖ ≥ (1 + )‖ ‖ ) ≤ For 0 < < . By setting

( )

= 0, we get =

‖ ‖

E




−1

‖ ‖

(24)

‖ ≤ 2 . Assume after re-ordering of entries of Δ, we have

| for all ∈ [|Δ| − 1]. Consider , the bi-partite graph corresponding to the projection matrix . Assume one

enumerate the edges of the graph

in lexicographic order and forms the sets Ω and Ω by following this rule: if ( , ) is the

first edge going to the vertex , then would be included in Ω , otherwise it goes to the set Ω [16]. More formally Ω = { : ∈ Ω . . ∄ < ,

,

≠ 0}

and Ω = Ω \Ω . Note that the sets Ω are pair-wise disjoint. As before, let 1) , , ) expander graph (where ≥ 1), one has ∑

|Ω | ≤

= { : ∈ Similarly define an index in

= /

for

|Ω |
1, define: (26) and (b) there is

Suppose

={ ,

|Ω | >

and therefore

Consequently,

, … } where



< 0.

(w-RIP constant) is sufficiently small. For instance, for

= 2 and

= 4, ̅ would be positive if

≤ 0.058. ■

REFERENCES [1]

David Donoho, “Compressed sensing”. IEEE Trans. on Information Theory, 52(4), pp. 1289 - 1306, April 2006.

[2]

E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information”. IEEE Trans. on IT, 52(2) pp. 489 - 509, February 2006

[3]

S.S. Chen, D. L. Donoho, Michael, A. Saunders, “Atomic decomposition by Basis Pursuit”, Siam J. of Scientific Comp, pp. 33-61, 1998.

[4]

M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems”. IEEE J. of Selected Topics in Signal Processing: Special Issue on Convex Optimization Methods for Signal Processing, 1(4), pp. 586-598, 2007.

[5]

D. Needell, J. A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples”. Applied and Computational Analysis, Vol. 26, no. 3, pp. 301-321, 2009.

[6]

R. Berinde and P. Indyk, “Sparse recovery using sparse random matrices”. (Preprint, 2008)

[7]

S. Jafarpour, W. Xu, B. Hassibi, and R. Calderbank, “Efficient compressed sensing using high-quality expander graphs”, IEEE Trans Info Theory, 2009

[8]

R. Berinde, A. C. Gilbert, P. Indyk, H. Karloff, and M. J. Strauss, “Combining geometry and combinatorics: A unified approach to sparse signal recovery”. (Preprint, 2008).

[9]

D.L. Donoho, A. Maleki, and A. Montanari. “Message passing algorithms for compressed sensing”, Proc. Natl Acad. Sci., 2009.

[10] Venkat Chandar, “A negative result concerning explicit matrices with the restricted isometry property”. (Preprint, 2008) [11] S. Sarvotham, D. Baron, and R. Baraniuk, “Sudocodes-Fast measurement and reconstruction of sparse signals”. IEEE ISIT 2006. [12] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, “A simple proof of the restricted isometry property for random matrices”. Constructive Approximation, 28(3), pp. 253-263, December 2008. [13] D. Achlioptas, “Database-friendly random projections”, 20th Annual Symposium on Principles of Database Systems, 2001, pp. 274–281. [14] Michael Elad, “Optimized projections for compressed sensing”. IEEE Trans. on Signal Processing, 55(12), pp. 5695-5702, December 2007 [15] R. Berinde, P. Indyk, and M. Ruzic, “Practical near-optimal sparse recovery in the l1 norm,” Allerton, 2008. [16] R. Berinde and P. Indyk, “Sequential Sparse Matching Pursuit”, IEEE Allerton 2009, pages 36-43.

[17] Lu Gan, Cong Ling, Thong T. Do, and Trac D. Tran, "Analysis of the statistical restricted isometry property for deterministic sensing matrices using Stein’s method". (Preprint, 2009) [18] R. Calderbank, S. Howard, and S. Jafarpour. (2009) "Construction of a large of deterministic sensing matrices that satisfy a statistical isometry property." [Online]. Available: http://www.dsp.ece.rice.edu/files/cs/strip-final.pdf [19] David L. Donoho, Arian Maleki, Andrea Montanari, "the noise-sensitivity phase transition in compressed sensing", IEEE transactions on information theory, Oct. 2011, vol. 57, issue 10, pages 6920-6941. [20] E. Candes, "A probabilistic and RIPless theory of compressed sensing", IEEE transactions on information theory, Nov. 2011, Vol. 57, Issue 11, pages 7235-7254. [21] J. Haupt and R. Nowak, "A generalized restricted isometry property", preprint 2007. [22] E. Candes, "The restricted isometry property and its implications for compressed sensing". Compte Rendus de l'Academie des Sciences, Paris, Serie I, 346 589-592. [23] H. Feiveson and F. C. Delaney, "the distribution and properties of a weighted sum of chi squares", NASA technical notes, TN D-4575, May 1968. [24] S. Gabler and C. Wolff "A quick and easy approximation to the distribution of a sum of weighted chi-square variables", Statistical Papers, vol. 28, pp.317 -325 1987