Sequential detection of targets in multichannel ... - Semantic Scholar

Report 2 Downloads 72 Views
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

425

Sequential Detection of Targets in Multichannel Systems Alexander G. Tartakovsky, Senior Member, IEEE, X. Rong Li, Senior Member, IEEE, and George Yaralov

Abstract—It is supposed that there is a multichannel sensor system which performs sequential detection of a target. Sequential detection is done by implementing a generalized Wald’s sequential probability ratio test, which is based on the maximum-likelihood ratio statistic and allows one to fix the false-alarm rate and the rate of missed detections at specified levels. We present the asymptotic performance of this sequential detection procedure and show that it is asymptotically optimal in the sense of minimizing the expected sample size when the probabilities of erroneous decisions are small. We do not assume that the observations are independent and identically distributed (i.i.d.). The first-order asymptotic optimality result holds for general statistical models that are not confined to the restrictive i.i.d. assumption. However, for i.i.d. and quasi-i.i.d. cases, where the log-likelihood ratios can be represented in the form of sums of random walks and slowly changing sequences, we obtain much stronger results. Specifically, using the nonlinear renewal theory we are able to obtain both tight expressions for the error probabilities and higher order approximations for the average sample size up to a vanishing term. The performance of the multichannel sequential detection algorithm is illustrated by an example of detection of a deterministic signal in correlated (colored) Gaussian noise. In this example, we provide both the results of theoretical analysis and the results of a Monte Carlo experiment. These results allow us to conclude that the use of the sequential detection algorithm substantially reduces the required resources of the system compared to the best nonsequential algorithm. Index Terms—Asymptotically optimal procedures, generalized sequential probability ratio test, multichannel systems, nonlinear renewal theory, sequential detection.

I. INTRODUCTION

I

N most practical systems, sensor decisions are made in a sequential manner at random times, depending on the data that are received sequentially by the sensors. The problem of detecting a target in multichannel systems may serve as a good example. Therefore, it is important to consider sensor decisions in a sequential setting where the sensor decision rules are sequential in nature. Manuscript received July 19, 2001; revised April 18, 2002. The work of A. G. Tartakovsky and G. Yaralov was supported in part by the U.S. ONR under Grants N00014-99-1-0068 and N00014-95-1-0229 and by the U.S. ARO under Grant DAAG55-98-1-0418. The work of X. R. Li was supported in part by ONR under Grant N00014-00-1-0677, NSF under Grant ECS-9734285, and NASA/LEQSF under Grant (2001-4)-01. The material in this paper was presented in part at SPIE’s AEROSENSE 2001 [25]. A. G. Tartakovsky and G. Yaralov are with the Center for Applied Mathematical Sciences, University of Southern California, DRB-155, Los Angeles, CA 90089-1113 USA (e-mail: [email protected]; [email protected]). X. R. Li is with the Department of Electrical Engineering, University of New Orleans, New Orleans, LA 70148 USA (e-mail: [email protected]). Communicated by A. Kav˘cic´, Associate Editor for Detection and Estimation. Digital Object Identifier 10.1109/TIT.2002.807288

A sequential detection procedure includes a stopping time and a terminal decision to achieve a tradeoff between the observation time and the decision quality. Problems of sequential testing of two or more hypotheses under different conditions have been studied for decades (see, e.g., Armitage [1], Chernoff [3], Dragalin [5], Dragalin, Tartakovsky, and Veeravalli [6], [7], Lai [11], Lorden [12]–[14], Pavlov [16], Sosulin and Fishman [18], Tartakovsky [19]–[22]). Most of the results have been obtained for independent and identically distributed (i.i.d.) models for observations, when log-likelihood ratio processes are random walks. Generalizations for non-i.i.d. models can be found in [6], [11], [19]–[22], [24]. In the present paper, we consider the problem of detecting a target in multichannel sensor systems. A target may appear in one of the channels and should be detected as soon as possible under constraints on the rates of false alarms and missed detections. More precisely, we study the behavior of the sequential detection algorithm which is based on the maximum-likelihood arguments. Decisions on target presence or absence are made sequentially by using a generalized Wald sequential probability ratio test (SPRT), which is based on the comparison of the maximum-likelihood ratio statistic (over all channels) with two thresholds. The thresholds are chosen in a way to guarantee specified rates of false alarms and missed detections. We do not assume that observations are i.i.d. In contrast, it is assumed that the observations can be correlated and nonstationary. The proposed sequential test turns out to be asymptotically optimal for very general statistical models when the false alarm rate and the rate of missed detections are low. This study is important in a variety of detection problems where observations are correlated and/or nonstationary. The general results are illustrated by a particular example of detecting a deterministic signal in colored Gaussian noise. In this example, we not only confirm that asymptotic theory is applicable to engineering practice, but we also design the thresholds that guarantee with high accuracy the given levels of false detections and missed detections. This is accomplished by estimating overshoots of the thresholds by decision statistics using the results of the nonlinear renewal theory [17], [26], [27]. The paper is organized as follows. In Section II, we formulate the problem and provide basic definitions and notations. In Section III, we describe a generalized SPRT, present its asymptotic performance, and show that it is optimal in an asymptotic setting when the rates of false alarms and missed detections are low. In Section IV, we briefly outline another attractive multichannel detection algorithm which is based on the parallel implementation of the Wald SPRTs in multiple channels (an SPRT bank). In Section V, we present the results of an exhaustive analysis

0018-9448/03$17.00 © 2003 IEEE

426

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

(theoretical and Monte Carlo (MC)) of the generalized SPRT for a problem of detecting a deterministic signal in correlated Gaussian noise and show that sequential detection allows us to substantially reduce the overall time needed to reach the final decision compared to the case where the decisions are made nonsequentially. We conclude the paper in Section VI. Most mathematical details and proofs are give in the Appendix.

is located in the th channel, while if , then there is no target at all. Obviously, in terms of the parameter , the hy” and potheses to be tested are reformulated as “ .” The latter hypothesis can be also re“ , where garded as a union of the simple hypotheses ” is the hypothesis that the target is located in the th “ channel.

II. PROBLEM FORMULATION

III. A GENERALIZED SPRT AND ITS OPERATING CHARACTERISTICS

We will be interested in a binary decision problem of testing two hypotheses related to an absence or presence of a target by observed sequentially a sensor with multidimensional data where is the information in discrete time available to the sensor at time moment . In what follows, we will suppose that the sensor represents a multichannel system channels (e.g., Doppler, angle, and range channels in with radars or assumed velocity channels in infrared/optical (IR/EO) is the systems). In this case, is vector of dimensionality , where the th component the observation available in the th channel at time . Write for the concatenation of observations up to time . It is assumed that a target either is present in one of the channels or is absent in all channels. The decision on target absence or presence must be made as soon as possible, while controlling the rates of false alarms and missed detections. Let be the probability density of when the target is abthe probability density when it is located in sent and the th channel. The problem of detecting the target can be formulated as the problem of testing two hypotheses, “ ”—the target is absent, and “ ”—the target is present in one of the channels (it does not matter in which one). is composite even when the densiNote that the hypothesis ties are completely specified. A sequential detection procedure (or more generally a sewhere quential test of two hypotheses) is a pair is a terminal decision function taking two values and , and is a Markov stopping time with respect to . In or , and the event depends other words, but not on , for all . Therefore, only on accepts and accepts are the decisions in favor of the hypothesis and , respectively. They are made at a stopping time which is an extended random variable that depends on observations. to denote the In what follows, we use , probability measures that correspond to probability densities introduced above, and will denote the operator of expectation with respect to the measure . In other words, the meacorresponds to the distribution of observations when sure to the distribution of observations there is no target and when the target is located in the th channel. It is convenient to introduce a fictitious parameter that takes values in the set and parameterize the probability density func, . If , then the target tion

We begin with presenting a construction of the multichannel sequential detection procedure that will be studied in this paper. A. Generalized Sequential Probability Ratio Test Let

denote the log-likelihood ratio between the hypotheses and based on the data observed up to the time and be two positive numbers moment . Further, let (thresholds). The proposed sequential detection procedure is defined as

if if

.

(1)

Thus, the observation process is continued as long as the maximum log-likelihood ratio statistic is between the thresholds and . We stop and make a final decision at time , which first leaves the is the time when the statistic . By default, if there is no such , i.e., region . The decision , indicating target presexceeds the upper ence, is made as soon as , indicating target absence, is threshold . The decision . made as soon as it falls below the lower threshold Note that in statistics the maximum log-likelihood ratio is usually called the generalized log-likelihood ratio. Correspondingly, the detection procedure of (1) will be called the generalized sequential probability ratio test (GSPRT). B. Upper Bounds for the Probabilities of Errors and Choice of Thresholds We are interested in detection procedures that confine the probability of a false alarm (FA probability) and the probabilities of missed detection (MS probabilities) at the given levels and , respectively. The class of such detection algo. To be more specific rithms will be denoted by

where we use the notation FA-probability of the procedure , and

(2) for the

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

for the MS-probability of the procedure when a target is located in the th channel. It is shown in Appendix I-A (see Lemma 1) that, regardless of , the following the structure of the observed process upper bounds for probabilities of errors hold: and

(3)

From (3), we immediately obtain that and imply

(4)

Formulas (3) and (4) will be refined in Sections III-D and III-E for quasi-i.i.d. and i.i.d. models. C. Asymptotic Performance and Optimality of GSPRT: General, Non-i.i.d. Case In this subsection, we study asymptotic properties of the sequential detection algorithm of (1) and (4) when the probabiland are small. In particular, we show ities of errors that this simple detection algorithm asymptotically minimizes for all among the expected sample sizes under mild all detection procedures in the class conditions that do not confine one to the i.i.d. assumption. Note that so far we did not impose any constraints on the observation processes. In fact, the upper bounds for probabilities of errors (3) hold true whenever probability measures are mutually locally absolutely continuous. However, if we want to study the behavior of the expected sample size (or more generally positive moments of the stopping time), some conditions should be imposed. denote the vector of the first Let observations from the th channel. For simplicity of presentation, in the rest of the paper, we assume that the vectors and are statistically independent (the channels are mutually independent) conditioned on the hypotheses,1 i.e.,

427

where, as usual, the abbreviation stands for the almost-sure convergence under the measure , and where and are positive finite numbers. Note right away that in and are nothing but the the i.i.d. case, the numbers and Kullback–Leibler information distances . A standard approach to obtaining asymptotics for the average sample number (ASN) is to first derive asymptotic lower bounds, and then, to show that these lower bounds are also upper bounds for the test considered. (See, e.g., [5]–[7], [11], [16], [18], [19]–[22] where this approach has been successfully applied.) In what follows, when deriving asymptotics without special and go to infinity emphasis, we will always assume that , where . This condiin such a way that tion guarantees that neither of the threshold values goes to inand finity exponentially faster than the other, i.e., , as . The same applies to the probabilities of errors. Specifically, we assume that, as , the ratio is bounded away from zero and infinity. The following theorem establishes the asymptotic lower bounds for moments of the stopping time distribution. It is proved in Appendix I-B. be a positive, not necessarily integer Theorem 1: Let number. Suppose that the almost-sure convergence conditions (7) hold. Then i) as

(8) ii) as

(9) (5) in which case (6) Therefore, under this assumption, which holds in many applicadepends on the observation tions, the log-likelihood ratio through the component and process does not depend on the rest of the components. We emphasize that we do not assume independence (or i.i.d.) of observations within the channels. Further, assume that the log-likelihood ratio processes obey the Strong Law of Large Numbers, i.e., for (7) 1This assumption can be removed and the results of the present section can be readily generalized for the case where there is a cross correlation between the data in channels.

where

.

, Theorem 1 gives the asymptotic lower Therefore, for bounds for the ASN of the GSPRT and for the ASN of an op. On the other hand, timal procedure in the class we show that, under certain conditions, the bounds (8) and (9) are also asymptotic upper bounds for the ASN of with procedure (1). where Observe that

(10) be the time at which Also, let first reaches the threshold or if there the statistic for any . is no such time. Obviously, and can be used to obtain Therefore, the Markov times

428

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

upper bounds for . To this end, however, we first have to strengthen the almost-sure convergence condition (7), since it . (See Targenerally does not even guarantee finiteness of takovsky [22] for the corresponding discussion and an example is infinite.) where the strong law (7) is valid but , define the random variables For

Theorem 2: Let the complete convergence conditions (12) be satisfied. i) Then, as for for (14) ii) Let the thresholds and be selected so that

which are the last times when is outside the region and is outside the region , respectively. In terms of the random variables and , the almost-sure convergence (7) can be and rewritten as follows: for all . Let us strengthen this condition by assuming that and

and as

Then, as

for all

(11) is said to converge completely to In this case, under and to under as (see [8]–[11], [22] for the definition of the complete convergence which is related to the -quick convergence). Therefore, the conventional strong law (7) is now strengthened into the complete version -

for (15) for Proof: Asymptotic equality (14) immediately follows ) and the upper bound from the lower bound (8) (with (13). To prove ii), we notice that by inequalities (3)

for -

(12)

and

Remark that conditions (11) are closely related to

(16) Since by conditions of the theorem

and and

for any . Therefore, these conditions are related to the convergence rate in the strong law of large numbers for (cf. [2], [4], [9]). Now, everything is prepared to derive the asymptotic performance of the detection procedure in question. Indeed, it is , the proved in Appendix I-C (see Lemma 3) that, as following upper bounds for the ASN of the GSPRT (1) hold: (13) if the condition (12) is satisfied. Combining Theorem 1 with the upper bounds (13), we arrive at the following theorem which is the main result of this submeans that section. Throughout the paper, the notation , i.e., , where . and In what follows, we also write for the FA-probability and the MS-probability of the detection procedure with the and . thresholds

as

, it follows from (13) and (16) that, as for for

Since by (9) the reverse inequalities hold for

asymptotic relations (15) follow in an obvious manner. Corollary 1: Let the thresholds be chosen as in (4), i.e., and . Then , and asymptotic relations (15) hold . for Relation (14) characterizes the asymptotic performance of the GSPRT for large values of thresholds regardless of error probability constraints, while relation (15) defines its performance . Moreover, it follows from Theorem in the class 2 ii) and Corollary 1 that, if the thresholds are chosen so that

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

and , in particular if they are given by (4), then the proposed GSPRT has the first-order (FO) asymptotic optimality property under very general conditions (11) in the sense that

We begin with the following two important assumptions on and : the processes is the log-likelihood ratio process for some i.i.d. (C1) multichannel problem with the probability density functions if and if . (C2)

(17) The complete convergence conditions (11) hold for general statistical models and do not require the i.i.d. assumption which is quite restrictive for a variety of applications. Remark 1: The assertions of Theorem 2 can be easily generalized to cover higher positive moments of the stopping time. However, this extension requires a strengthening of the complete convergence conditions (11). Specifically, suppose that and for every and some . Then, for every , as

429

is a slowly changing sequence such that almost surely (a.s.) under , .

Under conditions (C1) and (C2), the normalized logconverges a.s. (under and , likelihood ratio and which are the respectively) to the values famous Kullback–Leibler information numbers that measure and the “distances” between the densities

and (20)

for (18) for which is an extension of Theorem 2 ii) for higher moments. In other words, under the above conditions, the GSPRT minimizes all the moments of the stopping time up to the order . Asymptotic formulas (14) are generalized similarly. D. Asymptotic Performance of GSPRT in the Quasi-i.i.d. Case The formulas for the expected sample size and the probabilities of errors can be substantially improved when the observations are i.i.d., or more generally, when the log-likelihood ratio processes can be well approximated by random walks. To be can be repspecific, assume that the log-likelihood ratio resented as the sum of a random walk

The upper bounds for the probabilities of errors (3), and the FO asymptotic expansions for the ASN (14) and (15) obtained in the general case, can be refined in the quasi-i.i.d. case (19) by using nonlinear renewal theory [17], [26], [27]. The hypotheses and will be considered separately, since, in general, the and is different, especially in a symperformance for metric case (see later). In the rest of the paper, without special emphasis, we , will always assume that the log-likelihood ratios , are nonlattice (nonarithmetic).2 1) The Hypothesis —The Target is Present: In order to exploit relevant results from the nonlinear renewal theory, we defined in (10) in the form of a rewrite the Markov time random walk crossing a threshold plus a nonlinear term that is slowly changing in the sense defined above. Indeed, by adding and using representation (19), the Markov and subtracting can be written in the following form (for any time ): (21)

( , random sequence

are i.i.d.) and a slowly changing

where

(19) In other words, the log-likelihood ratio is split into two parts, and non-i.i.d. . An example where this reprei.i.d. sentation holds is considered in Section V. The case where the log-likelihood ratio processes can be represented in the form (19) will be referred to as the “quasi-i.i.d. case.” In what follows, we shall repeatedly use the notion of a slowly changing sequence (see Siegmund [17] and Woodroofe [26]). are said to be slowly The random variables , changing if the following two conditions hold: in probability as ; i) are uniformly continuous in probability, ii) , there exists some such that i.e., for every for all

We first show that provided that (C2) is satisfied. Write

is a slowly changing sequence are slowly changing and condition

and

Since

2A random variable X is called nonlattice (or nonarithmetic) if d X is not integer valued for every nonzero constant d, i.e., for every d 6= 0, P (X = dm; m = 0; 1; 2; . . .) < 1.

6 6

430

for all

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

and converges to

as

a.s. (see condition (C2)), the a.s. . Therefore, (22)

is slowly changing. Since the sum of which implies that slowly changing sequences is also a slowly changing sequence, is slowly changing under . , let For

Since , the upper bound in (23) gives a better apthan the upper bound in (3). Monte proximation to Carlo (MC) simulations show that (23) is considerably more accurate than (3). Also, although we do not have a rigorous proof in a general case, computations for several particular examples show that the upper bound in (23) is asymptotically accurate. We therefore conjecture that this bound is asymptotically accurate in the general case too, i.e., as (24)

be the time at which the random walk first reaches the if no such time exists. Also, let level or on denote the excess (“overshoot”) of at the stopping time , and let

over the boundary

be the limiting distribution of the overshoot. Similarly, define on by

for all However, in the example of Section V, where , simulations for moderate values of thresholds works show that the approximation than when at least for small . better when the Therefore, for moderate values of thresholds and small lower bound in (23) can be more accurate than the upper bound. Next, we improve the approximations to the expected sample . By (21) size on Taking expectations on both sides and using Wald’s identity , we obtain

Thus, is the excess of over the level at the time which is when it first crosses . linearly increases with , and Since are slowly changing, one can expect that the , when is large and the target is located behavior of in the th channel, will be approximately the same as the . Since and behavior of the one-sided stopping time are different only on the set , whose probability decreases exponentially fast, one can and are also close when . Thereexpect that , assuming fore, we expect that converges to a constant as . This is that elaborated in the following. First, we will use Woodroofe [26, Theorem 4.1] (see also [17, Theorem 9.12]) to refine the inequality (3) for the FA-probability by taking into account overshoots and using the fact that the stopping times , , and are close to each other when the target is present in the th channel. The key to this refinement is slowly changing under , is the fact that, since and have according to this theorem the overshoots the same limiting distributions

Define

as where as

. Now, since , we expect that

is the average limiting overshoot in the one-sided test Using (22) we may assume that

where is the mean of the limiting distribution of we expect that

.

. Thus,

as

(25)

, are Recall that, so far, we assumed only that with probability (see slowly changing, and , condition (C2)). To refine the asymptotics for the ASN we need some additional conditions. In the rest of this subsection, we assume that the following conditions hold for every : converges in

It is proved in Appendix I-D (see Lemma 4) that

in distribution , where

-distribution to a random variable with for some

(26) (27)

are uniformly integrable as

(23)

under

(28)

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

for some for

(29)

431

formulas become especially simple, since the values of ,

(30)

Approximation (25) follows from the Nonlinear Renewal Theorem (see [26, Theorem 4.5]). In order to apply this theorem, we have to show that conditions (26)–(28) are satisfied . Since a.s. as , for in distribution as the condition (26) implies . Since , the condition (27) for holds trivially. Next, it can be shown that are uniformly integrable when are are uniformly integrable. Hence, also uniformly integrable. Finally, it is shown in Appendix I-E (see Lemma 5) that as which implies that asymptotic approximation (25) is also valid for the expected value of . Note that asymptotic approximation (25) is true when (27) is replaced by a weaker condition

and

do not depend on . Therefore, as for

(32) (33)

—No Target: For the hypothesis , 2) The Hypothesis the performance changes dramatically depending on whether are different for different the distances channels (asymmetric case) or the same (symmetric case). Thus, these two cases must be considered separately. Asymmetric Case: First, we consider the asymmetric case for which attains where the number be its minimum is unique. Let over the threshold at the stopping the overshoot of and define time

for some In general, however, this last condition is not sufficient for (25) . The condition (27) can be perhaps relaxed to be true for into

for some

and

Also, let be the mean of the limiting distribution of under . (see (10)) can be written Observing that the Markov time in the form

and

We summarize the obtained results in the following theorem. , Theorem 3: Assume that the log-likelihood ratios , can be represented in the form (19) with a random walk and a slowly changing sequence. Then the following two assertions hold. are -nonlattice and i) If

where , and using exactly the same kind of argument as above, we obtain as (34)

for then the FA-probability satisfies the asymptotic inequalities (23). are -nonlattice and conditions ii) Suppose that as (26)–(30) are satisfied. Then, for (31) where distribution

of the overshoot

is the mean of the asymptotic

in the open-ended test . beIn the symmetric case where the distances and are the same for all (typical tween when the SNR does not change from channel to channel), the

(35) We omit all mathematical details and only remark that condireplaced by , by , by tions (26)–(30) with , by , and by guarantee (34). The inequalities are nonlattice, and . (35) hold whenever Note that the asymmetric situation is not typical for “multichannel applications.” We now consider a more realistic symmetric case. Symmetric Case: By contrast, suppose now that the distribuand do not depend on . tions This means, in particular, that the density functions in condition (C1) are the same for all . , , , and Then, obviously, do not depend on either. This opposite, symmetric situation is typical for many applications when the SNR is the

432

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

same for all channels. In this case, the asymptotic approximais comtion to the expected sample size under the hypothesis pletely different. Specifically, the second term of expansion for is not a constant as in the asymmetric case but is proportional to the square root of the threshold . Therefore, the second term is also growing with the threshold , and the FO approximation is usually very inaccurate for moderate values of the probabilities of errors. The reason is that, in the symmetric play the case, none of the log-likelihood ratio statistics is true. As a result, the dominating role when the hypothesis , , are not slowly changing. sequences To obtain the higher order (HO) asymptotic approximation , we will use the most general results of nonlinear refor newal theory developed by Zhang [27]. We first note that, in the in (19)), the asymptotic approximation for i.i.d. case ( is a special case of Dragalin, Tartakovsky, and the ASN Veeravalli [7, Theorem 3.3] (see Theorem 4 iii) in Section III-E). It turns out that in the quasi-i.i.d. case considered in this subsection similar results can be proved with suitable modifications. Reference [7, Theorem 3.3] is a key for deriving corresponding asymptotic approximations. We will need the following additional notation:

Note that, under , is a zero mean random walk and is a zero mean slowly changing sequence. It follows that on (36) is the excess of the process where over the boundary at time . In what follows, we , , are slowly changing and converge argue that with finite expectation. Let to a random variable . Replacing with in (36) and solving the resulting equation yields

where on both sides, we obtain that for large

. Taking expectations

(37) Now, we give some more details. To derive an accurate , we first have to strengthen asymptotic approximation for the second moment condition (30). Specifically, we suppose and the Cramér condition on the that characteristic function of (38) is satisfied. It is established in [7, Proof of Theorem 3.3] that, under these conditions

where, same as before and

are, respectively, standard normal density and distribution funcis the variance relative to the density . tions and where , where are Also, write i.i.d. Gaussian random variables with mean zero and variance . Note that is the expected value of the th-order statistic . from the standard normal distribution, and hence, The main idea of the approach to finding asymptotics is as can be written in the form follows. The Markov time

is a slowly changing process which converges in -distribution , and that to a random variable as , where with and defined above. Further, assume that conditions (26)–(29) are satisfied with replaced by , by , by , by , and by . Then, under , , are slowly changing sequences that converge in -distribution to random with mean zero, and hence, variables in distribution as . It follows that is the slowly changing sequence which converges in distribution . Since to the random variable when , by Anscombe’s theorem in distribution as and as by the uniform integrability properties. Also, in just the same way as in [7, Proof of Theorem 3.3], it can be is uniformly integrable, and shown that the overshoot as . Therefore, using (37) that (see Lemma 5 along with the fact that Appendix I-E), we obtain that, as

where

(39) The values of universal constants can be found in [7].

and

for

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

A rigorous proof of (39) is quite long and for this reason is omitted. We only remark that it is based on Zhang [27, Theorem 3 ] and runs along the lines of Dragalin, Tartakovsky, and Veeravalli [7, Proof of Theorem 3.3]. To summarize, we emphasize that the HO asymptotic approximation (39) holds true when the following set of conditions is fulfilled. a) For the random walk part: the third moment condition and Cramér’s condition (38) for the char. acteristic function of b) For the slowly changing sequences Markov time : conditions (26)–(29) with by , by , and by .

and the replaced by ,

We do not know how to improve the upper bounds (3) for the in the symmetric case. Note, MS-probabilities do not depend on however, that the probabilities , and (35) with works better than the upper bound (3) at least in the example considered in Section V. Remark 2: Earlier, we restricted our attention only to the two cases—asymmetric and symmetric. The latter is perhaps of major interest for applications. In some applications, however, it may be important to deal with an intermediate case, where are the same for several and different for the rest of the channels. A typical example is a radar system with range channels where the SNR changes from one cluster of channels to another but remains unchanged within clusters. Let

433

Experimentation indicates that (23), (31), (35), (39) with and determined in (40) and (41) give quite accurate approximations for the ASN and probabilities of errors. E. Asymptotic Performance of GSPRT in the i.i.d. Case in (19) which In this subsection, we assume that corresponds to the “purely” i.i.d. case. While this is a particular case of the more general model considered in Section III-D, some of the conditions postulated in that subsection can be relaxed. For this reason, we consider the i.i.d. case separately. The following theorem summarizes the results related to asymptotic approximations and asymptotic optimality in the i.i.d. case. The notation of Section III-D is used throughout. In and are the Kullback–Leibler information particular, . Recall that distances defined in (20), where is used to denote the observation available in the th channel at time . be i.i.d. with Theorem 4: Let the observations when there is no target and with the density the density when there is a target in the th channel. and . Then the i) Suppose that following three assertions hold. i.1) The FA-probability satisfies the asymptotic inequalities (23). , for any i.2) As for (42) for

be the ordered values of

,

. Assume that for some . Note that the latter condition includes the fully symmetric sitfor all when uation (assuming that ), and the asymmetric situation . In this more general situation, the asymptotic apwhen replaced by and by proximation (39) holds with . We note also that , and hence, the resulting expression for the ASN is consistent with (34) obtained in the asymmetric case. , , and are the subject of the The values of , renewal theory. They can be often computed either exactly or approximately (see, e.g., [17], [20], [23], [26], and Section V). The following formulas are particularly useful [17], [26]:

and

i.3) If the thresholds selected so that

then, as

, for any

for (43) for ii) Suppose that

. Then as for

where

and

.

are and as

(44)

(40)

, and the Cramér condition iii) Suppose that is satisfied. (38) for the characteristic function of Then, in the symmetric case, as

(41)

(45) A proof is given in Appendix I-F.

434

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

Remark 3: It is interesting that Theorem 4 i.3) implies that the GSPRT asymptotically minimizes all the positive moments of the stopping time distribution under the minor, first-moment , that holds whenever dencondition and are mutually absolutely continuous (i.e., sities , then also ), in which case if . A typical example where this condition is violated is testing for rectangular distributions. We stress that higher moment conditions are not required. In this connection, we also note that the conditions used in Remark 1 and can be relaxed in the i.i.d. case. Indeed, is a necessary and suffiin the i.i.d. case cient condition for the above conditions. These conditions are redundant, as shown in Theorem 4 i.3). Remark 4: The FO asymptotic optimality result of Theorem in (43)) can be perhaps strengthened 4 i.3) for the ASN ( into the “almost optimality.” To be specific, we conjecture that and in the symmetric situation if the thresholds are selected so that

ment when at least one of the statistics exceeds the upper threshold . A decision in favor of target absence is made at the cross the lower threshold moment when all the statistics . We stress that this procedure is nothing but the parallel implementation of Wald’s SPRTs in multiple channels. , we will denote the Wald’s binary SPRT that By against tests the hypothesis

if

(48)

if

and , the correand by sponding error probabilities. Since the observations in different channels are mutually in different branches are independent, the statistics independent (see (5) and (6)), and it is easily seen that the and the MS-probabilities FA-probability of the SPRT bank are equal to (49)

and and Assuming that using the well-known upper bounds (see, e.g., [26]), we obtain

then for

, and and

as

(50)

In other words, we conjecture that in this case the difference between the expected sample sizes of the GSPRT and an optimal detection procedure vanishes. A proof of this fact is highly nontrivial and may be built on the basis of a Bayesian approach similar to that considered by Lorden in [12], [14]. We leave this interesting problem for future research.

Reverting the inequalities (50), we immediately obtain that and (51) guarantee

. Note also that as

IV. AN SPRT BANK In this subsection, we briefly discuss an alternative multichannel sequential detection algorithm that is based on the parallel implementation of Wald’s tests. This scheme will be referred to as an SPRT bank. and be two positive thresholds. For any Let , define the one-sided stopping times

and (46) The multichannel Wald’s test (SPRT bank) as

if if

is defined

(47)

and . where Thus, the observation process is continued as long as all the are between the thresholds and log-likelihood ratios . A decision in favor of target presence is made at the mo-

The results of Theorem 2 are valid for the SPRT bank . In other words, the SPRT bank is asymptotically optimal under the same, quite general conditions. Theorems 3 ii), 4 i), and 4 ii) also hold. In particular, under the same conditions and with the same notation as in Theorem 3, we have as for Proofs are quite similar. The details are omitted. under the An asymptotic approximation for the ASN is slightly different from (39) and (45). This hypothesis problem deserves special consideration. However, the derivation of the asymptotically accurate expansion requires fairly tedious argument that is out of the scope of this paper. Here we , the stopping times and only point out that, under are close to each other when the thresholds are as . large, which results in , and hence, the right-hand Further, evidently sides of (39) and (45) provide the asymptotic upper bounds for in the quasi-i.i.d. and i.i.d. cases, respectively. and More importantly, the probabilities of errors can be estimated accurately in all cases where the partial error and can be evaluated. The latter probabiliprobabilities

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

ties can be evaluated with high precision in a variety of situations when observations are either i.i.d. or quasi-i.i.d. (see Sections III-D and III-E). Indeed, it can be shown that as

435

and

(52)

Therefore, ments

(53)

Assume that

,

is a Gaussian process with independent increand parameters

which along with (49) yields as

are defined in (40). In the i.i.d. where the constants and case, the asymptotic approximations (52) follow from the renewal theory [17], [26]. In the quasi-i.i.d. case, they can be derived based on the nonlinear renewal theory. The argument is similar to that used in Section III-D. V. AN EXAMPLE

TARGET DETECTION GAUSSIAN NOISE

OF

IN

COLORED

(54) where characterizes average SNR is finite, for all second moment of

. Since the

and -

Consider an example where, in spite of the fact that the observed data are dependent, the log-likelihood ratios are the processes with independent increments. In some cases, they can even be approximated by sums of random walks and slowly changing sequences, i.e., the model (19) can be used for the analysis. Consider the additive model where if and if , or . We will is deterministic and sensor noise suppose that the signal in the th channel is modeled by a th-order au(or clutter) toregressive Gaussian process which obeys the recursion

Thus, according to the results obtained in Section III-C, the detection algorithm is asymptotically optimal, and asymptotic for. Moreover, mulas (14) and (15) hold with whenever in addition to conformulas (18) hold for all for all , i.e., in this case, the dition (54) GSPRT asymptotically minimizes all positive moments of the stopping time distribution. Consider the particular case where the signal is constant, . Then the condition (54) is fulfilled with

for where are i.i.d. Gaussian random variables with ( and are independent for mean zero and variance ). We assume that the parameters and are known. In usually represents the result of preproradar applications, cessing (attenuation and matched filtering) of modulated pulses in the th channel. Write

Extensive MC simulations have been performed for different number of channels, SNR, and probabilities of errors. Sample results for two- and ten-channel systems are shown in Tables II–VI. We simulated an FO autoregressive process, in , where is the correlation which case ). coefficient ( In the latter case, the results of Sections III-D1 and III-D2 can be applied. To show this, we first note that the log-likelihood can be written in the form ratio

and

where

are i.i.d. Gaussian random variables with the parameters

It can be shown that

Next, obviously, , random variables with

are independent normal

Therefore, by adding and subtracting the random variable , which has the same distribution as , one can represent in the form (19) with

436

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

THE VALUES OF

TABLE I : FOR DIFFERENT SNR q

AND

For small , Brownian approximations can be applied (see [17, Lemma 10.27]) to obtain the following useful approximations for and :

being a Gaussian random walk with the parameters , , and

being a Gaussian random variable with , , where . It is easily checked that all the necessary conditions, particularly (26)–(30) and (38), are satisfied. and To guarantee the given probabilities of errors in simulations, we used the following threshold values obtained by reverting (24) and (35): and where the constants from the formula

and

(55)

, according to (40), are computed

(58) where

The remaining term in the approximations (58) is of the order for small , and therefore, their accuracy grows when SNR are given decreases. The values of and for in Table I. It is seen that the approximations (58) are quite accurate and can be used when is not very large. For the purpose of comparison, we also used the FO approximations for ASN (see (14))

(56) and To compute the ASN

(59)

and

(due to the symmetry, for all used the following HO approximations:

), we

(57) where, according to (41)

Formulas (57) follow from the HO asymptotics (32) and (39).

In Tables II–VI, we present the MC estimates of ASN and along with the theoretical values computed according (two-channel system) and to (57) and (59) for (ten-channel system). The abbreviations MCASN, FOASN, and HOASN are used for the ASN obtained by the MC experiment, by FO approximations (59), and by HO approximations (57), respectively. In these tables, we also show the MC estimates for and . The the probabilities of errors results in Tables II and V correspond to the thresholds (55) that in (55) corresponds account for overshoots. (The threshold to the asymptotic upper bound in (33) for the FA-probability.) It is seen that these formulas allow one to obtain quite satisfactory approximations for true error probabilities as long as the average sample size is not very small. FOASN (59) has a satisfactory but is poor for the hypothesis accuracy for the hypothesis . This could be expected, since the FO approximation for neglects the second term that increases at the rate of the square root of the threshold. HOASN gives fairly accurate estimates in all cases where SNR is not very large, i.e., ASN is not very small. Tables III and VI contain similar results in the case when

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

TWO-CHANNEL SYSTEM WITH a = log(2 =P

TABLE II ) AND

a

= log( =P

):

TABLE III

TWO-CHANNEL SYSTEM WITH a = log( =P

) AND

TWO-CHANNEL SYSTEM WITH a = log(2=P

) AND

TEN-CHANNEL SYSTEM WITH a = log(10 =P

437

a

= log( =P

= log(1=P

the threshold with defined by (56). Note that this threshold corresponds to the lower bound in (33). It is , the FA-probability is better approximated seen that for by the lower bound than by the upper bound (of course, with no this choice guarantee being below 10 ). However, for exceeds the given level is not good enough: up to four times. In addition, we performed MC experiments for the detection procedure that uses the thresholds (4) which were derived based on the general upper bounds (3). (These bounds ignore over-

a

= log( =P

,P

= 10

, % = 0:5

P

= 10

,P

= 10

, % = 0:5

):

P

= 10

,P

= 10

;%

TABLE V ) AND

= 10

):

TABLE IV

a

P

):

P

= 10

,P

= 10

= 0:5

;%

= 0:5

shoots.) Sample results are shown in Table IV for . An analysis shows that for this last procedure the true probabiliand are substantially ties of errors smaller than the allowed values. Comparison with Table II also shows that both error probabilities are smaller than in the detection algorithm with the thresholds (55) that account for overshoots. This leads to an increase of the true values of the average and sample size, which is undesirable. For example, for , ASN and . Compare these results to and . the first row in Table II, which gives

438

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

TEN-channel system with a = log( =P

TABLE VI ) and

a

Furthermore, in the columns and , we show the results of the comparison with the multichannel fixed sample size (FSS) detection algorithm, which is based on the maximum-likelihood statistic

= log( =P

):

P

= 10

,P

= 10

;%

= 0:5

Solving these equations, we obtain that the sample size should be chosen so that

if if where is a threshold. The efficiency of the sequential detection algorithm compared to the FSS algorithm is defined as

where is the sample size which is required to and in the FSS guarantee the probabilities of errors detection procedure. We now show that

(60) where normal distribution. Indeed, it is easily seen that

is the quantile of the standard

where , , and we obtain the following two equations for the threshold and sample size :

Since we are interested in small and (i.e., in large ), , neglecting the term . we can put This along with the previous equality yields (60). Evidently, the . approximation (60) is accurate in all cases where The data in Tables II–VI allow us to conclude that the sequential detection algorithm requires much smaller ASN. In parthe sample size is 1.7–3.6 times smaller ticular, for when there is no target and 1.4–2.8 times smaller when there is a target. We also remark that the equivalent SNR depends on the correlation coefficient of noise through the factor . Since the ASN grows approximately as , it substantially increases when increases. In particular, the ASN and are about four times higher for as compared to the . case of noncorrelated noise VI. CONCLUSION AND REMARKS For the problem of target detection in a multichannel sensor system, we have proposed simple, easily implementable multichannel sequential procedures, called generalized SPRT and SPRT bank. As a theoretical contribution, we have shown that under certain regularity conditions, these tests are asymptotically optimal in the sense of minimizing the average sample size, or more generally positive moments of the stopping time, among all detection procedures with guaranteed low rates of false alarms and missed detections for general non-i.i.d. data models. As we showed explicitly, the regularity conditions involved can be relaxed considerably for quasi-i.i.d. and the conventional i.i.d. data models. Further, using nonlinear renewal theory, we have derived explicit asymptotic expansions for the average sample size and the upper/lower bounds for the probabilities of errors of the proposed tests. They not only provide useful insight into the performance of the tests, but are also the basis for the approximate design of the thresholds of the tests. As verified by simulation results, accounting for the excesses over the thresholds in the higher order asymptotic expansions leads to significantly better

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

results. Simulation results also verified the substantial superiority of the proposed generalized SPRT to nonsequential tests. The usefulness of the asymptotic theory manifests itself once again in our study through the close match between the theoretical and simulation results for the practical example considered, in which error probabilities are small but not infinitesimals. Although we focused on discrete-time models, which are, perhaps, of the most interest for applications, many of the above results also hold for stochastic processes observed in continuous time. In particular, Theorems 1 and 2 are equally true for models with a continuous-time parameter. Proofs are straightforward. However, a generalization of Theorems 3 and 4 is not a straightforward task, since there is no continuous-time analogs of the nonlinear renewal theory results that we used in the proofs. Most of the results obtained in the paper also hold in the fol, lowing “scalar,” not multichannel case. Let be observations with the joint density with an unknown parameter when there is a target and with the density when there is no target. Further, let be distinct be the discrete points and let corresponding log-likelihood ratios between the hypotheses and . The GSPRT and the SPRT bank are defined replaced by . One of the parby (1) and (47) with ticularly interesting problems that allows for this formulation is the detection of gravitational waves (almost harmonic signals with a slowly changing phase and an unknown frequency ) in Gaussian noise [15]. The results of Sections III and IV can be used to obtain asymptotically accurate approximations for the probabilities of errors and the average sample size of the GSPRT and the SPRT bank.

439

where denotes an indicator of an event . The second upper bound follows from the following chain of equalities and inequalities: for any

where we used the following obvious relations:

The proof is complete. APPENDIX I B. Derivation of Asymptotic Lower Bounds (8) and (9) A. Derivation of Upper Bounds (3) for the Probabilities of Errors Lemma 1: Let The MS-probabilities

be an arbitrary stochastic process.

The following lemma is the key to obtaining the asymptotic lower bounds for moments of the stopping time distribution. Lemma 2: Let the almost sure convergence conditions (7) be , satisfied. Then, for every

and the FA-probability

for (A1)

of the GSPRT obey the following inequalities:

and

for Proof: To prove the inequality for , we obfor all on the set serve that . Therefore, by changing the measure we obtain

(A2)

Proof: Since for any two random events

and (A3)

440

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

where is a complement of , it follows that for any detection procedure

for

Consider the detection procedure . Our first obfor , which servation is that follows from the a.s. convergence (7). Indeed, since -a.s. and a.s., it follows and . that a.s. for all and . Next, using This implies inequalities (A6) and (A7) along with (3), we obtain

(A8)

whenever for . . Therefore, Note that for any procedure that terminates with probability

(A9) To prove the first equality in (A1), we put

for (A4)

and By changing the measure, we obtain that for any

and and use (A8) to obtain

(A10) To complete the proof, we only need to show that the third term for all on the right-hand side of (A10) tends to as . -a.s., By the conditions of the lemma in -probawhich implies that . Therefore, for every bility as

where the last inequality follow from (A3) if we set and It follows that

which along with (A10) proves that (A5) as Combining (A4) and (A5) yields

To prove the second equality in (A1) we note that, by the a.s., and similar to condition (7), the above

(A6) in order to show that

A similar argument applies to

for

(A7)

for every and Taking using (A9) and (A11), we obtain that for all , as every

which proves the second equality in (A1).

(A11) and and

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

Now, we prove (A2). First taking that for any procedure

and and using (A7), we obtain from the class

441

Proof of ii). Write for

(A12) Similar to (A13), for any The first two terms on the right-hand side go to for every as . The same kind of argument as above applies to show that the third term also goes to for all whenever the almost-sure convergence condition (7) for the right-hand side in (A12). holds. Write does not depend on , it follows that the Since inequality

where the right-hand side tends to This implies inequalities (9).

, any

, and every

as

by (A2).

C. Derivation of Asymptotic Upper Bounds (13) holds uniformly in as

, which along with the fact that proves the first equality in

(A2). Next, let

Lemma 3: Suppose that the complete convergence conditions (12) hold. Then, as for

and

for

.

Proof: Recall that

Then, using (A6), we obtain that and

for all and for any . The rest of the argument is essentially the same. This leads to the proof of the second equality in (A2). The proof of the lemma is complete. Proof of Theorem 1: To prove the asymptotic lower bounds (8) and (9), it suffices to use Lemma 1 and Chebyshev’s inequality. Proof of i). Write for

It follows from [6, Proof of Theorem 4.2] that, as

if the condition (12) holds. Now, since and , this implies that and for every . The required upper follow and the proof is complete. bounds for D. Derivation of Upper and Lower Bounds (23) for the False Alarm Probability

By the Chebyshev inequality, for any and every

, any

,

(A.13) where the latter limit follows from (A1). Since is an arbitrary , it follows that number in

Lemma 4: Suppose that the log-likelihood ratios can be represented in the form (19) where are random walks with finite expectations, , , and are slowly changing sequences, such that Then, as

where which imply inequalities (8).

442

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

Proof: Write

Then

and

as Changing the measure for all

to

and using (21), we obtain that

(A17) (A18)

as Proof: For any

where

for

, we have

. Since for all

(A19) (A14)

where we used the Schwarz inequality, the fact that (for any ), and the upper bound (3) for . Next, we show that under condition (A15) as

the following inequalities hold:

(A20)

in -probaSince by [17, Lemma 9.13] bility, in order to prove the convergence of the second moment is finite for all , (A20) we need to show that , , are uniformly integrable under and that for some . be as in (A15), and be a small positive Let . Let number such that and Now, since are slowly changing (see Section III-D1), [26, Theorem 4.1] applies in order to show that as in -distribution. Also, since -a.s., with probfor all . ability , and this implies Therefore, by (3) as

Then, for any , so that

, the value of

is smaller than

(A21) which along with previous inequalities proves the assertion of the lemma. E. Asymptotic Relations Between Stopping Times

Obviously

and

Lemma 5: Let in such a way that the ratio is bounded away from and . Assume that can are be represented in the form (19) where , random walks with finite second moments , and are slowly changing sequences such that

(A22) Now, the right-hand side of (A22) goes to zero as as By (A21)

for some (A15) for some

where and

(A16)

if

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

do not depend on . The values of , are sum, are mable by condition (A15). The values of by the following also summable whenever argument. . It follows from [4, Write depending Theorem 1] that there exists a universal constant only on such that for every

to -a.s. as . Therefore, the sequence is slowly changing, and Lemma 4 applies to complete the proof of i.1). i.2) We prove asymptotic equalities (42) only for . For , the proof is essentially the same. To prove the convergence of moments as it is sufficient to check the validity of the following three conditions: for all and (C1) ; is -uniformly inte(C2) the family grable; and in -probability as . (C3) , it is sufficient Since . to verify conditions (C1) and (C2) for the stopping time and Obviously, for any

and, hence, the following implication holds:

for all Since densities

443

is the log-likelihood ratio for some i.i.d. model with and , we have . Therefore,

(A23) which shows that follows that

is bounded for any

for all and, in particular, that and , Thus, both for all implies

. It

Applying Markov’s inequality to the last term, we obtain that for any

and

. are summable, which

as and

as . In particular, are uniformly integrable under . The convergence of the second moment (A20) follows. Finally, using (A19) and (A20), we obtain as

assuming that the ratio is bounded away from and . This implies asymptotic equality (A17). Asymptotic equality (A18) is proved absolutely analogously. F. Proof of Theorem 4

where

By Jensen’s inequality, for any when and do not coincide almost everywhere. It and a follows that there is a finite positive constant , such that , number , . This implies for all positive and all finite positive . Therefore, (C1) holds. for any , and By (A23), condition (C2) follows from Gut [8, Theorem III.7.1]. To establish the validity of (C3), we observe that, according -a.s. as . Also, it to the strong law -a.s. as . Thus, was established above that as -a.s., and using the representation (A24), we obtain -

Proof: Proof of i). In the i.i.d. case similar to (21), the can be written in the form Markov time (A24) with that

. To prove i.1), we note as a.s. under whenever and are positive and finite. This implies that converges

so that

almost surely under

. Since

as condition (C3) holds, which completes the proof of (42). i.3) The same as the proof of Theorem 2 ii).

444

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 2, FEBRUARY 2003

Proof of ii). We first use (A24) and [26, Theorem 4.5] to show that as (A25)

and Then

However, to apply [26, Theorem 4.5], the following four conditions have to be verified: converges in distribution to a random variable (A26) for some are uniformly integrable

(A27) (A28)

where . Thus, to verify condition (A29), it suffices to show that as

for some (A29) -a.s. as , condition (A26) holds Since . Condition (A27) holds trivially, since . with Verification of the uniform integrability condition (A28) is a straightforward but tedious task (see, e.g., [7]). The only remaining condition to check is (A29). to , we obtain that for any Changing the measure and

This last asymptotic relation follows from the Baum–Katz convergence rate in the strong law of large numbers [2], which states that for all whenever . Hence (A29) follows. Therefore, all the conditions of [26, Theorem 4.5] are satisfied. This theorem yields the asymptotic approximation (A25). To prove assertion ii), we have only to prove that as which follows from Lemma 5. This yields (44), which completes the proof of ii). Proof of iii). Asymptotic approximation (45) is a particular in Dragalin, Tarcase of the approximation (32) with takovsky, and Veeravalli [7, Theorem 3.3].

where the last inequality follows from (A3) if we set and

It follows that for

Next

and therefore,

Let

REFERENCES [1] P. Armitage, “Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis,” J. Roy. Statist. Soc. B, vol. 12, pp. 137–144, 1950. [2] L. E. Baum and M. Katz, “Convergence rates in the law of large numbers,” Trans. Amer. Math. Soc., vol. 120, pp. 108–123, 1965. [3] H. Chernoff, “Sequential design of experiments,” Ann. Math. Statist., vol. 30, pp. 755–770, 1959. [4] Y. S. Chow and T. L. Lai, “Some one-sided theorems on the tail distribution of sample sums with applications to the last time and largest excess of boundary crossing,” Trans. Amer. Math. Soc., vol. 208, pp. 51–72, 1975. [5] V. P. Dragalin, “Asymptotic solution of a problem of detecting a signal from k channels,” Russ. Math. Surv., vol. 42, pp. 213–214, 1987. [6] V. P. Dragalin, A. G. Tartakovsky, and V. Veeravalli, “Multihypothesis sequential probability ratio tests, I: Asymptotic optimality,” IEEE Trans. Inform. Theory, vol. 45, pp. 2448–2461, Nov. 1999. [7] , “Multihypothesis sequential probability ratio tests, II: Accurate asymptotic expansions for the expected sample size,” IEEE Trans. Inform. Theory, vol. 46, pp. 1366–1383, July 2000. [8] A. Gut, Stopped Random Walks: Limit Theorems and Applications. New York: Springer-Verlag, 1988. [9] P. L. Hsu and H. Robbins, “Complete convergence and the law of large numbers,” Proc. Nat. Acad. Sci. USA, vol. 33, pp. 25–31, 1947. [10] T. L. Lai, “On r -quick convergence and a conjecture of strassen,” Ann. Probab., vol. 4, pp. 612–627, 1976. [11] , “Asymptotic optimality of invariant sequential probability ratio tests,” Ann. Statist., vol. 9, pp. 318–333, 1981. [12] G. Lorden, “Integrated risk of asymptotically Bayes sequential tests,” Ann. Math. Statist., vol. 38, pp. 1399–1422, 1967. [13] , “2-SPRT’s and the modified Kiefer–Weiss problem of minimizing an expected sample size,” Ann. Statist., vol. 4, pp. 281–291, 1976. [14] , “Nearly-optimal sequential tests for finitely many parameter values,” Ann. Statist., vol. 5, pp. 1–21, 1977.

TARTAKOVSKY et al.: SEQUENTIAL DETECTION OF TARGETS IN MULTICHANNEL SYSTEMS

[15] S. Marano, V. Matta, and P. Willett, “Sequential detection of almostharmonic signals,” IEEE Trans. Signal Processing, vol. 51, pp. 395–406, Feb. 2003. [16] I. V. Pavlov, “Sequential procedure of testing composite hypotheses with applications to the Kiefer–Weiss problem,” Theory Prob. Appl., vol. 35, pp. 280–292, 1990. [17] D. Siegmund, Sequential Analysis: Tests and Confidence Intervals. New York: Springer-Verlag, 1985. [18] Yu. G. Sosulin and M. M. Fishman, Theory of Sequential Decisions and Its Applications (in Russian). Moscow, U.S.S.R.: Radio i Svyaz’, 1985. [19] A. G. Tartakovskii, “Sequential testing of many simple hypotheses with dependent observations,” Probl. Inform. Transm., vol. 24, pp. 299–309, 1988. , Sequential Methods in the Theory of Information Systems (in Rus[20] sian). Moscow, Russia: Radio i Svyaz’, 1991. , “Asymptotically optimal sequential tests for nonhomogeneous [21] processes,” Sequential Anal., vol. 17, pp. 33–62, 1998.

[22] [23]

[24] [25]

[26] [27]

445

, “Asymptotic optimality of certain multihypothesis sequential tests: Non-i.i.d. case,” Statist. Inference for Stochastic Processes, vol. 1, no. 3, pp. 265–295, 1998. A. G. Tartakovsky and I. A. Ivanova, “Approximations in sequential rules for testing composite hypotheses and their accuracy in the problem of signal detection from post-detector data,” Probl. Inform. Transm., vol. 28, pp. 63–74, 1992. A. G. Tartakovsky and X. R. Li, “Sequential testing of multiple hypotheses in distributed systems,” in Proc. 3rd Int. Conf. Information Fusion, vol. II, Paris, France, July 2000. A. Tartakovsky, X. R. Li, and G. Yaralov, “Sequential detection of targets in distributed systems,” in Proc. SPIE: Signal Processing, Sensor Fusion, and Target Regognition (Aerosense 2001), vol. 4380, Orlando, FL, Apr. 16–20, 2001, pp. 229–243. M. Woodroofe, Nonlinear Renewal Theory in Sequential Analysis. Philadelphia, PA: SIAM, 1982. C. H. Zhang, “A nonlinear renewal theory,” Ann. Probab., vol. 16, pp. 793–825, 1988.