1
Non-Bayesian Detection and Detectability of Anomalies From a Few Noisy Tomographic Projections Lionel Fillatre and Igor Nikiforov
Abstract— The detection of an anomaly from a few noisy tomographic projections is addressed from the statistical point of view. An unknown scene is composed of a background, considered as a deterministic nuisance parameter, with a possibly hidden anomaly. Because the full pixel-by-pixel reconstruction is impossible, a parametric non-Bayesian approach is proposed to fill up the gap in the missing data. An optimal statistical test which eliminates the background and detects the anomaly is designed. The potential advantage of such an approach is its capacity to detect an anomaly/target hidden in background designed by an adversary to mask the anomaly. A key issue in the non-Bayesian anomaly detection, i.e. the problem of anomaly detectability, is stated and solved in this paper. In the case of a bivariate polynomial background defined on an unknown rectangular support, the size of detectable anomaly reaches its maximum defined by the number of elementary cells of X-ray detector and degree of the polynomial function. Index Terms— Anomaly detection, Anomaly detectability, Computerized tomography, Non-bayesian parametric approach, Optimal hypotheses testing, Deterministic nuisance parameter.
I. I NTRODUCTION
AND THE STATE - OF - THE - ART ANOMALY DETECTION APPROACHES
Computerized tomography (CT) is a technique for examining the internal structure of an object, which is used in quantitative non-destructive testing (QNDT), object recognition, biomedical system monitoring and water vapor tomography by GPS, among others [1], [2], [3], [4]. In certain applications, like baggage X-ray scanning or welding defects detection, it is assumed that the full pixel-by-pixel reconstruction is an intractable, unattractive or expensive solution and, hence, it is desirable to detect an anomaly/target possibly hidden in background from a very limited number of noisy tomographic projections. First papers devoted to the problem of anomaly/target from noisy projections consider that the deterministic background is known [5], [6]. Consequently, this background is eliminated by simply subtracting its known value from radiographies and the initial problem is reduced to the signal in noise detection, which can be solved by using a maximum likelihood estimation [5], a binary [7], or multiple [6], hypotheses testing approach or an object recognition point of view [8], [9]. A more realistic anomaly detection problem statement is based on the assumption that Part of this work has been presented at the 16th IFAC World Congress, July 4-8, 2005, Prague. The authors are with the Universit´e de Technologie de Troyes, ISTIT, FRE CNRS 2732, 12 rue Marie Curie - BP 2060 - 10010, Troyes Cedex, France. Lionel Fillatre, e-mail :
[email protected], phone : 33 3 25 71 80 09. Igor Nikiforov (contact author), e-mail :
[email protected], phone: +33 3 25 71 56 78, fax : +33 3 25 71 56 99.
the imaged medium is composed of an (partially) unknown background with a possibly hidden anomaly/target. Existing methods of anomaly/target detection problem in radiographies [9] fall into three groups : 1) methods without a priori knowledge, 2) reference methods and 3) computerized tomography (CT) methods. The first group includes the methods of anomaly detection without prior knowledge of the imaged medium structure – in other words, the methods without (statistical) model. Such approaches typically use image processing tools (for example, field flattening to enhance the contrast) [10], [11], pattern recognition [12], expert systems [13] and artificial neural networks [14], [11]. The prerequisite for these methods is the existence of common properties which well define all kinds of anomalies and distinguish them from the character features of the non-anomalous imaged medium [9], [15], [16]. Often, these methods are noise sensitive since they do not include explicitly a random noise measurement in the model. In the second group, it is assumed that a reference radiography (model) [9] is available. If a significant difference is identified by comparing the tested image with a reference one [17], then the inspected piece is classified as defective. This approach is efficient to deal with a well-known inspected piece but it is heavily based on a reference model that is not always possible in practice. The third group (CT methods) consists of detecting anomalies from radiographies by reconstructing the imaged medium, which is composed of an (partially) unknown background with a possibly hidden anomaly. Since the number of projections and/or angles of view available for inspection is very limited and the pixel-by-pixel reconstruction is impossible [18], the introduction of prior information on the unknown background (i.e. on the nuisance parameter) is inevitable to fill up the gap in the missing data. Two methods of prior information introduction are available in the literature according to its nature - deterministic or statistical. A purely deterministic regularization [19] of this ill-posed problem [20] has some drawbacks including artifacts in the resulting image and noise sensitivity among others. The statistical CT approaches to anomaly detection can be divided into two groups : Bayesian and non-Bayesian. The dominant trend in the literature is the Bayesian statistical approach [19], [21], [2], [22], [6]. In the case of anomaly detection problem, it is assumed that : i) the considered hypotheses, H0 = {background is anomaly-free} and H1 = {the anomaly is superimposed on the background}, are random events with known prior probabilities; ii) the non-
2
anomalous background (or its structure) and the anomalies are random and the parameters of their (usually Gaussian) a priori distribution are known. In this paper, which continues our previous publication [23], another, non-Bayesian, statistical philosophy is adopted : it is assumed that the background is a deterministic nuisance parameter and, moreover, it is possibly designed by an adversary to mask an anomaly/target. Let us briefly discuss the relative advantages of the Bayesian and non-Bayesian approaches to anomaly detection with nuisance parameters in order to outline the appropriate applications for the non-Bayesian approach. The advantages of Bayesian approach to the problem of statistical anomaly detection are the following : it allows the natural incorporation of prior knowledge; the Bayesian decision rules are simple and efficient (lower probabilities of false alarm and nondetection). A typical application of the Bayesian approach is the biomedical X-ray examinations. Here, the statistical models of non-anomalous medium and anomalies are wellknown. The usage of Bayesian approach seems to be justified especially understanding that the biomedical X-ray examinations deal with low-dose imaging: the statistical decision procedure has to preserve its efficiency with low SNR signals. Let us turn now to other practical problems like QNDT, welding defects detection, or to a recently appeared problem of baggage/container X-ray scanning. In particular, in this last case, the detector cannot be based on an optimistic hypothesis that the Nature which plays against the Detector “kindly” generates the non-anomalous background or anomalies in accordance with a known a priori distribution. A more convenient working hypothesis about the structure of unknown background is the assumption that it is a deterministic nuisance parameter composed of known basis functions. The system of basis functions is defined by the physical nature of the imaged medium, and hence cannot be controlled by an adversary, but the unknown (for the Detector) coefficients of the basis functions superposition can be chosen by an adversary to mask an anomaly/target (a forbidden object in the case of baggage/container X-ray scanning). Hence, the Detector has to find a transformation to split the observation space into two subspaces : the first subspace is completely free from any nuisance and the second one is entirely corrupted by the nuisance. Next, the optimal decision rule is defined as a function of a sub-vector belonging to the first subspace. Let us finish this discussion of different anomaly detection methods by a brief comparison of measurement noise models. There are two main trends in the literature [24], [25], [26], [27] : i) the usage of a Poisson model when the mean number of X-ray photons is low (detection of weak signals in astrophysics, low-dose medical image processing, etc.); ii) the usage of a Gaussian approximation when the mean number of X-ray photons is relatively large (non-destructive testing, baggage/container X-ray scanning, etc.). The methods developed in this paper clearly belongs to the second group. In the frame of our study, the SNR of measurements with respect to the random noise is relatively large; for this reason a Gaussian approximation is applicable. The main difficulty of anomaly detection with nuisance parameters lies in a very low “SNR” of the anomaly with respect to the unknown
background, which is also called the anomaly-to-background ratio in [6]. II. P ROBLEM
STATEMENT AND CONTRIBUTION OF THE PAPER
This paper continues the previous work where a new nonBayesian approach based on the theory of optimal invariant tests has been proposed to solve the problem of anomaly detection [23]. The main assumption of this approach is that the background can be approximated by a (non-)linearly parameterized parsimonious decomposition. Such a parsimonious parametric model allows us to find an observation subspace where the impact of nuisance parameters (background) is absent or negligible. Some examples of such a background defined by deterministic functions can be found in [10], [11]. Because the rejection of the unknown deterministic background can potentially damage the anomaly, the problem of anomaly detectability is extremely important for such a nonBayesian approach. A rudimentary version of the background rejection technique applied in QNDT, i.e. the field-flattening operation [3], whose goal is to enhance the anomaly contrast in radiographies by eliminating the variations of contrast provoked by the background, sometimes leads to the situation where the anomaly becomes undetectable after the fieldflattening operation. The main drawbacks of the majority of the existing anomaly detection methods are the impossibility to calculate a priori the false alarm and non-detection rates [10] and/or their sensitivity to statistical prior information, especially when the estimation of the hyperparameters associated to the prior information is required [2]. The above mentioned non-Bayesian anomaly detection problem raises a variety of issues : 1) What is a lower bound for the probability of nondetection over the class of detectors with a given probability of false alarm and what is the anomaly detection method that attains this lower bound when dealing with an unknown anomaly and an unknown deterministic background ? 2) If the deterministic nuisance parameter (background) is chosen by an adversary to mask the anomaly, how to warrant the detectability of this anomaly ? 3) If the anomaly size is limited, how to bound the maximum size of its detectable projection in the single (resp. multiple) projection case ? 4) In the case of a bivariate polynomial background defined on a rectangular slice, what is the relation between the order of the polynomial function and the maximum size of detectable anomaly projection ? The contribution of this paper consists of responding to the above mentioned issues. First, the parsimonious parametric model of background is briefly described in section III. Second, in contrast to [23], an optimal non-Bayesian statistical test is designed in section IV for the case where the detectability of the anomaly, superimposed on an unknown background, is not a priori warranted. A key issue in the non-Bayesian anomaly detection, i.e. the problem of anomaly detectability, is stated and solved in section V. New results on anomaly
3
detectability are proposed to warrant the detection of limited size anomalies. The criterion of detectability is established, namely, the Kullback-Leibler Information (KLI) is used as a measure of “separability” between two hypotheses : H0 = {the background is anomaly-free} and H1 = {the anomaly is superimposed on the background}. A necessary and sufficient condition of detectability is proposed here. Particularly, it is shown that several projections improves the detectability of a limited size anomaly. A special attention is paid to the case of bivariate polynomial backgrounds defined on a rectangular support in section VI. It is shown that the thickness of the imaged object can be unknown and the size of detectable anomaly reaches its maximum defined by the number of elementary cells of X-ray detector and the degree of the polynomial function. Finally, some concluding remarks are presented in section VII.
X-source
aω (t) bω (t)
=
Z
s(t cos ω − l sin ω, t sin ω + l cos ω)dl,(1)
aω (t)
where the functions t 7→ aω (t) and t 7→ bω (t) define the boundary of the imaged object. In CT, projections are obtained from a linear numerical detector composed of n elementary cells counting the total amount of X-photons passing through the object (see Figure 1). To simplify the presentation, the parallel-beam geometry is used in the paper. The extension of Radon transform from parallel-beam to fan-beam geometry can be found in [28], [29], [30]. Hence, the discrete parallelbeam line integral projection Rω,τ (s), also called the discrete Radon transform of s for the set τ = {t1 , t2 , . . . , tn } of elementary cells is defined by : T
Rω,τ (s) = (Pω (t1 , s), Pω (t2 , s), . . . , Pω (tn , s)) .
(2)
bω
s(l) l = bω (tk ) y X-photon
l
Background (x, y)7→s(x, y)
l1 =
aω
t
D
(x, y)7→g(x, y) Anomaly
t l = aω (tk )
ω O
t, s) P ω(
x
X-photon
III. DATA ACQUISITION AND OBJECT MODELS First, the physical principles related to the data acquisition procedure are briefly introduced via the Radon transform. Next, the parametric model of non-anomalous background is introduced. Finally, the anomaly model is described and the measurement model is defined. A. Description of the imaging system To simplify the presentation, only the two-dimensional imaged objects are discussed. The extension from two- to three-dimensional case is trivial. Let us assume that the attenuation coefficient of the object (or the original scene) (x, y) 7→ s(x, y) is a real-valued square-integrable function of two variables in the x-y coordinate system defined on a compact support D ⊂ R2 , denoted s ∈ L2 (D) (see [22] for a practical justification). This situation is depicted in Figure 1. It is assumed that an X-ray, L(t, ω) say, is defined by two parameters, t and ω, where t is the distance from the origin to the ray and ω is the angle between the y-axis and the ray (see Figure 1). A cartesian equation of L(t, ω) is : x cos ω + y sin ω = t, 0 ≤ ω < π. Let us define the Radon transform Pω (t, s) of s [18] taken at the view angle ω and given by : Z bω (t) Pω (t, s) = s(l)dl
l2 =
tn
t
X-ray detector Elementary cell tk L(tk , ω)
t1
Fig. 1. Geometry of a two-dimensional object imaged with a tomographic system.
B. Parametric background model The background corresponds to the original scene without any anomaly. It is assumed that the attenuation coefficient of background h(x, y) is represented by a linear decomposition : h(x, y) = hµ (x, y) =
m X
µk hk (x, y), ∀(x, y) ∈ D,
k=1
where {h1 , . . . , hm } is a known family of basis functions in L2 (D) and µ = (µ1 , . . . , µm )T is a real-valued vector of unknown parameters. In other words, the unknown background h is assumed to be composed of a finite number of a priori known basis elements hk , representing X-ray attenuation coefficient distributions, with unknown scaling coefficients µk . Hence, the line integral projection is : Z bω (t) X m Pω (t, hµ ) = µk hk (l)dl =
aω (t) k=1 Z bω (t)
m X
k=1
µk
aω (t)
hk (l)dl =
m X
µk Pω (t, hk ). (3)
k=1
and the discrete Radon transform of a basis function is defined by Rω,τ (hk ) = (Pω (t1 , hk ), Pω (t2 , hk ), . . . , Pω (tn , hk ))T , (4) where k = 1, . . . , m. The choice of basis functions is a fundamental dilemma for any parametric approach (Bayesian or non-Bayesian). Two main criteria of this choice are : i) the approximation accuracy; ii) the statistical/numerical complexity of the chosen parametric model. This choice is usually problem-oriented. For this reason, it is difficult to
4
propose any general rules how to choose the basis functions. Some discussion of different parameterizations can be found in [31], [2], [22]. The impact of the parametric model accuracy on the test power function can be found in [23]. Bivariate polynomial background will be discussed in section VI. C. Anomaly model It is desirable to detect an anomaly without a priori information about it. For example,S its support d ⊂ D can be a non-connected domain d = N i=1 di such that di ⊂ D, T di dj = ∅ and N ∈ N∗ (see Figure 1). Let us define the function (x, y) 7→ f (x, y) ∈ L2 (D) representing a local variation of the attenuation coefficient in the original scene due to the presence of the anomaly : g(x, y)−hµ (x, y) if (x, y) ∈ d def. f (x, y) =fg,µ (x, y)= , (5) 0 if (x, y) ∈ D\d where (x, y) 7→ g(x, y), (x, y) ∈ d, is the anomaly attenuation coefficient. D. Measurement model Let us define the attenuation coefficient (x, y) 7→ s(x, y) corresponding to two possible situations : H0 = {the background is anomaly-free} and H1 = {the anomaly is superimposed on the background} : hµ (x, y) under H0 s(x, y) = , ∀(x, y) ∈ D. f (x, y) + hµ (x, y) under H1 (6) Due to the linearity of the operator Rω,τ and putting together equations (2) and (6), we obtain the following measurement model for a particular view angle ω : Hω µ + ξω under H0 Yω = , (7) Rω (f ) + Hω µ + ξω under H1 where Yω ∈ Rn (resp. ξω ∈ Rn ) is a vector of measurements (resp. a vector of noise), ξ ∼ N (0, σω2 In ), the element (0, 0, . . . , 0)T is denoted simply by 0, the variance σ 2 > 0 is assumed to be known, In is the unit matrix of order n, Rω (f ) = Rω,τ (f ) and Hω = (Hω1 , . . . , Hωm ) is an n × m matrix whose columns are defined by Hωk = Rω,τ (hk ) ∈ Rn for k = 1, . . . , m. It is supposed that m < n. If P projections are available, the nP -dimensional vector Y1,...,P , the function R1,...,P (f ) and the nP × m matrix H1,...,P can be easily built from “elementary” components Yωi , Rωi (f ) and Hωi by concatenation. In the rest of the paper, the subscript ω will be omitted to simplify the notations except when the notations can be confused. As it follows from equation (7), the noise ξω due to Xphotons counting is approximated by a gaussian law with the known variance (instead of a Poisson law [27]). Moreover, n the condition of non negativity Yω ∈ {R+ } is not applied. Nevertheless, from the practical point of view, the above mentioned model is relevant in the case of relatively high-dose X-ray examinations and when the variations of the function t 7→ Pω (t, s) are bounded, which is true for a great part of QNDT and baggage/container X-ray scanning. Consequently, the authors assume that equation (7) can be used as a platform
to present and discuss the main elements of the proposed algorithmic solution. IV. D ETECTION
OF AN ANOMALY
First, the hypotheses testing problem with nuisance parameter is stated. Next, a canonical model of observations is derived from the original measurement model (7) by a continuous oneto-one mapping. Finally, an optimal test between two linear composite hypotheses based on the theory of Wald is designed and discussed for canonical and original models. A. Hypotheses testing with nuisance parameters and canonical model The anomaly detection problem consists of deciding which hypothesis is the true one, H0 or H1 , while considering the parameterized background hµ as a nuisance parameter. Sometimes [32], [23], the a priori information on the anomaly can be included in equation (7) by assuming that Rω (f ) = M ψ, where ψ ∈ Rp and M is a full rank matrix of size n × p, describes the anomaly projection, i.e. it is assumed that Y = Hµ + M ψ + ξ and p + rank (H) < n. The details of the hypotheses testing problem H0 = {ψ = 0} against H1 = {ψ 6= 0} can be found in [32], [33], [23]. In particular, to avoid the problem of detectability, it is necessary to assume that RhHi ∩ RhM i = {0}, where RhHi is the column space of H, i.e. the anomaly Rω (f ) = M ψ cannot be hidden in the background. In contrast to [32], [33], [23], it is assumed now that the discrete anomaly projection Rω (f ) = θ is completely unknown in (7). Therefore, the following measurement model will be used in the rest of the paper : Y = Hµ+ξ, under H0 and Y = Hµ+θ +ξ, under H1 . (8) As it has been mentioned above, it is necessary to split the observations space into two subspaces: the first one, RhHi, is entirely corrupted by the nuisance µ and the second one, the orthogonal complement RhHi⊥ of RhHi, is completely free from any nuisance. The theory of invariant tests is usually used [34], [32], [33], [23] to solve this problem. Because now the matrix M is absent, it is simpler to introduce the canonical model to split the observations space into the above mentioned subspaces. Let us assume without loss of generality that H is a full column rank matrix, i.e. rank (H) = q = m. Due to the GramSchmidt orthogonalization [35], the following factorization of H is obtained : H = G U where G is an n × m matrix whose columns form an orthonormal basis for the column space of H, GT G = Im , (where GT is the transpose of G) and U is an n × n nonsingular upper triangular matrix. Hence, the measurement model in equation (8) under H1 can be rewritten as : Y = Gx+ θ + ξ with x = U µ. Since U is nonsingular and µ belongs to the whole space Rm , the matrix U establishes a continuous one-to-one mapping between x and µ. Let us define the n × (n − m) matrix W such that Q = (W G) is an n × n orthogonal matrix. It follows that : Y = QT Y = QT θ + QT Gx + QT ξ = θ + Jx + ζ,
5
where J = (0m,n−m Im )T , 0n,m is the n × m zero matrix and ζ ∼ N (0, σ 2 In ). This leads, under H1 , to the following canonical model : Φ1 ζ1 ζ1 θ1 Y = Φ+ζ = + = + , (9) Φ2 ζ2 ζ2 θ2 + x where Φ1 (resp. Φ2 ) is the sub-vector of Φ containing its n−m first terms (resp. its m last terms). As it follows from equation (9), the mean Φ2 = θ 2 + x of the sub-vector Y 2 extracted from Y is entirely corrupted by the unknown nuisance vector x ∈ Rm . Consequently, only the sub-vector Y 1 = W T Y extracted from Y brings up useful information on the presence/absence of the anomaly. For this reason the initial hypotheses testing problem in equation (8) should be rewritten in such a manner that the sub-vector Φ1 is considered as an informative parameter and the sub-vector Φ2 as a nuisance one. Hence, the anomaly detection problem consists of deciding between : H0 = {Y ∼ N (Φ, σ 2 In ); Φ1 = 0, Φ2 ∈ Rm } and H1 = {Y ∼ N (Φ, σ 2 In ); Φ1 = 6 0, Φ2 ∈ Rm }. (10) The matrix W = (w1 , . . . , wn−m ) satisfies the following conditions : def.
−1
⊥ W T H = 0n−m,m , W W T = PH = In − H(H T H) W T W = In−m ,
HT , (11)
where w1 , . . . , wn−m are the eigenvectors of the projection ⊥ matrix PH corresponding to eigenvalue 1. B. Optimal test for the canonical and original models The quality of a statistical test is defined with the probability of false alarm and the power of the test [36], [37]. The above mentioned hypotheses testing problem defined by equation (10) presents two main difficulties : i) the two hypotheses H0 and H1 are composite and ii) there is an unknown nuisance parameter Φ2 . There is no general way to test between two composite hypotheses especially with nuisance parameters [36], [37]. It is proposed to use here the theory developed by Wald in his fundamental paper [38]. Let us note Γ0 = {Φ | Φ1 = 0, Φ2 ∈ Rm } and Γ1 = {Rn \ Γ0 } = {Φ | Φ1 6= 0, Φ2 ∈ Rm }. Let Kα = {δ : supΦ∈Γ0 PrΦ (δ(Y ) = H1 ) ≤ α}, 0 < α < 1, be a class of tests with an upperbounded false alarm probability, where the probability PrΦ stands for the vector of observations Y being generated by the distribution N (Φ, σ 2 In ) and α is the prescribed upper bound for the probability of false alarm. The power function βδ (Φ) is defined by the probability of anomaly detection βδ (Φ) = PrΦ (δ(Y ) = H1 ), where Φ ∈ Γ1 . Obviously, it depends a priori on the nuisance parameter Φ2 , which is very undesirable. To solve this problem Wald [38] proposes to define a test δ¯ ∈ Kα having uniformly best constant power (UBCP) on the family of surfaces S = {Sc ; c ≥ 0} : i.e. 1) βδ¯(Φ′ ) = βδ¯(Φ′′ ), ∀Φ′ , Φ′′ ∈ Sc ; 2) βδ¯(Φ) ≥ βδ (Φ), ∀Φ ∈ Sc for any test δ ∈ Kα , which satisfies 1), and for any Sc . The solution of the hypotheses testing problem in (10) for the canonical model given by equation (9) is provided by Proposition VI of Wald in [38, p. 461]. Let us briefly recall
this result omitting the mathematical details : for any given ¯ ), defined as follows Φ2 , the test δ(Y ¯ 1 ) = 12 kY 1 k2 < λα H0 if Λ(Y 2 ¯ σ δ(Y ) = , (12) H1 else where the threshold λα is chosen to satisfy the definition of the ¯ 1 ) ≥ λα ) = PrΦ1 =0 (Λ(Y ¯ 1) ≥ class Kα : supΦ∈Γ0 PrΦ (Λ(Y λα ) = α, is UBCP in the class Kα over the family of surfaces S = {Sc ; c > 0} defined by 1 (13) Sc = Φ 2 kΦ1 k22 = c2 , Φ2 ∈ Rm . σ
¯ is distributed according to the χ2 law with The statistics Λ n − m degrees of freedom. This law is still central under H0 and non-central under H1 with the non-centrality parameter c2 = σ12 kΦ1 k22 . Hence, the power function of the test δ¯ is given by : Z +∞ 2 βδ¯(Φ) = βδ¯(Φ1 ) = βδ¯(c ) = pc2 (x)dx, ∀Φ2 ∈ Rm , λα
(14) where x 7→ pc2 (x) is the probability density function (pdf) of the non-central χ2 law with n − m degrees of freedom and the non-centrality parameter c2 . By using the definition of the vector Φ1 = W T θ, the hypotheses testing problem in (10) can be easily rewritten for e 0 = {θ | W T θ = 0} and the original model. Let us note Γ T e Γ1 = {θ | W θ 6= 0}. Hence, the anomaly detection problem in equation (10) consists of deciding between : e 0 , µ ∈ Rm } and H1 = {θ ∈ Γ e 1 , µ ∈ Rm }. (15) H0 = {θ ∈ Γ
It follows from Y 1 = W T Y and equation (11) that the family ⊥ of surfaces S is defined by S = {Sc : σ12 θT PH θ = c2 , c > ¯ 0} and the UBCP test δ over the family S can be rewritten as follows : ⊥ Y = σ12 kY 1 k22 < λα H0 if Λ(Y ) = σ12 Y T PH ¯ δ(Y ) = . H1 else (16) It is worth noting that the generalized likelihood ratio test (GLR) [36], [32] coincides with the the UBCP test given by equation (16). Indeed, it is easy to see that 2 log maxθ∈Rn pθ (Y ) − 2 log maxµ∈Rm pHµ (Y ) = Λ(Y ) wheren the gaussian pdf o is given by pϕ (Y ) = 2 1 1 exp − kY − ϕk with ϕ = θ or ϕ = Hµ. n 2 2σ2 (2π) 2 σn A brief study of the computational cost of the test δ¯ can be found in [23]. Example 1 (Detection of a disk): To illustrate the UBCP test given by equation (16), let us consider the following background hµ , a polynomial function of degree 2 : hµ (x, y) = 1−2 x− y +0.1 x y−x2 +y 2 , (x, y) ∈ D = {(x, y) ∈ R2 |− 10 cm ≤ x, y ≤ 10 cm}.(17) The function (x, y) 7→ hµ (x, y) varies from −120 cm−1 to +110 cm−1 on D. The local variation of the attenuation coefficient due to the presence of the anomaly defined by equation (5) is given by fd if (x − x0 )2 + (y − y0 )2 ≤ rd2 f (x, y) = , (18) 0 else
6
Projection at 50◦ : Y
Original scene
Λ(Y ) = 196.6 < 242.7(= λ0.01)
400
−10
30
Y 1 = WTY
−8
300 20
−6 200
−4
10 100
−2
y
zi
yi 0
0
2
0
-100 -10
4
-200
6
-20 -300
8 -400
10 −10
−8
−6
−4
−2
0
2
4
6
8
0
20
10
40
60
80
100
120
140
160
180
-30
200
x
(a) Original scene
20
40
60
80
100
140
160
180
200
(c) Λ(Y )=315.2 > 242.7(= λ0.01)
400
Anomaly
120
i
(b) Projection at 50◦ : Y
−10
−8
0
i : elementary cell number
60
Y 1 = WTY
50
300
Anomaly
40
−6 200
−4
−2
y
30
Anomaly
100
yi
zi 0
0
2
20
10
0
-100
-10
4 -200
-20
6 -300
-30
8 -400
10 −10
−8
−6
−4
−2
0
2
4
6
8
10
x
0
20
40
60
80
100
120
140
160
180
i : elementary cell number
(d)
(e)
200
-40
0
20
40
60
80
100
120
140
160
180
200
i
(f)
Fig. 2. Detection of a disk masked by a polynomial background : (a) Original scene without anomaly, (b) Projection of (a) at 50◦ , (c) Parity vector Y 1 = W T Y calculated from (b), (d) Original scene with anomaly, (e) Projection of (d) at 50◦ and (f) Parity vector Y 1 = W T Y calculated from (e).
with rd = 0.5 cm, fd = 50 cm−1 and (x0 , y0 ) = (−6.5 cm, 8.4 cm). The background without anomaly is depicted in Figure 2.(a) and with anomaly in Figure 2.(d). Obviously, the coefficients of hµ are unknown for the UBCP test. The attenuation coefficient of the disk does not dominate over the scene. The considered tomographic projections Y are taken at the angle of view ω = 50◦ (the linear X-ray detector orientation is shown by the black straight line plotted on Figures 2.(a) and 2.(d)) of the scene without (Figure 2.(b)) and with the anomaly (Figure 2.(e)). It is assumed that the projection is sampled on n = 200 elementary cells equally spaced between −15 cm and 15 cm with the constant sampling step of 0.15 cm. The projection is then deteriorated by a zero mean white Gaussian measurement noise with the standard deviation σ = 9.3. To reject the nuisance parameters hµ , the parity vector Y 1 = W T Y has been computed. Here, the peak corresponding to the anomaly is clearly visible (see Figures 2.(c) and 2.(f)). Its presence is confirmed by the augmentation of the test statistics Λ(Y ) which becomes greater than the threshold λ0.01 chosen for the probability of false alarm α = 0.01 (see Figure 2.(f)). V. D ETECTABILITY OF
ANOMALIES
The UBCP test proposed in the previous section detects an anomaly by completely eliminating the unknown background considered as a nuisance parameter. The rejection of the
background damages the projection of the anomaly and, hence, the ability of the test to detect the anomaly with a probability greater than α, which inevitably implies the problem of anomalies detectability.
A. Motivation and the general case of anomaly Intuitively, the problem of detectability appears when the statistical decision rule assimilates the anomaly with the background. From purely Bayesian point of view, the probability of such an assimilation is negligible and, logically, this event should not be taken into account. But if we assume that the background is deterministic and it is designed by an adversary to mask an anomaly, then the detectability problem becomes crucially important. Let us start with the general case of anomaly, namely θ ∈ Rn . To represent the phenomenon of detectability, a natural approach consists of studying the degree of “separability” between the hypotheses H0 and H1 . It is well-known that the Kullback-Leibler Information (KLI) is a good measure of the “separability” between two hypotheses [39]. For this reason the KLI is often called a “distance” between the hypotheses H0 and H1 . Moreover, as it follows from [38], in the case of a linear gaussian model, the KLI defines the power of the UBCP test. The KLI ̺(ϕ0 , ϕ1 ) between two parameterized pdf pϕ0
7
and pϕ1 is defined by the following equation [36] : Z pϕ (X) pϕ (Y ) dX = Eϕ0 log 0 . ̺(ϕ0 , ϕ1 ) = pϕ0 (X) log 0 pϕ1 (X) pϕ1 (Y ) Rn After a short algebra one gets : ̺(Hµ0 , θ + Hµ1 ) =
̺(pHµ0 , pθ+Hµ1 ) 1 2 = kθ + Hµ1 − Hµ0 k2 . 2σ 2 To examine the impact of the background rejection on the anomaly detectability, let us fix the vector θ and minimize the KLI with respect to the possible values µ0 and µ1 of the nuisance parameter : def.
̺⊥ H (θ) = =
= =
min
µ0 ∈Rm , µ1 ∈Rm
̺(Hµ0 , θ + Hµ1 )
1 2 kθ + H△µk2 2σ 2 1 T −1 θ In − H(H T H) H T θ 2 2σ 1 T ⊥ θ PH θ. 2σ 2 min
power βδ¯(c2 ). For this reason it seems to be impossible to define the size of anomaly which can be detected in the general case with a warranted power β for a given level of false alarm α and a given background H. To overcome this difficulty, the relation between the size k of detectable anomaly θ and the properties of the background H, i.e. the theoretical relation between k and H such as ̺⊥ H (θ) > 0, will be established in the rest of the paper. B. Definition of the detectability : an unlimited size anomaly Let us now define the notion of detectability in the case of a single tomographic projection and an unlimited size anomaly. ⊥ It follows from the properties of the matrices PH and W that the two following definitions are equivalent. Definition 1 (detectability): A given anomaly projection θ′ 6= 0 is detectable if and only if :
△µ∈Rm
def.
′ ̺⊥ H (θ ) =
(19)
ce ur so X-
t yi
Long anomaly
n oto ph X-
n oto ph X-
ti
ray X-
10
8
te de r cto
6
4
ω2 = 140◦
2
0
-2
n oto ph X-
-4
t -6
n oto ph X-
c2 = e1 (θ, σ 2 ) − e2 (θ, PH , σ 2 , p, k), (20) P p+k−1 where e1 (θ, σ 2 ) = σ12 i=p θi2 is the SNR of the anomaly Pp+k−1 Pp+k−1 θ, e2 (θ, PH , σ 2 , p, k) = σ12 i=p θi θj ph;i,j is the j=p SNR of the anomaly weighted by using the coefficients ph;i,j which reflect the masking effect of the unknown nuisance (deterministic background) and ph;i,j is the element (i, j) of the matrix PH . The discussion of the impact of p, k, anomaly SNR and PH on the probability of detection can be found in [23] (see pages 2220-2221, equation (20), Figures 3(a,b) and Figure 4). In particular it is shown that the translation p → p + i of the anomaly can imply a drastic variation of the second term e2 (θ, PH , σ 2 , p, k) and, consequently, of the
̺(Hµ0 , θ′ + Hµ1 ) > 0.
Definition 2 (detectability): A given anomaly projection ⊥ ′ θ′ 6= 0 is detectable if and only if PH θ 6= 0, or equivalently, T ′ W θ 6= 0. Hence, if the anomaly belongs to the null space of the matrix ⊥ PH then it is completely eliminated by the rejection of the background and the UBCP test δ¯ becomes “blind” to the anomaly. Example 2 (Detectable and undetectable anomalies): To illustrate the above mentioned definitions, let us discuss about the detectability of an anomaly superimposed on the polynomial background previously defined in Example 1. In
ce ur so X-
As it follows from equation (14), the power function c2 7→ βδ¯(c2 ) of the UBCP test δ¯ given by equation (16) is completely 2 defined by the non-centrality parameter 2̺⊥ H (θ) = c for a given probability α of false alarm. Because the rank of the ⊥ matrix PH is less than n, it is straightforward to verify that minθ∈Rn ̺⊥ H (θ) = 0. This minimum is reached for θ belonging ⊥ to the null space of the projection matrix PH , i.e. for θ such ⊥ T that PH θ = 0 (or W θ = 0). Therefore, if the anomaly vector θ entirely occupies the space Rn , the detectability cannot be warranted. This fact justifies a posteriori the choice of the null hypothesis H0 in (10) and (15). An interesting idea to consider the detectability as a relation between the smallest size of the “target” and the SNR for given probabilities of false alarm and non detection is proposed in [40]. Unfortunately, in the presence of nuisance parameters such a relation is difficult, if not impossible, to obtain. Let us consider an anomaly θ = T (0, . . . , 0, θp , . . . , θp+k−1 , 0, . . . , 0) of size k with its first position at p. As it follows from equation (19), the noncentrality parameter c2 (and the power βδ¯(c2 )) is a function of the anomaly SNR, the capacity of the background to mask −1 the anomaly expressed by the matrix PH = H(H T H) H T and the position/size of the anomaly given by p and k :
inf
µ0 ∈Rm , µ1 ∈Rm
-8
-10 -10
-8
-6
-4
-2
0
2
4
6
8
y ra X-
Short anomaly
ti
10
te de
r cto
yi
ω1 = 50◦
Fig. 3. An original scene is composed of a polynomial background and a long anomaly (solid line) or a short anomaly (dotted line). The tomographic projections are taken at the view angles ω1 = 50◦ and ω2 = 140◦ .
contrast to Example 1 the linear detector is composed now of 10 equally spaced elementary cells (see Figure 3). To simplify the presentation, it is assumed that the measurement noise ξ is equal to 0. Two different anomalies are considered (see Figure 3) : a long one, outlined by a solid line and a short one, hatched anomaly, outlined by a dotted line. The simulated
8
2
400
Long anomaly Short anomaly
300
200
yi
1.5
zi
1
100
0.5
0
0
-100
-0.5
replacements -1
-200
-300
-400
Difference between the two projections 1
2
3
4
Long anomaly Short anomaly
-1.5
5
6
7
8
9
10
-2
1
2
3
4
5
6
7
8
i : elementary cell number
i : elementary cell number
(a)
(b)
9
10
Fig. 4. Detectability of two different anomalies (long and short) depicted in Figure 3 : (a) tomographic projections Y at 50◦ of the two anomalies and (b) ⊥ Y calculated from (a) for the above mentioned anomalies. residual vectors Z = PH
tomographic projections Y at the view angle ω1 = 50◦ are depicted in Figure 4.(a). At the naked eye, it appears that these tomographic projections are quite similar : they slightly differ only for the 6-th elementary cell (Figure 4.(a)). ⊥ The residual vectors Z = PH Y (the background is rejected) are shown on Figure 4.(b). It is easy to see that the long anomaly is completely eliminated by the nuisance rejection ⊥ because it belongs to the null space of the matrix PH whereas the short one is still visible because it does not belong to the null space. In the case of noisy tomographic projections, the same reasoning can be applied to ̺⊥ H (θ). C. The single projection case : a limited size anomaly Let us consider now a limited size anomaly, i.e. the case where the number of non-zero elements of θ is upper bounded. Given n-tuple θ = (θ1 , . . . , θn )T ∈ Rn , we define the support of θ by supp (θ) = {i : 1 ≤ i ≤ n | θi 6= 0} and its cardinality, i.e. the number of non-zero elements of θ, by card (θ) = card (supp (θ)). For example, if θ = T (0 1 4 7 0 3) , then supp (θ) = {2, 3, 4, 6} and card (θ) = 4. Definition 3 (k-detectability): The anomaly projections of size k are detectable, or k-detectable, if and only if all projections θ, such that card (θ) ≤ k, are detectable. Sometimes, it is important to know the maximum size of anomaly projection which is detectable. Definition 4 (Order of detectability): Let us call the order of detectability of a tomographic projection the maximal cardinality of detectable anomaly projections θ. Let us comment on Definitions 1 - 4. Definitions 1 and 2 represent a property of a given anomaly projection θ′ . In contrast, Definition 3 represents a property of a set of projections θ, such that card (θ) ≤ k and, finally, Definition 4 represents a property of a given matrix H (or a nuisance subspace RhHi = {Y |Y = Hµ, µ ∈ Rm }). The following lemma establishes the relation between the kdetectability of limited size anomaly projections and the linear dependency of the rows of H.
Lemma 1: Let us denote by Jnk the set of all possible k-subsets 1 I = {i1 , . . . , ik }. Consider the sub-matrix HI obtained from H by deleting the rows i1 ,. . . ,ik . The anomaly projections are k-detectable if and only if rank (HI ) = rank (H) for all I ∈ Jnk . Proof: see Appendix I. It follows immediately from Lemma 1 that the order of detectability cannot be greater than n − rank (H). D. The several projections case : a limited size anomaly Let us assume now that P “elementary” projections Y1 , . . . , YP taken at the angles ω1 , . . . , ωP are available. We wish to establish the relation between the orders of detectability k1 , . . . , kP of P “elementary” projections and the order of detectability k1,...,P of the concatenated projection. Lemma 2: Let us assume that the concatenated nP × m matrix H1,...,P is composed of non-null n × m matrices H1 , . . . HP of the same rank q. The order of detectability k1,...,P for P tomographic projections verifies the following inequality : P (1 + min ki ) − 1 ≤ k1,...,P ≤ nP − q. 1≤i≤P
(21)
Proof: see Appendix II. It is worth noting that the advantage of several projections can be more significant than the lower bound warranted by Lemma 2. A view angle with a low order of detectability can be compensated by a view angle with a bigger order of detectability. Example 3 (Detectable and undetectable anomalies - contd.): To illustrate how several tomographic projections bring the improvement in detectability, let us consider again Example 2. Now the tomographic projections of the scene depicted in Figure 3 are taken at two different view angles ω1 = 50◦ and ω2 = 140◦ . Each projection is sampled on 10 equally spaced elementary cells. It is again assumed that the measurement noise ξ is equal to 0. The concatenated 1A
k-subset is a subset of a set on n elements containing exactly k element.
9
800
300
Long anomaly Short anomaly
Long anomaly Short anomaly
250
600 200
yi 400
zi
Differences between the two vectors
150
100
200
50 0 0
-200
-50
-100 -400
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-150
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
i : elementary cell number (concatenated vector)
i : elementary cell number (concatenated vector)
(a)
(b)
20
Fig. 5. Detectability of two different anomalies (long and short) depicted in Figure 3 : (a) concatenated vectors Y obtained from two tomographic projections ⊥ Y calculated from (a) for the above taken at the view angles : ω1 = 50◦ and ω2 = 140◦ for the long and short anomalies and (b) residual vectors Z = PH mentioned anomalies.
vector of measurements Y = (Y1T Y2T )T is obtained from the “elementary” projections Y1 and Y2 . The vector Y is calculated twice : the first time with the long anomaly and the second time with the short one (see Figure 3). The results are presented in Figure 5.(a). The two vectors are identical except for the 6-th and 16-th cells. The background is eliminated by the rejection process and results are depicted in Figure 5.(b). The large anomaly which was undetectable in Example 2 with a single projection is now detectable with two projections. VI. A NOMALY
DETECTION IN THE CASE OF A POLYNOMIAL BACKGROUND
The goal of this section is to discuss the problem of anomaly detectability in the case of a polynomial background. In this particular case and under the simplifying working hypothesis that the background is defined on a rectangle Dr viewed as a region of interest, the redundancy between the rows of the matrix H is directly related to the physical properties of the background. The order of detectability is defined as a function of the degree of a polynomial background. Finally, a numerical simulation illustrates the theoretical results by studying the statistical properties of the UBCP test. A. Polynomial background defined on a rectangle An efficient method to eliminate an unknown background with large and low frequency variations and to enhance the contrast of radiographies consists of modeling this undesirable background by using polynomial functions. For example, such a polynomial fitting is used in QNDT [41], [11]. To simplify the presentation and without loss of generality, only bivariate polynomial backgrounds composed of binary forms will be considered in the rest of the paper : hµ (x, y) =
m X k X
k=0 j=0
µk−j,j xk−j y j , ∀(x, y) ∈ D
(22)
where m is the degree of the polynomial background and µ = -dimensional (µ0,0 , µ1,0 , µ0,1 , . . . , µ0,m )T is the (m+1)(m+2) 2 parameter vector. Let us return to our basic measurement model given by equation (8) with the above mentioned param< n. As it follows eter vector µ. It is assumed that (m+1)(m+2) 2 from the definition of the Radon transform of a basis function Rω,τ (hk ) given by equations (3) - (4), the n × (m+1)(m+2) 2 matrix H is directly related to the shape of the boundary of D via aω (t) and bω (t) (see Figure 1). It seems that the analysis of the linear dependency between the rows of the matrix H for an unspecified boundary of D is impossible. To simplify the problem, let D be a rectangle Dr = [t1 , tn ]×[l1 , l2 ], where l1 = aω and l2 = bω (aω < bω ) in a coordinate system formed by t- and l- axes (see Figure 1). From the practical point of view, the set Dr can be viewed as a region of interest which should be inspected carefully. B. Detectability in the case of a polynomial background defined on a rectangle As it follows from Lemma 1, to examine the k-detectability of a limited size anomaly it is required to compute the rank for n! Cnk = k!(n−k)! matrices. Because k is unknown, it is necessary to repeat this operation for different 1 ≤ k ≤ n − q. For this reason if n is large enough, this problem becomes infeasible. Fortunately, in the case of polynomial background (22) defined on a rectangle Dr , it is possible to study the structure of the matrix H. The subtlety of this particular case consists of the fact that the integral projection defined by equation (3) of the polynomial background (x, y) 7→ hµ (x, y) defined by equation (22) on the rectangle Dr is again a polynomial function of t. The degree of this new polynomial function is equal to m. The proofs and mathematical details can be found in Appendix III. Moreover, it is shown in Appendix III that matrix H the following factorization of the n × (m+1)(m+2) 2 holds : H = Vτ U (23)
10
where Vτ is the n × (m + 1) rectangular Vandermonde matrix defined for the set τ = {t1 , . . . , tn } and U is an (m + 1) × (m+1)(m+2) matrix in echelon form [35, p. 18]. It is well2 known that rank (Vτ ) = m + 1. It is shown in Appendix III that rank (H) = rank (U ) = m + 1. Due to this factorization, the measurement model can be rewritten as follows : Y = Hµ + θ + ξ = Vτ µ ˜ + θ + ξ or m X yi = µ ˜j tji + θi + ξi , i = 1, . . . , n,
(24)
j=0
where the m + 1 dimensional vector µ ˜ represents the coefficients of the univariate polynomial function, tomographic projection of the background, and the following lemma can be proved : Lemma 3: Let us define a set τ = {t1 , . . . , tn } of sampling points (ti is the position of the i-th elementary cell) such that ti 6= tj for any pair (ti , tj ) ∈ τ × τ . For a polynomial background of degree m defined on a rectangular support Dr , anomalies are (n − m + 1)-detectable. Proof: see Appendix IV. Therefore, in the case of a polynomial background defined on a rectangular support, the order of detectability reaches its maximum (n−m+1) defined by the degree of the polynomial function representing the integral projection of the polynomial background or by the rank of H. Exactly the same reasoning can be applied to some other types of boundaries aω (t) and bω (t). Example 4 (Impact of the shape of D on the detectability): Let us return to our basic example 1. The background is defined by equation (17). Now the oval shape anomaly is specially designed to be undetectable. The results are depicted in Figure 6.(a,b,c). Let us change the shape of the support from non-rectangular to rectangular (see Figure 7.(a)). The same anomaly becomes detectable. This situation is shown in Figure 7.(b,c). C. Discussion Let us finally comment on the above mentioned results. First, as it follows from Appendix III, the values l1 and l2 can be unknown. Indeed, their impact on the integral projection is strictly limited to the regression coefficients ak (ω, i, j) (see equation (27)) which are supposed to be unknown anyway. This fact leads to a more realistic experimental context when the thickness of the imaged object is unknown. In the general case (see [23]), the unknown geometry of the imaged object immediately leads to a non-linear parametric parsimonious model Y = H(µ) + θ + ξ instead of a linear one Y = Hµ + θ + ξ. It is worth noting that a non-linear parametric model is less tractable than a linear one because of serious mathematical problems with the design of an optimal test. This fact stresses the relevance of the bivariate polynomial background defined on a rectangle Dr which allows us to apply a less complicated and more efficient mathematical tool. Second, let us assume that the background (x, y) 7→ h(x, y) is approximated with a certain non-random error (x, y) 7→ ψ(x, y) by a parametric model, i.e. h(x, y) = hµ (x, y) +
ψ(x, y). Because of mismatches between hµ and h, the line integral projection of ψ(x, y) produces an additional term Rω (ψ) in equation (7). This leads to the following “approximated” measurement model Y = Hµ + θ + Rω (ψ) + ξ. As it follows from Definitions 1 and 2, the study of anomaly detectability concerns only the “regular” part Hµ of the background model. Some theoretical analysis of the impact of the “irregular” part Rω (ψ) on the power function βδ¯ can be found in [23]. VII. C ONCLUSION The detection of an anomaly hidden in the background from a few noisy tomographic projections has been considered in the paper as a non-Bayesian hypotheses testing problem with nuisance parameters. To fill up the gap in the missing data, a parametric linear parsimonious model has been used to describe the nuisance parameter (background). In the case of a few available projections, the full pixel-by-pixel reconstruction of the imaged medium is impossible and this new approach allows us to overcome this difficulty. It is assumed that the vector of nuisance parameter defining the attenuation coefficient of the background can be chosen by an adversary to mask the anomaly. An optimal non-Bayesian (UBCP) test has been designed to detect an anomaly superimposed on the unknown deterministic background, see equations (12) and (16). It is worth noting that this test coincides with the GLR test. A new problem of anomaly detectability, crucially important for non-Bayesian philosophy, has been stated (see Definitions 1 - 4) and solved (see Lemmas 1 and 2). These theoretical results provide the readers with an algorithm to calculate the size of anomaly whose detectability is warranted independently of the desire of adversary wishing to mask it. This theory is applicable in the single and multiple projection cases. The most advanced results have been achieved in the special case of bivariate polynomial background defined on a rectangular support. Here, a special re-parametrization (see equation (24)) permits us to reduce the initial measurement model to a univariate polynomial function and to prove that the size of detectable projection attains its maximum (see Lemma 3). Moreover the thickness of the imaged object can be unknown that leads to a more realistic experimental context and seriously simplify the theoretical difficulties by avoiding the usage of non-linear parsimonious model. A PPENDIX I P ROOF OF L EMMA 1 Without any lost of generality we can assume that H is a full column rank matrix, i.e. rank (H) = q = m. Let us fix a support (k-subset) I = {i1 , . . . , ik } ∈ Jnk . It is assumed that n ≥ k + m. Any anomaly projection (AP) θI ∈ Rn can be represented as follows : X θI = θi ei , θi ∈ R, i∈I
where ei = (δ1,i , . . . , δi,i , . . . , δn,i )T ∈ Rn with δk,i = 1 if k = i and 0 otherwise. The APs defined on the support I ⊥ are detectable if and only if PH θI 6= 0, or equivalently, θI ∈ / R (H). Let us denote by FI = span{ei1 , . . . , eik } the linear
11
Projection at 50◦ : Y
Original scene
Λ(Y ) = 212.9 < 242.7 (= λ0.01 ) 30
400
−10
Y 1 = WTY 300
−8
20
Anomaly −6
200 10
−4 100
yi
−2
zi 0
0
0 -100
2
-10 -200
4
-20
6
-300
8
10 −10
-400
−8
−6
−4
−2
0
2
4
6
8
0
20
40
60
80
100
120
140
160
180
-30
200
0
20
i : elementary cell number
10
(a)
40
60
80
100
120
140
160
180
200
i : elementary cell number
(b)
(c)
Fig. 6. Detectability of a limited size anomaly in the case of a non-rectangular support : (a) Original scene with a limited size anomaly, (b) Projection of (a) at 50◦ and (c) Parity vector calculated from (b) : the anomaly is undetectable.
Projection at 50◦ : Y
Original scene
Λ(Y ) = 423.6 > 242.7 (= λ0.01 ) 20
250
−10
Y 1 = WTY
−8 200
Anomaly
−6
15
Anomaly
10
Anomaly
150
−4
yi
−2
zi
5
100
0
0
2
50 -5
4 0
-10
6
8
10 −10
-50
−8
−6
−4
−2
0
(a)
2
4
6
8
10
0
20
40
60
80
100
120
140
160
i : elementary cell number
(b)
180
200
-15
0
20
40
60
80
100
120
140
160
180
200
i : elementary cell number
(c)
Fig. 7. Detectability of a limited size anomaly in the case of a rectangular support : (a) Original scene with a limited size anomaly, (b) Projection of (a) at 50◦ and (c) Parity vector calculated from (b) : the anomaly is detectable.
space spanned by vectors {ei1 , . . . , eik }. Hence, the APs with the support I are detectable if and only if FI ∩ R (H) = {0}, or equivalently the set of vectors {ei1 , . . . , eik , H 1 , . . . , H m }, where H 1 , . . . , H m are m columns of H spanning the column space R (H), is linearly independent, or equivalently, the rank e I = ei1 . . . ei H 1 . . . H m of the n×(k +m) matrix H k is equal to k + m. Therefore, to prove Lemma 1, it is necessary to show that e I = k + m if and only if rank (HI ) = rank (H) = rank H e I = k+m. Let us denote m. First, let us assume that rank H
e I\{i } a sub-matrix extracted from H e I by deleting the by H 1 i1 -th row and the first column (whose terms are zero after e I\{i } = deleting the i1 -th row). It is obvious that rank H 1 e I −1 = k+m−1. Let us repeat this operation for all rank H e rows indexed by i2 ,. . . ,ik . It follows that ,...,ik } = HI HI\{i1 e e and rank (HI ) = rank HI\{i1 ,i2 ,...,ik } = rank HI − k = m. Hence, rank (HI ) = rank (H) = m.
Second, let us consider an (n − k) × m full column rank matrix HI . Suppose that the set (h1 , . . . , hm ) of linearly independent vectors (rows of the matrix HI ) spans the row
space of the matrix HI . Let us denote by (e h1 , . . . , e hm , ai1 ) the set of m + 1-dimensional vectors, where e hi = (hTi , 0)T , i = 1, . . . , m, and ai1 = (hi1 ,1 , . . . , hi1 ,m , 1)T . It is obvious that the dimension of this linear subspace is equal to m+1. By repeating this operation for all rows indexed by i2 ,. . . ,ik , we get the subspace of dimension k + m (the vector aij is given by aij = (hij ,1 , . . . , hij ,m , 0, . . . , 0, 1)T for j = 2, . . . , k). e I are calculated as linear The missing rows of the matrix H combinations of the basis rows. Finally, by interchanging the e I . These operations rows (resp. columns) we get the matrix H e I = k + m. do not change the rank, for this reason rank H This reasoning can be applied to all possible supports I ∈ Jnk , which ends the proof. A PPENDIX II P ROOF OF L EMMA 2 An upper bound follows immediately from Lemma 1. The proof of a lower bound is based on the fact that the anomaly projections are k-detectable if and only if all projections θ : card (θ) = k are detectable (see Definition 3). For this reason, deleting r = P (1 + min1≤i≤P ki ) − 1 rows of the matrix H1,...,P should not lead to a matrix (H1,...,P )I of a
12
lowest rank, where I = {i1 , . . . , ir } is an arbitrary chosen r subset of the set of indexes JnP , i.e. rank ((H1,...,P )I ) = rank (H1,...,P ) = q should be warranted. Let us assume that ri : 0 ≤ i ≤ n rows are deleted from the sub-matrix Hi . A short calculation shows that P X
ri = nP − r = P ( max (n − ki ) − 1) + 1. 1≤i≤P
i=1
(25) H i,j =
We make a reasoning by absurd : let us assume that rj < max1≤i≤P (n − ki ) for all j : 0 ≤ j ≤ n. This leads to the PP following inequality : i=1 ri < P (max1≤i≤P (n−ki )−1)+1 which is in contradiction with equation (25). Therefore, ∃j0 : 1 ≤ j0 ≤ P such that rj0 > max1≤i≤P (n − ki ) − 1 or, that is equivalent, kj0 = n − rj0 ≤ min1≤i≤P ki . Consider the submatrix (Hj0 )Ij0 obtained from the sub-matrix Hj0 by deleting rj0 rows. Because kj0 ≤ min1≤i≤P ki , the rank of (Hj0 )Ij0 is equal to q. This completes the proof. A PPENDIX III P ROOF OF H = Vτ U Let us consider the bivariate polynomial background given by equation (22) and defined on a rectangle Dr . First, it is assumed that a view angle ω satisfies the following inequalities : ω 6= 0 and ω 6= π/2. Let us show that the integral projection t 7→ Pω (t, hi,j ) of the polynomial term hi,j (x, y) = xi y j , is a polynomial function of degree i + j defined for real t. By putting together equation (1) and the Newton binomial theorem, we establish the following result : Z bω i j (t cos ω − l sin ω) (t sin ω + l cos ω) dl Pω (t, hi,j ) = =
aω j i X X
Let us define the n × (m + 1) rectangular Vandermonde matrix Vτ = (T 0 T 1 . . . T m+1 ) where T k = T (tk1 tk2 . . . tkn ) and the (m + 1)-dimensional vector Ui,j = T (a0 (ω, i, j) a1 (ω, i, j) . . . ai+j (ω, i, j) 0 . . . 0) where the terms after the non-zero term ai+j (ω, i, j) are all zero. It follows immediately from (27) that :
γ(ω, i, j, u, v) tu+v ,
(26)
u=0 v=0
where (i−u)
γ(ω, i, j, u, v) = l(ω, i, j, u, v) Ciu Cjv (−1)
cosu ω·
· sinv ω sin(i−u) ω cos(j−v) ω
i+j X
ak (ω, i, j) T k = Vτ Ui,j ,
(28)
k=0
and it can be concluded that : H = Vτ U where U = (U0,0 U1,0 . . . U0,m ) is an (m + 1) × (m+1)(m+2) 2 matrix in echelon form [35, p. 18]. Because the terms ai+j (ω, i, j) are all non-zero, the rank of the matrix U is equal to m+1. Let us calculate the rank of the matrix H by using the theorem for the rank of the product of matrices [42, p. 398] : rank (H) = rank (Vτ U ) = rank (U ) − dim[RhU i ∩ N (Vτ )], where the null space of A is denoted by N (A). If the elementary cell positions {t1 , . . . , tn } are distinct, rank (Vτ ) = m+1. Hence, N (Vτ ) = {0} and dim[RhU i∩N (Vτ )] = 0. Therefore, rank (H) = m + 1. Exactly the same reasoning with some simplifications can be applied to the special cases ω = 0 and ω = π2 which completes the proof. A PPENDIX IV P ROOF OF L EMMA 3 The proof immediately follows from Lemma 1 and equation n−(m+1) (23). Let us fix I ∈ Jn . It results from equation (23) that HI = Vτ I U , where Vτ I is a square Vandermonde matrix of order m + 1. If the set τ of elementary cell positions is chosen such that ti 6= tj for i 6= j, then the square Vandermonde matrix Vτ I is non-singular. By the same reasoning as in Appendix III (theorem for the rank of the product of matrices), we get rank (HI ) = m + 1 = rank (H). n−(m+1) Applying this reasoning to all supports I ∈ Jn completes the proof.
and l(ω, i, j, u, v) =
ACKNOWLEDGMENT
bω i+j−(u+v)+1 − aω i+j−(u+v)+1 . i + j − (u + v) + 1
The authors are grateful to Dr. Florent Retraint for helpful discussions related to this work, as well as to the reviewers for their constructive comments.
Hence, we get : Pω (t, hi,j ) =
i+j X
ak (ω, i, j) tk ,
(27)
R EFERENCES
k=0
Pk where ak (ω, i, j) = u=0 γ(ω, i, j, u, k−u) and it is assumed by convention that γ(ω, i, j, u, v) = 0 when u > i or v > j. Since ai+j (ω, i, j) = (bω − aω ) cosi ω sinj ω 6= 0, t 7→ Pω (t, hi,j ) is a polynomial function of degree i + j defined for real t. Let us denote the n-dimensional discrete Radon transform of the term hi,j (x, y) = xi y j by H i,j = Rω,τ (hi,j ). columns The matrix H is composed of (m+1)(m+2) 2 H i,j , i.e. H = H 0,0 H 1,0 H 0,1 H 2,0 . . . H 0,m .
[1] Y. Bresler and A. Macovski, “Three-dimensional reconstruction from projections with incomplete and noisy data by object estimation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 35, no. 8, pp. 1139–1151, august 1987. [2] J. Idier (Ed.), Approche bay´esienne pour les probl`emes inverses, ser. Traitement du signal et de l’image. Herm`es, Lavoisier (Trait´e IC2), 2001. [3] L. E. Bryant, Nondestructive tesing handbook (second edition). American Society for Nondestructive Testing, 1985, vol. 3 : radiography & testing. [4] A. Rius, G. Ruffini, and L. Cucurull, “Improving the vertical resolution of ionospheric tomography with GPS occultations,” Geophys. Res. Lett., vol. 24, pp. 2291–2294, 1997.
13
[5] D. J. Rossi and A. S. Willsky, “Reconstruction from projections based on detection and estimation of objects - parts I and II : performance analysis and robustness analysis,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 32, no. 4, pp. 886–906, august 1984. [6] A. Frakt, W. Karl, and A. Willsky, “A multiscale hypothesis testing approach to anomaly detection and localization from noisy tomographic data,” IEEE Trans. Image Processing, vol. 7, no. 6, pp. 825–837, june 1998. [7] E. Clarkson and H. H. Barrett, “Approximations to ideal-observer performance on signal-detection tasks,” Applied optics, vol. 39, no. 11, pp. 1783–1793, 2000. [8] M. D. Wheeler and I. Katsushi, “Sensor modeling, probabilistic hypothesis generation, and robust localization for object recognition,” IEEE Trans. Pattern Anal. Machine Intell., vol. 17, no. 3, pp. 252–265, march 1995. [9] D. Mery, T. Jaeger, and D. Filbert, “A review of methods for automated recognition of casting defects,” Journal of The British Institute of NonDestructive Testing, vol. 44, no. 7, pp. 428–436, 2002. [10] I. Kazantsev, I. Lemahieu, G. Salov, and R. Denys, “Statistical detection of defects in radiographic images in nondestructive testing,” Signal Processing, vol. 82, no. 5, pp. 791–801, 2002. [11] G. Wang and T. W. Liao, “Automatic identification of different types of welding defects in radiographic images,” NDT&E International, vol. 35, no. 8, pp. 519–528, 2002. [12] H. Boerner and H. Strecker, “Automated x-ray inspection of aluminium casting,” IEEE Trans. Pattern Anal. Machine Intell., vol. 10, no. 1, pp. 79–91, 1988. [13] A. Kehoe and G. Parker, “An intelligent knowledge based approach for the automated radiographic inspection of castings,” NDT&E International, vol. 25, no. 1, pp. 23–36, 1992. [14] S. Lawson and G. Parker, “Intelligent segmentation of industrial radiographic images using neural networks,” in Machine Vision Applications ans Systems Integration III, Proc. of SPIE, vol. 2347, 1994, pp. 245–255. [15] A. Gayer, A. Saya, and A. Shiloh, “Automatic recognition of welding defects in real-time radiography,” NDT International, vol. 23, no. 4, pp. 131–136, 1990. [16] D. Mery and D. Filbert, “Automated flaw detection in aluminum castings based on the tracking of potential defects in a radioscopic image sequence,” IEEE Trans. Robot. Automat., vol. 18, no. 6, pp. 890–901, 2002. [17] K. Castleman, Digital Image Processing, 2nd ed. Prentice-Hall, 1996. [18] F. Natterer, The Mathematics Of Computerized Tomography. John Wiley & Sons, 1986. [19] G. Demoment, “Image reconstruction and restoration : overview of common estimation structures and problems,” IEEE transactions on acoustics, speech and signal processing, vol. 37, no. 12, pp. 2024–2036, 1989. [20] M. Davidson, “The ill-conditionned nature of the limited angle tomography problem,” SIAM Journal of applied mathematics, vol. 42, no. 3, pp. 428–448, 1983. [21] E. Miller and A. Willsky, “Multiscale, statistical anomaly detection analysis and algorithms for linearized inverse scattering problems,” Multidimensional Systems and Signal Processing, vol. 8, pp. 151–184, 1997. [22] H. H. Barrett and K. J. Myers, Foundations of Image Science. John Wiley & Sons, 2004. [23] L. Fillatre and I. Nikiforov, “A statistical detection of an anomaly from a few noisy tomographic projections,” Journal of Applied Signal Processing, Special issue on advances in intelligent vision systems: methods and applications-Part II, no. 14, pp. 2215–2228, 2005. [24] G. R. Tillack, C. Nockemann, and C. Bellon, “X-ray modelling for industrial applications,” NDT&E International, vol. 33, no. 1, pp. 481– 488, 2000. [25] B. Chalmond, F. Coldefy, and B. Lavayssiere, “Tomographic reconstruction from non-calibrated noisy projections in non-destructive evaluation,” Inverse Problems, vol. 15, no. 2, pp. 399–411, 1999. [26] F. Retraint, “Contrˆole non destructif de pi`eces industrielles par radiographie : caract´erisation quantitative et reconstruction 3D a` partir d’un nombre limit´e de vues,” PhD dissertation, INSA-Lyon, 1998. [27] I. Elbakri and J. Fessler, “Statistical image for reconstruction for polyenergetic x-ray computed tomography,” IEEE Trans. Med. Imag., vol. 21, no. 2, pp. 89–99, 2002. [28] D. Parker, “Optimal short scan convolution reconstruction for ban-beam CT,” Med. Phys., vol. 9, pp. 254–257, 1982. [29] A. Naparstek, “Short-scan fan-beam algorithm for CT,” IEEE Trans. Nucl. Sci., vol. ns27, pp. 1112–1120, 1980.
[30] T. Peters and R. Lewitt, “Computed tomography with fan-beam geometry,” Journal of Computer Assisted Tomography, vol. 1, pp. 429–436, 1977. [31] K. M. Hanson and G. W. Wecksung, “Local basis-function approach to computed tomography,” Applied Optics, vol. 24, pp. 4028–4039, 1985. [32] L. Scharf and B. Friedlander, “Matched subspace detectors,” IEEE Trans. Signal Processing, vol. 42, no. 8, pp. 2146–2157, 1994. [33] M. Fouladirad and I. Nikiforov, “Optimal statistical fault detection with nuisance parameters,” Automatica, vol. 41, no. 7, pp. 1157 – 1171, jul 2005. [34] L. Scharf, Statistical Signal Processing. Detection, Estimation, and Time Series Analysis. Addison-Wesley, 1991. [35] C. Rao, Linear statistical inference and its applications (second edition). John Wiley & Sons, 1973. [36] A. A. Borovkov, Mathematical Statistics. Gordon and Breach Sciences Publishers, Amsterdam, 1998. [37] E. Lehman, Testing Statistical Hypotheses, Second Edition. Chapman & Hall, 1986. [38] A. Wald, “Tests of statistical hypotheses concerning several parameters when the number of observations is large,” Trans. Amer. Math. Soc., vol. 54, pp. 426–482, 1943. [39] M. Basseville and I. V. Nikiforov, Detection of abrupt changes : theory and application. Prentice Hall, 1993. [40] M. Shahram and P. Milanfar, “Imaging below the diffraction limit: A statistical analysis,” IEEE Transactions on Image Processing, vol. 13, no. 5, pp. 677–689, may 2004. [41] R. Palenichka, A. Alekseichuk, and U. Zscherpel, “Flaw detection in radiographic images by structure-adaptive binary segmentation,” in Computerized tomography for industrial applications and image processing in radiology. DGZfP Proceedings, march 1999, pp. 221–232. [42] D. A. Harville, Matrix algebra from a statistician’s perspective. Springer-Verlag, 1997.