IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 3, MARCH 2011
1061
Automatic Detection, Estimation, and Validation of Harmonic Components in Measured Power Spectra: All-in-One Approach Kurt Barb´e and Wendy Van Moer
Abstract—The detection of periodic components buried in noise is a general problem in various engineering fields. The amplitudes in the frequency domain of a disturbed signal follow Rice distribution, which is fully described by two parameters. Most methods are restricted to automatically detecting the harmonic components. In this paper, we extend the methodology to detect significant harmonics in measured spectra such that, aside from detection, the magnitude of the harmonic component is also estimated, together with the probability that the harmonic component was incorrectly detected. Index Terms—Discriminant analysis, harmonic distortions, power spectrum, rice distribution, signal processing/analysis, spectrum analyzer measurements.
I. I NTRODUCTION
I
N VARIOUS fields of applications, one deals with the problem of either detecting harmonic components or assessing the absence of harmonic components in observed signals. Indeed, in spectrum analyzer and radar measurements, one is interested in detecting significant harmonic components [1], [2]. In functional magnetic resonance imaging measurements, one is interested in detecting tumor tissue based on a harmonic analysis of the data [3]–[5]. In microwave measurements, it is interesting to detect only signal lines in order to avoid the need of calibrating all frequency lines [6], [7]. Time series analysis [8], like in modal analysis [9] or power systems [10], often operates under the assumption that no harmonic distortions are present in the data. It is, in general, important to assess the validity of this assumption to avoid biased models. The simplest approach is a visual analysis of the signal’s power spectrum. For digital signals or time series, the discrete Fourier transform can be applied to obtain the signal amplitude spectrum. For analog signals, the spectrum analyzer can be used to measure the signal’s power. If the signal-to-noise ratio (SNR) is large enough, the harmonic components leap out of the signal’s spectrum. Unfortunately, the SNR can be small such that the visual approach becomes infeasible and time consuming. Manuscript received March 15, 2010; revised June 27, 2010; accepted June 28, 2010. Date of publication November 9, 2010; date of current version February 9, 2011. This work was supported in part by a postdoctoral research grant of the Flemish research foundation (FWO), by the Methusalem Grant of the Flemish Government (METH-1), and by the Belgian Program on Interuniversity Poles of attraction initiated by the Belgian State, (IUAP VI/4). The Associate Editor coordinating the review process for this paper was Prof. Alessandro Ferrero. The authors are with the Department of Fundamental Electricity and Instrumentation (ELEC, M2 ESA), Vrije Universiteit Brussel, 1050 Brussels, Belgium (e-mail:
[email protected]). Digital Object Identifier 10.1109/TIM.2010.2062710
Harmonic detection can be performed in both the time and frequency domains. To the knowledge of the authors, the timedomain methods to detect harmonic components are generally based on a (multi)sine fitting approach [11], [12]. These parametric approaches have one drawback in common: obligatory user interaction. Most methods require at least the number of expected harmonics. The frequency-domain methods are often by virtue of the wide application of the fast Fourier transform (FFT) nonparametrically [13]. The advantage is that these require “almost” no user interaction. In addition to that, these methods require minimal knowledge on the probability distribution of the disturbing noise. Automatic detection criteria for harmonic components are often based on a statistical test [6], [7], [14]–[18]. These methods provide no information beyond the detection of the spectral lines. No information is available on neither the probability of a false detection nor an estimate of the magnitude of the harmonic component. In this paper, we offer a novel method based on a statistical test that automatically detects harmonics with three advantages. 1) No user interaction at all is needed, and minimal postulated noise assumptions are required. 2) Every detected frequency receives a probability that the frequency is falsely classified as being either noise or a signal line. 3) The lines detected as signal lines receive an estimate of the magnitude of the harmonic component. The detection algorithm can be implemented such that realtime applications are allowed. This paper is organized as follows: Section II provides an overview of the measurement setup and the assumptions needed to deduct the theory. Section III deals with the automatic detection of signal lines in the measured amplitude spectra. Section IV estimates the amplitude of both the signal components and noise. Section V validates the detected signal components. Section VI illustrates the theory on a few simulation examples. Section VII assesses the method on a measurement example. In the final section, we reach some conclusions. II. M EASUREMENT S ETUP AND S IGNAL A SSUMPTIONS A simple way to detect harmonic components in a measured signal is by inspecting the signal’s power spectrum or spectral density. In this paper, we consider two measurement methods to obtain the signal’s power spectrum, i.e., by using either an analog spectrum analyzer or a digital spectrum analyzer.
0018-9456/$26.00 © 2010 IEEE
1062
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 3, MARCH 2011
where ω = ωL0 − ωIF . The second approximate equality holds since the RBW filters the higher frequencies. C. Digital Spectrum Analyzer
Fig. 1. Simplified spectrum analyzer block diagram.
Before going into some details of the direct and indirect approaches, the signal assumptions used are formulated throughout this paper. A. Signal Assumptions
In a digital spectrum analyzer, the signal x(t) to be measured is digitized with sampling period TS . The spectrum analyzer computes the periodogram of the digitized signal xd (n) = x(nTS ) with n = 0, . . . , N − 1 by means of the FFT. Formally, the output Ax (k) of the digital spectrum analyzer at frequency bin ωk = 2πk/N TS is given by N −1 1
−jωk nTs Ax (k) = √ xd (n)e = |Xd (k)| N n=0
Let us consider a continuous time signal x(t) such that the signal satisfies the following representation:
where Xd (k) denotes the discrete Fourier coefficient of the signal xd (n) at frequency bin k.
x(t) = g(t) + n(t) where g(t) is a multisine with K separate tones, and n(t) is a noise process such that its power spectral density Sn (jω) exists and satisfies two conditions. 1) Sn (jω) is uniformly bounded for ω ≤ |ωB |, with ωB being the noise bandwith. 2) limω→±∞ ωSn (jω) = 0. Conditions 1) and 2) imply that the variance σn2 of the noise n(t) exists. For simplicity, we assume that an integer number of periods of the signal g(t) are measured such that the Fourier spectrum G(jω) of the signal g(t) is discrete. B. Analog Spectrum Analyzer Measurements
12 2
−X (j(λ + ωLO ))| dλ ⎡ ≈
ALO ⎣1 2 π
∞ 0
Automatic detection in the measurement field is normally applied within a statistical test framework. One wants to test the hypothesis that no signal component is present at the frequency line under test. For the sake of notational simplicity, we derive the results for the digital spectrum analyzer. The results are identical for the analog spectrum analyzer. To use a statistical test, one needs knowledge on the distribution of the measurements Ax (jω) or Ax (k) provided by the spectrum analyzer. A. Distribution of the Measurements Ax (jω), Ax (k)
Analog spectrum analyzers are part of the basic equipment of each microwave laboratory. It allows getting an accurate view of the spectral content of microwave signals in a fast and easy way. An analog spectrum analyzer multiplies the input signal x(t) with a sine wave sωL0 (t) with frequency ωL0 and amplitude AL0 . The product passes through a resolution bandwidth (RBW) filter with center frequency ωIF and a power meter, which measures the amplitude in a frequency band around frequency ωIF based on an envelope detector (see Fig. 1 for a simplified block diagram of the spectrum analyzer). Let HRBW (jω) be the transfer function of the RBW and X(jω) be the Fourier spectrum of the signal x(t) at frequency ω. Then, the output Ax (jω) of the spectrum analyzer at frequency ω is given by ∞ A2LO |HRRW (jλ)|2 |X (j(λ − ωLO )) Ax (jω) = 8π −∞
III. AUTOMATIC D ETECTION OF H ARMONIC C OMPONENTS
⎤ 12 |HREW (jλ)|2 |X(j(ωLO −λ))|2 dλ⎦
To obtain the distribution of Ax (k), we need to analyze |Xd (k)|. Using the definition of the signal x(t), we obtain |Xd (k)| = |Gd (k) + Nd (k)| with Nd (k) and Gd (k) being the discrete Fourier coefficient at frequency bin k of the digitized noise signal n(t) and periodic signal g(t), respectively. It is well known that the distribution of Nd (k) is asymptotically (N → ∞) complex circular Gaussian distributed with zero mean and variance Sn (jωk ) [19]. Hence, we obtain 1 Δ |Xd (k)| = Rice |Gd (k)| , Sn (jωk ) (1) 2 Δ
where = indicates equality in distribution. More details on (1) are found in Appendix A. A random variable X is Rice(v, σ 2 ) distributed if (see [4], [20], and [21])
Δ Δ X = Rice(v, σ 2 ) = N (v cos(θ), σ 2 )2 +N (v sin(θ), σ 2 )2 (2) where θ is a given real number, and N (v cos(θ), σ 2 ) denotes a normal distribution with mean v cos(θ) and variance σ 2 . The two normal distributions in (2) are assumed independent.
BARBÉ AND VAN MOER: DETECTION, ESTIMATION, AND VALIDATION OF HARMONIC COMPONENT IN POWER SPECTRA
1063
B. Automatic Detection: Classical Approach Classically, references [6], [7], and [14]–[18] apply the same method as the visual approach: the harmonic components are found above the noise floor. Hence, the decision rule becomes Ax (k) > c ⇒ Ax (k) contains a signal componet Ax (k) ≤ c ⇒ Ax (k) contains noise only with c being a user-defined critical level. One does not wish to detect signal components that are actually not there. Hence, in practice, c is defined as the value such that the probability of falsely detecting signal lines is a user-defined percentage α (often α = 1% or 5%). In literature, α is known as the significance level. Formally P {Ax (k) > c|Gd (k) = 0} = α
(3)
where P(A|B) denotes the “conditional” probability that A occurs if B has occurred. If the user has chosen a value for α, the critical value c can be computed. Let FRice(v,σ2 ) (x) be the cumulative distribution function of the Rice distribution with parameters v and σ 2 evaluated at x. Applying the definition (3) of the significance level α in combination with (1), the critical value becomes −1 (1 − α). c = FRice (0, 12 Sn (jωk ))
(4)
The practical problem when computing the critical value (4) is that one needs to estimate Sn (jωk ) and choose α. Although there are some standard choices for α = 1%, 2.5%, 5%, the results may strongly depend on this choice. This makes it difficult to apply for a layman user. A second problem is that the critical value is computed under the null hypothesis that no signal is present in the data (G(k) = 0). Hence, under the additional assumption that the noise spectrum is white, one estimates Sn (jωk ) as [20] F 1
Sˆn (jωk ) = (Ax (k)) F
(5)
k=1
with F being the number of frequencies. Clearly, the estimator (5) is biased since the null hypothesis (G(k) = 0) is not valid, because there are signal lines that we want to detect. In the next section, we propose a novel approach where neither an estimate of the noise spectrum nor any user interaction is required. C. Automatic Detection: Novel Approach As long as the harmonic detection problem is solved by means of a statistical test, one will never be able to eliminate user interaction: the user will always need to choose a significance level α. Choosing α can be involved since a value α that is too small implies that many noise lines are detected as signal lines and a value α that is too large implies that no signal lines will be detected. The optimal choice in practice will depend on the SNR.
Fig. 2. Philosophy of discriminant analysis: (gray) spectrum analyzer measurements, (solid black line) discrimination height, (dashed black line) average of the groups, (solid black arrow) standard deviation within groups, and (dashed black arrow) distance between the groups.
The novel approach is based on discriminant analysis [22], [23]. The goal of discriminant analysis is to predict group membership. In this paper, the different groups are given as follows: the frequency lines containing noise only and the frequency lines containing signal harmonics. In discriminant analysis, the best separator to discriminate between the two groups is the linear function that maximizes the distance between the groups and minimizes the variance within the groups. 1) Detailed Philosophy of Discriminant Analysis: In Fig. 2, the philosophy of discriminant analysis is illustrated. The gray curve is the spectrum analyzer measurement; the full black curve is the discrimination height. The frequencies with measured amplitudes below the discrimination height are classified as being the group containing pure noisy lines; those above is the group of frequencies with signal harmonics. The dashed black curves are the averages of the measurements in every group. The solid black arrows denote the “distance” within one group, and the dashed arrow denotes the distance between the groups defined as the distance between the group means. The objective of discriminant analysis is to find the solid black curve such that the variance within the groups is minimized and the distance between the groups is maximized. 2) Foundation of Discriminant Analysis: In this section, we derive the minimax solution of discriminant analysis and discuss its properties. We derive the classical minimax solution in a different way than classical textbooks, where Gaussian disturbances are assumed [22]–[24]. Clearly, since the spectrum analyzer measurements follow a Rice distribution, the Gaussian assumption is not valid. Let I and J be the membership sets such that I = {k|Ax (k) contains a signal harmonic} J = {k|Ax (k) contains noise only} . Hence, I is the set of frequencies that are classified as containing signal harmonics, and J is the set of frequencies containing noise lines. A priori, we have no knowledge on the set membership.
1064
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 3, MARCH 2011
Furthermore, we define the sample mean over the different groups 1
AˆIx = Ax (k) |I| k∈I 1
Ax (k) AˆJx = |J| k∈J
where |I| and |J| denote the size of the sets I and J, respectively. We know in advance that the data consists of two groups. Hence, it is natural that the sets I and J are chosen in such a way that the hypothesis AˆIx = AˆJx is statistically rejected as “strongly” as possible. The common approach to test the hypothesis AˆIx = AˆJx is the two-sample t-test [25]. The test statistic is given by T =
AˆIx − AˆJx σ ˆI2
(|I| − 1) +
σ ˆJ2
(|J| − 1)
|I| + |J|.
Squaring the test statistic (6) implies 2 AˆIx − AˆJx (|I| + |J| − 2) . T2 = 2 σ ˆI (|I| − 1) + σ ˆJ2 (|J| − 1)
(6)
binary grid search arrives at the correct discrimination height in maximally N steps. Hence, this can easily be implemented in spectrum analyzers and used in real time. IV. E STIMATION OF THE M AGNITUDES FOR THE S IGNAL AND N OISE P OWER The second objective, aside from the signal and noise classification, is providing the measurement engineer with an estimate of the magnitude of the signal and noise amplitudes. Due to (1), the expected value of the squared amplitude measurements per frequency is equal to [20] (8) E A2x (k) = |Gd (k)|2 + Sn (jωk ). The discriminant analysis reveals that the frequency lines k ∈ J are the noise-only lines. Under the additional assumption that the noise spectrum is white, the noise power can be estimated by applying (5) over the set J, i.e., 1
(Ax (k))2 . (9) Sˆn (jωk ) = |J| k∈J
(7)
In order to reject the hypothesis AˆIx = AˆJx as strongly as possible, we need to choose the set memberships I and J in such a way that the statistic (7) is maximized. The numerator of (7) is exactly the distance between the group means, and the denominator can be seen as the ‘distance’ within the groups computed as variances. Thus, maximizing (7) implies maximizing the numerator or the distance between groups and minimizing the denominator or the distance within groups. This was exactly the objective of the discriminant analysis. The solution (7) is known as Fisher’s quadratic discriminant [22], [23]. For Gaussian-distributed measurements, one can show that (7) is optimal in the sense that the probability of misclassification is minimal. Historically, Fisher was interested in the following problem: Let x and y be two 2-D Gaussian random vectors. Assume that we have N measurements x(n), n = 0, . . . , N − 1 for x and similar for y. The problem is that we do not know which measurement corresponds to x and to y. Indeed, we have 2N measurements of the vector z consisting of x and y. The strategy of Fisher was to find a discriminant line separating the measurements of x and y such that the probability of misclassification is minimized. If one assumes that z follows a Gaussian distribution and, hence, x and y are also Gaussian distributed, the optimal solution to minimize the misclassification probability is given by (7). Fortunately, it is well known that the Rice distribution tends to a Gaussian distribution if the SNR of the measurements improves. Hence, maximizing (7) is approximately optimal to discriminate between noise and signal lines without any need for user interaction. To maximize (7), we use a binary grid search to arrive at the correct discrimination height. It is well known that a binary grid search is fast since the search space (the discrimination height) is 1-D. Hence, if the measurement device is of N bits, the
Combining (8) and (9) results in an estimate of the signal component for k ∈ I ˆ (10) Gd (k) = A2x (k) − Sˆn (jωk ). Please note that (9) is the maximum-likelihood estimator for the noise power for the set of noise lines J. The maximumlikelihood estimator for the signal component Gd (k) is unavailable since we only have one amplitude measurement per frequency. Hence, we apply (8), where the expected value E[A2x (k)] is computed as A2x (k). V. P ROBABILISTIC VALIDATION OF THE D ETECTED H ARMONICS Since it is not assumed to have a set of frequencies where one knows in advance that signal harmonics are present, one cannot pursue a databased validation (training + validation set). A probabilistic validation of the discrimination between signal and noise lines is proposed. The main idea is to compute the probability that the observed amplitude measurement Ax (k) is wrongly classified. This is known as the probability of misclassification π(k). Let A∗ (k) be a new measurement at frequency k. We know from Section III-A that A∗ (k) follows a Rice distribution with parameters (|Gd (k)|, (1/2)Sn (jωk )) if k is a signal line and (0, (1/2)Sn (jωk )) if k is a noise line. Formally, the probability of misclassification is given by P (A∗ (k) > Ax (k)|k ∈ I) π(k) = (11) P (A∗ (k) < Ax (k)|k ∈ I) . The first probability of misclassification in (11) deals with frequency lines classified as signal lines. In this case, π(k) denotes the probability that values A∗ (k) larger than the measurement Ax (k) can be observed, even though frequency line k is a noise line (k ∈ I). The second probability of misclassification in (11) deals with frequency lines classified as noise lines. In this case, π(k) denotes the probability that values
BARBÉ AND VAN MOER: DETECTION, ESTIMATION, AND VALIDATION OF HARMONIC COMPONENT IN POWER SPECTRA
A∗ (k), which are smaller than the measurement Ax (k), can be observed, even though the frequency line k is a signal line (k ∈ I). Indeed, for the first probability, we are dealing with a frequency line k classified as a signal line. This means that the measurement Ax (k) is larger than the discrimination height issued by maximizing (7) (see the solid black line in Fig. 2). The probability of misclassification P(A∗ (k) > Ax (k)|k ∈ I) is the probability that even larger measurements A∗ (k) than the measurement Ax (k) can be observed while k is a noise line. If such larger observations can be observed, the line k will clearly be classified as a signal line, whereas k is a noise line resulting in a misclassification. This provides the user with some quality label/confidence on the discrimination. However, the probabilities in (11) cannot be computed since this requires knowledge on the true signal and noise amplitudes. Fortunately, the probabilities can be estimated by an application of (9) and (10). 1) An estimate of P(A∗ (k) > Ax (k)|k ∈ I). One needs to compute the probability that A∗ (k) > Ax (k), where Ax (k) is the measured power at frequency bin k, given that k is a noise line. Hence, we know from (1) that A∗ (k) follows a Rice(0, (1/2)Sn (jωk )) distribution. As a result P (A∗ (k) > Ax (k)|k ∈ I) = 1 − FRice(0, 1 Sn (jωk )) (Ax (k)) .
1065
Fig. 3. Automatic signal line detection for an SNR of 20 dB: (gray) amplitude measurements, (solid black) discrimination height, (dashed black) critical height of the statistical test, and (black arrow) signal lines.
The two different standard deviations for the noise process n(t) result in an SNR of 20 and 5 dB, respectively. For the classical method, which was described in Section III-B, we used a significance level α = 0.01.
2
Since the true noise power is unknown, we use the estimate (9) to compute FRice(0,(1/2)Sn (jωk )) (Ax (k)). 2) An estimate of P(A∗ (k) > Ax (k)|k ∈ I). We need to compute the probability that A∗ (k) < Ax (k), where Ax (k) is the measured power at frequency bin k, given that k is a signal line. Hence, we know from (1) that A∗ (k) follows a Rice(|Gd (k)|, (1/2)Sn (jωk )) distribution. Hence P (A∗ (k) < Ax (k)|k ∈ I) = FRice(|Gd (k)|, 1 Sn (jωk )) (Ax (k)) . 2
Since both the true noise power and the signal’s amplitude is unknown, the estimates (9) and (10) are used to compute FRice(|Gd (k)|,(1/2)Sn (jωk )) (Ax (k)). An estimate of the respective probabilities of misclassification is given by 1 − FRice(0, 1 Sˆn (jωk )) (Ax (k)) 2 (12) π ˆ (k) = FRice(Gˆ d (k), 1 Sˆn (jωk )) (Ax (k)) . 2
VI. N UMERICAL E XAMPLES In this section, the new method is illustrated based on a discriminant analysis and compared with the classical method based on a hypothesis test. In the simulation, we generated a time series y(t), t = 0, . . . , 2048. The signal y(t) satisfies the following equation: y(t) = g(t) + n(t) where n(t) is a zero-mean white Gaussian noise √ sequence with standard deviation σ1 = 1/10 and σ2 = 1/ 10. The signal g(t) is a multisine with unit amplitude spectrum and ten tones arbitrarily chosen.
A. SNR of 20 dB The signal detection is performed in Fig. 3. The black arrows indicate the true signal lines. The solid black line gives the discrimination height, following the novel approach of Section III-C. The dashed black line gives the discrimination height, following the classical approach of Section III-B. We see that the novel method is able to separate the signal and noise lines perfectly without any observed errors, whereas the classical method classifies nine noise lines as signal lines, which is unacceptable. As explained in Section III-B, the classical method needs user interaction. Indeed, to obtain the same discrimination height as the novel method, the user needs to choose the significance level α = 10−4 . The user is not able to determine this choice for α in advance. The choice of α implies that, if there are N measured lines, the classical method can result in N α incorrectly detected signal lines. Hence, if α is fixed and too large, the detection height will cut in the noise floor and falsely detect noise lines as signal lines. The new method does not desire this choice of α from the user and is more capable of finding a good position, based on the measurements, between the noise floor and the signal lines. Next, the estimators (9) and (10) are computed to estimate the variance power and the signal amplitude. These estimators ˆ d (k)| = 0.9959, which correreveal Sˆn (jωk ) = 0.0098 and |G spond well to the theoretical values of Sn (jωk ) = 0.01 and |Gd (k)| = 1. Finally, the validation step described in Section V was conducted. The misclassification probabilities are estimated with (12). The probability that the detected signal lines are falsely detected is 0 (= 1−FRice(0,(1/2)Sˆn (jωk )) (Ax (k))). In total, 1014 lines were classified as noise, in which 638 lines received the misclassification probability (= FRice(|Gˆ d (k)|,(1/2)Sˆn (jωk)) (Ax (k)))
1066
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 3, MARCH 2011
Fig. 4. Automatic signal line detection for an SNR of 5 dB: (gray) amplitude measurements, (solid black) discrimination height, (dashed black) critical height of the statistical test, and (black arrow) signal lines.
around 0.001 and, for the other 376 lines, the misclassification probability ranged from 0.001 to 0.58. The discrimination rule successfully separated the signal lines from the noise since the signal lines have a misclassification probability of 0 and the average misclassification probability of the classified noise lines was 0.18. B. SNR of 5 dB In Fig. 4, the signal detection was performed. The same legend as in Fig. 3 holds. The novel method, which was described in Section III-C, can correctly find six signal lines, together with four noise lines. The classical approach, which was described in Section III-B, can correctly find six signal lines at the expense of 18 detected noise lines. Due to the very poor SNR, the other signal lines are buried in the noise. Similar to the previous simulation, we used a significance level α = 0.01 to obtain the discrimination height of the classical approach (dashed black line). To obtain the same discrimination height with the classical method as the novel approach, we have to use α = 0.002. This choice depends on the true SNR; thus, the user is not able to determine this choice for α in advance. Next, the estimators (9) and (10) are computed to estimate the variance power and the signal amplitude. These estimaˆ d (k)| = 1.4605, which tors reveal Sˆn (jωk ) = 0.3092 and |G correspond quite well, taking the poor SNR √ into account, to the theoretical values of Sn (jωk ) = 1/ 10 ≈ 0.3162 and ˆ d (k)| = 1. |G Finally, we performed a validation step. We estimated the probabilities of misclassification. The misclassification probabilities are estimated with (12). The probability that the detected signal lines are falsely detected is about 2.84 10−4 (= 1 − FRice(0,(1/2)Sˆn (jωk )) (Ax (k))). In total, 1006 lines were classified as noise, in which 615 lines received the misclassification probability (= FRice(|Gˆ d (k)|,(1/2)Sˆn (jωk )) (Ax (k))) around 2.84 10−4 and, for the other 391 lines, the misclassification probability ranged from 2.84 10−4 to 0.56. Even under the very poor SNR, the discrimination method performed relatively well without any user interaction.
Fig. 5. Fraction of the (left) misclassified signal lines and (right) noise lines as a function of the SNR: (solid curve) novel approach and (dashed curve) classical approach.
C. Misclassification as a Function of the SNR A simulation was performed, where the true signal was a multisine with 100 separate tones. In total, 2048 samples of the time series was 1000 times simulated under the different SNRs ranging from 5 to 30 dB. For the classical method, we used a significance level α = 0.05. The left plot of Fig. 5 shows the fraction of the detected lines that were incorrectly classified as signal lines. We see that, for very low SNRs, both methods can be expected to perform equally well in the sense that the number of falsely detected signal lines is of the same order. As the SNR improves, the false signal detection rapidly decreases to 0 from approximately 10 dB for the novel method, where the classical method or the hypothesis test always falsely classifies some noise as signal lines. Unfortunately, it is not immediately clear why the false signal detection fraction is not a monotonic curve for the classical method. The right plot shows that the novel method misses more signal lines than the statistical test. However, from approximately 20 dB, both methods perform equally well. In conclusion, the novel method clearly outperforms the statistical test for SNRs of 20 dB or higher. Indeed, even for larger SNRs, the classical method in this simulation has a probability of falsely detecting signal lines of approximately 4%. This means that 4% of the detected lines are actual noise
BARBÉ AND VAN MOER: DETECTION, ESTIMATION, AND VALIDATION OF HARMONIC COMPONENT IN POWER SPECTRA
Fig. 6. Automatic signal line detection for the PRBS of 0.97 MHz: (gray) amplitude measurements, (solid black) discrimination height, and (dashed black) critical height of the statistical test.
1067
Fig. 7. Probability of misclassification that the lines are wrongly classified as noise for the frequency band [0,20] MHz.
lines. For the measurement engineer, this means that 4% of the calibrated frequency lines were not necessary. VII. M EASUREMENT E XAMPLE : PRBS S IGNAL In the measurement example, two pseudorandom binary sequence (PRBS) signals are measured with an analog spectrum analyzer. A. PRBS Frequency of 0.97 MHz The signal detection is performed in Fig. 6. The solid black line gives the discrimination height, following the novel approach of Section III-C. The dashed black line, for a significance level α = 0.01, gives the discrimination height, following the classical approach of Section III-B. One sees that both methods perform equally good, except that the classical method (dashed line) selects eight additional lines, which do not seem to be harmonically related. It was impossible to find a significance level α such that the classical method obtains the same discrimination height as the novel method. This means that the actual α < 10−16 . Next, the estimators (9) and (10) are computed to estimate the variance power and the signal amplitude. These estimators ˆ = −58.1 dBm. reveal Sˆn (jω) = −142.92 dBm and |G(jω)| ˆ The amplitude estimate |G(jω)| = −58.1 dBm corresponds well to the visual observation, but the noise amplitude estimate Sˆn (jω) = −142.92 dBm seems to correspond to an overestimate. This slightly suggests for a closer examination. The suspicion is acknowledged by the validation step. First, we estimated the misclassification probabilities that the detected signal lines are wrongly classified. This probability was 0 (= 1 − FRice(0,(1/2)Sˆn (jωk )) (Ax (k))). Next, we computed the misclassification probabilities that the detected noise lines are falsely detected. All lines, except the lines with frequencies kf0 /4, with k = 4N and f0 = 0.97 MHz, receive a misclassification probability around 10−8 . Indeed, in Fig. 7, we computed the misclassification probabilities of the frequencies classified as noise. The vertical lines correspond to the frequencies classified as signal lines. Hence,
Fig. 8. Automatic signal line detection for the PRBS of 3.88 MHz: (gray) amplitude measurements, (solid black) discrimination height, and (dashed black) critical height of the statistical test.
one can see, between two vertical lines, three frequencies having a large misclassification probability. These frequencies correspond to kf0 /4, with k = 4N and f0 = 0.97 MHz. Thus, the validation implies that the frequencies kf0 /4, with k = 4N and f0 = 0.97 MHz, are signal lines. A second experiment should be conclusive. We generated a new PRBS signal at a frequency of 3.88 MHz, which is 4 × 0.97 MHz. Hence, we expect to see potential signal components at frequencies that are multiples of 0.97 MHz. B. PRBS Frequency of 3.88 MHz The signal detection is performed in Fig. 8. The solid black line gives the discrimination height, following the novel approach of Section III-C. The dashed black line, for a significance level α = 0.01, gives the discrimination height, following the classical approach of Section III-B. The classical method appears to cut into the noise where the novel method seems to discriminate better between noise and signal lines. Again, it was impossible to find a significance level α such that the classical method obtains the same discrimination height as the novel method. This means that the actual α < 10−16 .
1068
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 3, MARCH 2011
Next, the estimators (9) and (10) are computed to estimate the variance power and the signal amplitude. These estimators ˆ = −58.3 dBm. reveal Sˆn (jω) = −188.4 dBm and |G(jω)| ˆ The amplitude estimate |G(jω)| = −58.3 dBm and the noise amplitude estimate Sˆn (jω) = −188.4 dBm correspond well to the visual observations. An interesting observation is that, between the harmonics of the PRBS frequency, the classification finds approximately three additional signal lines. A closer examination reveals that these frequencies are at multiples of 0.97 MHz, as expected. The reason the classification works better in the second measurement example is there are more noise lines than in the previous example. Indeed, the frequency band is, in both examples, the same, but the PRBS frequency changed. The signal lines are the more difficult lines to classify statistically due to the fact that its amplitude is not random. (It is the sum of a deterministic part and a stochastic part.) The more true the noise lines, the better the discrimination works. We estimated the misclassification probabilities that the detected signal lines are wrongly classified. This probability was 0 (= 1 − FRice(0,(1/2)Sˆn (jωk )) (Ax (k))) for all detected signal lines. The misclassification probability that the noise lines are wrongly detected is about 10−5 for all detected noise lines. This suggests that the discrimination was correctly performed. This indicates that the frequency lines corresponding to multiples of 0.97 MHz are signal lines.
A PPENDIX A P ROBABILITY D ISTRIBUTION OF |Xd (k)| It is explained in Section III-A that |Xd (k)| = |Gd (k) + Nd (k)|, with Nd (k) and Gd (k) being the discrete Fourier coefficient at frequency bin k of the digitized noise signal n(t) and periodic signal g(t), respectively. It is well known that the distribution of Nd (k) is asymptotically (N → ∞) complex circular Gaussian distributed with zero mean and variance Sn (jωk ), [19]. The following convenient notation is introduced: Xd (k) = Xd,r (k) + jXd,i (k), where j denotes the imaginary unit and Xd,r (k) and Xd,i (k) are the real and imaginary parts, respectively, of the Fourier coefficient Xd (k). Hence, one obtains Xd (k) = Xd,r (k) + jXd,i (k) = Gd,r (k) + Nd,r (k) + j (Gd,i (k) + Nd,i (k)) . Hence, we obtain |Xd (k)|=|Xd,r (k)+jXd,i (k)| 2 2 1 1 Δ = N Gd,r (k), Sn (jωk ) +N Gd,i (k), Sn (jωk ) 2 2 which is equal to the Rice distribution in (2). R EFERENCES
C. Confronting the Measurement Engineer The novel approach allows discovering signal lines that are present at multiples of 0.97 MHz. At first sight, one would say that this is a wrong conclusion since these components look like noise and have a much lower signal level than the ‘real’ signal lines. However, when taking a closer look to the schematic of the PRBS generator, it becomes obvious that these signal lines, which are located at multiples of 0.97 MHz, are related to the clock signal that is used to generate the PRBS signal. This allows concluding that the clock signal is not fully isolated from the output signal of the PRBS generator but “leaks” to the output. VIII. C ONCLUSION This paper has studied a method that allows detecting signal components in the measured amplitude spectra. The advantages over existing methods are given here. 1) It is fully automatic, without any need for user interaction. Hence, every kind of user can apply the method without the need to answer nasty questions. 2) The method provides the user with an estimate of the amplitude spectrum of signal and noise under the additional assumption that the spectra are flat. 3) The method provides the user with a simple and, again, user-friendly validation. The probabilities of misclassification can easily be computed with commercial software packages. It provides the user with either some warnings, like in the measurement example, or a quality label on the signal detection.
[1] A. Stelzer and M. Pichler, “Resolution enhancement with model-based frequency estimation algorithms in radar signal processing,” Subsurface Sens. Technol. Appl., vol. 4, no. 3, pp. 241–261, Jul. 2003. [2] A. Stelzer, M. Pichler, S. Scheiblhofer, and S. Schuster, “Identification of SAW ID-tags using an FSCW interrogation unit and model-based evaluation,” IEEE Trans. Ultrason., Ferroelectr., Freq. Control, vol. 51, no. 11, pp. 1412–1420, Nov. 2004. [3] A. J. den Dekker, D. H. J. Poot, R. Bos, and J. Sijbers, “Likelihood-based hypothesis tests for brain activation detection from MRI data disturbed by colored noise: A simulation study,” IEEE Trans. Med. Imag., vol. 28, no. 2, pp. 287–296, Feb. 2009. [4] J. Sijbers, A. J. den Dekker, P. Scheunders, and D. Van Dyck, “Maximumlikelihood estimation of Rician distribution parameters,” IEEE Trans. Med. Imag., vol. 17, no. 3, pp. 357–361, Jun. 1998. [5] L. K. Hansen, F. A. Nielsen, and J. Larsen, “Exploring fMRI data for periodic signal components,” Artif. Intell. Med., vol. 25, no. 1, pp. 35–44, May 2002. [6] D. Rabijns, G. Vandersteen, and W. Van Moer, “An automatic detection scheme for periodic signals based on spectrum analyzer measurements,” IEEE Trans. Instrum. Meas., vol. 53, no. 3, pp. 847–853, Jun. 2004. [7] W. Van Moer, Y. Rolain, and J. Schoukens, “An automatic harmonic selection scheme for measurements and calibration with the nonlinear vectorial network analyzer,” IEEE Trans. Instrum. Meas., vol. 51, no. 2, pp. 337–341, Apr. 2002. [8] P. Broersen, Automatic Autocorrelation and Spectral Analysis. Berlin, Germany: Springer-Verlag, 2006. [9] B. Cauberghe, P. Guillaume, P. Verboven, and E. Parloo, “Identification of modal parameters including unmeasured forces and transient effects,” J. Sound Vib., vol. 265, no. 3, pp. 609–625, Aug. 2003. [10] A. S. Yilmaz, A. Alkan, and M. H. Asyali, “Applications of parametric spectral estimation methods on detection of power system harmonics,” Electr. Power Syst. Res., vol. 78, no. 4, pp. 683–693, Apr. 2008. [11] T. H. Li and K. S. Song, “Estimation of the parameters of sinusoidal signals in non-Gaussian noise,” IEEE Trans. Signal Process., vol. 57, no. 1, pp. 62–72, Jan. 2009. [12] S. Schuster, S. Scheiblhofer, and A. Stelzer, “The influence of windowing on bias and variance of DFT-based frequency and phase estimation,” IEEE Trans. Instrum. Meas., vol. 58, no. 6, pp. 1975–1990, Jun. 2009. [13] B. Porat, Digital Processing of Random Signals. Englewood Cliffs, NJ: Prenctice-Hall, 1994.
BARBÉ AND VAN MOER: DETECTION, ESTIMATION, AND VALIDATION OF HARMONIC COMPONENT IN POWER SPECTRA
[14] M. Ahdesmaki, H. Lahdesmaki, and O. Yli-Harja, “Robust Fisher’s test for periodicity detection in noisy biological time series,” in Proc. IEEE Int. Workshop Genomic Signal Process. Statist., Tuusula, Finland, 2007, pp. 39–42. [15] O. C. de Jager, B. C. Raubenheimer, and J. W. H. Swanepoel, “A powerful test for weak periodic signals with unknown light-curve shape in sparse data,” Astron. Astrophys., vol. 221, no. 1, pp. 180–190, Aug. 1989. [16] M. J. Hinich and P. Wild, “Detecting finite bandwidth periodic signals in stationary noise using the signal coherence spectrum,” Signal Process., vol. 85, no. 8, pp. 1557–1562, Aug. 2005. [17] A. W. Liew, N. Law, X. Cao, and H. Yan, “Statistical power of Fisher test for the detection of short periodic gene expression profiles,” Pattern Recognit., vol. 42, no. 4, pp. 549–556, Apr. 2009. [18] S. Vaughan, “A simple test for periodic signals in red noise,” Astron. Astrophys., vol. 431, no. 1, pp. 391–403, Feb. 2005. [19] D. Brillinger, Time Series: Data Analysis and Theory. New York: McGraw-Hill, 1981. [20] C. Carobbi and M. Cati, “The absolute maximum of the likelihood function of the rice distribution: Existence and uniqueness,” IEEE Trans. Instrum. Meas., vol. 57, no. 4, pp. 682–689, Apr. 2008. [21] L. Lauwers, K. Barbe, W. Van Moer, and R. Pintelon, “Estimating the parameters of a Rice distribution: A Bayesian approach,” in Proc. IEEE Int. Instrum. Meas. Technol. Conf., Singapore, 2009, pp. 114–117. [22] W. R. Klecka, Discriminant Analysis. Beverly Hills, CA: Sage, 1980. [23] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Ann. Eugenics, vol. 7, pp. 179–188, 1936. [24] M. Kendall and A. Stuart, The Advanced Theory of Statistics: Design and Analysis, and Time Series, 4th ed. High Wycombe, U.K.: Griffin, 1983. [25] R. Walpole and R. Myers, Probability & Statistics for Engineers & Scientists, 4th ed. New York: Macmillan, 2004. [26] B. Picinbono, Random Signals and Systems. Englewood Cliffs, NJ: Prentice-Hall, 1993.
1069
Kurt Barbé was born in Aalst on April 25, 1983. He received the M.S. degree in mathematics (option Statistics) and the Ph.D. degree in electrical engineering from Vrije Universiteit Brussel (VUB), Brussels, Belgium, in 2005 and 2009, respectively. He is currently a Postdoctoral Researcher with the Department of Fundamental Electricity and Instrumentation (ELEC), VUB. His research interests are system identification, time series, and its finite sample properties. Dr. Barbé has been an Associate Editor for the IEEE T RANSACTIONS ON I NSTRUMENTATION AND M EASUREMENT since 2010.
Wendy Van Moer received the Engineer and Ph.D. degrees in applied sciences from Vrije Universiteit Brussel (VUB), Brussels, Belgium, in 1997 and 2001, respectively. She is currently an Associate Professor with the Department of Fundamental Electricity and Instrumentation (ELEC), VUB. Her research interests are nonlinear measurement and modeling techniques for medical and high-frequency applications. Dr. Van Moer has been an Associate Editor for the IEEE T RANSACTIONS ON I NSTRUMENTATION AND M EASUREMENT since 2007 and was an Associate Editor for the IEEE T RANSACTIONS ON M ICROWAVE T HEORY AND T ECHNIQUES in 2010. She was the recipient of the 2006 Outstanding Young Engineer Award from the IEEE Instrumentation and Measurement Society.