1 DIRECT SEARCH OF TIME DELAY IN BEAMFORMING ...

Report 1 Downloads 14 Views
186

J. R. Aguilar, M. Arias, R. Salinas, and M. A. Abidi, “Direct search of time delay in beamforming applications,” in Proc. of the III Encuentro Chileno de Acústica INGEACUS 2004, Valdivia, Chile, December 2004

DIRECT SEARCH OF TIME DELAY IN BEAMFORMING APPLICATIONS Juan R. Aguilar †, Miguel Arias, Ph.D. †, Renato Salinas, Ph.D. †, and Mongi A. Abidi, Ph.D. ‡ †Graduate Program in Automation, Electrical Engineering Department University of Santiago de Chile Ave. Ecuador 3519 Santiago, CHILE.

‡Imaging, Robotics, and Intelligent Systems Laboratory The University of Tennessee 334 Ferris Hall Knoxville, TN 37996-2100 [email protected]

[email protected]

[email protected] [email protected]

Summary: This paper describes our attempts to use direct search optimization methods for the time delay determination in beamforming applications. Since beamforming involves the search of a set of delays which, when applied to the microphone array signals, maximizes the summed output of the array, direct search methods of optimization becomes an interesting approach to solve this particular problem. Two kind of direct search methods based on heuristics where tried here: pattern search and genetic search. The results indicate that the performance of the algorithms is satisfactory.

0. INTRODUCTION In this research work we explore the application of heuristic direct search methods in the determination of the time delay between two discrete time series. Since each time series can be thought as an n-dimensional vector, the delay determination can be expressed in terms of the amount of delay which minimizes the Euclidean distance between these vectors. The paper starts with a brief review of the basic concepts of beamforming in the time domain. Next, we introduce our formulation of beamforming problem as an optimization problem in terms of the Euclidean distance. To solve the optimization problem, genetic search and pattern search algorithms are presented. We conclude with the results of preliminary experiments performed using a two microphone array (microphone pair). Our experiments were performed under reverberant conditions.

1. BEAMFORMING It is a well known fact that the directional characteristic of an array of sensors can be altered

by applying different amounts of delay to each sensor. Under this premise, it appears the concept of Beamforming. It consists of electronically modifying the directivity pattern of an array of microphones to steer their maximum sensitivity axis in any desired direction, normally over an emitter of our particular interest. This can be useful because it maximizes the amplitude of the received signal improving the signal to noise ratio.

Figure 1. (Left) Directivity pattern of a microphone array, no delay applied. (Right) Beam steering effect that occurs by applying convenient amounts of delay to each microphone.

Beam steering, sometimes called “spatial filtering”, is produced by applying certain amounts of delay to the signals coming from the microphones, and then summing these delayed 1

signals together to obtain the output of the beamformer. Figure 1 depicts the beam steering effect which occurs in beamforming. In Figure 2 we show an example of the signal routing that occurs in beamforming for a plane wave: the signals that arrive to the microphones have a little delay due to their inclination with respect to the axis of the array, the beamformer computes these delays and generates a set of new delays which aligns in the time domain the array signals, obtaining the maximum output.

Figure 2. Illustration of signal routing in a beamforming system for plane wave incidence.

where, x represents the delay between the time series, a and b are the time-window bounds. The best matching between the two timewindows occurs when the Euclidean distance is minima, that is min f ( x) min ( p a , p a 1 ,  , p b )  (q a  x , q a 1 x ,  , q b  x )

The algorithm chooses the time-windows in the following manner: starting from one time series, the algorithm finds the global maximum in that whole series. The position (column) in which the global maxima occurs is then considered to be the center of the window, and their bounds defined by a radius value entered by the user. The time window in the other series has the same length but it can be shifted forward or backward in time by introducing positive or negative delay values. The constraints of the problem appear when we consider that only the delay values in the range 

So, beamforming involves an optimization problem: to find those delays which when applied to the microphone signals will maximize the output signal of the beamformer. Its solution will be the set of delays W i which maximize the function

& z (t , q )

& z (t , q )

m

¦ S (t  W ) i

i

(1)

(4)

d d d x d c ˜ fs c ˜ fs

(5)

are valid. In Eq. 5, d is the separation distance between the microphones, c is the speed of sound, and fs the sampling rate. In practice, the shape of the objective function, Eq. 3, is dependent on the microphone waveforms and on the incidence angle.

i 1

½° ­° m & max^z (t , q )` max ® S i (t  W i )¾ °¿ °¯ i 1

¦

(2)

where, Si represents the i-th microphone signal and m the number of microphones in the array [1]. 2. DELAY ESTIMATION ALGORITHM Basically, our algorithm searches the time delay between two discrete time series by matching a small time-window of one time series with a time-window, of the same length, in the other time series. So, an objective function can be defined in terms of the Euclidean distance between the two time series, Eq. 3, the independent variable x represents the number of samples of delay between the two time windows f ( x)

( pa , pa 1,, pb )  (qa  x , qa 1 x ,, qb  x ) (3)

3. DIRECT SEARCH MINIMIZATION METHODS In 1961 Hooke and Jeeves introduced direct search providing the following description: “we use the phrase direct search to describe sequential examination of trial solutions involving comparison of each trial solution with the best obtained up to that time together with a strategy for determining (as function of earlier results) what the next trial will be.” [3] From a historical context, Hooke and Jeeves paper [3] appeared five years before the ArmijoGoldstein-Wolfe conditions were used to show how the method of steepest descent could ensure global convergence [4][5][6], and only two year after Davidon’s report [7] on using secant updates to derive quasi-Newton methods [2]. Meanwhile, quasi-Newton methods are based on Taylor’s expansion of the objective function; direct search methods makes no use of the derivatives of the objective function but directly evaluates it in each successive minimization step. 2

Direct search methods are reasonably straightforward to implement and can be applied almost immediately in many nonlinear optimization problems were quasi-Newton methods are difficult to implement. Direct search encloses a wide variety of heuristics methods having global convergence behaviour analogous to globalized quasi-Newton technique. In the following sections, we introduce a couple of these direct search methods, namely genetic algorithm and pattern search.

3.1 Genetic Algorithms Genetic search algorithms begin with an initial population of randomly selected initial solutions, a set of random strings representing the problem’s decision variables ^xi ` . Thereafter, each of these initially picked strings is evaluated by using the fitness function Fi . If a satisfactory solution according to some acceptability or search stoppage criterion is already at hand, the search is stopped. If not, the initial population is subject to genetic evolution to procreate the next generation of candidate solutions [8][9]. The genetic process of procreation uses the initial population as input. The members of the current population are processed by three types of operators: reproduction, crossover, and mutation. Reproduction selects a good parent solution or string, which have above average fitness. A popular reproduction operator uses fitness proportionate selection in which a parent solution is selected to move to the mating pool with a probability that is proportional to its own fitness. pi

Fi

¦F

3.2 Pattern Search Algorithm The pattern search algorithm finds a sequence of points that approaches the optimal point. The value of the objective function decreases from each point in the sequence to the next. A pattern is a collection of vectors that the algorithm uses to determine which points to search at each iteration. At each step, the pattern search algorithm searches a set of points, called a mesh, and polls these points by computing the objective function values to find a point that improves the objective function. If this occurs, the poll is called successful and the point it finds becomes the current point at the next iteration. If the algorithm fails to find a point that improves the objective function, the poll is called unsuccessful and the current point stays the same at the next iteration. At each step, the algorithm forms the mesh by multiplying the pattern vectors by a scalar, called the mesh size or by adding the resulting vectors to the current point — the point with the best objective function value found at the previous step [9].

4. EXPERIMENTAL PROCEDURES 4.1 Hardware Setup The Basic features of the hardware setup are shown in the block diagram of the Figure 3,

(6) j

j

Crossover operator, sometimes called recombination, attempts to produce new string solutions of superior fitness by effecting large changes in strings. A crossover probability is used to decide whether a given solution will be crossed. Mutation performs local search around a current solution. Mutation creates a new solution in the neighbourhood of a current solution by introducing a small change in some aspect of the current solution. The progenies are then evaluated and tested for termination. If the termination criteria is not met, the three operators iteratively operates upon the current population

Figure 3. Block diagram of the hardware setup used in the experiments.

We implemented a two-element line microphone array, with pre-polarized condenser type transducers. The microphones have a omnidirectional directivity pattern and they were separated 1 m each other, as shown in Picture 1. The output signals from the microphone were preamplified by using a conventional audio 3

mixing console whose electronics was modified to obtain a separate output for each input channel. The signals were calibrated to have 3.5 Vrms measured with a true rms DMM912 Tektronix voltmeter. The acoustic reference was generated by using a CAL 200 Larson Davis acoustic calibrator. A reference sound pressure level of 94 dB at 1 KHz was used. Then, the signals were introduced to the analog input channels of a National Instruments AI16-XE50 PCMCIA data acquisition card, to a Pentium 4 - 2.6 GHz laptop computer where the signal processing routines were implemented by using Matlab, see Picture 2. The AI-16-XE50 card has 16 non-referenced analog input channels and is capable to sample 16 bit up to 200 KS/s. The voltage amplitude of the incoming signals was near 10 Vpp meanwhile the data acquisition card supports up to 20 Vpp.

Picture 1. Microphone array setup. Note the nonanechoic conditions of the room.

Ch 2

0q

1m

Ch1

30q

90q Figure 4. Microphone array geometry and source position used during the experiments.

4.2. Signal Pre-processing A data acquisition window having duration of 2 seconds was used to acquire microphone array data. The sample rate of each DSP channel was 10 KS/s, accordingly the sampling interval was of 10-4 seconds. The internal arrangement of the acquired data in the Matlab’s data acquisition toolbox produces a matrix of 20000 rows by 2 columns. Each column of this data matrix stores one microphone signal vector of 20000 samples. These vectors were pre-processed to have zero mean and unity variance, and anti-alias bandpass filtering was properly applied. After the time windows are initialized following the algorithm described in Sec. 2, and the objective function is constructed, we are in conditions to apply direct search algorithms.

4.3. Description of the Experiments

Picture 2. The laptop computer, with National Instruments AI-16-XE50 PCMCIA data acquisition card, and the microphone preamplifiers.

During the experiments a male speaker who pronounces the word “hola” was used as sound source. Because none anechoic chamber was available, we did our experiments under reverberant conditions. We try several source positions, as depicted in Figure 4. Figures 5a and 5b shown the plots of the two time series corresponding to microphones 1 and 2, for a talker located at 90°. Figure 6, shows the plot of the objective function corresponding to these two time series. In this case, the talker is located at 90°, correspondingly the global minima occurs at 29. Then, we use both Pattern Search and Genetic Algorithms to search the delay which maximizes the summed output of the array. 4

algorithm was strongly dependent on the initial point and initial mesh size. Taking into account the valid delay values are in the range of -29 to 29 samples, the initial mesh size was set up to 10, and the initial point was fixed to 0. The expansion factor was 2 and the contraction factor was 0.5. Figure 7 shows the convergence of the pattern search algorithm. The delay value found by the algorithm exactly agrees with the expected value. Figure 8 shows the mesh size in each iteration step.

Figure 5a. Time series plot corresponding to Channel 1.

Figure 7. Convergence of the objective function in pattern search.

Figure 5b. Time series plot corresponding to Channel 2.

Figure 8. Mesh size s function of iteration number in pattern search algorithm.

Figure 6. Plot of the objective function as function of the time delay between the two time series, the source (talker) located at 90°. Note the global minimum at 29.

4.4 Pattern Search Algorithm Our first try was pattern search algorithm. Preliminary attempts show the convergence of the

4.5 Genetic Search Algorithm In the genetic search case we use an initial population which contains all the possible solution values, from -29 up to 29. In this context, the algorithms needs only 2 generation two converge. The parameters in the genetic algorithm were Reproduction Elite Count = 2; Crossover

5

Fraction = 0.8; Crossover Function = Scattered; Mutation = Gaussian. Figure 9 shows the convergence of the genetic search algorithm.

Figure 9. Convergence of the genetic search algorithm. The algorithm converges in the second generation because initial population contains all possible solutions.

Figure 11b shows the summed output of the array with added delay (beamformed output). Clearly the beamformer increases the amplitude of the output signal improving signal to noise ratio.

Figure 11a. Summed output of the array. No delay applied to the signals.

The genealogy of the individuals can be observed in figure 10. Red lines indicate mutation children, blue lines indicate crossover children and black lines indicate elite individuals.

Figure 11b. Summed output of the array. Beamforming delay applied. Note the signal amplitude enhancement compared with Fig 11a.

5. CONCLUSIONS AND FUTURE WORK Figure 10. Plot of the genealogy of the individuals.

4.7 Results Finally, the results are evaluated by comparing the amplitude of the output signal of the microphone array (signal N°1 + signal N°2), before and after application of the time delay value found by the direct search algorithm. Figure 11a shows the plot of the summed output of the array, no delay applied; meanwhile

We found that both pattern search and genetic algorithms performed satisfactorily. In spite of the moderate speed of the algorithms, in this simple two microphone array experiment, we think this technique could be useful in microphone arrays with a large number of transducers. Doubtlessly the non-anechoic environment in the room where the experiments were performed constitutes a major source of error, and anechoic tests must be carried out in order to make a more accurate evaluation of the system, particularly in 6

terms of angular resolution and signal to noise ratio enhancement. Another important source of error was the phase response mismatch between the microphones. Overcoming this deficiency requires high quality transducers, preferably matched pairs of microphones, and detailed phase response. In a near future we will implement a four element microphone array for source localization in the horizontal plane and we will run tests and calibrations under anechoic conditions. Long term performance evaluation tests must be conducted in order to evaluate the robustness of the technique under noisy and reverberant conditions.

[3] R. Hooke and T. A. Jeeves, "Direct search" solution of numerical and statistical problems, Journal Association for Computer Machinery (ACM), 8 (1961), [4] L. Armijo, Minimization of functions having Lipschitz continuous first partial derivatives, Pacific Journal of Mathematics, 16 (1966). [5] A. Goldstein, Constructive Real Analysis, Harper & Row, New York, 1967. [6] P. Wolfe, Convergence conditions for ascent methods. SIAM rewiev, 11 (1969). [7] W. Davidon, Variable metric method for minimization, Argone National Laboratory Research and Development Report 5990, May 1959.

6. REFERENCES

[8] T. Bagchi. Multiobjective Scheduling by Genetic Algorithms. Kluwer Academic Publishers, 1999.

[1] M. Brandstein and D. Ward, Microphone Arrays. Springer Verlag, 2001.

[9] Matlab Genetic Algorithm and Direct Search Toolbox User’s Guide. The Mathworks. 2004.

[2] R. Lewis, V. Torczon, and M. Trosset, Direct search methods: Then and now. 2000. www.cs.wm.edu/~va/research/jcam.ps.gz

.

7