Robust Detection of Buildings in Digital Surface ... - Semantic Scholar

Report 3 Downloads 108 Views
Robust Detection of Buildings in Digital Surface Models Prabhu Krishnamoorthy Dept. of Electrical Engineering The Ohio State University Columbus, Ohio

Kim L. Boyer Dept. of Electrical Engineering The Ohio State University Columbus, Ohio

Abstract

Patrick J. Flynn Dept. of Computer Science University of Notre Dame Notre Dame, Indiana

[7] used both optical and range data. Weidner et al. [lo] used morphological opening; Brunn and Weidner [ll added Bayesian nets. Kilian et al. [Sl noted that the size of the structuring element used in opening [lo, 11 to obtain an estimate of the topographic surface is critical. Furthermore, the appmximation converges to the underlying topographic surface near the objects only when it is relatively smoother than the surface with the objects included. Buildings sited on coarse terrain are not identified during initial segmentation. Vosselman [9] proposed a technique to filter out spatial objects from laser data for generating Digital Elevation Models (DEM). Essentially, the complement of this approach can be adapted to our purpose. Since this method avoids the approximation step, there are fewer errors.

1. Introduction The classification of land use, degree of development, and the location of major features, such as buildings and expanses of vegetation are of considerable interest to planning bodies, relief agencies, and the military. Landscape characteristics are under continuous modification, especially in urban and suburban areas, and therefore the maps of these areas need regular updating. Derived databases, such as land use maps and 3D city models, are used in several applications including change detection and urban planning [7,5]. Tnthis paper, we use Digital Surface Models (DSM) with lm, ground resolution to detect and discriminate buildings and vegetation. The DSM is a 2.5D matrix representation of relief generated from data obtained using scanning laser altimeters (LTDAR). Huising et al. [6] demonstrated the potential of laser data for the extraction of topographic information. Several techniques have been proposed for detecting buildings in aerial imagery. However, strategies relying entirely on image data are often sensitive to brightness variations or require additional information. Being dense and independent of ambient lighting, DSM data are better suited to this purpose [lo, 21. Haala [5] located buildings by searching for local maxima, Eckstein [3] suggested a dual rank filter for extracting spatial objects and Jaynes et al.

2. Extraction of Cue Regions We base our approach on Vosselman’s work [9], observing that large height differences between two nearby points are unlikely to be caused by terrain variations. We define a non-ground point pi = (xi. vi. zi) as one for which there is a point p,j = (xi. g,j. z,j) such that the height difference between these points is larger than a specified maximum at horizontal distance n’(pi. p,j). For range data of the form z = Z(X w), the emsim of point pi within a discrete set of points A with kernel ~(Az. a~) is

Tn particular, when the kernel is defined using a filter function such as ,$(az.ag) = -~I&,~(~~) the above equation becomes qJi = pi

[“.j + ~hnax(n’(Pi. &))I

(2)

When a nondiasing function Ah,,,,(d) is specified, a point can be classified as “non-ground” if its height exceeds that of the eroded surface.

1051-4651/02 $17.00 (c) 2002 IEEE

The filter function incorporates prior knowledge about the terrain slope. For example, if we know that the slope of some area is not steeper than 10% and we allow for small perturbations F in the data, the filter function could be: ah,,,(d)

= O.ld + F

(3

When it is difficult to specify a filter function explicitly, the terrain characteristics can be obtained from training samples containing only ground points [9]. The erosion kernel should be large enough not to be completely enclosed in a building. Fig. 1 illustrates the idea in 1D using a simple erosion kernel that specifies the allowed height difference as a function of separation. The erosion kernel shown permits a 10% terrain slope. We then consider a profile containing two buildings, a flat roof and a gable roof. The eroded heights of the data points are shown against the original heights. The non-ground points detected by comparing the original and eroded heights are also shown. We thus identify non-ground points using the erosion kernel by indirectly searching for a point within the support width that violates the slope criterion. After median filtering to remove the outliers, we obtain the initial segmentation of the DSM in two steps: erosion, followed by the test given below:

T={pi

~Alzi

>epi}

(4)

This initial segmentation will contain tiny segments due to outliers and noise. Spurious segments are eliminated through morphological filtering, minimum area and minimum height criteria. We use a 3 x 3 structuring element to delete noise segments. Then, the bounding box (the tightest rectangle, regardless of orientation) of each connected component is computed. We discard segments whose bounding boxes reach the margin of the DSM; these may be cropped and unreliable. We now compute a local threshold for each segment from the average elevation of ground points within the bounding box and prior knowledge about the minimum height of a building. Final segmentation results are obtained by applying the local height threshold, and the connected components are labeled.

3. Classification RN denote the labeled regions in the Let Rl.Ra..... segmented image .9(2. w). The extracted candidate regions represent both natural and manmade objects; the most important problem is to classify these regions into buildings and nonbuildings. We pose this as a three class problem:

1051-4651/02 $17.00 (c) 2002 IEEE

rectangular structures, elongated structures, and nonbuildings. We then merge the first two classes.

3.1. Feature Extraction High Cmvatmve Points: Surface curvatures measure the rate of change of surface orientation at a point. Following Flynn and Jain [4], we use a bicubic surface fit to estimate the curvature at each point of interest. Buildings comprise homogenous surface segments with high curvature values at their intersections. Tn contrast, vegetation areas allow laser energy to penetrate producing high curvature values throughout. Tf we define high curvature points as those with maximum curvature values greater than a threshold and h(z. w) as the density of high curvature points in .9(x. w), then the conditional probability of finding a high curvature point in region RI; is:

c c f4z.w) (5) Roof Edges: Surface orientation discontinuities (roof edges) are characterized by the maximum angular difference between adjacent unit surface normals:

Chvvatmve md S~&ce Normal IXwtions: Maximum curvature vectors typically point in a direction perpendicular to that of the edges. We compare the histogram of azimuth orientation of maximum curvature vectors at high curvature points within each region to model histograms. Similarly, the histogram of surface normal orientation at those points where the roof edge magnitude exceeds a certain threshold is also compared to models. Several model histograms can be used to identify complex structures. We use two models that we call the orthogonal edge model and the elongated structure model. The orthogonal edge model contains two peaks and is obtained by adding two Gaussian random variables Nj (0,. 0”) and Nz(H2. 0”) with 81 and 02 separated by 90” and CT= 10”. The elongated structure model contains only one peak and is generated from a Gaussian random variable N(B. 0”). Let X(O) denote the model histogram and Y(Ol R,) be the test histogram obtained from the DSM data. The correlation coefficient between X(O) and Y(Ol R,) measures the similarity between the two models. Because buildings can be oriented in any direction, the test histogram has to be compared with the model histogram over all possible orientations. This is achieved by a circular shift of the test histogram. The maximum normalized correlation coefficient between X(O) and Y(Ol R,) is calculated as follows:

k&(r) = (6) When the local surface changes rapidly, the magnitudes of roof edges are high. Roof edge magnitude histograms of buildings exhibit a peak at low angles while those of vegetation peak at higher angles. Roughness: The maximum difference in depth between a point and its eight neighbors gives the jump edge magnitude:

The edge magnitudes are randomly distributed in vegetation areas but are more consistent in buildings. The entropy of the distribution of the magnitude of jump edges in a region is:

Cm,{X(B. r). Y(Ol R,)} Vcw{X(~)}Vcw{Y(~l R,)}

Pm&k.) = mfx{pr;(r)}

(9) (10)

where 7 denotes the shift parameter. High pmax value suggests the presence of a structure that corresponds to the model. Trees and vegetation areas have randomly oriented curvature and normal vectors and therefore have low prnax values. Area: When the resolution of the data is known, the area is a simple and useful measure to discriminate between manmade structures and vegetation. Tn obtaining the segmentation, although we rejected small regions, vegetation that spreads beyond the minimal area is not eliminated. EZoqptedmss: The elongatedness of a region is the aspect ratio of its bounding box. Elongated structures can be easily distinguished from rectangular buildings using this feature.

3.2. k-Nearest Neighbor Classifier k-1

where Nt, is the number of magnitude bins and pr; is the probability of the k,“h bin. The normalized entropy measures the randomness in a region. High entropy values characterize vegetation and low entropy values are typical for buildings.

The &NN classifier works by computing the Euclidean distance between the feature vector of the observation x = [.“, . 22. . . . . z,lT and those of training samples drawn from all three classes. When computing the distance, the features are individually scaled to the same range to remove the bias due to large feature values:

1051-4651/02 $17.00 (c) 2002 IEEE

d,,(x.y)=d(?l:-2:)2+...+(1/:?-r~)2

(11)

where z:s and giy,!s are the normalized features. An initial decision is made based on the lo nearest neighbors. Tn our experiments, JC= 5. Finally, segments that are classified as rectangular structures or elongated structures are labeled as buildings and others labeled nonbuildings.

4. Experimental

Results

We now present results on LTDAR data having a ground resolution of lm,. Fig.2 shows sections of the graycoded DSMs of Stockholm, Belgium and Baltimore after median filtering. Dark (bright) values correspond to low (high) elevation and terrain variations are not apparent in the graycoded DSMs. Therefore, simple thresholding and filtering techniques would not reliably detect non-ground points. The Stockholm DSM also shows a large number of dropouts from black tin roofs. Fig. 3 shows the eroded DSMs using a filter of size 51 x 51 with slope 20% and tolerance 0.01. The initial segmentation obtained using our method is shown in Fig. 4. The final segmentation results after eliminating regions having a ground area below 35m2 are shown in Fig. 5. Tnsufficient penetration of laser energy resulted in the fusion of some vegetation and building areas. With no other information available, we generated the ground truth segmentation manually. A region was labeled as a building if it contained any part of a building. Tn the Stockholm DSM, some segments were labeled as buildings even though they did not completely enclose one. The histograms were quantized into 18 bins in both similarity and roughness calculation. The &NN classifier was trained using 302 samples including 134 rectangular structures, 83 elongated structures and 85 nonbuildings. Training samples were obtained from nonoverlapping DSMs of Baltimore and Belgium. Table 1 presents a quantitative assessment of the final classification results. Tnthe Stockholm data, as expected, some fragments inside buildings were incorrectly classified as vegetation. Moreover, in Ravensburg we see a low classification rate by regions because these data were taken over a predominantly residential area. Houses are “small” and the few pixels on target in that case induces a higher error rate. Smooth surfaces (due to filtering) resulted in a few vegetation areas being classified as buildings (false alarms). This could be overcome with edge preserving smoothing. Curvature estimates, being second derivatives, are sensitive to noise. The quality of laser data and the preservation of local changes, therefore play a vital role in the

Figure 2. Graycoded DSM data

Figure 3. Eroded DSM data performance of this method. As shown in Table 2, the segments that were misclassified were primarily small houses for which the resolution of the data was insufficient.

5. Conclusions We have developed a data driven strategy to detect buildings in elevation data. Previous techniques assume flat terrain for detecting buildings. More research is required for robust building detection in hilly regions and coarse landscapes, and in separating trees located close to buildings. This work is a significant first step in that direction. Future work will be directed towards more detailed classification of building types and identifying other landscape features. LTDAR systems, when collecting laser data, also obtain an intensity image. Brightness edges could be valuable in the classification stage. Future work will also include fusion of depth and hyperspectral data. When complementary, the features of multiple data sets can offer bet-

1051-4651/02 $17.00 (c) 2002 IEEE

Classification Rate (‘3~)

Table 1. Classif ication Results

Figure 4. Initial Segmentation Table 2. Error Types by Region

vi

N. Haala. 3D Ruilding Reconstruction using 1,inear Fdge Week, Wichmann, KarlSegment?. Tn Photogmmmetrir sruhe, pages 1%28, 1995. WIE. J. Huising and 1,. CT.Pereira. Errors and accuracy estimates of laser data acquired by various laser scanning systems for topographic applications. ISPRS, 53(6):24S-261,

vi

1998.

C. 0. Jaynes, A. Hanson, E. Riseman, and H. Schultz. Ruilding reconstruction from optical and range images. Tn IEEE Cor$ Computer Vision rind Pattern Recognition, 386, Puerto Rico, June 17-19 1997.

pages380-

r81J. Kilian,

N. Haala, and M. Englich. Capture and evaluation of airborne laser scanner data. IAPRS, XXXT(R3):383-388, 1996.

Figure 5. Extracted

Cue Regions

r91CT.Vosselman. rw

ter classification. For example, additional information such as the degree of organization in an area measured from the hyperspectral data could be used in the classifier training phase.

U. Weidner and W. Fiirstner. Towards automatic building extraction from high resolution Digital Elevation Models. ISPRS, 50(4):38-49,

References [I] A. Rmnn and U. Weidner. Hierarchical Rayesian nets for building extraction using dense Digital Surface Models. ISPRS, 53(6):29&307,

1998.

[2] C. Csatho, K. Royer, and S. Filin. Segmentation of Iaser Surfaces. IAPRS, XXXn(3, W14):73-X1,2000.

[3] W. Fckstein and 0. Munkelt. Extracting objects from digital terrain models. Tn T. Schenk, editor, Remote Sensing rind Reronstrurtion,

for Three-l%mensiond

Slope based filtering of laser altimetry data.

IAPRS, XXXllT(R3/2):935-942,2000.

O&ert.v rind Srene,y,

Proz SPTF 2.572, pages 222-23 1, 1995. [4] P. J. Flynn and A. K. Jain. On Reliable Curvature Estimation. Tn IEEE Coqf: Computer Msion rind Pattern Rerognition, pages11&I 16, SanDiego, June 1989.

1051-4651/02 $17.00 (c) 2002 IEEE

1995.