A Bayesian Local Binary Pattern Texture Descriptor Chu He [1,2], Timo Ahonen[1] and Matti Pietikäinen[1] Machine Vision Group, University of Oulu, PL 4500, FI-90014 Oulun yliopisto, Finland [1], School of Electronic Information, Wuhan University, Wuhan 430079, P.R.China [2], Email:
[email protected],
[email protected],
[email protected] Abstract In this paper, a Bayesian LBP operator is proposed. This operator is formulated in a novel Filtering, Labeling and Statistic (FLS) framework for texture descriptors. In the framework, the local labeling procedure, which is a part of many popular descriptors such as LBP, SIFT and VZ, can be modeled as a probability and optimization process. This enables the use of more reliable prior and likelihood information and reduces the sensitivity to noise. The BLBP operator pursues a label image, when given the filtered vector image, by maximizing the joint probability of two images under the criterion of MAP. The proposed approach is evaluated on texture retrieval schemes using entire Brodatz database. The result reveals BLBP operator’s efficient performance and FLS framework’s capability to in-depth analysis of the texture descriptors on a common background.
1. Introduction 1.1. Motivation The Local Binary Pattern (LBP) operator [1] has gained increasing attention due to its simplicity and excellent performance in various texture and face image analysis tasks. Many variants of LBP have been recently proposed [2], including Local Ternary Patterns (LTP) [3] and multi-scale block LBP (MB-LBP) [4], providing considerable success in various tasks. In this paper, a Filtering, Labeling and Statistics (FLS) framework is developed, which can be used for analyzing and comparing their performance in a unified way. Furthermore, the FLS framework shows that many of the most popular descriptors can be seen as a
978-1-4244-2175-6/08/$25.00 ©2008 IEEE
statistic of labels computed in the local pixel neighborhood through filtering and labeling procedure. However, the stochastic nature of image formation is usually disregarded by these methods leading to inaccuracy and sensitivity to the illumination changes and noise. As a result, a novel Bayesian LBP operator is proposed. It models the label acquirement from filter responses as a stochastic process, and embeds a Markov Random Field (MRF) in the label space. The label procedure is then treated as a joint optimization process under a criterion of Maximum a posteriori (MAP). Finally, a histogram estimating the probability density of the labels is used as a descriptor.
1.2. Previous work and contributions Varma and Zisserman’s classifier (VZ) [5], LBP [1] and Scale-invariant feature transform (SIFT) [6] are among the most prevalent texture descriptors. The VZ classifier belongs to so-called texton methods which represent the filter response vector by a codebook cluster learned from the training images. LBP can also be seen as a special filter based operator followed by a threshold quantization. The recently proposed filtering and vector quantization framework [7] was presented to treat the codebook and threshold approaches as two types of quantization of filter response and consequently allow for a systematic comparison of these two descriptors. The FLS framework is related to this work. The contributions of this paper are: (1) developing a FLS framework with a unified explanation of LBP, its variants, SIFT and VZ classifier, (2) proposing a new Bayesian LBP operator inspired by the framework, which regards the labeling procedure of descriptors as a probability and joint optimization process with embedding the MRFs on the label images.
vector image V(s) original image Y(s)
3
4
1
2
LBP quantization on 2D vector space
LBP diffrence filtering
histogram
label image X(s)
descriptor output H
2D vector space Figure 1. The FLS framework of image descriptors. For illustrational purposes, only 2-D space is considered here.
2. The texture description framework 2.1. The framework The original LBP operator is introduced as a descriptor summarizing the local gray-level structure. The operator labels the pixels of an image by thresholding every 3 by 3 neighborhood with the value of the central pixel. The sum of the thresholded values weighted by powers of two is then used as a label for the center pixel. This operator can be extended to use neighborhoods of different sizes and sampling points. Another extension defined the so-called uniform patterns and rotationally invariant patterns [1]. The LBP operator can be rewritten under a novel framework as illustrated by Fig. 1. First, an image on a rectangular pixel lattice , containing a set of pixels , , is filtered with filter bank, , ̂ , 1,2, … . The filter bank , ̂ , responses at each pixel location can be gathered , ,…, to build up a vector together , . As for LBP, equals to 8 and image the filter bank consists of multi direction difference filters [8]. Secondly, in the vector space, each can be , 1,2, … assigned a label and a label image , can be obtained. The LBP labeling procedure includes threshold quantization and mapping to a rotation invariant and function can be expressed as uniform pattern. The Eq. (1). 2 ,
1, 0,
0 0
(1)
Finally, a statistic procedure will be done on the labeled image usually through a histogram count. The then can be obtained. export descriptor
The FLS framework can be illustrated as Fig. 1. , are drawn Only two dimensions of vector, , space in the figure. When different filters or in a labeling functions are considered, the framework provides a unified way for explaining many of the most popular texture descriptors such as LBP, its variants, SIFT and VZ classifier. 78 99 56 54 57 12
24
50 49
Filter
45
2
1
1 1
1 1
0
1
+
0
X (T = -5) 1 BLBP
1
1 1
0
J(X, V, T = -5) P(V|X,T=-5) =0.0030
0
0 0
0
0
X (T = 5)
1 1
0
1
0 0
0
X (T = 0)
V 1
0 0
0
-3 -42 -41
13
1
1
LBP
-5
Y
LTP
1
-4
1
^
1
1 0
0 0
0
0
J(X, V, T = 0) P(V|X,T=0) =0.7289
1
^
1
0 0
0 0
0
^...
0
J(X, V, T = 5) P(V|X,T=5) =0.2681
Figure 2. The labeling procedure of LBP, LTP and BLBP
2.2. Analysis of the labeling procedure is independent of , the If the expected value E , ,…, of the multi-direction responses difference filter bank have zero mean and the values are accumulated around zero. Since the LBP operator uses zero as the threshold, any small changes can cause the label result range from one value to another. This makes the description of the micro-structure sensitive to the noise.
Among the variants of LBP, the LTP descriptor improves the robustness of the labeling procedure of LBP by using two thresholds. Inspired by the estimation theory, we propose a novel labeling scheme, the Bayesian LBP (BLBP). Different from the fixed process of obtaining from utilized by LBP, LTP, and other texture descriptors, BLBP regards the labeling procedure as a probability and joint optimization process, as illustrated in Fig 2. Based on principle of maximum a posteriori estimation, more prior and likelihood information can be introduced to eliminate the noise influence.
,
exp
(4)
, 1 ,for , and , 0 otherwise. Here the and stand for two label sites in the 2 rand clique set defined on the neighborhood system illustrated as Fig. 4. with
p
q
Figure 4. The 2-rand clique set used in BLBP
3. Bayesian LBP 3.1. Bayesian LBP descriptor As illustrated in Fig. 3, let , denote the a world state of the vector image and label image . Therefore, the objective of labeling process is to pursue , given the vector image data , according to minimizing some criterion function , . A criterion of MAP is usually used, which formulates , as ,
|
log
log
,
(2)
where , , also called the smoothing term, is a prior probability on , and is the parameter for | modeling . , called the likelihood term, is the conditional probability of , given . Hence, argmin
,
(3)
The remaining steps of BLBP are the same as original LBP according to the FLS framework. J ( X, V ) X
p
q P( X ) P( V|X )
V
Figure 3. Label process of BLBP
3.2. Likelihood term, smoothing term and optimization One of the advantages of BLBP is its capability for adopting different likelihood and smoothing terms. In this paper, the prior knowledge of the spatial consistency of the label space is conveniently described by an MRF and Potts model as
| which The logsig function is used to model means that the distance from zero increases the likelihood to threshold 1 or 0. Assuming each item of is conditionally independent, the likelihood function can be written as |
1 1
(5)
is the item of , is the weight where of power of two of label , is the weight and assigned to 0.5 here. Finally, Graph Cuts [8], a global optimization algorithm, is used to minimize , and get the label in Eq. 3. image From the likelihood function described above, it is obvious that the original LBP can be derived from BLBP when using a maximum-likelihood estimate (MLE) formulation, i.e. ignoring the smoothing term.
4. Experimental results 4.1. Texture retrieval evaluation strategy Following the FLS framework, an experimental comparison of LBP, its derivatives LTP and MB-LBP, and the proposed BLBP operator was performed using the entire Brodatz database (999 images; 111 classes, 9 samples per class) [9]. In order to verify the robustness of the BLBP operator toward noise, Gaussian noise 0.01 was added to the with parameters 0, original images to construct a noisy dataset. The unsupervised texture retrieval evaluation scheme used in [10] was adopted and the experiments are carried on both the original dataset and the noisy one. The number of retrievals ranges from 8 to 50 for the Brodatz database.
4.2. Results and analysis
The retrieval results are illustrated in Fig. 5. Table 1 shows the Average Recall Rate (ARR) at the beginning of the retrieval curve and the average value of ARR over the whole curve. The best performance is obtained with the BLBP descriptor in both datasets. Its performance is also better than the results reported in [10]. To confirm these results, we also made experiments in texture retrieval using the KTH-TIPS database, and their results are consistent with those obtained with the Brodatz images.
Ave. recall
0.95 0.9 0.85
LBP LTP MB-LBP Bayesian LBP
0.8 0.75
10
20 30 Number of retrievals
40
50
(a) for the Brodatz dataset 0.95 Ave. recall
0.9 0.85 LBP LTP MB-LBP Bayesian LBP
0.7 0.65 10
20 30 Number of retrievals
40
The work is supported by the CIMO scholarship of Finland, the NSFC grant (NO. 60702041) and the Academy of Finland. 50
(b) for the Brodatz dataset with Gauss noise 0, 0.01 Figure 5. Texture retrieval result curve Table 1. Result of BLBP, LBP and its other derivatives’ texture retrieval performance (a) for the Brodatz dataset
LBP LTP MB-LBP BLBP
ARR at 8 retrievals 0.7922 0.7709 0.7820 0.8062
aver. ARR of whole curve 0.8927 0.8771 0.8745 0.9077
(b) for the Brodatz dataset with Gauss noise
LBP LTP MB-LBP BLBP
In this paper, a Filtering, Labeling and Statistic framework is proposed, leading to a unified explanation and comparison of LBP and its derivatives on a common background. The framework can also be used with other state-of-the-art texture descriptors such as SIFT and VZ classifier. The framework allows us to model the labeling procedure as a probability and optimization process, which can introduce more prior and likelihood information and make the texture operator less sensitive to noise. Hence, a novel Bayesian LBP operator is proposed. The same strategy proposed here could also be extended to other prevalent texture descriptors. The proposed approach was evaluated in texture retrieval tasks using the entire Brodatz dataset. The results show that the BLBP operator performs very well and that the FLS framework can be used for indepth analysis of texture descriptors on a common background. Future work includes research with alternative smoothing terms aiming for even better results on noisy images. Also methods for learning parameter values will be considered.
Acknowledgements
0.8 0.75
0.6
5. Conclusion
ARR at 8 retrievals 0.7226 0.6468 0.7255 0.7308
0, 0.01
aver. ARR of whole curve 0.8494 0.8104 0.8420 0.8771
References [1] T. Ojala, M. Pietikäinen, T. Mäenpää. Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE T-PAMI,24(7):971-987, 2002. [2] The local binary pattern bibliography, http://www.ee.oulu.fi/mvg/page/lbp_bibliography [3] X. Tan, B. Triggs. Enhanced local texture feature sets for face recognition under difficult lighting conditions. AMFG 2007. [4] S. Liao, S. Li. Learning multi-scale block local binary patterns for face recognition. ICB 2007. [5] M. Varma, A. Zisserman. Texture classification: are filter necessary? CVPR 2003 2:691-698, 2003 [6] D.G. Lowe. Object recognition from local scale-invariant feature. ICCV 1999, 1150-1157, 1999 [7] T. Ahonen, M. Pietikäinen. A framework for analyzing texture descriptors. VISAPP 2008, 1:507-512. [8] Y. Boykov, V. Kolmogorov. Experimental comparison of min-cut/max-flow algorithms for energy minimization in computer vision, IEEE T-PAMI 26(9): 1124-1137, 2004. [9] P. Brodatz. Textures: A Photographic Album For Artists And Designers. Dover, New York, 1966. [10] S. Lazebnik, C.Schimid, J. Ponce. A sparse texture representation using local affine regions. IEEE T-PAMI, 27(8):1265-1278, 2005.