Image Retrieval via Relevance Vector Machine ... - Semantic Scholar

Report 2 Downloads 114 Views
676

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

Image Retrieval via Relevance Vector Machine with Multiple Features Zemin Liu College of Mathematics and Computer Science, Panzhihua University, China

Wei Zong Massachusetts Avenue, MA 01239, USA Email: [email protected]

Abstract—With the fast development of computer network technique, there is large amount of image information every day. Researchers have paid more and more attention to the problem of how users quickly retrieving and identifying the images that they may interest. Meanwhile, with the rapid development of artificial intelligence and pattern recognition techniques, it provides people with new thought on the study on complex image retrieval while it’s very difficult for traditional machine learning method to get ideal retrieval results. For this reason, we in this paper propose a new approach for image retrieval based on multiple types of image features and relevance vector machine (RVM). The proposed method, termed as MF-RVM, integrates the informative cures of features and the discrimination ability of RVM. The retrieval experiment is conducted on COREL image library which is collected from internet. The experimental results show that the proposed method can significantly improve the performance for image retrieval, so MF-RVM presented in this paper has very high practicability in image retrieval. Index Terms—MF-RVM; Image Feature; Image Retrieval

I.

INTRODUCTION

With the rapid development of computer network technique, there is large amount of image retrieval information every day, that how to effectively organize and manage these images has been a more and more serious research topic. If image retrieval data can’t be effectively managed, a large amount of information will be lost [1, 2]. Therefore, the problem of how users quickly retrieving and identifying their required image information are being paid more and more attention by researchers. Recently, retrieval technologies based on data mining about image retrieval emerge as the times requirement, that how to quickly and accurately retrieve these images has become the key point of recent image retrieval technical study [1]. In recent years, with the fast development of artificial intelligence and pattern recognition, it provides people with new thought on the study on complex image retrieval, some classification methods based on goal decomposition and spectral signature, such as fuzzy set, generalized multiple kernel learning (GMKL), Bayesian and neural network classification method [2,3], and other

© 2014 ACADEMY PUBLISHER doi:10.4304/jmm.9.5.676-681

methods based on ground material property, statistic characteristic for image retrieval data, such as spectral angle mapping method, maximum likelihood method and minimum distance method, are began to be widely applied to image retrieval, but it has a lot of difficulties when being in the face of spectral data with hyper spectral and multi angles [2, 4]. In addition, this method has also some shortcomings for themselves. For example, the degree of membership for fuzzy set classification method shall be given by experience or experts, and it has high subjectivity, so precision and intelligibility for learning problems of fuzzy system are the first questions to be solved [5, 6]. What’s more, it first needs to select the variables according to problems’ definition, the data type and feature when separating remote sensing effect with fuzzy technique. For GMKL, it’s suitable to classify problems given by data, but its chosen kernels will significantly affect classification results. Moreover, it is difficult to fulfill the assumption of conditional independent assumption for naïve Bayesian classification when handling large-scale classification problems and select the required evaluation function [3, 4]. Also, this method has very complicated learning and training [7]. There are some shortcomings easily appearing for neural network classification method, such as local minimum and the slow rate of convergence. In addition, for the above traditional machine classification learning method [2], they have very high requirements of data regularity and shall be conducted under the assumption of infinite sample size. However, data classifying remote sensing image retrieval cannot usually meet the above requirements, with the characteristics of circumpolar latitude and small sample. That is to say, it is very difficult to obtain the ideal classification results with traditional machine learning method for these data. Furthermore, classification methods based on spectrum cannot provide textural features, just like that presented by identical permutation. Many remote sensing image retrievals are just reflected based on textural features. Those image classification methods based on spectral extraction feature are unable to accurately classify image retrieval [5, 6].

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

How can we effectively and accurately identify images? If image data can’t be effectively managed, a large amount of information will be lost, so they can’t be effectively and timely retrieved for use by people when being required. For this reason, researchers have paid more and more attention to problem of how users effectively and quickly identifying and managing images. Based on shortcomings for the above methods, one image retrieval method based on multiple features and RVM [8] is presented in this paper. Since SVM is a convex optimization problem, seen from the properties of convex optimization, local optimum solution solved with convex optimization must be global optimum solution, which is not included in other classification methods. In addition, SVM shows its effectiveness on image retrieval research in the aspects of independent study classification and automatic processing. However, the application of SVM is affected by kernel parameters and its classification performance is highly dependent on kernel parameters [9]. For this reason, a method based on multiple features representation and relevance vector machine is presented in this paper. Moreover, this method can be well applied to quickly solve nonlinear problems, which makes RVMhave a wide application prospect in solving classification and prediction problems. The retrieval experiment is conducted on COREL image library which is collected from internet and has about68040 photo images belonging to various categories. The experimental results show that it can significantly improve the accuracy for image retrieval to obtain parameters with cross validation, so MF-RVM presented in this paper has very high practicability in image retrieval [10-11]. Here we briefly summary the advantages of relevance vector machine, which accounts for reason that uses relevance vector machine to multiple feature based image retrieval. First, relevance vector machine introduces the sparseness inducing prior over the weight. Consequently, relevance vector machine has the ability to simultaneously perform feature selection and classification. On the other hand, because the proposed MF-RVM uses multiple image features and produces a very high feature space, it is important to fish out the informative features and remove the noise features, from the perspective of both computational efficiency and model performance [12]. By integrating RVM, MR-RVM is able to benefit from multiple features while improving its efficiency and performance. Moreover, it is very flexible due to its adaption ability from probabilistic formulation. The step of the proposed MF-RVM is summarized as follows. (1) Remove the noise data so as to minimize the ambiguity and highlight the useful information. (2) Model the distribution of image features and extract the feature based on the model. (3) Choose the relevance vector machine as the retrieval model, and learn relevance vector machine based on the feature space or equally the similarity defined on the feature space. (4) We perform image retrieval using the learned MF-RVM and extracted features for test data.

© 2014 ACADEMY PUBLISHER

677

The contributions of this paper are three aspects. (1), the feature extraction method proposed in this paper could highlight the informative components while reducing the data variance by means of data distribution. (2), relevance vector machine is used as retrieval model, by which MF-RVM benefits from its satisfied generalization ability and data adaption ability for image retrieval. (3), the results of our comprehensive experiments validates the advantages of MF-RVM in image retrieval, i.e., beats other methods on different experimental conditions. The remainder part of the paper is organized as follows. We report the formulation details of the proposed method MF-RVM in Section 2. The empirically experiments of MF-RVM and related algorithms are reported in Section 3. We draw a conclusion in Section 4. II.

THE PROPOSED SCHEMA

On the basis of the above discussion, in this part, we propose an approach, MF-RVM for image retrieval. Our approach is based on the relevance vector machine, which is able to conquer the defects of previous algorithm [13]. Our developed algorithm is graphically illustrated in Figure. There are three main steps. First, collect data. Second, data modeling and feature extraction [14]. Third, train the model described in Section 2 and perform test.

Figure 1. The experiment framework for MF-RVM

A. Multiple Image Features Image feature is one of the most important factors for image retrieval because it captures low-level and local cues such as gradient and texture [15]. However, it has been validated that, different features have different abilities and capture different information. Therefore, single image feature is not sufficient to represent the images. To overcome this limitation, we in this paper propose to use multiple features for image representation. Specifically, we use scale invariant feature (SIFT), HOG and HMAX. These features are combined are feed into the following relevance vector machine for image retrieval. B. Relevance Vector Machine There are many machine learning problems belonging to the heading of supervised learning. Here we consider a setof input vectors X  {xn }nN1 that combined with corresponding target values T  {tn }nN1 . Our goal of that is to employ this training data, together with arbitrary relevant prior knowledge, to predict T for new values of

678

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

X . We can tell apart two distinct cases: regression, in that T is a continuous variable, and classification, in which T belongs to a discrete set. Here we make a consideration of models, in which the prediction label y( X ,W ) can be expressed as the linear combination of basis function m ( x) of the form, M

y  x, w  wmm  x   wT 

(1)

m0

where {wm } are the weight parameters of the model. Relevance Vector Machine (RVM) makes the probabilistic predictions and yet that keeps the excellent predictive performance of the support vector machine. It also can make the preservation of the sparseness property of the SVM. In fact, for a wide variety of test problems it actually results in models which are dramatically sparser than the corresponding SVM, while sacrificing little if anything in the accuracy of prediction. RVM models the conditional distribution of the target variable, given an input vector x, as a Gaussian distribution of the form, P  t|x, w,   N (t | y  x, w , 1 ) where N ( z | m, S ) denotes a multivariate Gaussian distribution over z with mean m . The conditional probability of y( x, w) is given through Equation (1). The parameters w are given a Gaussian prior, N

P  w|   N ( wm | 0,  m1 ) m0

where   { m } is a vector of hyper-parameters, with a hyper-parameter  m assigned to each model parameter. These hyper-parameters can be estimated through utilizing type-II maximum likelihood in which the marginal likelihood P(T | X ,  , ) is maximized with consideration of  and  making an assessment of this marginal likelihood require integration over the model parameters P T |X ,  ,   P T |X , w,  P  w|  dw . We can neglect the relevant conditions from the trained model, the representation is Equation(1), and the training data are used that is connected with the remaining kernel functions, called relevance vectors. A step is applied to improve the m parameters simultaneously, in the relevance vector machine’s identification version. The targets’ conditional distribution is applied by 1t

P  t|x, w    y  1    y  where  ( y)  (1  exp( y))1 and y( x, w) is given by Equation (1). Please pay attention to the case t {0,1} . Make an assumption which it is independent example which is equivalently distributed. The above equation requires the integration to compute marginal likelihood which cannot be conducted analytically arbitrary more. Therefore, a local Gaussian approximation is used to the weights’ posterior distribution. Then optimizing the t

© 2014 ACADEMY PUBLISHER

hyper parameters can be made with a re-estimation framework, re-verifying the posterior’s mode in alternation till convergence. C. Retrieval as Identification Nevertheless, categorization is more complicated than the regression case. Note that we do no thave a completely conjugate hierarchical structure. How to settle this problem, think about the log marginal probability of the target data again, offer the input instance which can be written like this, lnP T |X   ln P T |X , w P  w|  P   dwd The same as ever, we bring in a factorized variational posterior of the form Qw (w)Q ( ) , and acquire the lower bound as follow on the log marginal probability  P T |X , w  P  w|  P      lnP T |X   Qw  w  Q   ln    dwd Q w Q        w   

We can get the predictions from the trained model for new inputs by replacing the posterior suggest weights., You can see the predictive distribution in the form of P(t | x, E[w]) . To make an accurate estimation, we should take the uncertainty of weight into consideration by means of marginalizing over the posterior distribution over weights. III.

EXPERIMENTAL RESULTS

In this part, we will evaluate our proposed MF-RVM approach for image retrieval [16]. The experimental steps of our proposed MF-RVM algorithm are graphically presented in the above section and Figure 2. It contains the following procedure: (1) data collection; (2) feature extraction via approach in Section 2; (3) train the model and evaluate its performance.

Figure 2. The flowchart of the proposed MF-RVM method

A. Experimental Database The Corel Image database used in our experiments was collected by means of Michael Ortega-Binderberger who is with University of California at Irvine. The original image collection was got from Corel. There are 68040 photo images from various categories. Each configured of features is stored in a divide file. For each file, a line corresponds to a single image. The first value in a line is the image ID and the subsequent values are the feature vector (e.g. color histogram, etc.) of the image. The same image has the same ID in all files whereas the image ID is not the same as the image filename. The database includes 68040 samples with 89 attributes, these

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

attributes describes as each image’s four configures of features were extracted: Color Histogram, Color Histogram Layout, Color Moments, Co-occurrence Texture. Color Histogram: 32 dimensions (8 × 4 = H × S); Color Histogram Layout: 32 dimensions (4 × 2 × 4 = H × S × sub-images); Color Moments: 9 dimensions (3 × 3); Co-occurrence Texture: 16 dimensions (4 × 4). The distribution of datum is presented in Table 1. TABLE I.

CLASS DISTRIBUTION OF SAMPLES

Classes Car Bridge People Scenery Total number of samples

Number of samples 40000 5000 18040 5000 68040

B. Assessment Standard To assess the advantage of our proposed MF-RVM approach for image retrieval, and other compared algorithms for image retrieval, we in this paper select some of classification accuracy, recognition precision and recognition recall as the assessment criterions. The definitions can be determined in Table. TP denotes true positive that is the correct result; TP (true positive) represents items correctly labeled as belonging to the positive category; TN (true negative) represents items correctly labeled as belonging to the negative category; FP (false positive) denotes items incorrectly labeled as belonging to the category; and FN (false negative) represents items that were not labeled as belonging to the positive category whereas should have been. These assessment criterions can be directly for two classes or multiple class classification problem of image retrieval. TABLE II.

TABLE III.

679

identification model. The experimental step is summarized in the experiment part. The pre-processing procedure and feature extraction procedure are important due to the capture discriminant information. Our developed approach MF-RVM is trained utilizing above described algorithm, and some parameters of MF-RVM are got through cross-validation strategy. We do the test for multiple trials, where in each trial we randomly divide the dataset to training set and test set. The Precision and Recall is utilized as the evaluation standard for the image retrieval. We conduct the experiment for 20 times and present the experimental results of partial time are in Table 3 and Figure 3. As report in Table 3, by means of utilizing our algorithm to learn parameter, MF-RVM for image retrieval reach the highest performance of 83.43% under the standard of Precision, while MF-RVM reach the highest performance of 82.85% under the criterion of Recall. Additionally, the average Precision of MF-RVM is 80.38% which outperforms that of SIFT-SVM (75.12%). The potential reasons for these results are mainly threefold. Firstly, The MF-RVM is capable to adapt complexly distributed data and deal with it well, where the adaptability essentially comes from the flexibility of the model parameters. Secondly, the parameter selection method is according to the distribution information of the input data to select the model parameters of the MF-RVM, which makes the MF-RVM has better adaptability. Thirdly, the processing procedure for data is able to remove noise and keep useful information effectively, and the element steps of our method could cooperate.

THE ASSESSMENT CRITERION FOR IMAGE RETRIEVAL Evaluation standard Precision

Definition TP / (TP  FP)

Recall

TP / ( FN  TP)

THE PERFORMANCE COMPARISON OF DIFFERENT ALGORITHM

Experiment Trial1 Trial 2 Trial 3 Trial 4 Trial 5 Average

Approach SIFT-SVM MF-RVM (ours) SIFT-SVM MF-RVM (ours) SIFT-SVM MF-RVM (ours) SIFT-SVM MF-RVM (ours) SIFT-SVM MF-RVM (ours) SIFT-SVM MF-RVM (ours)

Precision (%) 75.29 79.81 76.49 83.43 75.56 80.55 73.97 81.48 73.98 79.60 75.12 80.38

Recall (%) 77.83 82.39 75.26 82.85 76.55 80.26 76.24 80.81 75.79 81.40 75.81 81.29

C. Main Results In the first experiment, we assess our proposed MF-RVM method for image retrieval, over the Corel image dataset. We make use of two comprehensive criterions, Precision and Recall, for experimental verification. Identification accuracy and recall are two typical and popular measures for the correctness of the

© 2014 ACADEMY PUBLISHER

Figure 3. The comparison of experimented approach over two standard

This experiment is run over Corel image dataset. The dataset, collected by Michael, includes 68040 samples with 89 attributes for each image. This experiment aims to validate the ability of our proposed MF-RVM as well as its solution method in the task of image retrieval. The experiment procedure can be found in the above part of this paper, where the parameters of MF-RVM are found using the solution algorithm variational inference and cross-validation. We also compare MF-RVM with other related algorithms. The verification standards make use of here are Precision and Recall where Identification accuracy and recall are two typical and popular measures for the correctness of the identification model.

680

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

We conduct experiments over Corel image dataset. The dataset, collected by Michael Ortega-Binderberger, includes 68040 samples with 89 attributes for each image. This experiment will assess the capability of MF-RVM in image retrieval, and optimization. It employs the method show in above part to learn MF-RVM and cross-validation approach to select the parameters. The assessment criterions are precision and recall respectively where identification accuracy and recall are two typical and popular measures for the correctness of the identification model. The test was performed for 10 trials on this method, and the overall results of varying experimental configuration are present in Table 4 and Figure 4. As present in Table 4 and Figure 4, the value of Precision is around 77.77%, consistently beating the compared approach PLSA. Additionally, for varying experimental rounds, the Precision of our proposed approach also outperform other compared approach. These results are consistent with the previous work, which demonstrates which Precision is a reliable measure for image retrieval and MF-RVM. The reasons are from the following three aspects. Firstly, the MF-RVM has the ability to map the nonlinear data in the low dimensional space to the high dimensional space by a Kernel function, which makes the classification problem easy. Secondly, the parameter selection method is according to the distribution information of the input data to select the model parameters of the MF-RVM, which makes the MF-RVM has better adaptability. Thirdly, the framework of the proposed method MF-RVM contains a group of comprehensive procedures which sequentially maximize the performance. TABLE IV. Training data 30% 40% 50% 60% 70%

THE IDENTIFICATION RESULTS OF IMAGE RETRIEVAL ADOPTING MF-RVM Verification standard Precision Recall Precision Recall Precision Recall Precision Recall Precision Recall

MF-RVM (ours) 68.59 67.89 75.30 79.23 77.77 82.18 81.59 85.06 85.04 86.03

PLSA 65.35 68.94 73.14 74.99 77.82 78.14 81.33 81.28 78.91 83.79

comparison experiment. Two popular criterions Precision and Recall are employed for evaluation. Identification accuracy and recall are two typical and popular measures for the correctness of the identification model. The dataset is Corel image dataset. The experimental procedures are summarized in above part. Our approach MF-RVM is learnt by the approach in above section, where some parameters of MF-RVM are configured to defaults. The test is repeatedly done for several rounds over stochastically split database. TABLE V.

IDENTIFICATION PERFORMANCE COMPARISION OF FOUR ALGORITHM FOR IMAGE RETRIEVAL

Experiment trial Trial 1

Trial 2

Trial 3

Approach SIFT-SVM PLSA SC-SVM MF-RVM (ours) SIFT-SVM PLSA SC-SVM MF-RVM (ours) SIFT-SVM PLSA SC-SVM MF-RVM (ours)

Precision (%) 75.29 78.30 77.45 79.81 76.49 75.92 80.92 83.43 75.56 78.26 77.34 80.38

Recall (%) 77.83 80.29 81.03 82.39 75.26 77.64 81.19 82.85 76.55 77.27 78.55 81.29

We extensively compare our developed MF-RVM approach for image retrieval with three algorithm, SIFT-SVM, PLSA and SC-SVM. The classification results are show in Table 5 and Figure 5. These experimental results indicate which: (1) our proposed MF-RVM significantly beats all three compared approach, under varying experimental configurations, varying number of training sample, and different assessment criterion. (2) The proposed algorithm exhibit robustness against the trial of experiments, and the assessment criterion, that no wonder mean which the proposed algorithm could be used to a number of tasks. The reasons are three folds. (1) The MF-RVM method can be applied to the dataset of large scale and high dimension, and containing a large number of heterogeneous information. (2) The parameter selection method is according to the distribution information of the input data to select the model parameters of the MF-RVM, which makes the MF-RVM has better adaptability. (3) The framework of the proposed method MF-RVM contains a group of comprehensive procedures which sequentially maximize the performance.

Figure 4. The classification results of image retrieval employing MF-RVM

In the third experiment, we verify the capability of the proposed MF-RVM method in image retrieval, using © 2014 ACADEMY PUBLISHER

Figure 5. Theperformance comparision of experimented approach for image retrieval

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 5, MAY 2014

IV.

DISCUSSION

One image retrieval method based on multiple features and RVM is presented in this paper. Since RVM is a convex optimization problem, seen from the properties of convex optimization, local optimum solution solved with convex optimization must be global optimum solution, which is not included in other classification methods. In addition, RVM shows its effectiveness on image retrieval research in the aspects of independent study classification and automatic processing [1]. However, the application of RVM is affected by the parameters and its classification effect highly depends on parameters. For this reason, selection method for cross validation, which is an expansion of typical RVM, is presented in this paper. Moreover, this method can be well applied to quickly solve nonlinear problems, which makes RVM have a wide application prospect in solving recognition and retrieval problems. Retrieval experiment is conducted on COREL collected from internet. Experimental results show that it can significantly improve the accuracy for image retrieval to obtain parameters with cross validation, so MF-RVM presented in this paper has very high practicability in image retrieval. Moreover, MF-RVM learning method can be well applied to conditions of large-scale sample data, complex dimension and large amount of isomerism information. MF-RVM learning method provides machine learning with wide application prospect and rich design thought in fields of feature extraction, multi-class objective detection and pattern recognition. Retrieval accuracy is highly dependent on selection the best wide y of parameter and penalty parameter c, so study on selection for kernel parameter is still the key point of the next study. REFERENCES [1] XieCheng-Wang, Comparative study on various SVM algorithms, Mini-MicroSystems, Vol. 1, No. 1, pp. 56-58, 2008 [2] Gong Zhi-Le, Zhang De-Xian. An improved text classification algorithm for SVM, Computer simulation, Vol. 7, No. 26, pp. 89-90, 2009 [3] Gao Jin, Image classification based on SVM, Master thesis from Northwest University, Vol. 2, No. 10, pp. 125-1262010 [4] Zhang Shu-Ya, Zhao Yi-Ming, Algorithm and implementation for image classification based on SVM. Computer engineering and application, Vol. 43, No. 25, pp. 156-157, 2007. [5] Zhang Bao-Hua, Application of decision rule classifier in network intrusion detection. Computer engineering and application, Vol48, No. 26, pp. 93-95, 2012 [6] Haralick R M. Statistical and Structural Approaches to Textture, Proceedings of IEEE, Vol. 67, No. 5, 1957

© 2014 ACADEMY PUBLISHER

681

[7] Xiang Li, Xuan Zhan, A New EPMA Image Fusion Algorithm based on Contourlet-lifting Wavelet Transform and Regional Variance, Journal of Software, Vol. 5, No. 11, pp. 1200-1207, 2010 [8] Tipping, Michael E, Sparse Bayesian learning and the relevance vector machine, The Journal of Machine Learning Research, No. 1, pp. 221-244, 2001. [9] Li Jing, Chao Shao, Image Copy-Move Forgery Detecting Based on Local Invariant Feature, Journal of Multimedia, Vol. 7, No. 1, PP. 90-97, 2012 [10] Bobo Wang, Hong Bao, Shan Yang, Haitao Lou, Crowd Density Estimation Based on Texture Feature Extraction, Journal of Multimedia, Vol. 8, No. 4, pp. 331-337, 2013 [11] Yin Bo, XiaJing-Bo, FuKai, The prediction research of the Network traffic based on the IPSO Chaos Support Vector Machine (SVM), Computer application research, Vol. 29, No. 11, pp. 4293-4299, 2011 [12] Linhao Li, Content-Based Digital Image Retrieval based on Multi-Feature Amalgamation, Journal of Multimedia, Vol. 8, No. 6, pp. 739-746, 2013 [13] Zemin Liu, Zong Wei, Image Classification Optimization Algorithm based on SVM, Journal of Multimedia, Vol 8, No. 5, pp. 496-502, 2013 [14] Shoujia Wang, Wenhui Li, Ying Wang, Yuanyuan Jiang, Shan Jiang, Ruilin Zhao, An Improved Difference of Gaussian Filter in Face Recognition, Journal of Multimedia, Vol. 7, No. 6, pp. 429-433, 2012 [15] Xiao-wei ZHU, Research on Automatic Classification Technology of Flash Animations based on Content Analysis, Journal of Multimedia, Vol. 8, No. 6, pp. 693-698, 2013 [16] Xiang Zhang, Changjiang Zhang, Satellite Cloud Image Registration by Combining Curvature Shape Representation with Particle Swarm Optimization, Journal of Software, Vol. 6, No. 3, pp. 483-489, 2011

Zemin Liu is currently an associate professor in the College of Mathematics and Computer Science at the Panzhihua University of P. R. China. He received a bachelor degree of Engineering from Xi'an Jiao Tong University in 1987 and a master's degree from University of Electronic Science and Technology of china in 2007.Heis in charge of teaching compiler principles, operating system. He hosts and participates in a number of projects and research topics. At present his research interests include: machine intelligence, computer graphic, digital image processing. Wei Zong received the B.S. degree in Electrical Engineering and Computer Science, the M.S. and the Ph.D. degrees in Biomedical Engineering, from Xi’an Jiaotong University (XJTU), Xi’an, P. R. China, in 1983, 1986, and 1993, respectively. From 1993 to 1997, he was an associate professor at the Institute of Biomedical Engineering, XJTU. Since 1997, he has been affiliated with the Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology (MIT). His research interests include signal and image processing, pattern recognition, machine learning, data mining, and medical informatics.