Expert Systems with Applications 39 (2012) 3031–3036
Contents lists available at SciVerse ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Matrix representation in pattern classification Loris Nanni a,⇑, Sheryl Brahnam b, Alessandra Lumini c a
Department of Information Engineering, University of Padua, Via Gradenigo 6, 35131 Padova, Italy Computer Information Systems, Missouri State University, 901 S. National, Springfield, MO 65804, USA c Department of Electronic, Informatics and Systems (DEIS), Università di Bologna, Via Venezia 52, 47023 Cesena, Italy b
a r t i c l e
i n f o
Keywords: Pattern classification Texture descriptor Locally ternary patterns Local phase quantization Support vector machines
a b s t r a c t Presented in this paper is a novel feature extractor technique based on texture descriptors. Starting from the standard feature vector representation, we study different methods for representing a pattern as a matrix. Texture descriptors are then used to represent each pattern. We examine a variety of local ternary patterns and local phase quantization texture descriptors. Since these texture descriptors extract information using subwindows of the textures (i.e. a set of neighbor pixels), they handle the correlation among the original features (note that the pixels of the texture that describes a pattern are extracted starting from the original feature). We believe that our new technique exploits a new source of information. Our best approach using several well-known benchmark datasets, is obtained coupling the continuous wavelet approach for transforming a vector into a matrix and a variant of the local phase quantization based on a ternary coding for extracting the features from the matrix. Support vector machines are used both for the vector-based descriptors and the texture descriptors. Our experiments show that the texture descriptors along with the vector-based descriptors can be combined to improve overall classifier performance. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction The general method for solving most machine pattern recognition problems is to transform raw sensor inputs so that a selection of relevant features can be extracted. These features are then transformed and passed them along to a classifier system. Optimizing the feature processing step is a major focus of research in the field of pattern recognition because the raw sensor data in many machine pattern recognition problems, for example, image recognition, is a high dimensional matrix of values. It is common practice to reshape these matrices as a vector before applying various feature transforms, such as principal component analysis (Beymer & Poggio, 1996). However, reshaping the matrix into a one dimensional vector is not the only way to represent sensor values, nor is it necessarily the most desirable method. Local binary patterns (LBP) (Nanni & Lumini, 2008; Ojala, Pietikainen, & Maeenpaa, 2002) and Gabor filters (Eustice, Pizarro, Singh, & Howland, 2002), for instance, are able to extract texture descriptors directly from the matrix, and several two-dimensional feature transforms have recently been proposed. Yang, Zhang, Frangi, and Yang (2004), for example, have developed a two-dimensional method for performing principal component analysis, called 2DPCA, ⇑ Corresponding author. Tel.: +39 0547 339121; fax: +39 0547 338890. E-mail addresses:
[email protected] (L. Nanni),
[email protected] (S. Brahnam),
[email protected] (A. Lumini). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.08.165
directly on the sensor matrix. Several benefits have been reported for directly applying a transform and extracting features from raw sensor matrices of data: such methods improve the performance of classical feature transforms and result in significant reductions in computation time (Wang, Chen, Liu, & Zhang, 2008). Studies have also been reported where researchers were able to improve performance by developing classifiers that directly handled 2-D patterns (see for example, Wang & Chen, 2008; Wang et al., 2008). A number of studies have explored matrix representations of features for generic classification problems (Chen, Zhu, Zhang, & Yang, 2005; Liu & Chen, 2006; Wang & Chen, 2008; Wang et al., 2008). In Fig. 1, we show an example of a matrix obtained by reshaping a pattern vector so that a two-dimensional feature transform can be applied. By using different reshaping methods, Wang and Chen (2008) and Wang et al. (2008), for instance, were able to diversify the design of the classifiers and develop an AdaBoost technique that exploited that diversity. In (Kim & Choi, 2007) a composite feature matrix representation was proposed. A composite feature consists of a number of primitive features, each of which corresponds to an input variable. The composite features are derived using discriminant analysis. In (Nanni, Brahnam, & Lumini, 2010) patterns were rearranged as matrices and then local ternary patterns (LTP) were used for extracting the features. The vector was reshaped as a matrix by random assignment, with 50 different random reshapings (see Fig. 1) performed. For each reshaping, a
3032
L. Nanni et al. / Expert Systems with Applications 39 (2012) 3031–3036
1 2 ...
... i ...
Continuous wavelet (CW); Random reshaping (RR); Feature values (FV).
... ... N
As we noted in the introduction, rearranging a vector into a matrix potentially creates classifier diversification (Wang et al., 2008). In our study, we exploit this diversity in a very simple way:
1
2
...
i
...
N
Fig. 1. Reshaping a vector in a matrix.
different SVM was trained, and the results were then combined using the mean rule. In (Nanni et al., 2010) the following two observations were made: (1) Texture descriptors and vector-based descriptors can be combined to improve the overall performance of the system; and (2) linear SVMs always work well when the texture descriptors are used for training SVM. In this paper, we expand (Nanni et al., 2010) in the following ways: We compare different methods for matrix representation in pattern classification and find the best approach is based on continuous wavelets; We compare the novel local phase quantization and LTP for extracting the features, and find that our best approach is a variant of the local phase quantization based on a ternary coding; We test our approach of representing the patterns as matrices for peptide and protein classification, obtaining in all the tested datasets a performance similar to or better than that obtained by the state-of the-art. The remainder of this paper is organized as follows. In Section 2 we provide an overview of the proposed feature extraction approach. In Section 3, we report experimental results. Finally, in Section 4, we provide a few concluding remarks and directions for future research.
2. Feature extraction approach In this paper, we propose to reshape a vector representation of a feature pattern into a matrix using a continuous wavelet approach. A variant of the local phase quantization is then used for extracting the features from the matrix obtained by continuous wavelet. The classifier used in our experiments is SVM, a random subspace ensemble (Ho, 1998) of 50 SVMs, with each trained with a random set of 50% of the original features. These 50 classifiers are combined by sum rule. This is used both for the vector-based descriptors and the texture descriptors. SVM is a bi-class classifier trained to find the equation of a hyperplane that divides the training set into two classes (leaving all the points of the same class on the same side) while maximizing the distance between the two classes and the hyperplane (Duda, Hart, & Stork, 2000). The feature values are linearly normalized considering the training data to [0 1] before training with SVM. Using texture descriptors based on local neighborhoods, we investigate the correlation among the original features that belong to a given neighborhood. The pixels of the texture that describes a pattern are extracted starting from the original feature. To exploit this property, using several different neighborhood each built by different features, we extract several images from each pattern simply by randomly sorting the features of the original pattern before the matrix creation. In this paper, we test three different ways for rearranging a vector into a matrix:
When facing a standard classification problem, we randomly sort the features. Then the vector is reshaped as a matrix. For each pattern, a set of 50 images is created by simply randomly sorting the feature of the original pattern before matrix creation. A different classifier is trained for each image. They are then combined by mean rule. When facing a peptide/protein classification problem, we extract a different matrix from each peptide/protein using a different physicochemical property. First, the peptide/protein sequence is converted to a numerical sequence, substituting each amino acid with its value of a given physicochemical property. In our investigation, the set of physicochemical properties used are obtained from the amino acid index (Kawashima & Kanehisa, 2000) database.1 An amino acid index is a set of 20 numbers that represent different physicochemical properties. For each protein descriptor, a set of 50 physicochemical properties are used. A different classifier is trained for each physicochemical property, then they are combined by mean rule. Continuous wavelet The Meyer continuous wavelet is then applied to the above encodings, and the wavelet power spectrum is extracted by considering 100 different decomposition scales.2 Wavelet transforms (WT) are a popular tool for spectral and temporal signal analysis. With continuous wavelets, (WTC), digital signals can be decomposed into many groups of coefficients in different scales. The coefficients vectors exhibit characteristics in both the temporal and frequency domains. The wavelet power spectrum (WPS) is a representation of variations as they accumulate at each scale in the decomposition of data. The continuous wave transform (CWT) and WPS can be mathematically described as follows:
1 W f ða; bÞ ¼ pffiffiffi a
WPS½j ¼
K X
Z
tb dt and f ðtÞw a
C 2j;k ;
k¼1
where a is the scale factor (a > 0), b is the shift factor, f(t) is the digital signal sequence, w(t) is the wavelet core, Wf (a,b) is the inner product between f(t) and w(t), Cj,k are the coefficient vectors in different scales, j is the level of decomposition, k is the order of the decomposition, WPS[j] is the WPS at the jth level of decomposition, and all values are in R. Random reshaping To reshape a vector of dimension s as a matrix, we create a squared matrix of dimension s0.5. We then randomly assign each entry of the matrix a value from the vector. Feature values The value of the element (i, j) of the image that describes a given pattern is the sum of the value of the feature in position i of the protein and the value of the feature in position j. We also combine, using the mean rule, the proposed method along with the standard 1 Available at http://www.genome.jp/dbget/aaindex.html. We have not considered the properties where the amino acids have value 0 or 1. 2 cwt(I,1:100,’meyr’).
3033
L. Nanni et al. / Expert Systems with Applications 39 (2012) 3031–3036
vector-based method. As shown in Section 3, this combination outperforms the standard vector-based approach. The remainder of this section describes the various texture descriptors used in our proposed ensemble method.
NP, where, starting from all the rotation invariant bins, the selection method proposed in Loris Loris Nanni and Alessandra Lumini. (2010a) is used. In this paper, we also concatenate the two histograms calculated with (P = 8; R = 1) and (P = 16; R = 2).
2.1. Local binary/ternary patterns
2.2. Local phase quantization
LBP has become a fairly popular local feature extractor for texture descriptors because of the following desirable properties (Ojala et al., 2002):
Local phase quantization (LQP) operator, first proposed as a texture descriptor by Ojansivu and Heikkila (Beymer & Poggio, 1996), is based on the blur invariance property of the Fourier phase spectrum. It uses the local phase information extracted from the 2-D short-term Fourier transform (STFT) computed over a rectangular neighborhood at each pixel position of the image. In the Fourier domain, the model for spatially invariant blurring of an image, g(x), is
Calculating LBP does not require high computational power; LBP is robust to illumination changes and rotations. LBP works well when applied to many classification problems, ranging from texture classification (Ojala et al., 2002) to automated cell phenotype image classification (Nanni & Lumini, 2008). LBP examines the joint distribution of gray scale values of a circularly symmetric neighbor of a set of P pixels around a pixel x on a circle of radius R. The LBP histogram3 of dimension N (typically N = P + 2) is obtained considering all the pixels of a given image. The difference, d, between x and its neighborhood u is encoded using 2 values:
d¼
1
ux
0
otherwise
Distributions are called uniform patterns when they contain at most two bitwise transitions, 0–1 or 1–0, using circular binary codes. Examples of uniform patterns would be 11111111, 00000110, and 10000111. All other patterns are called non-uniform patterns. When using uniform LBP, the uniform distributions are considered in the histogram while a single bin contains all the non-uniform patterns. Because of the circular sampling of the neighborhoods, it is straightforward to make LBP codes invariant with respect to rotation in the image domain: simply remove the effect of rotation by rotating each LBP code back to a reference position. This effectively makes all rotated versions of a binary code the same. When this approach is performed, LBP is known as rotation invariant LBP. A recent variant of the original local binary pattern is the local ternary pattern (LTP) Tan & Triggs, 2007. LTP represents the gray-scale differences between a pixel and its neighborhood using a ternary, value. LTP is defined as follows:
8 ux¼s > < 1 d¼ 0 x s ult; x þ s > : 1 otherwise In our investigations the value of the threshold s is fixed to 1/5 of the standard deviation of the value of the training images. A ternary value is less sensitive to noise. Thus, small variations inferior to s between two pixels are not considered. To obtain the final feature vector, the ternary pattern is split into two binary patterns by considering the positive and negative components. The LBP histograms computed from these two patterns are then concatenated. We test two approaches base on local ternary patterns: UNIF, where all the uniform bins4 are used for training a random subspace of SVM. We concatenate the two histograms calculated with (P = 8; R = 1) and (P = 16; R = 2).
3 The Matlab implementation are available at the Outex site (http:// www.outex.oulu.fi). 4 To extract the histogram we use the following mapping: mapping=getmapping011(18,’u2’);
GðuÞ ¼ FðuÞ HðuÞ
ð1Þ
where G(u), F(u), and H(u) are the discrete Fourier transforms (DFT) of the blurred image g(x), the original image f(x), and the point spread function (PSF) h(x), respectively, and u is a vector of coordinates [u, v]T . The magnitude and phase aspects in Eq. (1) can be separated. Thus,
jGðuÞj ¼ jFðuÞj jHðuÞj and
ð2Þ
\GðuÞ ¼ \FðuÞ þ \HðuÞ: In the case where the blur h(x) is centrally symmetric, the Fourier transform is always real-valued, and its phase is a two-valued function given by \ H(u) = 0 if H(u) P 0, p otherwise. LPQ is based on the properties of blur invariance noted above. It uses the local phase information extracted using STFT computed over a rectangular neighborhood Nx of size M by M at each pixel position x of the image f(x):
Fðu; xÞ ¼
X
f ðx yÞej2pu
Ty
¼ wTu fx
ð3Þ
y2Nx
where wu is the basis vector of the 2-D DFT at frequency u, and fx is a vector containing all M2 image samples from Nx. Only four complex coefficients in LPQ are considered. These correspond to the 2-D frequencies u1 = [a,0]T, u2 = [0,a]T, u3 = [a, a]T, and u1 = [a, a]T, where a is the first frequency below the first zero crossing of H(u) that satisfies \GðuÞ ¼ \FðuÞ for all \HðuÞ 0. Letting
F cx ¼ ½Fðu1 ; xÞ; Fðu2 ; xÞ; Fðu3 ; xÞ; Fðu4 ; xÞ; and
ð4Þ
Fx ¼ ½RefFcx g; ImfFgcx T ; where Re{} and Im{} return the real and the imaginary parts of a complex number, respectively, then the corresponding 8 by M2 transform matrix is
W ¼ ½Refwu1 ; wu2 ; wu3 ; wu4 ; g; Imfwu1; wu2 ; wu3 ; wu4 ; gT
ð5Þ
Thus,
Fx ¼ Wfx ;
ð6Þ
assuming that for fx, the correlation coefficient between adjacent pixel values is q and the variance of each sample is r2 = 1 and that the covariance between positions xi and xj becomesrij = q|| xixj ||, where || || denotes the L2 norm. The covariance matrix of the transform coefficient vector Fx can be obtained from D = WCWT. Where C is the covariance matrix of all M samples in Nx. The coefficients need to be decorrelated before quantization. Assuming a Gaussian distribution, a whitening transform can achieve independence:
3034
L. Nanni et al. / Expert Systems with Applications 39 (2012) 3031–3036
Gx ¼ VT Fx ;
ð7Þ
where V is an orthonormal matrix derived from the singular value decomposition (SVD) of the matrix D, and Gx is computed for all image positions. The resulting vectors are quantized using a scalar quantizer, gj is the jth component of Gx and qj = 1 if gj P 0, otherwise 0. These quantized coefficients are represented as integers beP tween 0 and 255 using the binary coding b ¼ 8j¼1 qj 2j1 . Finally, a histogram of these integer values is composed and used as a feature vector. The following ternary coding scheme can be applied to improve performance even further:
8 g i sðjÞ > < 1 Qi ¼ 0 sðjÞ g j < sðjÞ > : 1 otherwise where gj is the jth component of Gx. The value of the threshold s(j) is set to the half of the standard deviation of the jth component of Gx. The ternary coding is split into two binary patterns by considering the positive and negative components. From each of these two codings, a 256-dimensional feature vector is extracted as in standard LPQ. For both standard LPQ and our ternary coding (named TERN is Section 3), we concatenate the three histograms calculated with r (the filter size) = 1, 3, 5. Moreover, for LPQ and TERN we use the random subspace ensemble of SVM as classifier since the feature vector extracted by those methods is quite long (we thereby avoid the curse of dimensionality problem).
Table 2 Number of binders (B) and non-binders (NB) in training and testing sets for HLA-A2. HLA-A2
0201 0202 0204 0205 0206
Training set
Testing set
B
NB
B
NB
224 619 641 648 621
378 2361 2162 2346 2349
440 45 23 16 43
1999 25 224 40 37
terms of error area under the ROC curve (Qin, 2006). This performance indicator is a scalar measure that can be interpreted as the probability that the classifier will assign a higher score to a randomly picked positive sample than to a randomly picked negative sample.
3.2. Protein datasets 3.2.1. Human protein–protein interaction (HUM) The HUM dataset (Bock & Gough, 2003) examines human protein–protein interaction. HUM contains a total of 1882 human protein pairs. Each pair of proteins is labeled as either an interacting pair or a non-interacting pair.
3.2.2. Helicobacter protein–protein interaction (HEL) The HEL dataset (Bock & Gough, 2003) examines helicobacter protein–protein interaction. HEL contains a total of 2,916 helicobacter protein pairs. Each pair of proteins is labeled as either an interacting pair or a non-interacting pair.
3. Experimental results 3.1. Generic pattern recognition datasets
3.3. Peptide datasets
For comparing the proposed approaches with other state-ofthe-art techniques, we report results obtained on eight benchmark datasets (several from the UCI Repository) and several real problems (see Table 1 and Sections 3.2–3.3 for characteristics of the problem datasets). A detailed description of the datasets from the UCI Repository is available at http://archive.ics.uci.edu/ml/. As suggested by many classification approaches, the features on these datasets have been linearly normalized between 0 and 1. The results on these datasets have been averaged over ten experiments. For each experiment we randomly resample the training and the testing sets (containing respectively half of the patterns). Since this work is focused on bi-class problems, the results are reported in
Vaccine (VAC) Bozic, Zhang, & Brusic, 2005, which contains peptides from five HLA-A2 molecules that bind/non-bind multiple HLA. The testing protocol suggested in Bozic et al. (2005) has been adopted, which is a ‘‘five-molecule’’ cross validation method, where all the peptides related to a given molecule are used as the testing set and all the peptides related to the other four molecules form the training set (see Table 2 for details). HIV-1 PR 1625 dataset (HIV) Kontijevskis, Wikberg, & Komorowski, 2007, which contains 1625 octamer protein sequences: 374 HIV-1 protease cleavable sites and 1251 uncleavable sites. In this dataset the 10-fold cross validation is used to assess the performance.
Table 1 Characteristics of the datasets used in the experimentation: number of attributes (#A), number of examples (#E), brief description. Dataset
#A
#E
Brief description
BREAST
100
584
PAP
100
917
WDBC SONAR IONOSPHERE (IONO) German credit (CrG) Tornado
31 60 34
569 208 351
To classify samples of benign and malignant tissues, for details see (Junior, Cardoso de Paiva, Silva, & Muniz de Oliveira, 2009). To extract the features from each image we extract the 100 rotation invariant LTP bins, with P = 16 and R = 2, with higher variance (considering only the training data) To classify each cell extracted from a pap test as normal or abnormal, for details see (Jantzen et al., 2005). To extract the features from each image we extract the 100 rotation invariant LTP bins, with P = 16 and R = 2, with higher variance (considering only the training data) breast tumor diagnosis (UCI Repository) To discriminate between sonar signals bounced off a metal cylinder and those bounced off a roughly cylindrical rock (UCI Repository) Classification of radar returns from the ionosphere (UCI Repository)
20
1000
To discriminate between good credit group and bad credit group (UCI Repository)
24
18951
57
4601
to develop a hybrid forecast system for the discrimination of tornadic from non-tornadic events. See (Trafalis, Ince, & Richman, 2003) for details. (UCI Repository) Classifying Email as Spam or Non-Spam (UCI Repository)
Spam
3035
L. Nanni et al. / Expert Systems with Applications 39 (2012) 3031–3036 Table 3 Performance obtained in the generic pattern recognition problems. SV
Iono Sonar Wdbc CrG PAP breast Spam Tornado
1.9 4.8 0.4 19.9 13.2 7.4 2.3 10.4
FU
UNIF
1.6 4.5 0.4 19.2 12.6 6.0 2 7.3
NP
HIV VAC
RR
CW
FV
RR
CW
FV
RR
CW
FV
RR
2.1 8.7 1.0 21.1 15.1 9.8 – –
2.0 11.5 0.9 22.3 19.1 9.3 – –
1.9 9.7 0.9 21.1 16.5 9.2 – –
2.2 9.5 1.1 21.9 17.9 12.9 – –
1.9 10.6 0.9 21.6 20.2 10.6 – –
1.5 9.9 0.9 21.6 20.1 11.3 – –
1.7 6.7 1.6 20.8 13.1 7.3 2.8 6.7
1.9 7.9 1.5 20.9 16.2 7.3 – –
1.5 5.9 1.2 21.3 13.2 7.4 – –
1.5 6.9 1.6 21.1 12.2 7.6 – –
1.9 9.0 1.3 20.7 17.2 8.2 – –
1.5 6.8 1.1 20.7 13.8 7.6 – –
Descriptors Nanni et al. (2010b)
Linear SVM
RBF SVM
0.973 0.880
0.969 0.906
0.982 0.914
Table 5 Performance obtained in the protein datasets. Dataset
HUM HEL
LPQ
FV
Table 4 Performance obtained in the peptide datasets. Dataset
TERN
CW
Descriptors RC
DW + DL
Linear SVM
RBF SVM
0.717 0.925
0.690 0.880
0.590 0.909
0.680 0.928
use our best approach (TERN and CW). In this case, using only the linear SVM does not obtain excellent performance. For our texture based approach, the radial basis function (RBF) SVM is used (in each run of the experiments only the training data is used for finding the parameters). In Table 4 we also compare our approach with the best texture-based approach for peptide classification proposed in Nanni, Brahnam, and Lumini (2011). In both datasets our proposed method based on RBF SVM outperforms the best texture-based approach reported so far. In Table 5 we compare our approach with the best texturebased approach for protein classification (DW+DL) and the best approach, the quasi residue couple (RC), based on the amino-acid sequence tested in the survey (Nanni, Brahnam, & Lumini, 2010). Also, in this test, our approach based on RBF SVM works very well, outperforming, in the HEL dataset, the high performing RC method based on the amino-acid sequence.
3.4. Experimental results
4. Conclusion
In Table 3 we report the performance of our approaches for rearranging a vector into a matrix, with each coupled with the four texture descriptors. In this table we also report also the performance of the following:
In this paper, we conducted experiments using different texture descriptors for extracting features from a matrix obtained by reshaping the original feature vector that describes a given pattern in a pattern recognition problem. We expanded our previous work in this area in the following way:
SV, a stand-alone support vector machine trained with the original features (in each dataset the best kernel and the best parameters are chosen); FU, fusion by mean rule between SV and our best approach based on texture descriptors. In some datasets (Spam and Tornado) we use only our best approach (TERN and CW). This is due to the computational issue. Our previous work (Nanni et al., 2010) was based on UNIF and RR. It is clear from the table that they work better than TERN+CW in the Wdbc dataset only. Moreover, on average (considering all the three methods for rearranging a vector into a matrix) TERN works slightly better than LPQ (with an average of 8.7 AUC for TERN versus an average 8.9 AUC for LPQ). We have chosen CW instead of RR (they have similar performance) since CW allows the dimension of the texture created by a given feature vector (i.e. setting the number of different decomposition scales) to be chosen. Also, it should be noted again that the number of different decomposition scales of CW is set to 100 all the reported tests. The most important conclusion to be made is that the best approach is FU. Notice that for our texture-based approaches only the linear SVM is used, while in SV the best SVM (linear SVM or radial basis function SVM) is chosen. Its parameters are optimized in that dataset (in each run of the experiments only the training data is used for finding the parameters). In Table 4 we report the performance in the peptide/protein datasets. In this test, again, due to the computational issue, we only
Different methods for matrix representation in pattern classification are investigated, and found that an approach based on the continuous Meyer wavelet worked best; We compare the novel LPQ and LTP for extracting the features and found that a variant of the LPQ based on a ternary coding was the best approach; We tested our methods to represent the patterns for peptide and protein classification as matrices, obtaining in all the tested datasets a performance similar to or better than that obtained by the state-of the-art. Since each pixel of the texture that describes a pattern is extracted starting from the original feature, using texture descriptors based on local neighborhood, we also investigated/the correlation among the original features that belong to the given neighborhood. For studying the correlation among different set of features, we extracted several images from each pattern by simply randomly sorting the features of the original pattern before the matrix creation. As future work we will study the fusion of different texture descriptors for its potential of improving of performance. References Beymer, D., & Poggio, T. (1996). Image representations for visual learning. Science, 272, 1905–1909.
3036
L. Nanni et al. / Expert Systems with Applications 39 (2012) 3031–3036
Bock, J., & Gough, D. (2003). Whole-proteome interaction mining. Bioinformatics, 19, 125–135. Bozic, I., Zhang, G., Brusic, V., (2005) Predictive vaccinology: Optimization of predictions using support vector machine classifiers. Intelligent Data Engineering and Automated Learning LNCS 3578/2005: 375-381. Chen, S. C., Zhu, Y. L., Zhang, D. Q., & Yang, J. (2005). Feature extraction approaches based on matrix pattern: MatPCA and MatFLDA. Pattern Recognition Letters, 26, 1157–1167. Duda, R. O., Hart, P. E., & Stork, D. (2000). Pattern Classification (2nd ed.). Wiley. Eustice, R., Pizarro, O., Singh, H., Howland, J., (2002) UWIT: Underwater Image Toolbox for Optical Image Processing and Mosaicking in MATLAB, Proceedings of the 2002 International Symposium on Underwater Technology, pp. 141–145, Tokyo, Japan. Ho, TK. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844. J. Jantzen, J. Norup, G. Dounias, and B. Bjerregaard, ‘‘Pap-smear benchmark data for pattern classification, in Nature inspired Smart Information Systems (NiSIS), Albufeira, Portugal, 2005, pp. 1–9. Junior, G. B., Cardoso de Paiva, A., Silva, A. C., & Muniz de Oliveira, A. C. (2009). Classification of breast tissues using Moran’s index and Geary’s coefficient as texture signatures and SVM. Computers in Biology and Medicine, 39(12), 1063–1072. Kawashima, S., & Kanehisa, M. (2000). AAindex: amino acid index database. Nucleic Acids Research, 28, 374. Kim, C., & Choi, C-H. (2007). A discriminant analysis using composite features for classification problems. Pattern Recognition, 40(11), 2958–2966. Kontijevskis, A., Wikberg, JES., & Komorowski, J. (2007). Computational proteomics analysis of HIV-1 protease interactome. Proteins: Structure, Function, and Bioinformatics, 1, 305–312. Liu, J., & Chen, S. C. (2006). Non-iterative generalized low rank approximation of matrices. Pattern Recognition Letters, 27(9), 1002–1008. Nanni, Loris, Brahnam, Sheryl, & Lumini, Alessandra (2010). High performance set of PseAAC and sequence based descriptors for protein classification’. Journal of Theoretical Biology, 266(1), 1–10.
Nanni, L., Brahnam, S., Lumini, A., (2010), Texture descriptors for generic pattern classification problems. Expert systems with applications, In review. Loris Nanni, Sheryl Brahnam, Alessandra Lumini. (2010a) ‘A study for selecting the best performing rotation invariant patterns in local binary/ternary patterns.’ In Proceedings of the International Conference on ImageProcessing, Computer Vision, and Pattern Recognition (IPCV’10), 2010. Nanni Loris Brahnam Sheryl & Lumini Alessandra (2011). Artificial intelligence systems based on texture descriptors for vaccine development. Amino Acids, 40(2), 443–451. Nanni, L., & Lumini, A. (2008). A reliable method for cell phenotype image classification. Artificial Intelligence in Medicine, 43(2), 87–97. Ojala, T., Pietikainen, M., & Maeenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987. Qin, Z. C., (2006). ROC Analysis for Predictions Made by Probabilistic Classifiers, Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Vol. 5, pp. 3119–312. Tan, X., Triggs, B., (2007). Enhanced local texture feature sets for face recognition under difficult lighting conditions. In Analysis and Modelling of Faces and Gestures, Volume 4778 of LNCS, Springer, pp. 168–182. Trafalis, T.B., Ince, H., Richman, M.B. (2003). Tornado detection with support vector machines. In: Sloot, P.M., et al. (Eds.), Computational Science-ICCS, Springer, pp. 202–211. Wang, Z., & Chen, SC. (2008). Matrix-pattern-oriented least squares support vector classifier with AdaBoost. Pattern Recognition Letters, 29, 745–753. Wang, Z., Chen, SC., Liu, J., & Zhang, DQ. (2008). Pattern representation in feature extraction and classification – Matrix versus vector. IEEE Transactions on Neural Networks, 19, 758–769. Yang, J., Zhang, D., Frangi, A. F., & Yang, J. U. (2004). Two-Dimension PCA: A New Approach to Appearance-Based Face Representation and Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(1), 131–137.