Pattern Recognition 36 (2003) 91 – 101
www.elsevier.com/locate/patcog
O-line signature veri!cation by the tracking of feature and stroke positions B. Fanga , C.H. Leungb;∗ , Y.Y. Tangc , K.W. Tseb , P.C.K. Kwokd , Y.K. Wonge b Department
a Singapore-MIT Alliance, The National University of Singapore, Singapore of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Pokfulam, Hong Kong c Department of Computer Science, Hong Kong Baptist University, Hong Kong d School of Science and Technology, Open University of Hong Kong, Hong Kong e Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong
Received 15 March 2001; accepted 28 February 2002
Abstract There are inevitable variations in the signature patterns written by the same person. The variations can occur in the shape or in the relative positions of the characteristic features. In this paper, two methods are proposed to track the variations. Given the set of training signature samples, the !rst method measures the positional variations of the one-dimensional projection pro!les of the signature patterns; and the second method determines the variations in relative stroke positions in the two-dimension signature patterns. The statistics on these variations are determined from the training set. Given a signature to be veri!ed, the positional displacements are determined and the authenticity is decided based on the statistics of the training samples. For the purpose of comparison, two existing methods proposed by other researchers were implemented and tested on the same database. Furthermore, two volunteers were recruited to perform the same veri!cation task. Results show that the proposed system compares favorably with other methods and outperforms the volunteers. ? 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Signature veri!cation; Feature tracking; Elastic matching; O-line system; Handwriting recognition
1. Introduction A lot of work has been done in the !eld of automatic o-line signature veri!cation [1–11]. While a large portion of the work is focused on random forgery detection, more eorts are still needed to address the problem of skilled forgery detection. Signature patterns drawn by humans are subject to the fundamental characteristics of handwritings: intra-personal variation and inter-personal dierence. No two signatures by the same person are identical on a detailed scale. In the case of detecting skilled forgeries that are very similar to the authentic signatures on a global scale such as aspect ∗ Corresponding author. Tel.: +852-2859-7097; fax: +8522559-8738. E-mail address:
[email protected] (C.H. Leung).
ratio and overall orientation, the extraction and comparison of local features are indispensable. In the 1980s, Ammar [6] used the statistics of high grey-level pixels to identify pseudo-dynamical characteristics of signatures. Qi et al. [4] used local grid features and global geometric features to build multi-scale veri!cation functions. The local measurement is based on a non-uniform grid. The feature vector at each grid position includes the boundary code and the total number of pixels in the grid cell. Sabourin et al. [7] used an extended shadow code as a feature vector to incorporate both local and global information into the veri!cation decision. The extended shadow code allows general features of the signature to be extracted at a low resolution and other features from a speci!c area of the signature to be extracted at a high resolution. In another report [3], Sabourin used a ‘pattern spectrum’, which is derived from a successive application of morphological
0031-3203/02/$22.00 ? 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 2 ) 0 0 0 6 1 - 4
92
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
Fig. 1. (a) Samples of authentic signatures, (b) samples of forgeries.
operators, as a local shape factor. Guo et al. [5] used the local correspondence of stroke segments. A signature is segmented based on the edge information and local features are computed for each stroke segment. The quality of the correspondence is determined by matching the test signature with a model and the result is used for detecting forgeries. A priori on-line knowledge is required for building the model that includes a tracing of the reference signature. Justino et al. [8] used an HMM approach to model the intrapersonal and interpersonal variations in signature patterns. Mizukmi et al. [9] proposed a method to extract an optimum displacement function between a pair of signatures. From this function, a dissimilarity measure is then derived and used for veri!cation. Besides these, neural network approaches have also been used for signature veri!cation [10,11]. The two-dimensional signature pattern can be conveniently converted into one-dimensional waveforms by taking the horizontal and vertical projections. Although some information is lost, the projections still contain a lot of useful global and local information for veri!cation [2]. Besides directly using the projections as features, other features can also be extracted, for example, the horizontal or vertical center of gravity, peak values of the projections and the baseline of the signature. Although there are inevitable variations in the signature patterns drawn by the same person, if the statistics of these variations can be extracted from the training samples, it is possible to compute a measure to indicate the degree that an input signature is within the range of variations of the authentic signatures. In this paper, the idea of tracking the positional variations of the strokes of signature patterns for o-line signature veri!cation is proposed and tested. Two methods are tried. The !rst method tracks the positional variations of the projection pro!les of the signature patterns, while the second tracks the actual positional variations of individual strokes in the two-dimensional signature patterns. In both meth-
ods, the statistics on these variations are computed. Given an input test signature, the positional displacements are computed. Based on the statistics of the training set, the authenticity of the input signature is determined. The decision process involves the computation of a distance measure which takes the positional variations and the correlation between them into account. In order to obtain a better estimation of the covariance matrix, a matrix estimation technique is also employed. The eectiveness of the proposed method is evaluated on a database of 1320 authentic signatures from 55 authors and 1320 forgeries from 12 forgers. Some samples are shown in Fig. 1. In order to compare the performance with other methods reported in the literature, the methods of global shape features [2] and extended shadow code [7] are implemented and tested on the same database. Furthermore, to check the practical usefulness of the proposed system, the performance was compared with that of two volunteers who performed the same veri!cation task. Details of the method are discussed in the following sections.
2. The proposed methods As explained in the introduction, there are inevitable variations in the signature patterns drawn by the same person. The present proposal is to track the positional variations of the features or strokes of the signature patterns and build a statistics of these variations from the training set. Given an input test signature, it is then possible to compute a measure to indicate the degree that it is within the range of variations of the authentic signatures. Two methods are proposed. The !rst method tracks the positional variations of the projection pro!les of the signatures, and the second tracks the actual positional variations of individual strokes of the two-dimensional signature patterns.
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
93
Fig. 2. (a) The reference signature and its projection along the vertical direction, (b) the input signature and its vertical projection, (c) the superposition of the two projections before optimal matching and (d) the superposition of the two projections after optimal matching.
2.1. Method 1 The one-dimensional projection pro!les of the signature patterns are optimally matched using dynamic warping. The positional variations are then derived from the resulting warping function. Fig. 2 shows the projections before and after warping, and Fig. 3 shows the resulting warping function. Let the two projections to be matched be called the reference projection and the input projection, and let them be denoted by Prf (i); i = 1; : : : ; L1 , and Pin (j); j = 1; : : : ; L2 . The warping function w(i); i = 1; : : : ; L1 to be found is de!ned as the function which minimizes the overall distortion D: L 1 2 (1) [Prf (i) − Pin (w(i))] : D = min w(:)
i=1
The warping function w(i); i=1; : : : ; L1 , gives the positional distortion of the input projection relative to the reference projection. With Prf (i) optimally matched to Pin (w(i)), it indicates that point i of the reference projection is shifted to point w(i) in the input projection. Hence the positional distortion at point i is equal to [w(i)−i]. The optimal matching problem given by Eq. (1) can be eNciently solved by using dynamic programming. Optimal matching by dynamic programming is a well-known method and has been applied to various problems, such as speech recognition. A description of the method can be found in Ref. [12].
Non-linear dynamic time warping has been used for on-line signature veri!cation [13–15]. There are dierences between these previous methods and the present proposal. One minor dierence is that the previous methods apply warping to on-line signature patterns while the present proposal is for o-line signatures. Moreover, the major dierence is that in the previous methods, the on-line test signature and the template signature to be matched are optimally aligned by warping the temporal axis, and then the distance D between them as given in Eq. (1) is used for deciding the authenticity of the test signature. The temporal displacement [w(i) − i] at time i is not used in the decision. On the other hand, in the present method, the positional distortion [w(i) − i] at each point i of the projection of the o-line signature is incorporated into a distance measure, while the distance D in Eq. (1) is ignored. The rationale for this new distance measure is that, for o-line signature patterns, the statistics of [w(i) − i] is exactly the statistics on the positional variations of the signature strokes. Hence they should be useful for deciding whether a test signature is authentic or not. If the positional distortions of the test signature are within a prede!ned range derived from the statistics, the test signature could be accepted as authentic. Otherwise, it could be rejected. Moreover, the proposed overall distance is not a simple Euclidean distance, but the Mahananobis distance is used instead. A matrix estimation procedure is employed to give a better estimate of the
94
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
of these vectors, where \=
N −1 1 Vk ; N −1
(3)
N −1 1 (Vk − \)(Vk − \)T : N −1
(4)
k=1
=
k=1
During the test phase, the projections of the signature to be veri!ed is optimally matched with the reference signature from the training set, and a warping function w (i); i = 1; : : : ; L1 is obtained. The components of the positional variations vector F are computed according to F(i) = w (i) − i;
Fig. 3. Illustration of optimal matching between the input projection (on the vertical axis) and the reference projection (on the horizontal axis). The warping function maps the reference projection at position i to the input projection at position w(i). As an example, the reference peak at A is matched to the input peak at B. The displacement between these two peaks can be determined from w(i) as shown.
covariance matrix and the resulting distance thus takes the correlation between the positional distortions at dierent locations into account. 2.1.1. Training and veri5cation protocol During the training phase, the statistics on the positional variations is built up from the set of training samples. For each writer, a reference signature is !rst selected from the set of training samples written by that writer. The selection of the reference sample is not crucial because it is the statistics of the positional variations of all training signatures relative to the reference that are computed for building up the training model. Thus the signature with the average length, say L1 , is selected as the reference. The projection of this reference signature is matched with that of all the other training signatures. Hence if there are N training signatures from a writer, N − 1 warping functions {wk (i); i = 1; : : : ; L1 | k = 1; : : : ; N −1} are obtained. The positional variations at point i of the reference projection are {[wk (i)−i]; i=1; : : : ; L1 | k= 1; : : : ; N − 1}. These values are assembled into N − 1 positional variations column vectors Vk , where Vk = [wk (1) − 1; wk (2) − 2; : : : ; wk (L1 ) − L1 ]T ; k = 1; : : : ; N − 1;
(2)
where T stands for transpose. For each writer, the statistics of the positional variations can be represented by the mean \ and covariance matrix
i = 1; : : : L1 :
(5)
The Mahananobis distance d(F; \; ) is used to measure the dissimilarity between the input test signature and the set of training signatures according to d = (F − \)T −1 (F − \):
(6)
The input signature is accepted as authentic if the dissimilarity is less than a given threshold; otherwise, it is considered as a forgery. The threshold is selected to minimize the average error rate of the whole system. Two types of errors are de!ned, Type I error rate or the false rejection rate (FRR), and Type II error rate or false acceptance rate (FAR). The average error rate is de!ned to be the average of FRR and FAR. Due to the limited number of training samples, the leave-one-out method was adopted to maximize the use of the available samples. A problem arises when the number of training samples is small compared with the feature vector dimension. The estimated covariance matrix as given by Eq. (4) is not reliable, and may even be singular so that the distance measure in Eq. (6) cannot be computed. A matrix estimation method [16,17] is employed to improve the performance of the system. The details will be discussed in Section 3. 2.2. Method 2 Instead of matching the projection pro!les of the template and input test signatures, the individual stroke segments of the two-dimensional test signature are directly matched with those of the template signature using a two-dimensional elastic matching algorithm developed by the second author [18]. The positional variations of each stroke segment are then determined from the matching result. Let the two signature patterns to be matched be referred to as the ‘template’ and ‘input’ patterns. The two patterns are binarized and thinned. The strokes in the skeleton patterns are then approximated by !tting a set of short straight lines of approximately equal length, as illustrated in Figs. 4 and 5. Each of these short straight lines is referred to as an ‘element’. Each element is represented by its slope and the position vector of its midpoint. Each signature pattern is in turn represented by its set of elements. Hence the matching
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
Fig. 4. (a) Image of the template signature, (b) skeleton of the signature and (c) result of approximating the skeleton by short straight lines of approximately equal length.
Fig. 5. (a) Image of the input signature, (b) skeleton of the signature and (c) result of approximating the skeleton by short straight lines of approximately equal length.
problem becomes that of matching two sets of elements. The number of elements in the two patterns need not be equal. The template and input patterns are elastically deformed in order to match with each other. The process consists of putting one pattern on top of the other, and the set of template elements and input elements are iteratively moved until the corresponding elements meet, as illustrated in Fig. 6. The objective is to achieve maximum similarity between the resulting patterns while minimizing the deformation. This is achieved through the minimization of a cost (or energy) function. The energy function E1 used to guide the movements of the template elements towards the input elements is de!ned as follows [18]: E1 = −K12
NI NT ln exp(−|Tj − Ii |2 =2K12 )f(Tj; Ii ) i=1
+
NT N T
j=1
wjk (dTj; Tk − d0Tj; Tk )2 ;
(7)
j=1 k=1
where NT is the number of template elements, NI the number of input elements, Tj the position vector of the midpoint of the jth template element, Tj the direction of the jth template element, Ii the position vector of the midpoint of the ith input element, Ii the direction of the ith input element, Tj; Ii the angle between template element Tj and input element ◦ Ii , restricted within 0 –90 , f(Tj; Ii ) = max(cos Tj; Ii ; 0:1), dTj; Tk the current value of |Tj − Tk |, d0Tj; Tk the initial value T of |Tj −Tk |, wjk =exp(−|Tj −Tk |2 =2K22 = Nn=1 exp(−|Tj − 2 2 Tn | =2K2 ), K1 and K2 are the size parameters of the Gaussian windows which establish neighborhoods of inOuence, and are decreased monotonically in successive iterations, and and the coeNcients used to weigh the importance of the two terms. The !rst term of the energy function is a measure of the overall distance between elements of the two patterns.
95
Foreach element Ii of the input pattern, the summation T ln Nj=1 exp(−|Tj − Ii |2 =2K12 )f(Tj; Ii ) is dominated by the contribution from the nearest template element Tj with a similar slope. The value of the factor f(Tj ;Ii ) is large for similar slopes and small for slopes perpendicular to each other. As the size K1 of the Gaussian window decreases monotonically in successive iterations, in order for the energy E1 to attain a minimum, each Ii should have at least one Tj attracted to it. The second term is a weighted sum of all relative displacements between each template element and its neighbors within the Gaussian weighted neighborhood of size parameter K2 . Minimization of this term minimizes the structural distortion of the template pattern while each element is being moved. Each template element normally does not move towards its nearest input element but tends to follow the weighted mean movement of its neighbors in order to minimize the distortions within the neighborhood. K2 is initially large so that the large-scale distortions are kept small and the template elements move collectively to align with the input pattern in a coarse or global manner. As K2 is gradually decreased in successive iterations, !ner and !ner details of the two patterns are aligned. E1 is minimized by a gradient descent procedure. The movement PTj applied to Tj is equal to −@E1 =@Tj and is given by NI NT uij (Ii − Tj ) + 2 (wmj + wjm )[(Tm − Tm0 ) PTj = i=1
m=1
− (Tj − Tj0 )]; 2
=2K12 )f(Ii; Tj )=
NT
(8)
where uij =exp(−|Ii −Tj | n=1 exp(−|Ii − Tn |2 =2K12 )f(Ii; Tn ) and Tj0 is the initial value of Tj . The above procedure moves the template elements towards the input elements. The minimization of energy E1 tends to !nd a matching template element for each input element. However, it does not take an active role in !nding a matching input element for each template element. To correct this pitfall, the roles of the template and input patterns are swapped during each iteration and there are two passes in each iteration. In the !rst pass, the template elements are attracted to the input elements, and in the second pass, the input elements are attracted to the template elements. A second energy function E2 , similar to E1 , is de!ned for guiding the movement of input elements towards template elements. After all the iterations, the template and input elements will have moved towards each other. Hopefully most of the corresponding elements will overlap or have at least moved closer to each other, as illustrated in Fig. 6. Based on the !nal positions of the elements, for each template element Tj , the nearest input element Ii( j) is identi!ed. Hence the positional displacement vector DTj of Tj can be obtained by referring to the original undistorted template and input patterns and computing the vector (Tj − Ii( j) ). This is performed for each Tj ; j = 1; : : : ; NT . An overall positional displacement vector G for the pair of signature patterns can be obtained
96
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
Fig. 6. (a) Overlapped images of the template pattern (thick lines) and input pattern (thin lines) before matching and (b) overlapped images of the patterns after matching.
by concatenating the displacement vectors of the individual template elements: G = (DTT1 ; DTT2 ; : : : ; DTTNT )T ;
(9)
where the superscript T stands for transpose. 2.2.1. Training and veri5cation protocol The training procedures are essentially the same as Method 1. Let there be N training signatures from a single writer. A signature with the average length is taken as the reference, and the remaining N − 1 signature samples are elastically matched with the reference. Let Gk be the displacement vector obtained when the kth sample is matched with the reference, k = 1; : : : ; N − 1. Hence for this writer, the statistics of the positional variations can be represented by the mean and covariance matrix C of these vectors: 1 Gk ; N −1 N −1
]=
(10)
k=1
C=
N −1 1 (Gk − )(Gk − )T : N −1
(11)
k=1
During the test phase, the test signature to be veri!ed is elastically matched with the reference signature from the training set, and the displacement vector H is obtained. The dissimilarity between the input test signature and the set of training signatures is computed according to T
d = (H − ) C
−1
(H − ):
(12)
The input signature is accepted as authentic if the dissimilarity is less than a given threshold; otherwise, it is considered as a forgery.
likelihood estimation as given by Eqs. (4) and (11) is often used. Unfortunately, when the number of training samples is small, the estimation is unreliable. Furthermore, when the number of training samples is less than or equal to the feature vector dimension, the estimated covariance matrix is singular and its inverse as required in Eqs. (6) and (12) does not exist. For signature veri!cation in practice, it is very diNcult to collect a large number of authentic signature samples from the same author. One reason is due to the con!dential nature of personal signatures. Another reason is that handwritten signature patterns vary from time to time. To obtain a realistic estimation of the variations, the signatures should be collected in a number of sessions scheduled over a long period. However, it is very diNcult to arrange such regular and numerous appointments with the volunteers. Hence the number of training samples from a writer is often not too adequate for deriving a reliable estimate of the covariance matrix. Some methods for alleviating the problem have been proposed [16,17]. The leave-one-out covariance (LOOC) method is adopted in this study [17]. For a pattern recognition problem with NC classes, instead of simply adopting the maximum likelihood estimate of the covariance matrix as given by Eqs. (4) and (11), the estimated covariance matrix ˆ i for class i is a mixture of matrices given by ˆ i (i ) = i1 diag(i ) + i2 i + i3 S + i4 diag(S);
(13)
where i is the sample covariance matrix for class i, diag(i ) is the matrix obtained from i by setting the o-diagonal elements to zero, i = [i1 ; i2 ; i3 ; i4 ]T is the set of mixing parameters and S is the common covariance matrix de!ned as the average sample covariance matrix: NC
3. Performance improvement by matrix estimation techniques The Mahananobis distance measure is adopted in Eqs. (6) and (12) instead of a simple Euclidean distance. The rationale is that the positional displacements are not independent and the information on the correlation between them may be useful for the purpose of veri!cation. However, this requires the covariance matrix to be estimated. The maximum
S=
1 i : NC i=1
(14)
The mixing parameters i is selected so that the corresponding mixture of covariance matrices maximizes the average leave-one-out log likelihood LOOCi of each class i for the training samples: LOOCi =
Ni 1 ln[f(xi; k |mi\k ; ˆ i\k (i )); Ni k=1
(15)
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
where the notation i \ k indicates that the quantity is computed without using sample k from class i, Ni is the number of training samples for class i and f(x|m; ) is the likelihood of the occurrence of vector x given the mean vector m and covariance matrix . Once the optimal value of i has been estimated, the estimated covariance matrix is computed from Eq. (13). Details of the method can be found in Ref. [17]. In the present study, since the dimension of the displacement vector for one class (set of training signature samples written by one author) is dierent from another class (samples written by another author), the covariance matrices for dierent classes have dierent dimensions. Hence there is no common covariance matrix. So parameters i3 and i4 in Eq. (13) are set to zero for each class i.
4. Experimental results The proposed methods were evaluated on a database of 1320 authentic signatures from 55 authors and 1320 forgeries from 12 forgers. In the determination of the false rejection rate, due to the limited number of training samples, the leave-one-out method was adopted to maximize the use of the available authentic samples. In this method, the training and testing sessions were repeatedly performed. In the !rst round of training and testing, of the 24 authentic signature samples from each writer (total 55 writers), 23 were used for training and the remaining one was used for testing (i.e., the training data set and test data set were disjoint). In the second round of training and testing, the training and test sets for each writer also contained 23 and 1 samples respectively, but the test sample was chosen to be not the same as the one in the previous round. All together 24 rounds were performed. Hence the training and test sets were always disjoint and at the same time the utilization of the available samples for training and testing could be maximized. At the end of the 24 rounds of training and testing, the false rejection rate was determined. 4.1. Results of proposed method 1 In this method, the projections of the signature patterns were used for veri!cation. Two projections of a signature pattern were available. The vertical projection was obtained when the signature pattern was projected vertically onto the horizontal axis, as illustrated in Fig. 2. The horizontal projection was obtained by projecting the pattern along the horizontal direction onto the vertical axis. The matrix estimation method discussed in Section 3 was not implemented in the experiments reported in this section (the method will be incorporated in the experiments in Section 4.2). The covariance matrix was computed according to Eq. (4). To cater for the unreliable o-diagonal elements, these elements were set to zero. Hence only the variances
97
Table 1 Error rates obtained when the positional displacements at all positions of the projection of the binary signature image were used for veri!cation Projection of binary signature image
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
Vertical projection Horizontal projection Both projections
22.1 29.7 23.2
23.5 27.1 21.4
22.8 28.4 22.3
of each vector dimension were used, and the covariances between dierent dimensions were ignored. When the positional displacements in the vertical projections were used for veri!cation, the false rejection rate (FRR) and false acceptance rate (FAR) were 22.1% and 23.5% respectively. The average error rate (average of FRR and FAR) was 22.8%. When the horizontal projections were used, the average error rate was 28.4%. It seems that the projection along the vertical direction preserves more discriminatory information of the 2-D signature image than the projection along the horizontal direction. When both the vertical and horizontal projections were used, the error rate (22.3%) was only slightly lower than that of using the vertical projection alone. The results are given in Table 1. A simpli!ed implementation was also tried. For the reference projection {Prf (i); i = 1; : : : ; L1 }, instead of using the statistics of positional variations at all L1 positions (which amounted to a few hundred points), only the statistics at a subset of the positions were used. This subset of positions corresponded to the positions of the local peaks in the reference projection. Since the local peaks of the projections were signi!cant features, their positional variations should be useful features. Moreover, using only the positions of the local peaks could signi!cantly reduce the vector dimension, resulting in more eNcient computations and also opened the possibility of estimating and exploiting the correlation between the vector components. To facilitate the identi!cation of local peaks and to discard the small noisy peaks, a small amount of smoothing by local averaging within a moving window of width !ve pixels was performed on the projection waveform before the optimal warping was done. Table 2 shows the results. The resulting average error rate was 24.5% for vertical projection matching, which represents an increase of 1.7% compared with the case of using the whole vertical projection waveform. When both the horizontal and vertical projections were used, the error rate was 23.8%. In another experiment, instead of using binary signature images, the original grey-level signature images were used. The rationale was that the grey-level pattern contains some information on the pen pressure (higher pressures correspond to darker pixels) and possibly on the speed (higher speeds correspond to lighter and thinner strokes). Experimental results show an improvement of about 3.7% for
98
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
Table 2 Error rates obtained when the positional displacements at local peak positions of the projection of the binary signature image were used for veri!cation Projection of binary signature image
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
Vertical projection Horizontal projection Both projections
23.7 32.3 23.3
25.3 31.5 24.3
24.5 31.9 23.8
Table 3 Error rates obtained when the positional displacements at all positions of the projection of the grey-level signature image were used for veri!cation Projection of grey-level signature image
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
Vertical projection Horizontal projection Both projections
18.6 26.2 19.1
19.5 25.4 19.3
19.1 25.8 19.2
Table 4 Error rates obtained when the positional displacements at local peak positions of the projections of the grey-level signature image were used for veri!cation Projection of grey-level signature image
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
Vertical projection Horizontal projection Both projections
20.7 27.7 20.1
21.3 28.3 20.5
21.0 28.0 20.3
vertical projection matching. When the whole vertical projection waveform was used, the average error rate was 19.1%. When only the positional variations at the local peaks of the vertical projection were used, the average error rate was 21.0%. When both the whole vertical and horizontal projections were used for matching, the average error rate was 19.2%. When only the local peaks of both projections were used for matching, the average error rate was 20.3%. The results are summarized in Tables 3 and 4. 4.2. Results of incorporating a matrix estimation technique When the positional displacements at all positions of the projection were used for veri!cation, the displacement vector dimension was very much larger than the number of available training samples. Even the matrix estimation tech-
Table 5 Improvement in performance by adopting the full estimated covariance matrix. The error rates correspond to the cases of using the positional displacements of the local peaks of the vertical projection Average error rate (%)
Binary signature images Grey-level signature images
Without full covariance matrix
With full covariance matrix
24.5
20.8
21.0
18.1
nique discussed in Section 3 was not useful for estimating the full covariance matrix. Hence the o-diagonal elements of the covariance matrix in Eq. (4) were set to zero and need not be computed. When only the displacements at the local peaks of the projections were used for matching, the displacement vector dimension was much reduced, and the matrix estimation technique became appropriate for estimating the full covariance matrix. The experiments using local peaks as reported above were repeated. The LOOC matrix estimation procedures explained in Section 3 were carried out to estimate the covariance matrices [17]. The resulting covariance matrices in Eq. (13) were adopted to replace those in Eq. (4). When the local peaks of the vertical projections of the binary signature images were used for matching, the average error rate was reduced from 24.5% to 20.8%. When grey-level signature images were used instead of binary ones, the average error rate was reduced from 21.0% to 18.1%. This shows that the correlation information provided by the full estimated covariance matrix is useful for improving the performance of the system. The results are given in Table 5. 4.3. Results of proposed method 2 In the two-dimension elastic matching experiment, each signature pattern was !rst scaled such that its aspect ratio was preserved and the resulting pattern !tted into a 250×250 frame. Since most signatures had their lengths greater than their heights, these resulting patterns had lengths equal to 250 pixels and heights less than 250 pixels. The patterns were thinned to contain only binary lines and curves. Each line or curve was approximated by !tting a sequence of short straight lines (‘elements’) of about 6 pixels long. This length of 6 pixels was chosen by experience according to the size of the image. If each element was too short, there would be too many elements and the computational complexity for matching would be very high. On the other hand, if each element was too long, the resolution of the representation would be very low and the matching of short strokes would be aected.
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101 Table 6 Error rates for the two-dimensional elastic matching method Method
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
2-D elastic matching
23.5
23.3
23.4
The weighting factors and in Eq. (7) were set to 0.2 and 0.1, respectively. They were kept constant throughout the matching process. As explained in Section 2.2, the neighborhood size parameters K1 and K2 should be large initially to allow for more global alignment of the two patterns being matched, and the parameters should be successively decreased to allow !ner and !ner alignment. This procedure was adopted to alleviate the local minimum problem associated with gradient descent procedures. In the experiments, both K1 and K2 were set to 20 pixels to start with. After each set of 10 iterations, K1 and K2 were updated as follows: K1 :=K1 − max(0:4; 0:15K1 );
(16)
K2 :=K2 − max(0:4; 0:10K2 ):
(17)
The process stopped after 160 iterations when K1 was reduced to below 1.5 pixels. K2 was scheduled to decrease at a lower rate than K1 because it was found that if they decrease at the same rate, the distortion would be quite large during the last few iterations. Hence K2 decreased at a lower rate to let more neighboring elements inOuence each other to keep the local structure less deformed. For this method, since the dimension of the positional displacement vectors is much larger than the number of training samples, the matrix estimation technique in Section 3 was not implemented. The covariance matrices were computed according to Eq. (11) and the o-diagonal elements were set to zero. The false acceptance, false rejection and average error rates were 23.3%, 23.5% and 23.4% respectively, as shown in Table 6. 4.4. Comparison with other published methods [2,7] It is diNcult to compare the performance of dierent signature veri!cation systems since dierent systems use dierent signature data sets. The lack of a standard international signature database is a big problem for performance comparison. Hence to compare the performance of the proposed methods with other published works, two methods for signature veri!cation were implemented and tested with the same database as in this study [2,7]. The average error rates of Ammar’s [2] and Sabourin’s [7] methods were 22.8% and 17.8%, respectively. For the proposed methods in the present study, the error rates as reported above are also within the same range. As reported in Section 4.2, when the positional displacements of the local
99
Table 7 Comparison of the performance of the proposed method with the methods given in Refs. [2,7] Method
Average error rate (%)
Proposed method Global shape features [2] Extended shadow code [7]
18.1 22.8 17.8
The error rate of the proposed method corresponds to the case of using the positional displacements at the local peaks of the vertical projection of the grey-level signature image with the full estimated covariance matrix incorporated.
Table 8 Error rates of two experienced volunteers Volunteer
False rejection rate (%)
False acceptance rate (%)
Average error rate (%)
Volunteer 1 Volunteer 2 Overall average
20.0 22.9 21.5
18.6 14.8 16.7
19.3 18.9 19.1
peaks of the grey-level projections were used for veri!cation and with the incorporation of the full estimated covariance matrix, the average error rate was 18.1%. This shows that the performance of the proposed method is comparable to other methods. The above results are given in Table 7. 4.5. Comparison with human performance It is instructive to compare the performance of the proposed methods with that of human beings. Two volunteers were recruited to perform the signature veri!cation task. They were familiar with the signature veri!cation methods reported in the literature. One half of the signature database (i.e., 660 authentic signatures and 660 forgeries) was given to them for training. They studied the signatures to gain an appreciation of the characteristics of authentic signatures and forgeries. The remaining half of the database which they had not seen was used for testing. During the actual veri!cation work, 13 signatures were displayed on the computer screen each time: (i) the test signature for which the authenticity was to be veri!ed, and (ii) 12 authentic signatures to serve as reference. The 12 authentic signatures were those previously given to the volunteer for training. The volunteer had to either accept or reject the signature in question. He was told the fact that the probability of each unknown signature being authentic was 0.5. The average error rates of the two volunteers were 19.3% and 18.9%. The overall average error rate for the two volunteers was 19.1%. Compared with the best performance of 18.1% of the proposed system, the volunteers were a little bit inferior. The results are summarized in Table 8.
100
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
5. Discussions and conclusions In this paper, two methods to build up the statistics on the positional variations of the features or strokes of signature samples for veri!cation are proposed. One method is based on the optimal matching of the one-dimensional projection pro!les of the signature patterns and the other is based on the elastic matching of the strokes in the two-dimension signature patterns. Given a test signature to be veri!ed, the positional variations are compared with the statistics of the training set and a decision based on a distance measure is made. Both binary and grey-level signature images are tested. Experimental results show that the positional variations are stable features and are useful for signature veri!cation. The average veri!cation error rate of 18.1% was achieved when the local peaks of the vertical projection pro!les of grey-level signature images were used for matching and with the full estimated covariance matrix incorporated. The veri!cation performance of the two-dimensional elastic matching method was a bit worse than that of the projection matching method. Two methods published by other research workers [2,7] were implemented and tested with the same database as the present study. The performance of the proposed method was about the same as that of Sabourin and Genest [7] and was better than Ammar [2]. To compare with human performance, two volunteers with knowledge in the signature veri!cation !eld were recruited to perform signature veri!cation. Results show that the error rate of the proposed method was 1.0% lower than that achieved by the volunteers. From intuition, the statistics on the positional variations of the features or strokes of signature samples should be useful for veri!cation. The present study was aimed at evaluating the usefulness of the method. Although it is not the best among all existing methods, there is the possibility of combining it with other methods to achieve better results. Similar to other real world problems, no single approach may solve the signature veri!cation problem perfectly, and practical solutions are often derived by combining dierent approaches. 6. Summary Human signature patterns are often subject to variations. In this paper, two methods are proposed to track the variations for the purpose of signature veri!cation. Given the set of training signature samples, the !rst method employs an optimal matching technique to measure the positional variations in the one-dimensional projection pro!les of the signature patterns. The second method employs a two-dimensional elastic matching technique to determine the variations in relative stroke positions in the two-dimensional signature patterns. The statistics on these variations are de-
rived from the training set. Given a signature to be veri!ed, the positional displacements in the one-dimensional projection pro!les as well as the stroke displacements in the two-dimensional signature pattern are determined, and the authenticity is decided based on the statistics of the training samples. The proposed methods were compared with two existing methods proposed by other researchers. To have a fair comparison, these two existing methods were implemented and tested on the same database as the proposed method. Besides comparing with other computer algorithms, the proposed methods were compared with human performance. Two volunteers were recruited to perform the same veri!cation task. Results show that the proposed system compares favorably with other methods and outperforms the volunteers.
Acknowledgements This research is supported by a research grant from the Hong Kong Research Grant Council, reference no. HKU7044=97E.
References [1] F. Leclerc, R. Plamondon, Automatic signature veri!cation: the state of the art—1989 –1993, Int. J. Pattern Recognition Artif. Intell. (Special Issue on Automatic Signature Veri!cation) 8 (3) (1994) 643–660. [2] M. Ammar, Progress in veri!cation of skillfully simulated handwritten signatures, Int. J. Pattern Recognition Artif. Intell. 5 (1991) 337–351. [3] R. Sabourin, G. Genest, F. Prˆeteux, O-line signature veri!cation by local granulometric size distributions, IEEE Trans. Pattern Anal. Mach. Intell. 19 (9) (1997) 976–988. [4] Y. Qi, B.R. Hunt, Signature veri!cation using global and grid features, Pattern Recognition 27 (12) (1994) 1621–1629. [5] J.K. Guo, D. Doermann, A. Rosenfeld, Local correspondence for detecting random forgeries, Proceedings of the Fourth IAPR Conference on Document Analysis and Recognition, Ulm, Germany, 1997, pp. 319 –323. [6] M. Ammar, Y. Yoshida, T. Fulumura, A new eective approach for o-line veri!cation of signatures by using pressure features, Proceedings of the Eighth ICPR, Washington, DC, USA, 1986, pp. 566 –569. [7] R. Sabourin, G. Genest, An extended-shadow-code-based approach for o-line signature veri!cation: Part I. Evaluation of the bar mask de!nition, Proceedings of the 12th ICPR, Jerusalem, Israel, 1994, pp. 450 – 453. [8] E.J.R. Justino, F. Bortolozzi, R. Sabourin, O-line signature veri!cation using HMM for random, simple and skilled forgeries, Proceedings of the Sixth International Conference on Document Analysis and Recognition, 2001, pp. 1031–1034. [9] Y. Mizukmi, H. Miike, M. Yoshimura, I. Yoshimura, An o-line signature veri!cation system using an extracted
B. Fang et al. / Pattern Recognition 36 (2003) 91 – 101
[10]
[11]
[12] [13]
displacement function, Proceedings of the ICDAR’99 Fifth International Conference on Document Analysis and Recognition, 1999, pp. 757–760. J.N. de Gouvea Ribeiro, G.C. Vasconcelos, O-line signature veri!cation using an auto-associator cascade architecture, Proceedings of the IJCNN’99 International Joint Conference on Neural Networks, Vol. 4, 1999, pp. 2882–2886. N. Papamarkos, H. Baltzakis, O-line signature veri!cation using multiple neural network classi!cation structures, Proceedings of DSP 97, 13th International Conference on Digital Signal Processing, Vol. 2, 1997, pp. 727–730. F. Itakura, Minimum prediction residual principle applied to speech recognition, IEEE Trans. Accoust. Speech Signal Process. 23 (1975) 67–72. Y. Sato, K. Kogure, Online signature veri!cation based on shape, motion, and writing pressure, Proceedings of the Sixth ICPR, Munich, 1982, pp. 823–826.
101
[14] M. Yasuhara, M. Oka, Signature veri!cation experiment based on non-linear time alignment: a feasibility study, IEEE Trans. Systems Man Cybernet. 17 (1977) 212–216. [15] M. Parizeau, R. Plamondon, A comparative analysis of regional correlation, dynamic time warping, and skeletal tree matching for signature veri!cation, IEEE Trans. Pattern Anal. Mach. Intell. 12 (7) (1990) 710–717. [16] J.H. Friedman, Regularized Discriminant Analysis, J. Am. Statist. Assoc. 84 (1989) 165–175. [17] J.P. Hobeck, D.A. Landgebe, Covariance matrix estimation and classi!cation with limited training data, IEEE Trans. Pattern Anal. Mach. Intell. 18 (7) (1996) 763–767. [18] C.H. Leung, C.Y. Suen, Matching of complex patterns by energy minimization, IEEE Trans. Systems Man Cybernet. Part B 28 (5) (1998) 712–720.
About the Author—B. FANG received the B.Eng. degree in Electrical Engineering from Xi’an Jiaotong University, M.Sc. degree in Electrical Engineering from Sichuan University and Ph.D. degree in Electrical Engineering from the University of Hong Kong. He is currently a Research Fellow at the Singapore—MIT Alliance of the National University of Singapore. His research interests include computer vision, pattern recognition, document analysis and medical image processing. About the Author—C.H. LEUNG received the B.Sc. (Eng.) and Ph.D. degrees from the University of Hong Kong, and the M.Eng. degree from McGill University, all in Electrical Engineering. He worked for several years in industry and as a Lecturer with the Polytechnics in Hong Kong. Since 1986, he has been working at the Department of Electrical and Electronic Engineering, University of Hong Kong. He is currently an Associate Professor. His main research interest is in pattern recognition and computer vision. About the Author—YUAN Y. TANG received the B.S. degree in Electrical and Computer Engineering from Chongqing University, Chongqing, China, the M.Eng. degree in Electrical Engineering from the Graduate School of Post and Telecommunications, Beijing, China, and the Ph.D. degree in Computer Science from Concordia University, Montreal, Canada. He is presently a Professor in the Department of Computer Science at Hong Kong Baptist University and Adjunct Professor in the Centre for Pattern Recognition and Machine Intelligence, Concordia University. He is an Honorary Lecturer at the University of Hong Kong, an Advisory Professor at many institutes in China. His current interests include wavelet theory and applications, pattern recognition, image processing, document processing, arti!cial intelligence, parallel processing, Chinese computing and VLSI architecture. Professor Tang has published more than 200 technical papers and is the author=co-author of 18 books on subjects ranging from electrical engineering to computer science. He was the General Chairman of the 17th International Conference on Computer Processing of Oriental Languages (1997), and a Program Chair of the 2nd International Conference on Multimodal Interface (1999) and the 2nd International Conference on Wavelet Analysis and Applications (2001). He is a Senior Member of IEEE. He is the Founder and Editor-in-Chief of the International Journal on Wavelets, Multiresolution, and Data Processing (IJWMDP) and an Associate Editor of the International Journal of Pattern Recognition and Arti!cial Intelligence. About the Author—K.W. TSE received his B.Sc. degree in Computer Science and Ph.D. degree in Computation from the University of Manchester, UK. He had worked for several companies including Burrough’s Machines Ltd., International Computers Ltd., Bell-Northern Research and Microtel Paci!c Research in UK. and Canada before joining the Department of Electrical Engineering, University of Hong Kong in 1986. His current interest includes computer communications, computer architecture and applications. About the Author—PAUL C.K. KWOK holds a B.Sc. in Telecommunication Engineering from the University of Essex and a Ph.D. degree in Electrical Engineering from the University of Cambridge. He has worked for Monotype International, the Hong Kong Polytechnic, The Chinese University of Hong Kong; The University of Hong Kong; and The University of Calgary. He is currently an Associate Professor at The Open University of Hong Kong. His current interest is in document processing, computer architecture and LED design. He co-discovered the Pentium II Sign Extension Bug, otherwise known as Errata 44 in Intel’s Pentium II Speci5cation Update. About the Author—Y.K. WONG received the B.Sc. and M.Sc. degrees from the University of London, and the Ph.D. degree from the Heriot-Watt University, UK. He joined the Hong Kong Polytechnic University in 1980. Dr. Wong is a Member of the IEE and a Senior Member of the IEEE. His current research interests include modeling, simulation, arti!cial intelligence, intelligent control and power system control.