Deterring password sharing:user authentication ... - Semantic Scholar

Report 4 Downloads 61 Views
Deterring Password Sharing: User Authentication via Fuzzy c-Means Clustering Applied to Keystroke Biometric Data Salvador Mandujano Rogelio Soto Instituto Tecnol´ogico y de Estudios Superiores de Monterrey Center for Intelligent Systems Monterrey, Mexico {smv, rsoto}@itesm.mx Abstract

therefore can be used to strengthen password-based authentication.

This paper describes a clustering-based system to enhance user authentication by applying fuzzy techniques to biometric data in order to deter password sharing. Fuzzy c-Means is used to train personal, per-keyboard profiles based on the keystroke dynamics of users when entering passwords on a keyboard. These profiles use DES encryption taking the actual passwords as key and are read at logon time by the access control mechanism in order to further validate the identity of the user. Fuzzy values obtained from membership functions applied to the input (i.e., keystroke latencies) are compared against profile values, and a match, within a certain precision threshold γ, will grant access to the user. With this technique, even when user A shares password PA with user B, B will still be denied access unless he is capable of mimicking the keystroke dynamics of A. We describe the motivation, design, and implementation of a prototype whose results indicate the accuracy level and feasibility of the approach.

Password systems have been the favorite authentication method for years in electronic systems due to several reasons: they are straightforward to implement, easy to use and maintain, their precision can be adjusted by enforcing password-structure policies or by changing cryptographic algorithms depending on the security level desired, and they are an inexpensive, scalable way of validating users, both locally and remotely, to all sorts of services [10, 2]. If a username or password does not match the information stored on the access control repository at log-on time, the user will be denied access; otherwise, he will be able to use the system.

1. Introduction Biometric mechanisms represent the strongest means to authenticate people [1, 3, 13]. As human beings we have characteristics that help identify us from others. Our genetic code, fingerprints, handwriting, and ocular retinal pattern are examples of biometric features that make us unique and distinguishable as individuals. There is another source of biometric data which has not been exploited for the purpose of strengthening user identification: the typing patterns of a person when using a computer keyboard [12, 5]. The keystroke frequency of a user is a distinctive feature that, even when it is not as precise as others in terms of entropy and classification power [1], has de advantage of not requiring costly equipment and software to be implemented, and

The security of traditional password systems resides in the actual ciphered string containing the password and the inability to decipher it while it is stored on the filesystem (shadow-password models hide password information from users in order to avoid dictionary attacks and password cracking, although many systems typically allow users read this information given it is encrypted [2]). These authentication mechanisms are based on something the user knows, in this case, a password. If someone gets to know the password, he will be perfectly (although, inappropriately) able to log on into the system. There exist other models based on something the user owns, for instance, a lock, a key, or a batch. Any person who gets one of these objects will be able to access the protected resource with no trouble. A third type of system is based on something the user is, meaning, biometric features of the user. These methods convey much more information for user validation and are more difficult to break as they are based on something that is more difficult to share (unlike passwords and access batches, e.g.) [10]. The system being introduced in the present paper fuses two of these security mechanisms in order to fortify user authentication. It employs a password string as something the

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

user must know, and complements it with the corresponding keystroke pattern of the user which represents something the user must be. If he knows the password but cannot type it on the keyboard at the right pace, he will be unable to log on. Similarly, if the user does not remember the password but he is actually the legitimate user, he will still be denied access. Both components need to be present for the user to be let in: the password and an “good-enough” keystroke pattern (see Section 4 for details on the matching mechanism). If any of those is missing, no access is permitted. This paper comprises the following sections. Section 2 describes the problem of user authentication and password sharing. Section 3 reviews background information on clustering and password systems including related research projects. Section 4 outlines the design and implementation of the fuzzy c-Means prototype. Section 5 describes the experimental results obtained with the prototype. Conclusions and references are at the end of the document.

2. Problem description Users share their personal passwords with others in order to give them access to individual or corporate electronic accounts [11, 12, 6]. If a user wants someone else to enter the system on his behalf, he just needs to let that someone know the password. This weakness can also be observed if a password has been sniffed on a connection line: someone tapping on a network is potentially able to capture all the passwords that travel in the clear [13]. If passwords are not protected with cryptography or with a secondary mechanism like application wrappers that hide all humanlyreadable strings by scrambling the data, they will give access to an intruder who will be able to abuse the privileges of the account and, perhaps, to extend the break-in to other areas of the compromised host. These incidents have caused serious losses over the last years and constitute a priority to the information security teams of many governments and corporations around the globe [11]. For system administrators, if more than one user is logging on into a host using the same user account, they will be unable to tell which of those users should be held accountable for what actions – especially when it comes to anomalous activity. Multi-user systems require mechanisms to make sure that all the accounting is done correctly and hardening password-based user authentication is a way of guaranteeing the integrity of system records. By improving authentication through biometrics, in our case doing keystroke pattern analysis, it is possible to prevent people from utilizing passwords that do not belong to them. Consequently, an intruder will have to do two things in order to get access to a system account: 1) get the password of one of the users, and 2) guess the typing patterns the owner of that password. This additional security layer on

top of password strings makes security stronger as biometric data is something difficult to imitate and even to communicate over the phone or through email [1]. With this mechanism we deter intentional password sharing and reduce the threat posed by a compromised password.

3. Background This section covers three topics. It first describes crisp clustering methods and then goes on comparing them with fuzzy clustering. Toward the end of the section, we cite other projects related to the keystroke approach followed by this paper.

3.1

Data clustering

Clustering algorithms are a form of non-supervised learning used to identify groupings among a population of individuals [7, 8]. They analyze the similarity of a set of samples in order to identify possible groupings to split up the set. Once these groupings or clusters are defined, a new incoming sample point can be classified according to its particular features and can be put into one of the clusters. A point describing the members of a cluster is calculated so that new points can be compared against [14]. Some methods call this point the centroid of the cluster and is recomputed during the learning phase and/or as new members are received. Let G be a set of points and let W (G) be the power set of G. C will be a cluster or partition of G if and only if C ∈ W (G) (i.e., C is a possible subset of G). We can define a binary membership function for C as follows:  1 if x ∈ C uC (x) = 0 if x ∈ /C where x is an incoming sample. Now suppose there are two clusters, C1 and C2 . We can apply a function like the one above to a group of three input points p, q and r, and build a partition matrix U with their crisp membership values. U will indicate to which of the clusters every sample belongs:   1 0 1 U= 0 1 0 A 1 value denotes absolute membership and a 0 means absolute non-membership. Any given point will belong to one and only one cluster depending on the outcome of a similarity measure used to group similar individuals together. For evaluating similarity, the point to classify is compared with the centroids. In the case of data points on an n-dimensional space, the similarity measure could be certain type of distance function between them [8, 14]. The

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

closer they are, the stronger the probability of belonging in the same cluster. Figure 1 shows three clusters and how they split data points. Note that there is no overlapping since a point will belong exclusively to the cluster that is closest in distance. X

2

C1

C 2

C3

0

X1

Figure 1. Three crisp-membership clusters ( C1 , C2 , and C3 ) on a bidimensional space defined by elliptic functions. Crosses represent centroids.

These clusters classify points according to their coordinate values in x1 and x2 and the position of the centroid. In circular clusters, for instance, the centroid is located at the geometrical center of the set and its coordinates correspond to the mean values of the coordinates of the cluster’s members (for other cluster shapes, the centroid and similarity function may vary [14]). When a new point pi needs to be classified, it is compared against current cluster centroids. In the case of the example in Figure 1, three membership values will be computed for each point pi : uC1 (pi ), uC2 (pi ), and uC3 (pi ). The highest membership value will determine the right cluster for pi .

3.2

Fuzzy c-Means clustering

The c-Means algorithm [14] is a fuzzy clustering technique that works something like the above method but provides additional flexibility regarding membership. An individual will belong to one or more classes or clusters with different membership degrees. This idea arises from the fact that it is ambiguous to tell whether a point must go into a certain cluster and not into another (consider points with equal membership for two clusters, for instance). To deal with this ambiguity, it is necessary to introduce some fuzziness into the formulation of the problem. Instead of having precise, crisp boundaries for a cluster represent-

ing a binary threshold which indicates whether a point definitely belongs to a cluster or not, fuzzy membership functions compute a membership degree of each point for every cluster. c-Means will define clusters from a set of input points using this loose membership strategy which constitutes the most famous algorithm that has been developed for this purpose [14]. The output of a fuzzy membership function will be a real value between 0 and 1, for instance, uF (x) = [0, 1] for a fuzzy cluster F and an input point x. The partition matrix for two fuzzy clusters and three input points will look something like this:   0.24 0.15 0.93 Uf = 0.76 0.85 0.03 Each number represents the membership degree of a point with respect to a cluster. The first of the three points will belong to the first cluster with membership 0.24 and, at the same time, it will also belong to the second cluster but with a membership of 0.76. The c-Means algorithm will build the clusters, compute their corresponding centroids and maintain Uf . The algorithm works as follows: Step 1. Given an input data set X = (x1 , x2 , ..., xn ), where xi ∈ Rk , fix the number c of clusters with c ∈ (2, 3, ..., n − 1) (c is the variable that gives name to the algorithm). Set m ∈ (1, ∞) to 1 and initialize partition P0 . Step 2. At iteration l, with l ∈ (N ∪ 0), compute c mean vectors vi with i ∈ {1, 2, ..., c} – these are the average points of the c clusters. Being uik the proximity function of xi with respect to cluster k: n l m k=1 (uik ) xk ,1 ≤ i ≤ c vil =  n l m k=1 (uik ) Step 3. Update U l = [ulik ] to U l+1 = [ul+1 ik ] : ulik =  c

1

|xk −vil | 2/m−1 j=1 ( |xk −vjl | )

, 1 ≤ i ≤ c, 1 ≤ k ≤ n

Step 4. If |U (l+1) − U l | < e, where e is the error, stop; otherwise let l = l+1 for the next iteration and go to Step 2. The algorithm will converge to a set of c clusters and a partition matrix U which contains the membership values of each point with respect to the clusters.

3.3

Related projects

This approach to user authentication has not been widely explored and just a few projects involving enhanced security through keystroke dynamics have been developed, the

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

main difference among them being the type of technique used [12, 4, 9]. Ru et al. used fuzzy classes to characterize the typing behavior of system users but they did not apply any sort of clustering [12]. In addition to keystroke information, they incorporated a password complexity value based on the distances between keys on the keyboard. Joyce and Gupta used the same sort of “variables that make a handwritten signature a unique human identifier” in order to define a stream of latency values that make up a profile [4]. No clustering was used here either but the results obtained from this project clearly support the use of keystroke biometrics for password-based authentication. Yasuhiro et al. created a variant of the traditional keystroke-speed model [9]. They generate “user rhythms” which capture a broader pattern describing a user’s keystroke frequencies regardless of any password. The accuracy of this model is not as precise as the others but it can be certainly used in a more elaborate sort of authentication (perhaps, challenge-response authentication using phrases). Given that all of these solutions use individual variables to capture different features from the user, clustering lends itself naturally to this purpose as it can be used to extract information from those variables in order to learn the behavior of each biometric aspect they capture. We explore the fuzzy version of this method as an alternate solution by defining considerably more variables per key which lets us increase detection accuracy at the key level and not at the password level as the other models.

Input variables

This system is designed to learn the keystroke patterns of users. When entering a password, there exist two variables that will be considered: 1) the fraction of time a key stays pressed, and 2) the time interval between releasing a key and pressing the next one (see Figure 2).

p 1 X

p 2 r 1

Y

r 2

0

d

1

d2

d3

c 1

c2

c 3

t

centroids Figure 3. Fuzzy clusters for a single password character: d1 corresponds to slow latency, d2 to correct latency, and d3 to fast latency. Each cluster has a centroid denoted by ci .

4. Design and implementation 4.1

every time a key is pressed and they constitute the input data to the clustering module. If we have, for example, a password length of three characters, we will have two released-key latencies, r1 and r2 (which are intervals), and three pressed-key latencies, p1 , p2 , and p3 . In general, for an n-character password, there will be n pressed-key values and n − 1 released-key values. A password needs to be entered k times by the user during training. For convenience, the prototype offers four values of k: 5, 10, 20, or 30. From the training phase we will get kn pressed-key values and k(n − 1) released-key values. The c-Means algorithm is then applied to define clusters that will capture the speed at which each key is being pressed and released. Every pi and ri variable will have three latency clusters attached to it: a slow latency, correct latency, and fast latency. Each one represents a fuzzy range that describes how accurate the entered value is. Figure 3 depicts three clusters and their corresponding centroids. Unlike crisp clustering, there is overlapping in fuzzy models.

p 3

These values are based on time and, as such, are onedimensional. The computation of centroids, in this case, is equivalent to computing the arithmetic mean of input times. Clusters are created with the purpose of finding the actual centroids, which, coupled with standard deviation, represent a biometric feature captured from the user.

Z

4.2 t

Figure 2. Pressed-key and released-key variables for a three-letter password

These two values often referred to as latencies are read

Authentication with fuzzy values

Once the system is trained and it has learned a keystroke pattern, the information is ciphered with DES [13] using the actual password as encryption and decryption key. (If the password is wrong, regardless of the keystroke pattern, the profile will not be accessed. If it is correct, the profile is deciphered and compared against the observed pattern.) Clustering data stored on the profile will be used to decide whether or not a user is the person he says he is.

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

If the user wants to authenticate to the system, he will type in his password and keystroke latencies will be read by the security system (see the modules of the prototype in Figure 5). For instance, if the second letter of a three-character password is “Y” then the following six centroids will be computed: p(2,s) , p(2,c) , and p(2,f ) which correspond to pressed-key values slow, correct, and fast, and r(2,s) , r(2,c) , and r(2,f ) corresponding to released-key values (the time interval previous to character “Y” has been already computed with the first input character). The observed keystroke latencies are computed and then compared to the corresponding centroid using the standarddeviation σi of the cluster computed during training. The number of correct matches (that is, matches that fall into the correct latency cluster) for each variable are summed up into a variable h which is divided by the maximum number of possible correct matches m. The quotient is compared to the access threshold α to determine whether the user will be granted access.  h If q ≤ α, then “access granted” q= If q > α, then “access denied” m There is a precision constant γ which helps fine tune the evaluation of profiles. This value is multiplied by all σi in order to adjust the acceptance range of clusters (see Figure 4). A small γ will make a very narrow range within which a variable can be considered correct, whereas a larger γ will provide more flexibility when entering the password (i.e., the matching will not be as tight as the input needs not be too close to the centroids and standard deviation values stored on profile). α and γ are configuration settings that the administrator can use to regulate how narrow the matching area will be. The above computations determine profile values which characterize the keystroke latencies of the user. If the user types in his password at the usual keystroke rhythm, that will mean to the system he is probably the legitimate owner of the account. If γ is too tight, the user might have to try several times before successfully logging in. Since latencies may differ among keyboards, the prototype allows the user to define per-keyboard profiles. The proposed structure comprises three fuzzy sets that correspond to the three fuzzy clusters defined for each variable (Figure 4). Every key entered by the user will generate two or three variables (depending whether it is an initial or an intermediate character in the password) which generate three membership values each. This increases the detail level used to characterize keystroke behavior. If a variable has a higher membership for the correct set, it will mean it was typed at the right pace. If it belongs to any other set, the matching algorithm will determine whether that value, along with the rest of the characters, conform, as a whole, an acceptable password.

u(t) 1

slow

correct

0

fast

t

γσ

Figure 4. Fuzzy functions for a single character. Standard deviation values (σ) are computed during training, and are adjusted by tuning variable γ

5. Experimental results The prototype was developed in Java and is composed of two modules: the training module and the test module. The training module is designed to learn the keystroke patterns from users. Values for keyboard number, user name, and number of training rounds need to be defined before starting the training. Keyboard number is used to let the user have several profiles given that the typing patterns of a person may vary from keyboard to keyboard.

Figure 5. The training and testing modules communicate by reading from and writing to keystroke profiles

In Figure 5, the lower panel of the training module plots keystroke latencies read from the keyboard. This gives an intuitive idea of how a keystroke pattern looks

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

like but is displayed for visualization purposes only. Once the user has entered the password the requested number of times (only correct passwords are taken into account for the computation of clusters; latencies corresponding to wrong passwords are discarded), a profile is created. This profile stores the centroids and standard deviations computed with c-Means. This profile is then ciphered with the actual password using DES which is implemented by the org.logi.crypto.keys.DESkey Java class (the prototype system was developed in Java 2 using a 1.4.2 build). Once the user has trained the system, he can launch the test module to evaluate the learned patterns. A logon screen will send the user an “access granted” or an “access denied” message indicating whether the authentication succeeded or not (matching variables α and γ are currently hard-coded into the program). For experimentation purposes, one keyboard was used by 15 users who trained the system with their passwords and created their profiles. They were asked to type the password of the other 14 users 15 times each (that counts for roughly 152 training samples, and 152 tests which generate 2n latency variables each, where n is the length of the password). The results are captured in the following table (see Table 1).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

User ID gork cowboy alex donna watcher vrije hicss three lucky domino fam edward walta iaomi Duff

Password Kennedy1 Winding96 blacknight asdfasdf w2567 r011ing$tone$ anthropos aebgtw1 wanfcp0e DOMINO haml00 e2mv Haba11owd00$ iaomi76 d123998

Success 0.95 0.91 0.95 0.94 0.89 0.94 0.93 0.92 0.89 0.96 0.97 0.94 0.94 0.98 0.89

Failure 0.17 0.21 0.32 0.31 0.11 0.06 0.24 0.09 0.11 0.23 0.09 0.21 0.04 0.22 0.10

Table 1. The Success column denotes the success rate at identifying legitimate owners (high is good). The F ailure column denotes the percentage of failure to detect an impostor (high is bad).

The overhead of this fuzzy authentication mechanism is quite small. It behaves O(n) with the length of the password and, considering the salt variation used by many password modules of Unix systems [2], this overhead would be

equivalent to the use of a salt variable from a busy-waiting perspective. It can be noticed that longer passwords provide a better means to learn a user’s keystroke pattern. The number of variables increases with the length of the password and this allows for increased accuracy. It can be also inferred that passwords containing dictionary words are weaker, and that an unauthorized user can correctly type passwords that are short. The failure rate to detect an impostor is high for easy-totype passwords, but the inclusion of special characters and numbers provides additional security to the password and its corresponding keystroke profile. An important point to make is that, if an attacker is not aware of the password system featuring this biometrics support, he will probably try a password a few times before giving up (in the experiments, all users were requested to try each password the 15 times). This considerably increases the success of our approach. From the learning perspective, the success rate obtained with fuzzy clustering is high resulting in the positive identification of legitimate users. Failure rates are low if we consider 14 users trying to break into a system knowing the password beforehand and attempting to log on 15 times with each password. It will be convenient to combine a support authentication module like this with a password policy that eliminates the use of passwords that are easy to guess and type [2].

6. Conclusions Password authentication can be conveniently enhanced through keystroke pattern monitoring. The proposed fuzzy method using c-Means clustering provides an extra-level of security that makes password authentication stronger. The main benefit of this approach is limiting the effects of password sharing and password stealing by including additional variables into the authentication equation. Our experimental results show that this sort of biometric measure effectively identifies legitimate users and impostors, and the prototype can be fine-tuned to regulate the level of accuracy required for gaining access to the system.

References [1] R. Bolle. Guide to Biometrics. Springer-Verlag, 1st edition, December 2003. [2] S. Garfinkel and E. H. Spafford. Practical UNIX Security. OReilly, 2nd edition, April 1996. [3] R. Hsu, M. Abdel-Mottaleb, and A. Jain. Face detection in color images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):696–706, March 2002.

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE

[4] R. Joyce and G. Gupta. Identity authentication based on keystroke latencies. Communications of the ACM, 33(2):168–176, 1990. [5] S. Kumar. Classification and Detection of Computer Intrusions. PhD thesis, PhD thesis, Department of Computer Sciences, Purdue University, West Lafayette, IN, 1995. [6] A. K. Lenstra and E. R. Verheul. Selecting cryptographic key sizes. Journal of Cryptology: the journal of the International Association for Cryptologic Research, 14(4):255– 293, 2001. [7] D. Matula. Graph theoretic techniques for cluster analysis algorithms. Classification and Clustering, 1977. [8] T. M. Mitchell. Machine Learning. McGraw Hill, 1st edition, 1997. [9] Y. Ogoshi, A. Hinata, S. Hirose, and H. Kimura. Improving user authentication based on keystroke intervals by using intentional keystroke rhythm. IPSJ Journal, 44(2–21), March 2003. [10] C. P. Pfleeger. Security in Computing. Prentice Hall Inc., Upper Saddle River, NJ, 2nd edition, 1997. [11] R. Richardson. Computer crime & security survey 2003. Technical report, Computer Security Institute, CSI and Federal Business of Investigations, FBI, 2003. [12] W. G. Ru and J. H. Eloff. Enhanced password authentication through fuzzy logic. In IEEE Expert, volume 12, pages 38– 45, Nov/Dec 1997. [13] B. Schneier. Applied Cryptography. John Wiley and Sons, New York, NY, 2nd edition, 1996. [14] L.-X. Wang. A Course in Fuzzy Systems and Control. Prentice Hall, Inc, Upper Saddle River, NJ, 1997.

Proceedings of the Fifth Mexican International Conference in Computer Science (ENC’04) 0-7695-2160-6/04 $20.00 © 2004 IEEE