Combining QoS prediction and customer ... - Semantic Scholar

Report 5 Downloads 147 Views
Knowledge-Based Systems 56 (2014) 216–225

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Combining QoS prediction and customer satisfaction estimation to solve cloud service trustworthiness evaluation problems Shuai Ding a,b,⇑, Shanlin Yang a,b, Youtao Zhang c, Changyong Liang a,b, Chengyi Xia d a

School of Management, Hefei University of Technology, Box 270, Hefei 230009, Anhui, PR China Key Laboratory of Process Optimization and Intelligent Decision-Making, Ministry of Education, Hefei 230009, Anhui, PR China c Department of Computer Science, University of Pittsburgh, Pittsburgh 15213, PA, USA d Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300191, PR China b

a r t i c l e

i n f o

Article history: Received 12 February 2013 Received in revised form 10 November 2013 Accepted 16 November 2013 Available online 23 November 2013 Keywords: Cloud computing Service trustworthiness Multi-attribute evaluation QoS prediction Customer satisfaction

a b s t r a c t The collection and combination of assessment data in trustworthiness evaluation of cloud service is challenging, notably because QoS value may be missing in offline evaluation situation due to the time-consuming and costly cloud service invocation. Considering the fact that many trustworthiness evaluation problems require not only objective measurement but also subjective perception, this paper designs a novel framework named CSTrust for conducting cloud service trustworthiness evaluation by combining QoS prediction and customer satisfaction estimation. The proposed framework considers how to improve the accuracy of QoS value prediction on quantitative trustworthy attributes, as well as how to estimate the customer satisfaction of target cloud service by taking advantages of the perception ratings on qualitative attributes. The proposed methods are validated through simulations, demonstrating that CSTrust can effectively predict assessment data and release evaluation results of trustworthiness. Ó 2013 Elsevier B.V. All rights reserved.

1. Introduction In cloud computing environments, computing and storage resources are modeled as services which can be delivered or used as utilities wherever needed [1–3]. With the fast and wide adoption of cloud computing, many small and medium enterprises (SMEs) and individual users prefer to apply cloud services to build their business system or personal applications. In many domains, often multiple cloud services provide similar functional properties [4]. For example, in customer relationship management (CRM), CRM vendors offering many functionally-equivalent cloud services, such as Microsoft Dynamic CRM, Salesforce Sales Cloud, SAP Sales OnDemand, and Oracle Cloud CRM. However, given the lack of cloud computing technology of SMEs and individual users, it is tedious when manually selecting appropriate candidate from a set of functionally-equivalent cloud services. Cloud service trustworthiness is known as a comprehensive quality measure of cloud services [5]. It also reflects the user’s cognition of cloud service with respect to multiple trustworthy attributes, such as reliability, scalability, availability, safety, and security [6–10]. For many of the same reasons distinguishing and evaluating software trustworthiness [11–13] that are closely

⇑ Corresponding author at: School of Management, Hefei University of Technology, Box 270, Hefei 230009, Anhui, PR China. E-mail address: [email protected] (S. Ding). 0950-7051/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.knosys.2013.11.014

related to quality guarantee, designing effective methods or models for cloud service trustworthiness evaluation has become a challenging and urgently required research problem. Trustworthiness-aware service evaluation and selection has gained much attraction in Service-Oriented Computing and Cloud Computing research community over the past two decades. Some widely adopted methods include: AHP based cloud service ranking [7], reputation-aware service selection and rating [14], trust-aware service selection [15], SLA-based trustworthiness estimation [16], QoS ranking prediction [4,17], and service trustworthiness probabilistic model [18]. More recently, with the development of cloud computing and social network, feedback-based risk analysis [19] and trust modeling [20] become more and more important as ways to improve the security, reliability and trustworthiness of cloud service system. Petri et al. [20] proposes an innovative feedbackbased trust model for forming and evaluating the trustworthiness between various end-users in P2P clouds. Most researches pertaining to cloud service trustworthiness analysis focus on quantitative measurement depending on a full assessment dataset. However, there are several challenges when employing existing trustworthiness evaluation method to distinguish and select cloud services with the desired trustworthy attributes. Firstly, commercial cloud service vendors release computing or storage service to end users at different levels of trustworthy assurance. Some of the vendors do not disclose their trustworthiness guarantee nor their service level agreements. Secondly, since

217

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

conducting cloud service invocations for the purpose of trustworthiness evaluation is costly and time-consuming, it is not practical to measure trustworthy attributes of all candidates for each user. Moreover, due to the dynamic and uncertain nature of cloud service application, there may be a high demand to ensure the validity and integrity of assessment data. Without sufficient assessment data, accurate cloud service trustworthiness evaluation cannot be obtained. To attack this challenge, we propose a novel cloud service trustworthiness evaluation framework (named as CSTrust) based on collaborative filtering recommendation technology and utility theory. The proposed method seeks to predict the missing QoS value of quantitative attributes by employing the formerly usage experiences of other similar services, which then is combined by the customer satisfaction of qualitative attributes to determine the comprehensive trustworthiness value of target cloud service. In this way, the proposed framework alleviating a potentially complex and costly evaluation task, along with enhancing the objectivity of cloud service trustworthiness reports. The remainder of this paper is structured as follows: Section 2 presents the cloud service trustworthiness evaluation framework. Section 3 discusses our QoS prediction approach for cloud service trustworthiness evaluation. Section 4 proposes the customer satisfaction estimation approach and the trustworthiness evaluation approach in CSTrust. Section 5 describes our experiments. Section 6 concludes the paper.

2. Trustworthiness evaluation framework Service trustworthiness evaluation is a very important research topic in cloud computing [6,21]. In the real world, trustworthiness evaluation can be conducted in online situation or offline situation [22]. While online trustworthiness evaluation supports the realtime dynamic measurement of the cloud service capacities, offline trustworthiness evaluation provides more comprehensive cognition by integrating past usage experiences. For the purpose of trustworthy cloud service selection, users prefer to employ offline methods for enhancing the accuracy of trustworthiness evaluation. Since trustworthiness usually consists of several attributes, many investigations identify offline trustworthiness evaluation as a Multiple Attribute Decision Analysis (MADA) [23] or Multi Criteria Decision Making (MCDM) [7] problem. Trustworthy attributes are often divided into quantitative attributes and qualitative attributes, of which the former are measured by objective test results or observation data (e.g. QoS value [24]), and the latter are assessed by customer perception rating ([14,25]). Extended from our previous works on software trustworthiness model [23,26], we design a personalized evaluation framework (CSTrust) to evaluate the trustworthiness of cloud service via the combination of multi-source assessment data. CSTrust not only assists users in making decision when selecting the best performing candidates from a set of functionally-equivalent cloud services, but also helps discover more trustworthy services for the current users. As shown in Fig. 1, employing CSTrust to solve cloud service trustworthiness evaluation problem includes three phases. Firstly, based on trustworthiness analysis, a trustworthiness evaluation indicator system (TEIS) can be developed via trustworthy attributes identification and metrics design. Subsequently, the relative weights of trustworthy attributes are determined by using traditional objective or subjective weighting methods, and the formerly usage experiences corresponding to cloud service trustworthiness need to be collected by analyzing test records or log files. Secondly, CSTrust predict the QoS values on quantitative attributes by applying collaborative filtering technology, whereas estimate the customer satisfactions on qualitative attributes by integrating

Cloud Services

TEIS Developing

Attribute Weighting

Experiences Collection

Utility Calculation

Predicting

Similarity Computation

Combination

Satisfaction Estimation

Customer Rating

Evaluation Reports

Fig. 1. System architecture of CSTrust.

customer rating. Finally, the trustworthiness reports will be released by combining multi-source assessment data. To make offline trustworthiness evaluation, TEIS explores the personalized trustworthy requirements corresponding to different cloud service selection cases. When CSTrust is employed to evaluate cloud service trustworthiness, the developed TEIS should be mutually exclusive and collectively exhaustive. In our previous work [23], a unified TEIS definition and an objective weighting method were presented to support automated trustworthiness calculation. It derives from classic MADA theory, and also suitable for offline cloud service trustworthiness evaluation. We will omit the details of the first phase for brevity. CSTrust aims to remedy the shortcomings of previous methods by avoiding the costly and time-consuming real-world cloud service invocations. In the following sections, the QoS prediction approach mentioned in phase 2, and the customer satisfaction estimation approach mentioned in phase 3 will be proposed, respectively. 3. Cloud service QoS value prediction Since cloud service evaluation is often a time-consuming and costly process, it must be attractive if we can obtain reliable assessment data using small amounts of past usage experiences. In this section, we provide the QoS prediction approach to extend existing cloud service trustworthiness test and measurement techniques. This approach helps CSTrust predict the missing assessment data on quantitative attributes by taking advantages of similar services’ usage experiences. 3.1. Cloud services similarity computation There are a lot of functionally-equivalent services within the cloud computing environment. It is too expensive to provide tests to all candidates, since charges are needed in using the real-world services. Without sufficient test results or observation data on trustworthy attributes, it is difficult to obtain the accurate trustworthiness evaluation of target cloud service. Fortunately, different from traditional software applications, cloud service vendors can collect usage experiences (e.g. QoS values) from different APPs easily in cloud computing environment [17]. Moreover, many trustworthy attributes can be measured by the QoS values, such as throughput, failure-rate, and response-time. We will employ item-based collaborative filtering (CF) technology [27] to predict collaborative information for reinforcing cloud service trustworthiness evaluation. Suppose CSTrust contains M users and N cloud services. The user-service matrix for missing QoS value prediction is describes as:

2

q1;1 6 . 6 . 4 . qM;1

3 q1;N .. 7 7 . 5    qM;N  .. .

218

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225



Table 1 The user-service matrix for assessing cs1. Users

1 2 3 4 5 6

C n;y ¼ 2

cs1

cs2

cs3

cs4

1.15 1.1 0.95 1.25 0.85 Null

1.35 1.15 0.9 1.05 0.7 0.95

1.25 Null Null Null Null 1.1

1.1 0.85 Null 1.05 0.7 0.85

Definition 1. [28] Suppose csn and csv are two cloud services, PCC is applied to calculate the similarity between csn and csv by

P   m2U ðqm;n  qn Þðqm;y  qy Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Simðcsn ; csy Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; P P 2   2 m2U ðqm;n  qn Þ m2U ðqm;y  qy Þ

ð1Þ

where Sim(csn, csy) is in the interval of [1, 1], U is the subset of n and q y are the average users who have invoked both csn and csv, q QoS values observed by different users. From the above definition, the larger Sim(csn, csy) value indicates that the cloud services csn and csv are more similar. However, investigation in [29] shows that PCC will overestimate the similarities of negative cloud services. The negative cloud services are actually not similar with each other but happen to have similar usage experience observed by few users. Example 1. Suppose cs1 is the target cloud service, cs2, cs3 and cs4 are the neighborhood cloud services in user-service matrix, the QoS value response-time observed by six users defined as Table 1. The similarity between the cloud services are calculated by PCC as Sim(cs1, cs2) = 0.7672, Sim(cs1, cs3) = 1, and Sim(cs1, cs4) = 0.8858, which means that cs3 is more similar with cs1 than cs2 and cs4. It is clearly contrary to the actual due to the limited usage experience. In order to reduce the influence of negative cloud services, Zheng et al. [29] propose the significance weight and an enhanced PCC Sim’ to improve the accuracy of similarity computation. To conduct the similarity computation between two services, this en2jU n;y j hanced PCC integrates the significance weight jUn jþjU with PCC (i.e. yj Eq. (1)) by: 0

 1:

ð3Þ

Cloud services

where the entry qm,n denotes the QoS value of cloud service csn observed by user m, qm,n = null denotes that m did not invoke csn previously. To determine the similarity between two services, Pearson Correlation Coefficient (PCC) has been successfully applied in many QoS-aware service recommendation researches.

Sim ðcsn ; csy Þ ¼



jU n;y j MAX y2½1;N;y–n ðjU n;y jÞ

2  jU n;y j  Simðcsn ; csy Þ; jU n j þ jU y j

ð2Þ

where |Un,y| is the number of users who invoked both csn and csv before, Sim(csn, csy) is the PCC value between csn and csv. It is worthy to note that this method would also affect the similarity computation of positive cloud services (e.g. cs2) with more useful usage experiences. To address this problem, we introduce an extended PCC approach (named as f-PCC) that makes use of a usage structure factor Cn,y reflecting the characteristics of similarity distribution. Cn,y is defined as a convex function considering that the accuracy of QoS value prediction is more sensitive to changes in positive cloud services than to negative ones. Definition 2. Suppose csn and csv are two cloud services in userservice matrix, the usage structure factor is defined as

Proposition 1. Cn,y P 0 holds for all cloud services in user-service matrix, where y e [1, N] and y – n. Proposition 2. Cn,y attains |Un,y| = MAXye[1,N],y–n(|Un,y|).

its

maximum

C MAX n;y ¼ 1

when

Fig. 2 shows the shape of usage structure factor Cn,y with increasing |Un,y|, where the total number of users is fixed on 100. It indicates that the more similar usage experiences of cloud services, the greater Cn,y it has. Based on the usage structure factor, the f-PCC approach is defined as follows. Since the value of Cn,y is in the interval of [0, 1], the value of SimC(csn, csy) is also between the interval of [1, 1]. Definition 3. Suppose csn and csv are two cloud services, the f-PCC approach is defined as C

Sim ðcsn ; csy Þ ¼ C n;y Simðcsn ; csy Þ;

ð4Þ

where Cn,y is the usage structure factor of csv, SimC(csn, csy) is the PCC value between csn and csv. Let’s consider the motivating example. By employing PCC, Zheng’s method (as Eq. (2)) and f-PCC, the results of similarity computation are listed in Table 2. According to the similarity computation results, we can infer that f-PCC (SimC) is better than Zheng’s method (Sim’) in calculating the similarity of positive cloud service, i.e. cs2. The corresponding usage structure factor C1,2 = 1 and the significance weight 2jU 1;2 j ¼ 10=11. Moreover, the f-PCC approach can also improve jU 1 jþjU 2 j the accuracy of similarity computation by reducing the influence of negative cloud service (cs3). 3.2. Missing QoS value prediction QoS values are usually collected as objective assessment data to evaluate the trustworthiness of cloud services. However, it is hard to obtain QoS values for all candidates due to the time-consuming and costly of cloud service invocation. Based on the existing analysis, the missing QoS value on target cloud service can be predicted by employing the observation QoS data on other similar services. In reality, the user-service matrix is very sparse due to the huge differences in cloud services’ usage experiences. Therefore, before predicting the missing QoS value in user-service matrix, the similar neighbors need to select. Definition 4. Suppose the user-service matrix contains M users and N cloud services, csn is the target cloud service, the best performing similar set U(e) from N cloud services is defined as C

UðeÞ ¼ fcsy jSim ðcsn ; csy Þ P e; y–ng

ð5Þ

1.0 0.8 0.6 0.4 0.2 0.0 0

20

40

60

80

100

Fig. 2. Impact of |Un,y| on the usage structure factor.

219

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

Definition 6. Suppose csn is the target cloud service, the customer satisfaction is defined as:

Table 2 The similarity of cs2, cs3, and cs4 in computing with cs1. Services

Similarity

(cs1, cs2) (cs1, cs3) (cs1, cs4)

CU n ðr t Þ ¼ fp ðrt Þ þ fd

C

Sim

Sim’

Sim

0.7672 1 0.8858

0.6975 0.3333 0.7086

0.7672 0.1486 0.6565

where Sim (csn, csy) is the f-PCC value between csn and csy, e P minfSimC ðcsn ; csy Þ; yin½1; Ng is the similarity parameter. To alleviate the stronger influence of dissimilar cloud services (SimC(csn,csy) < 0), e should be determined in the interval of [0, 1]. The similarity parameter also cannot be set to a very high value since it will significantly reduce the number of training data. When U(e) = £ the missing QoS value prediction needs to degrade by decreasing the similarity parameter e. As an item-based CF method, we predict the missing QoS value by taking advantage of similar neighbors’ observation QoS data. Definition 5. Suppose the user-service matrix contains M users and ^m;n on cloud service csn is N cloud services, the missing QoS value q calculated as:

P

^m;n ¼ q n þ q

csy 2UðeÞ Sim

P

C

y Þ ðcsn ; csy Þ  ðqm;y  q C

csy 2UðeÞ Sim

ðcsn ; csy Þ

;

ð6Þ

n and q y are the where U(e) is the best performing similar set of csn, q average QoS values of cloud services csn and csy observed by different users, SimC(csn, csy) is the f-PCC value between csn and csy. As in Example 1, after adopting the proposed method for pre^1;6 ¼ 0:9967 with dicting the missing QoS value on cs1, we get q the similarity parameter e set to 0.5. Then, the predicted QoS value ^m;n can be further applied to the evaluation of cloud service q trustworthiness. 4. Cloud service trustworthiness evaluation Many trustworthiness problems require not only quantitative measurement but also qualitative analysis [14,23]. Methods and models focusing on the quantitative measurement of trustworthiness have been proposed in previous investigations [16–18,30]. While these models successfully solved a lot of real-world problems, they are facing a challenge in reality: how to integrate the customer’s preference and satisfaction into the trustworthiness evaluation model. In this section, we introduce a novel cloud service evaluation approach to help the aggregation of quantitative QoS value and qualitative customer satisfaction. 4.1. Customer satisfaction estimation The goal of customer satisfaction estimation is to provide perceived feedback on qualitative trustworthy attribute of delivered cloud service. In view of CSAT defined in [31], we identify customer satisfaction as a liner combination of perception function fp and disconfirmation function fd. Assuming At is a trustworthy attribute, rt e [0, 1] is the perceived rating of target cloud service on At, fp maps rt to the customer’s baseline satisfaction, whereas fd to the referred satisfaction. As a matter of fact, decision-maker prefers to construct a confidence interval rather than tipping point when selecting trustworthy cloud services. This implies that fd will be sensitive to the expected interval of ratings. To ensure the objectivity of expected interval determination, we use the maximum and minimum ratings on similar cloud services in our method.

 r t  r t ; þ  rt  rt

ð7Þ

where rt is the perceived rating of csn, fp and fd are the perception function and disconfirmation function. The expected interval of rat  þ ings r subjects to the following constraints: t ; rt

C



r þt ¼ MAXðr t;y jcsy 2 Uð2ÞÞ;

ð8Þ

r t ¼ MINðr t;y jcsy 2 Uð2ÞÞ:

Since different trustworthiness evaluation problems may inherit their own rating preference and data distribution, a parameter qð0 6 q 6 1Þ is defined to determine how much our customer satisfaction estimation method relies on perception function and disconfirmation function. We describe our perception function as an increasing linear function fp(r1) = q  r1 bounded between 0 and q, whereas express the disconfirmation function as a three-piece function grounded on the fact that the referred satisfaction of csn   þ increases significantly when rt 2 r  t ; rt . The customer satisfaction of cloud service csn on At can be formalized as follows:

8 r t r    t > < q  r t þ ð1  qÞ  rþt rt ; r t 6 rt 6 r t ;  CU n ðrt Þ ¼ q  r t rt < rt ; > : 1 þ q  ðrt  1Þ; r t > r þt ;

ð9Þ

where CUn(rt) is a linear function defined in [0, 1] and bounded between CUn(0) = 0 and CUn(1) = 1. Specially, the situation when  rþ t ¼ 1 and r t ¼ 0, the customer satisfaction of csn can be degrades to the following equation:

CU n ðr t Þ ¼ q  r t þ ð1  qÞ 

r t  r t ¼ rt : r þt  r t

ð10Þ

This means that the customer satisfaction in cloud service trustworthiness evaluation is equal to the utility function of perceived rating when the expected interval of ratings is [0, 1]. Based on the customer satisfactions fCU n ðrm t Þjm ¼ 1; . . . ; Mg of target cloud service from different customers, we can obtain a group customer satisfaction by

P CU n;t ¼

r t 2R CU n

m rt

jRj

ð11Þ

;



where R ¼ r m t ; m ¼ 1; . . . ; M is the rating set on At, |R| is the number of customers who have assigned rating to csn. Example 2. Suppose cs1 is the target cloud service, cs2, cs3 and cs4 are the neighborhood cloud services, the perceived ratings on qualitative attribute Security are described as Table 3. ‘‘1’’ denotes that the user did not invoke the corresponding cloud service previously.

Table 3 The perceived ratings of Security. Users

Cloud services cs1

cs2

cs3

cs4

1 2 3 4 5 6

0.7 0.6 0.6 0.7 0.7 1

0.8 0.6 0.7 0.6 0.9 0.7

0.8 1 1 1 1 0.7

0.7 0 1 0.8 0.7 0.6

220

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

We employ f-PCC to calculate the similarity between cloud services as SimC (cs1, cs2) = 0.4899, SimC (cs1, cs3) = 0.1487, and SimC(cs1, cs4) = 0.6162, and thus get the similar set of target cloud service UðeÞ ¼ fcs2 ; cs4 g where the similarity parameter q = 0.4. The predicted rating of cs1 is computed by Eq. (6) as ^1;6 ¼ 0:6081, and the expected interval of ratings is determined q as [0.6, 0.7]. Adopting the above method for customer satisfaction estimation, we have CU 1 ðr6Security Þ ¼ 0:3446 where the parameter q     is set to 0.5. Similarly, CU 1 r 1Security . . . CU 1 r 5Security can also be estimated by Eq. (9). Then, the group customer satisfaction can be   P computed as CU 1;Security ¼ 6m¼1 CU 1 rm Security =6 ¼ 0:5824: As opposed to previous works on cloud service trustworthiness evaluation where customer satisfaction is either neglected [16] or dealt with as any rating parameter [14], customer satisfaction is considered as the combination of customer’s baseline and referred satisfaction in this work. 4.2. QoS value utility computation To make use of QoS values for trustworthiness evaluation, it is worth noting that different QoS values may have different types of characteristics depending on personalized user preferences. In general, QoS values can be grouped into two classes: ‘‘benefit’’ and ‘‘cost’’. If At is a ‘‘benefit’’ (respectively, ‘‘cost’’) attribute, e.g. throughput (respectively, response-time), the higher (respectively, lower) its value, the keener a user would be choose it. As outlined before, utility reflects the decision maker’s attitude toward profit and risk [32,33]. In our previous research [23], results show that utility can be applied to identify the trustworthiness perception considering user’s interests and preferences. Therefore, it is rational to construct the utility function for deriving utility from QoS values in cloud service trustworthiness evaluation. If ^m;n 2 ‘‘benefit’’ (respectively, q ^m;n 2 ‘‘cost’’), user’s the QoS value q absolute risk aversion generally increases (respectively, decreases) ^m;n increases. Therefore, we employ the constant relative risk as q aversion (CRRA) [34] utility function to calculate the actual utility of QoS value. ^m;n is a Definition 7. Suppose csn is the target cloud service, q ^m;n is computed as: ‘‘benefit’’ QoS value, the utility of q

^m;n Þ ¼ uðq

1h ^m;n q ; 1h

ð12Þ

where 0 6 h < 1 specifies the CRRA coefficient. From this definition, the utility of QoS value is in the interval of ½0; þ1Þ, where a larger utility indicates that user is more satisfied with the target cloud service csn. In order to combine QoS value with customer satisfaction, we introduce the definition of mapping utility. ^m;n is a ‘‘benefit’’ QoS value, uðq ^m;n Þ is the Definition 8. Suppose q ^m;n , the mapping utility is defined as: CRRA utility of q

U m;n

8 ^m;n Þuðq uðq  mÞ > ^m;n 6 qþm ; > < uðqþm Þuðqm Þ ; qm 6 q ¼ 1; ^m;n > qþm ; q > > : ^m;n < qm ; q 0;

where



þ q m ; qm



ð13Þ

is the expected interval of QoS value on At for user m.

Different from the expected interval of ratings discussed in Sec  þ tion 4.1, q m ; qm is determined by analyzing the QoS value distribution on At in user-service matrix, which will be further discussed in Section 5.4. By this way, the value of um,n e [0, 1] holds for all QoS values in user-service matrix.

^m;n is a ‘‘cost’’ QoS value, the mapping utility of ‘‘cost’’ When q QoS value is calculated as:

U m;n ¼

8 u qþ þuðq^ Þ ð m Þ m;n  > ^m;n 6 qþm ; > < uðqþm Þuðqm Þ ; qm 6 q

ð14Þ

^m;n > qþm ; q ^m;n < qm ; q

0; > > : 1;

^m;n Þ is described as Eq. (12). where the CRRA utility of uðq Let’s consider Example 1. Assuming the CRRA coefficient h is set ^6;1 is calculated as to 0.3, the CRRA utility of predicted QoS value q ^6;1 Þ ¼ q ^1h uðq =ð1  hÞ ¼ 1:4253. Hence, by analyzing the domain of 6;1 q6,y (y = 2, 3, 4) in user-service matrix, we get the expected interval of QoS value on response-time is [0.85, 1.1]. Since response-time is a ‘‘cost’’ QoS value,

the mapping utility

u m,n = 0.4039 is obtained by Eq. (14) where qþ 6 ¼ 1:5271 and u q6 ¼ 1:275. Instead of QoS value in trustworthiness evaluation, our approach utilizes mapping utility to reveal the users’ actual perception on target cloud service. Our approach is generally effective at estimating the assessment data on quantitative attributes that relates to all users and services in user-service matrix. Note that the available cloud service QoS datasets are normally very sparse, and this problem will strongly affect the performance of QoS prediction, it is natural to adopt utility rather than predicted QoS value as the assessment data to reinforce trustworthiness evaluation. 4.3. Trustworthiness evaluation After QoS value utility computation, the trustworthiness value of target cloud service can be achieved by combining the customer satisfactions on qualitative attributes and QoS values on quantitative attributes. Definition 9. Suppose csn is the target cloud service, u1m;n . . . uIm;n are the mapping utilities on quantitative attributes A1. . .AI, CU n;Iþ1 . . . CU n;J are the group customer satisfactions on qualitative attributes AI+1. . .AJ, the trustworthiness value of csn is defined as:

PI trustn ¼

i¼1

xi  utm;n I

PJ þ

j¼Iþ1

xj  CU n;j

JI1

;

ð15Þ

where xi and xj are the relative weights of trustworthy attributes, PI PJ i¼1 xi þ j¼Iþ1 xj ¼ 1. The above equation defines the combination method of customer satisfaction and QoS values under the constraint PI PJ PI i¼1 xi þ i¼1 xi ¼ 1, j¼Iþ1 xj ¼ 1. When xI+1 = ,    , = xJ = 0 and our method is also applicable for the mentioned situations, in which we have to ignore the qualitative attributes of cloud service for enhancing the flexibility of trustworthiness evaluation. Trust information has been applied to improve the recommendation accuracy and coverage of CF approach [21]. Our method makes it possible to deal with various types of cloud service recommendations by using the multi-attribute trustworthiness evaluation. 5. Simulations and evaluation 5.1. Simulation experiments Inspired by the representative simulation work in service trustworthiness evaluation [14], a prototype system of CSTrust was developed using Microsoft .NET and SQL Server. We conduct a simulation experiment to demonstrate the effectiveness of our method. Based on the previous investigations [34–36], the commonly used cloud service trustworthy attributes include: Availability, Performance, Security, Privacy, Maturity, and Controllability. Their evaluation characteristics and weights are summarized as Table 4.

221

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225 Table 4 The trustworthy attributes of cloud service. Metrics

Characteristics

Weights



Availability (A1) Performance (A2) Security (A3) Privacy (A4) Maturity (A5) Controllability (A6)

Response-time Throughput / / / /

Quantitative Quantitative Qualitative Qualitative Qualitative Qualitative

0.25 0.18 0.12 0.15 0.13 0.17

/ / [0.3, 0.9] [0.5, 0.8] [0.5, 0.9] [0.2, 0.8]

35000

35000

30000

30000

Number of QoS value

Number of QoS value

Attributes

25000 20000 15000 10000 5000 0

0-1.5 1.5-2 2-6 >6 Values of response-time

þ r t ; rt



Values 1.1742 s 76.45 kbps 0.7625 0.6834 0.8255 0.6288

25000 20000 15000 10000 5000 0

-1

0-1

(a) response-time

1-2 2-4 >4 Values of throughput

-1

X 102

(b) throughput

35000

35000

30000

30000

Number of rating

Number of rating

Fig. 3. QoS value distributions in the user-service matrix.

25000 20000 15000 10000

20000 15000 10000 5000

5000 0

25000

0

0.00.10.20.30.40.50.60.70.80.91.0 Values of customer rating

(b) Privacy

35000

35000

30000

30000

Number of rating

Number of rating

(a) Security

25000 20000 15000 10000

25000 20000 15000 10000 5000

5000 0

0.00.10.20.30.40.50.60.70.80.91.0 Values of customer rating

0.00.10.20.30.40.50.60.70.80.91.0 Values of customer rating

(c) Maturity

0

0.00.10.20.30.40.50.60.70.80.91.0 Values of customer rating

(d) Controllability

Fig. 4. Customer rating distributions in the user-service matrix.

To ensure the typicality and objectivity of QoS prediction, an open QoS research dataset [37] will be employed to predict the assessment data on Performance and Availability. These QoS data (response-time and throughput) were collected from 339 users on 5825 web services in a real-world environment. Since it is impractical to discover and distinguish all functionally-equivalent cloud services at the selection time, we randomly select 100 web services’ QoS reports for our experiment. Fig. 3 shows the value distribution of response-time and throughput in user-service matrix. The ranges of response-time and throughput are 0–16.053 s and 0– 541.546 kbps, respectively. Suppose cs100 is the target cloud service. By employing f-PCC approach on the 339  100 user-service matrix, we obtain the pre-

dicted QoS values for evaluating the trustworthiness of cs100 as ^res ¼ 1:1742 s and q ^thr ¼ 76:45 kbps where the similarity parameq ter e is set to 0 this time. For qualitative attributes (A3–A6), we assume that customers assign perceived ratings to each cloud services. At each customer’s satisfaction estimation, the ratings are randomly generated for all 339 users within the domain of [0, 1]. Fig. 4 shows the customer rating distributions on Security, Privacy, Maturity, and Controlla þ bility. The expected intervals of ratings r  on A3–A6 are listed t ; rt in Table q = 0.5, the 3. Assuming the parameter

m

customer satisfactions CU 100 ðr m on A3–A6 can be 3 Þ . . . CU 100 r 6 jm ¼ 1; . . . ; 339 estimated according to Eq. (9). In addition, we obtained the group customer satisfactions CU 100;3 . . . CU 100;6 on A3–A6 for target cloud

222

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

response-time

1.0

0.8

0.8

0.6

0.6

MAE

MAE

1.0

0.4 IPCC IPCC-sw f-PCC

0.2

throughput

0.4 IPCC IPCC-sw f-PCC

0.2

0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

(a) response-time

1.4

1.4

1.2

1.2

1.0

1.0

0.8

0.8

RMSE

RMSE

(b)

0.6 0.4

IPCC IPCC-sw f-PCC

0.2

0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

throughput

0.6 0.4

IPCC IPCC-sw f-PCC

0.2

0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

(c)

(d)

Fig. 5. Impact of e on the prediction accuracy.

service by Eq. (11). The simulation results are also reported in Table 4. ^res and q ^thr using After obtaining the mapping utilities of q Eqs. (13) and (14), in which the CRRA coefficient h is set to 0.3, the trustworthiness value of target cloud service cs100 is computed by

P2 trust 100 ¼

i¼1

xi  ui339;100 2

P6 þ

j¼3

xj  CU 100;j 3

¼ 0:6449:

We can further obtain the assessment level as ‘‘Slightly untrustworthy’’ according to the trustworthiness decision-making rule set [23]. Although we only study six cloud service attributes in the experiments, the proposed method can be easily applied to other trustworthiness evaluation problems. When evaluating the trustworthiness with a certain TEIS, the entry data of CSTrust is the corresponding QoS value or customer rating proposed by past users on a set of functionally-equivalent cloud services. In the following sections, we will study the impacts of parameter e, q, and h on the enhancement of system performance. 5.2. Impact of e In this work, we present a new CF approach to predict the missing QoS values on quantitative trustworthy attributes. To study the effectiveness of our approach in QoS prediction, we compare it with two widely used item-based methods: Item-based CF adopting PCC (IPCC) [27], and Item-based CF adopting enhanced PCC with significance weight (IPCC-sw). We use Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) to evaluate the prediction performance of our method in comparison with other methods. MAE and RMSE are defined as

8 P ^m;n qm;n j jq > > ; < MAE ¼ m;n Q rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi P ^m;n qm;n Þ2 > ðq > m;n : RMSE ¼ ; Q

ð17Þ

^m;n and qm,n are the predicted QoS value and actual value of where q the cloud service csn, Q is the total number of predicted values. The similarity parameter e determines how many neighbors’ records are employed to predict missing QoS value. Suppose the T density of 339  100 user-service matrix as density ¼ 339100 , where T denotes the number of valid QoS records in CSTrust. For the userservice matrixes of response-time and throughput, we separate them into two parts: training set (80% records in the matrix) and test set (the remaining 20% records). On the basis, we set the density to 30%, and vary e from 0 to 1 with the step value of 0.1. Fig. 5 shows the experimental results on response-time and throughput. Under the same simulation condition, f-PCC and IPCC-sw significantly outperform PCC. Since our method employs the usage structure factor Cn,y to alleviate the influence of negative cloud services, f-PCC achieves smaller MAE and RMSE consistently than IPCC-sw for both response-time and throughput. On the other hand, no matter for f-PCC or IPCC-sw, the MAE and RSME value drop at first as e increases. It indicates that the prediction performance can be enhanced by removing the records observed on the neighbors with worse similarity. However, when e surpasses a certain value, they fail to drop, and instead rise sharply followed by a further increase in e. This was caused by the reduction of useful information from similar neighbors. These observations indicate that the better accuracy our method achieves when more past usage experiences on similar services are employed in CSTrust.

5.3. Impact of q In order to evaluate the impact of customer satisfaction on our CSTrust system, we have run additional experiments with variable parameter q. The parameter q balances the contribution for customer satisfaction estimation from perception function and disconfirmation function. If q = 0, we only extract customer’s baseline satisfaction for making the estimation, and if q = 1, we only employ referred satisfaction. In this experiment, we vary q

223

Group customer satisfaction

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

0.8

from 0 to 1 with the step value of 0.1 where the density of customer rating matrix is set at 100% this time. The outcome of the experiment is shown in Fig. 6. As shown in Fig. 6, the group customer satisfaction drops slightly on A3 and A6 and rises on A4 and A5 with the increase of q. This is due to the fact that customer satisfaction on the target service is perceptibly higher when the neighbors have worse usage experiences compare to the neighbors have better usage experiences. For instance, r339 ¼ 0:7 is the perceived rating on A3 pro5 posed by user 339. We can obtain the customer satisfactions on



¼ 0:6 and CU 100;2 r 339 ¼ 0:7667, considering cs100 as CU 100;1 r339 5 5

0.6 0.4 Security Privacy Maturity Controllability

0.2 0.0 0.0

0.2

0.4

0.6

0.8

1.0

Fig. 6. Impact of q.

1.0

response-time (θ = 0.2)

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0 0.0

0.4

0.8

1.2

1.6

2.0

0.0 0.0

response-time (θ = 0.4)

0.4

0.8

(a) 1.0

1.0 0.8

0.6

0.6

0.4

0.4

0.2

0.2 0.4

0.8

1.2

1.6

2.0

0.0 0.0

0.4

0.8

0.6

0.6

0.4

0.4

0.2

0.2 40

60

80

100

0.0 0

20

40

(e) 0.8

0.6

0.6

0.4

0.4

0.2

0.2 20

40

60

80

100

60

80

100

throughput

1.0

0.8

0.0 0

2.0

(f)

throughput

1.0

1.6

throughput (θ = 0.4)

1.0 0.8

20

1.2

(d)

0.8

0.0 0

2.0

response-time (θ = 0.8)

(c) throughput (θ = 0.2)

1.0

1.6

(b)

response-time (θ = 0.6)

0.8

0.0 0.0

1.2

80

100

0.0 0

20

(g) Fig. 7. Impact of h on the mapping utility.

40

60

(h)

224

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225

two different situations with the same parameter q = 0.5, in which the former has the expected interval of ratings [0.5, 0.9] and the latter has [0.2, 0.8].

5.4. Impact of h Different QoS values may have different data dimensions and user preferences. Instead of QoS value, we use mapping utility to identify the quality perception on quantitative attributes considering user’s interests and QoS value characteristics. The CRRA coefficient h determines how to derive the actual utility from QoS value to approximating the decision maker’s attitude toward profit and risk. To study the impact of h, we also implement a set of experiments in CStrust. Considering that 88.07% response-time values are smaller than 2 s and 81.32% throughput values are smaller than 100 kbps (Fig. 3), we set the expected intervals of QoS value to [0, 2] and [0, 100] for A1 and A2, respectively. Fig. 7 shows the impacts of CRRA coefficient h on the mapping utility estimation results. For response-time (respectively, throughput), the higher the value, the higher (respectively, lower) the mapping utility um,n has. We can also observe that how um,n turns from approximate straight-line with h = 0.2 to strictly concave with h = 0.8. This is be^res ) drops with cause the mapping utility of a ‘‘cost’’ QoS value (e.g. q the upgrade of user’s risk aversion level, whereas the mapping util^thr ) rises with the upgrade of risk ity of a ‘‘benefit’’ QoS value (e.g. q aversion level. It also shows that the curves of two mapping utility functions (Eqs. (13) and (14)) are symmetric corresponding to the axis of um,n = 0.5. This phenomenon confirms the intuition that user’s risk aversion level should be consistent no matter for ‘‘cost’’ or ‘‘benefit’’ QoS value under the same trustworthiness evaluation condition.

6. Conclusion and future work Based on the fact that trustworthiness can be evaluated by both objective measurement and subjective perception, we have proposed the cloud service trustworthiness evaluation framework by combining multi-source assessment data. To improve the accuracy of QoS prediction, our method defined the usage structure factor to reduce the influence of negative neighbors in similarity computation, and presented the similarity parameter to determine how many neighbors’ records have been adopted to predict missing QoS value. Our method has also proposed the customer satisfaction estimation approach for assessing qualitative attributes. The proposed method can only be applied to evaluate cloud service trustworthiness in the offline situation. In the online situation, trustworthiness evaluation needs to pay more attention on the dynamic of cloud service applications. However, our method has little to do with this problem due to the complicated nature of dynamic trustworthiness evaluation [38]. An uncertainty theory based cloud service evaluation model will be investigated in our future work. Furthermore, we also plan to conduct more experimental studies on additional QoS values (e.g. failure-probability).

Acknowledgements This work has been supported by the National Natural Science Foundation of China (Nos. 61374169, 71131002, 71331002, 71201042 and 71071045), the National Key Basic Research Program of China (No. 2013CB329603), and the Specialized Research Fund for the Doctoral Program of Higher Education of MOE of China (No. 20120111110020).

References [1] B. Hayes, Cloud computing, Commun. ACM 51 (7) (2008) 9–11. [2] A.Y. Du, S. Das, R. Ramesh, Efficient risk hedging by dynamic forward pricing: a study in cloud computing, INFORMS J. Comput. (2012), http://dx.doi.org/ 10.1287/ijoc.1120.0526. [3] S. Marston, Z. Li, S. Bandyopadhyay, et al., Cloud computing – the business perspective, Decis. Support Syst. 51 (1) (2011) 176–189. [4] Z.B. Zheng, X.M. Xu, Y.L. Zhang, et al., QoS ranking prediction for cloud services, IEEE Trans. Parall. Distrib. Syst. (2012), http://dx.doi.org/10.1109/ TPDS.2012.285. [5] M. Alhamad, T. Dillon, E. Chang, A trust-evaluation metric for cloud applications, Int. J. Mach. Learn. Comput. 1 (4) (2011) 416–421. [6] C. Wang, Q. Wang, K. Ren, et al., Toward secure and dependable storage services in cloud computing, IEEE Trans. Serv. Comput. 5 (2) (2012) 220–232. [7] S.K. Grag, S. Versteeg, R. Buyya, A framework for ranking of cloud computing services, Future Gener. Comput. Syst. 29 (4) (2013) 1012–1023. [8] A. Benlian, T. Hess, Opportunities and risks of software-as-a-service: findings from a survey of IT executives, Decis. Support Syst. 52 (1) (2011) 232–246. [9] J. Gao, P. Pattabhiraman, X.Y. Bai. et al. Saas performance and scalability evaluation in clouds, in Proceedings of the 6th IEEE International Symposium on Service Oriented System Engineering, Irvine, CA, 2011, pp. 61–71. [10] Z. Li, L.O. Brien, R. Cai, et al. Towards a taxonomy of performance evaluation of commercial cloud services, in: IEEE Fifth International Conference on Cloud Computing, Honolulu, HI, 2012, pp. 344–351. [11] D.H. Mcknight, M. Carter, J.B. Thatcher, et al., Trust in a specific technology: an investigation of its components and measures, ACM Trans. Manage. Inform. Syst. 2 (2) (2011) 1–15. [12] A. Avizienis, J.C. Laprie, B. Randell, Basic concepts and taxonomy of dependable and secure computing, IEEE Trans. Dependable Secur. Comput. 1 (11) (2004) 11–33. [13] T.A. Delong, D.T. Smith, B.W. Johnson, Dependability metrics to assess safetycritical systems, IEEE Trans. Reliab. 54 (3) (2005) 498–505. [14] N. Limam, R. Boutab, Assessing software service quality and trustworthiness at selection time, IEEE Trans. Softw. Eng. 36 (4) (2010) 559–574. [15] C.W. Hang, M.P. Singh, Trustworthy service selection and composition, ACM Trans. Auton. Adapt. Syst. 6 (1) (2011) 1–17. [16] S. Chakraborty, K. Roy. An SLA-based framework for estimating trustworthiness of a cloud, in: IEEE 11th International Conference on Trust, Security and Privacy in Computing and, Communications, 2012, pp. 937–942. [17] Z.B. Zheng, T.C. Zhou, M.R. Lyu, et al., Component ranking for fault-tolerant cloud applications, IEEE Trans. Serv. Comput. 5 (4) (2012) 540–550. [18] M. Mehdi, N. Bouguila, J. Bentahar. Trustworthy web service selection using probabilistic models, in: IEEE 19th International Conference on Web Services (ICWS), Honolulu, HI, 2012, pp. 17–24. [19] I. Petri, O.F. Rana, Y. Rezgui, G.C. Silaghi, Trust modelling and analysis in peerto-peer clouds, Int. J. Cloud Comput. 1 (2/3) (2012) 221–239. [20] I. Petri, O. F. Rana, Y. Rezgui, G. C. Silaghi. Risk assessment in service provider communities. GECON’11 Proceedings of the 8th International Conference on Economics of Grids, Clouds, Systems, and Services, 2011, pp. 135–147. [21] Q. Schambour, J. Lu, A trust-semantic fusion-based recommendation approach for e-business applications, Decis. Support Syst. 54 (1) (2012) 768–780. [22] C.L. Corritore, B.K. Kracher, S. Wiedenbeck, On-line trust: concepts, evolving themes, a model, Int. J. Hum–Comput. Stud. 58 (6) (2003) 737–758. [23] S. Ding, X.J. Ma, S.L. Yang, A software trustworthiness evaluation model using objective weight based evidential reasoning approach, Knowl. Inform. Syst. 33 (1) (2012) 171–189. [24] Z.B. Zheng, Y. Zhang, M. Lyu, Investing QoS of real-world web services, IEEE Trans. Serv. Comput. (2012), http://dx.doi.org/10.1109/TSC.2012.34. [25] F. Belanger, J.S. Hiller, W.J. Smith, Trustworthiness in electronic commerce: the role of privacy, security, and site attributes, J. Strateg. Inform. Syst. 11 (3–4) (2002) 245–270. [26] S. Ding, S.L. Yang, C. Fu, A novel evidential reasoning based method for software trustworthiness evaluation under the uncertain and unreliable environment, Expert Syst. Appl. 39 (3) (2012) 2700–2709. [27] M. Deshpande, G. Karypis, Item-based top-N recommendation, ACM Trans. Inform. Syst. 22 (1) (2004) 143–177. [28] B. Sarwar, G. Karypis, J. Konstan, et al. Item-based collaborative filtering recommendation algorithms, in: Proceeding of 10th International Conference on World Wide Web (WWW), 2001, pp. 285–295. [29] Z.B. Zheng, H. Ma, M.R. Lyu, et al., QoS-aware web service recommendation by collaborative filtering, IEEE Trans. Serv. Comput. 4 (2) (2011) 140–152. [30] V. Basili, P. Donzelli, S. Asgari, A unified model of dependability: capturing dependability in context, IEEE Softw. 21 (6) (2004) 19–25. [31] J. Xiao, R. Boutaba, Assessing network service profitability: modeling from market science perspective, IEEE/ACM Trans. Netw. 15 (6) (2007) 1307–1320. [32] R.M. Musai, R. Soyer, C. Mccabe, et al., Estimating the population utility function: a parametric Bayesian approach, Eur. J. Oper. Res. 218 (2) (2012) 538–547. [33] J.B. Yang, Rule and utility based evidential reasoning approach for multiattribute decision under uncertainties, Eur. J. Oper. Res. 131 (1) (2001) 31–61. [34] M.C. Chiu, H.Y. Wong, Optimal investment for an insurer with cointegrated assets: CRRA utility, Insur.: Math. Econ. 1 (2013) 52–64. [35] S. Subashini, V. Kavitha, A survey on security issues in service delivery models of cloud computing, J. Netw. Comput. Appl. 34 (1) (2011) 1–11.

S. Ding et al. / Knowledge-Based Systems 56 (2014) 216–225 [36] J.M. Cruz, Z.G. Liu, Modeling and analysis of the effects of QoS and reliability on pricing, profitability, and risk management in multiperiod grid-computing networks, Decis. Support Syst. 52 (3) (2012) 562–576. [37] Z.B. Zheng, Y.L. Zhang, M.R. Lyu. Distributed QoS evaluation for real-world web services, in: Proceedings of the 8th International Conference on Web Services (ICWS2010), Miami, Florida, USA, 2010, pp. 83–90.

225

[38] K. Karaoglanoglou, H. Karatza, Resource discovery in a grid system: directing requests to trustworthy virtual organizations based on global trust values, J. Syst. Softw. 84 (3) (2011) 465–478.