Fuzzy Interval Decision-theoretic Rough Sets - Semantic Scholar

Report 2 Downloads 139 Views
Fuzzy Interval Decision-theoretic Rough Sets Dun Liu

Tianrui Li

Decui Liang

School ofEconomics and Management

School ofInfonnation Science and Technology

School ofEconomics and Management

Tsinghua University Beijing,100084, China [email protected]

Southwest Jiaotong University Chengdu, 610031, China [email protected]

Abstract

-

In this paper, we introduce the fuzzy interval

number to decision-theoretic rough sets (DTRS), and propose a novel

three-way

decision

model

of

fuzzy

interval

decision­

theoretic rough sets (FIDTRS). The fuzzy interval number is used to describe the uncertainty of loss functions in DTRS. The corresponding propositions and criteria of FIDTRS are analyzed. An

illustrative

example

of

mines

management

is

given

to

illuminate the proposed model in applications.

I. INTRODUCTTON Decision- theore tic rough se ts ( DTRS), which is proposed by Yao in 1990s [26-2 8], has become one of ho t topics in the researches of rough se ts (RS). As a represen ta tive model of probabilis tic rough se ts ( PRS), DTRS in troduces loss func tions to calcula te the two thresholds in PRS, which brings a brief seman tic explana tion by minimizing the decision cos t wi th Bayesian theory. Based on DTRS, Yao fur ther proposed a concep t of three-way decision, consis ting of posi tive, boundary and nega tive decision rules [29-32]. The two thresholds genera ted by DTRS can divide the universe in to three par ts: posi tive region, nega tive region and boundary region [20,2 8,36-37], which make the decisions of accep tance, rejec tion and non-commi tmen t, respec tively. Nowadays, three-way decisions derived from DTRS have been applied to many domains, such as email fil tering [34,35], inves tmen t and governmen t decision [11,15], medical clinic [19], documen ts classifica tion [ 8], da ta packs selec tion [24], web-based suppor t sys tems [25], e tc. DTRS re quires tha t the loss func tions are precisely given as real numbers. Obviously, some times, one can direc tly use the values of objec ts (i.e., money, energy and time) to es tima te the losses [12]. However, in mos t cases, the decision makers may hardly es tima te the precise loss func tion values in real decision procedure. Observed by these limi ta tions, there are two me thods to solve this problem. One is to u tilize the behaviour me thod. We can design some ques tionnaires or behavioral experimen ts to ge t the losses in DTRS [12]. The o ther is to u tilize the uncer tain values to replace precise values in DTRS. The uncer tain informa tion (including s tochas tic, vague or rough informa tion) has been applied in DTRS [16,17,41]. Under the s tochas tic environmen t, Liu e t a1. considered the 10ss func tions obey a cer tain of probabilis tic dis tribu tion and the ex tension of decision- theore tic rough se t models under uniform dis tribu tion and normal dis tribu tion are proposed in [17]. Under the vague and rough environmen ts, considering the fac t tha t modelling possible values of quan ti ties by means of real in tervals accoun ts for some vague

978-1-4799-0348-1/13/$31.00 ©2013 IEEE

Southwest Jiaotong University Chengdu, 610031, China [email protected]

knowledge in a very simple way, Liu e t al. induced the interval-valued loss functions and proposed in terval-valued decision-theoretic rough sets [16]. However, the expressive power of intervals is limited: if an interval is too narrow, the quantity it represents may lie outside of it. If it is too large,the obtained results will be uninformative [5]. So, it is natural to extend interval number to fuzzy interval number when modelling with the vague knowledge. In this paper, we introduce the fuzzy in terval number to DTRS, and propose a novel model of fuzzy in terval decision- theore tic rough se ts ( F I DTRS). The remainder of this paper is organized as folIows: Sec tion 11 provides the basic concep ts of fuzzy in terval numbers and DTRS. The corresponding proposi tions and cri teria of I F DTRS are inves tiga ted in Sec tion I I I. Then, a case s tudy of mines managemen t is given to illus tra te our model in Sec tion IV. Sec tion V concludes the paper and ou tlines the fu ture work. 11. PRELIMINARIES

Basic concep ts, no ta tions and resul ts of fuzzy in terval numbers and RS are brietly reviewed in this sec tion [5-7,101 8,20,22,23,2 8,33,36,37,39-43] . Definition

A

1.

IL' , /L"E]]{ , and IL'

closed



/L

in terval

=

[ IL', /L"] ,

/Lu is called real-valued interval

IL' , /L"E [0,1], [IL' , /Lu] is called an interval

number. If

number on unit interval or simply interval number. Let N [O,I]

=

{/L [IL' , /Lu] I ° < IL' < /Lu < I} , =

then i t

is

the

collec tion of all in terval numbers (on uni t in terval [0,1]). Definition 2. A fuzzy se t F, defined in the universal space X, is a func tion defined in X which assumes values in the range [0,1]. A fuzzy F is wri t ten as a se t of pairs

{x,,uF(x)}asF {{x,,uF(x)}),XE X. x is an elemen t of the universal space X, and ,uF(x)is the membership grade of the element xin a fuzzy se t F. =

A fuzzy number lis a triangular fuzzy number, wi th i ts membership func tion defined as:

1315

,uX (x)

=

(I, m, n)

=

!(X-l)/(m-l) (x-n)/(m-n) o

l�x�m m�x�n otherwise

A fuzzy interval is viewed as a pair of fuzzy thresholds [ �, �] . If �, � are triangular fuzzy numbers, [ �, �] can

called a Pawlak approximation space. For a subset

X in

the upper bound of

(2) NEG(X)= U -apr(X)= -,(apr(X)). For all X EU, one can make an acceptance decision when XEP OS(X) , a deferment decision when XE BN D(X) , and a rejection decision when XEN EG(X). The three regions lead to three-way decisions.

As an extension ofPawlak rough sets,probabilistic rough sets are proposed to generalize the restrictive definition of the lower and the upper approximations in (1), two thresholds aand ßare induced to construct the (a, ß)-approximation space [28,32,37]. Given a > ß, the (a, ß)-probabilistic lower and the upper approximations are defined as folIows. � a};

=

l

, and

••

l:. is a fuzzy number with

l...

Table 1: The fuzzy interval loss function regarding the risk or cost of actions in the different states

X(P)

ap

-1 pp] App = [App,A1/

as

Asp = [Asp,Asp]

aN

ANp = [ANP ,ANP]

-'x(N) -

-I

-u

-

-/

-"

-

-/

-"

APN = [ApN ,APN]

-

-/

-"

ASN = [AsN'ASN]

-

-/

-"

ANN = [ANN ,ANN]

For simplicity,we suppose all the parameters

l..in Table

1 are triangular fuzzy numbers. Thus,the loss functions can be redefined as: / 1 I � ""pp = [(1pp,mpp,npp ), (11/ 1/ 1/)]; pp,mpp,npp / 1 I � 1 ) 11/ 1/ 1/ "" sP = [(SP'msp,nsp , (sp'msp,nsp)]; / / I � " " " ""N = [(I NP'mNP'nNP)'(INP'mNP'nNP)]; P / 1 I � " " " ""PN = [(IPN'mpN ,npN), (IPN'mpN ,npN)]; / 1 I � " " " ""SN = [(ISN'mSN'nSN)'(ISN'mSN'nSN)];

number J. = (, I m,n) can be transformed into a crisp number by employing the following e quation:

and negative regions are defined as:

a},

BN D (a ß) (X) = {XE U I ß < Pr(XI [X]) < a}, . N EG(a ß) (X) = {XE U I Pr(X I [X]) � ß} . .

-/

J.NN = [(I�N'm�N'n�N)'(� (5) l N'm�N'n�N)] . There are many ranking methods for calculating the fuzzy numbers, including integral value method [9], distance minimization method [2], deviation degree method [1]. According to the results presented in [3, 21], a triangular fuzzy

(3)

where, Pr(X I [x]) I [x] nX 1/ I [x] I denotes the conditional probability of the classification. Obviously, the (a,ß)- probabilistic positive, boundary

A=

I+4 m+n

6 Furthermore, the loss functions in Table calculated as:

(4)

a set of 2 states, indicating that an

element in X and not in X , respectively. Let A={ap, as, aN} be a fmite set of 3 possible actions, which represent classifying an object in positive region, boundary region or negative region. The losses of those 3 classification actions with respect to different states are defmed by 6 fuzzy interval-

1316

""pp

=



""sP 4NP

Ipp + 4mpp +npp ["":"":,__ , ,:,,,_-,-,,:,,,,6 I



III. FUZZY INTERVAL DECISION-THEORETIC ROUGH SET MODEL Q={ X, -X , } be

-"

In light of [4,26,27], the loss functions regarding the risk or cost of actions in the different states under fuzzy interval are given in Table l.

BN D(X) = apr(X) -apr(X) ,

Let

-/

X. 4.. (.= P, B, N) is a fuzzy number

with the lower bound of

P OS(X) = apr(X),



ap, as and

=

does not belong to

regions N EG(X):

> ß}.

-I

[4N p,

incurred for taking action ap, as and aN when an object

(1) apr(X)={xEU I[x]nX:;t:0}. Based on the rough set approximations of X, three pair­ wise disjoint regions are generated: the positive regions P OS(X), the boundary regions BN D(X) and the negative

P OS(a ß) (X) = {XE U I Pr(XI [X]) ,

=

aN when an object belongs to X; APN [4PN , 4PN] , -I -1 sN] and ANN [4NN' 41/ NN] denote the losses ASN [4sN' 41/ =

X � U/ R ,

the lower approximation and the upper approximation of Pawlak rough sets are defined by: apr(X)={X E U I[X ] � X} ;

apr(a ßl (X) = {XE U I Pr (XI[x]) .

-/ [4sp, 41/ sp]and ANp

=

Definition 3: Let Ube a finite and nonempty set and R an e quivalence relation on U. The e quivalence relation R induces a partition of U, denoted by U/R. The pair apr=(U, R)is

{x E U I Pr(X I [x])

=

-

numbers are positive (lp 12 ' mp m2 ' np n2 � 0) .

=

=

l�p ] denote the losses incurred for taking action

be rewritten as [(11' ml, nl), (12' m2, n2)], and all the fuzzy

ap r (a. ß)(X)

-/ 1/ , Asp [4pp, 4pp]

-

numbers, App

=

=

I

I

Iupp + 4m1Ipp +nUpp 6

(6) can be ];

I I I U 1I U Isp + 4msp +nsp Isp + 4msp +nsp . ], [ 6 6 U 1I 1I 1 I I INP + 4mNP +nNP INP + 4mNP +nNP ]; [ 6 6

R(aNI[x]) I�p+4mfp+n�pPr(XI[x])+ I�N +4mfN +n�N Pr(-'xI[x])· u /BN

+

11

The Bayesian decision procedure suggests the following minimum-cost decision rules:

11

4mBN + nBN

6

];

(P).

If

R(ap I [xl) :S; R(aB I [xl) and

R(aN I [xl) ,decide (B). At this moment,the loss functions in (7) are changed into

If

POS(X);

R(aB I [xl) :S; R(ap I [xl) and

R(aN I [xl) ,decide

interval. The lower bound and the upper bound of X in (7)

XE

R(ap I [xl) :S;

XE

BND(X);

••

(N).

reflect different risk attributes of decision makers. As the risk­ lover, they tend to choose the lower bound of each loss function to denote the corresponding loss function value. On the contrary, the risk-averters believe the upper bound of loss function can represent their loss function values [7,12 ]. For a risk-Iover,the expected loss can be expressed as:

I;p + 4m;p + n;p

6

NEG(X).

:s;

I;p + 4m;p + n;p

6