Using a Fuzzy Based Pseudometric in Classification Sandra Sandri, Flávia Martins-Bedê, and Luciano Dutra Instituto Nacional de Pesquisas Espaciais 12201-970, São José dos Campos, SP
[email protected], {flavinha,luciano.dutra}@dpi.inpe.br
Abstract. In this work, we propose a pseudometric based on a fuzzy relation, which is itself derived from a fuzzy partition. This pseudometric is a metric in the particular case in which the fuzzy partition is composed solely by triangular fuzzy sets. We prove that these functions are indeed a pseudometric and a metric and illustrate their use in an experiment for the classification of land use in an area of the Brazilian Amazon region. Keywords: classification, k-NN, fuzzy partitions, distance, metric, pseudometric.
1 Introduction One of the main approaches to classification is the so-called k-NN classifiers, in which an element of a domain is assigned the class that represents the majority of the classes of its closest k neighbours (see [9] for a review of classification techniques). The function used to obtain the neighbors of an element x in a multi-dimensional domain O is usually a metric (or distance) on O, but there are a few works proposing the use of pseudometrics in general instead (see for instance [4]). In this work, we propose the use of a function that is the complement in [0, 1] of a particular kind of similarity relation, called an Order Compatible Fuzzy Relation (OCFR ), defined using a total order (Ω, ) [8]. An OCFR itself is derived from a particular type of fuzzy partition (a collection of fuzzy sets), called Convex Fuzzy Partitions (CFP ). The creation of OCFR was motivated by the need to ease the burden of creating suitable relations for use in a particular fuzzy case-based reasoning classification approach [6]. The main goal of this work is to prove that the proposed function is i) a pseudometric, when obtained from a specific type of CFP , called 2-Ruspini, and, in particular, ii) a metric, when this CFP is moreover composed solely of triangular fuzzy sets. We also briefly describe an application of this function in the classification of land use in an area of the Brazilian Amazon region.
2 Fuzzy Convex Partitions and Order-Compatible Relations In Fuzzy Sets Theory, membership to sets is no longer an all-or-nothing notion [1]. A fuzzy set A on a domain Ω is characterized by a mapping A : Ω → [0, 1], called the membership function of A. It is said to be normalized when ∃ x0 ∈ Ω such that A. Laurent et al. (Eds.): IPMU 2014, Part I, CCIS 442, pp. 189–198, 2014. c Springer International Publishing Switzerland 2014
190
S. Sandri, F. Martins-Bedê, and L. Dutra
A(x0 ) = 1. A level cut of A is defined as ∀α ∈ (0, 1], [A]α = {x ∈ Ω | A(x) ≥ α}. The core and support of a fuzzy set A are particular types of level cuts defined as core(A) = {x ∈ Ω | A(x) = 1} and supp(A) = {x ∈ Ω | A(x) > 0}), respectively. Particular types of fuzzy sets are those described by linear by parts membership functions, in which all level cuts are (nested) intervals of the domain. If A0 is such a fuzzy set and its core is given by ∃ xl , xu ∈ Ω, core(A0 ) = [xl , xu ], then A0 is trapezoidal (respec. triangular), when xl = xu (respec. xl = xu ). A collection of fuzzy sets is usually called a fuzzy partition, with more specific definitions depending on the properties obeyed by the fuzzy sets composing the partition. A fuzzy relation S is characterized by a mapping from a multidimensional domain O = Ω1 × ... × Ωn to [0, 1]. Its normalization is defined similarly to that for fuzzy sets in one-dimensional domains. Given two distinct fuzzy relations S and S , we say that S is finer than S , when ∀x, y ∈ Ω, S(x, y) ≤ S (x, y). A T -norm operator is a mapping : [0, 1]2 → [0, 1], that is commutative, associative, monotonic and has 1 as neutral element. Given a T -norm , its associated residuated implication operator I is defined as [7]. I (x, y) = sup{z ∈ [0, 1] | (x, z) ≤ y}, and its associated biresiduation BI is defined by BI (x, y) = min(I (x, y), I (y, x)). In particular, the Lukasiewicz T -norm operator is defined as
L (x, y) = max(x + y − 1, 0). and its associated residuated operator and biresiduation are respectively defined as – IL (x, y) = min(1 − x + y, 1), – BIL (x, y) = 1− | x − y |, where | . | denotes the absolute value of a real number. Let S : Ω 2 → [0, 1] be a fuzzy binary relation and (Ω, ) be a total order. Formally, S is an Order Compatible Fuzzy Relation with Respect to a Total Order (Ω, ) (OCFR or OCFR, for short), when it obeys the following properties [8]: – ∀x, y, z ∈ Ω, S(x, x) = 1 (reflexivity) – ∀x, y, z ∈ Ω, S(x, y) = S(y, x) (symmetry) – ∀x, y, z ∈ Ω, if x y z, then S(x, z) ≤ min(S(x, y), S(y, z)) (compatibility with total order (Ω, ), or -compatibility for short). Let (Ω, ) be a total order and let A = {A1 , ..., At } be a collection of fuzzy sets in Ω. Formally, A is a Convex Fuzzy Partition with Respect to a Total Order (Ω, ) (CFP or CFP, for short), if it obeys the following properties [8]: 1. ∀Ai ∈ A, ∃x ∈ Ω, Ai (x) = 1 (normalization), 2. ∀x, y, z ∈ Ω, ∀Ai ∈ A, if x y z then Ai (y) ≥ min(Ai (x), Ai (z)) (convexity),
Using a Fuzzy Based Pseudometric in Classification
191
3. ∀x ∈ Ω, ∃Ai ∈ A, Ai (x) > 0 (domain-covering), 4. ∀Ai , Aj ∈ A, if i = j then core(Ai ) ∩ core(Aj ) = ∅ (non-core-intersection). Let A(Ω,) denote the set of all CFPs that can be derived considering a total order (Ω, ). CFP A ∈ A(Ω,) is said to be a n-CFP if each element in Ω has non-null membership to at most n fuzzy sets in A (n ≥ 1). In particular, a 2-CFP A is called a 2-Ruspini partition, when it obeys additivity: – ∀x ∈ Ω, i Ai (x) = 1 (additivity) In [8], the authors propose to generate OCFR S + : Ω 2 → [0, 1] from a CFP A as S + (x, y) =
0, if S ∗ (x, y) = 0 SL (x, y), otherwise
∀x, y ∈ Ω, S ∗ (x, y) = sup min(Ai (x), Ai (y)) i
∀x, y ∈ Ω, SL (x, y) = infi 1− | Ai (x) − Ai (y) | Note that SL is constructed based on the Lukasiewicz biresiduated operator BIL . When A is a 2-Ruspini partition, S + (x, y) obeys the following properties (see [8] for proofs): – ∀Ai ∈ A, ∀c ∈ core(Ai ), ∀x ∈ Ω, S(c, x) = S(x, c) = Ai (x) (core-restrictivity); – ∀x, y ∈ Ω, S(x, y) ≥ S ∗ (x, y) = supi min(Ai (x), Ai (y)) (level-cut-compatibility). Core-restrictivity states that the “slice” from S, corresponding to an element c at the core of a set Ai (the column of element c in [7]), is exactly the same as Ai , whereas level-cut-compatibility ensures that any two elements of Ω that belong to level cut [Ai ]α of a fuzzy set Ai in A will also belong to level cut [S]α of relation S [8].
3 Function f + A metric, or distance function, d : Ω → R satisfies the following properties: – – – –
∀x, y ∀x, y ∀x, y ∀x, y
∈ Ω, d(x, y) ≥ 0 (non-negativity) ∈ Ω, d(x, y) = 0 if and only if x = y (identity of indiscernibles) ∈ Ω, d(x, y) = d(y, x) (symmetry) ∈ Ω, d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
A pseudometric satisfies non-negativity, symmetry and the triangle inequality, but the identity of indiscernibles property is substituted by a weaker property: – ∀x ∈ Ω, d(x, x) = 0 (anti-reflexivity)
192
S. Sandri, F. Martins-Bedê, and L. Dutra
+ Let A be a CFP and SA be the result of applying S + to A. Here we propose the use + 2 of function fA : Ω ∈ [0, 1] in tasks in which metrics and pseudometrics are employed: + + ∀x, y ∈ Ω, fA (x, y) = 1 − SA (x, y).
This formula can be written directly as: + ∀x, y ∈ Ω, fA (x, y) =
1, if ∀i, min(Ai (x), Ai (y)) = 0, supi | Ai (x) − Ai (y) |, otherwise.
+ When no confusion is possible, we denote fA as simply f + . + Below, we first prove some properties of f and then prove that it is a pseudometric in general and a metric when A is composed solely of triangular fuzzy sets. Note that, + obeys the compatibility with total order property, we obtain the inequality since SA + + + ∀x, y ∈ Ω, fA (x, z) ≥ max(fA (x, y), fA (y, z)). + Therefore, that property leads us to obtain a lower bound for fA (x, z), in terms of + + + fA (x, y) and fA (y, z), when x y z. By proving the triangle inequality for fA , we + will then obtain an upper bound for fA , that does not depend on the order of x, y and z in Ω. In the following, we say that two elements p and q in Ω relate to each other with respect to a CFP A, when they both have non-null membership degree to at least one fuzzy set in A, i.e when ∃A ∈ A, min(A(p), A(q)) > 0. Let A be a 2-Ruspini CFP . Each element in Ω has non-null membership to either 1 or 2 fuzzy sets in A, due to the additivity and covering properties. Therefore, with respect to A, any two elements p and q in Ω can either: i) be unrelated, when ∀Ai ∈ A, min(Ai (p), Ai (q)) = 0, ii) be related by a single fuzzy set, when ∃B ∈ A such that p, q ∈ supp(B) and ∀Ai ∈ A, Ai = B, min(Ai (p), Ai (q)) = 0, or iii) be related by exactly two fuzzy sets, when there exists two contiguous fuzzy sets B1 , B2 in A such that p, q ∈ supp(B1 ) ∩ supp(B2 ) and for all Ai ∈ A such that Ai = B1 and Ai = B2 , min(Ai (p), Ai (q)) = 0.
Corollary 1. Let (Ω, ) be a total order, A be a 2-Ruspini CFP , S + (x, y) be an OCFR derived from A and f + = 1 − S + . 1. If p and q are unrelated, we have f + (p, q) = 1. Proof. In this case, for all Ai ∈ A, min(Ai (p), Ai (q)) = 0, and thus the result follows from the definition of f + . 2. If p and q are related by a single fuzzy set B ∈ A, then either (a) f + (p, q) = 0, when p and q both belong to the core of B, or (b) f + (p, q) = max(1 − B(p), 1 − B(q)), otherwise. Proof. In this case, there exists B ∈ A such that p, q ∈ supp(B) and for all Ai ∈ A, Ai = B, min(Ai (p), Ai (q)) = 0.
Using a Fuzzy Based Pseudometric in Classification
193
– a) If p and q belong to the core of B, B(p) = B(q) = 1 and thus | B(p) − B(q) |= 0. Due to additivity in 2-Ruspini partitions, for all Ai ∈ A such that Ai = B, min(Ai (p), Ai (q)) = 0 =| Ai (p)−Ai (q) |. Therefore, f + (x, z) = 0. – b) If p and q do not both belong to the core of B, then there exist two fuzzy sets B − and B + in A adjacent to B, to its left and right, respectively, which are such that p ∈ supp(B − ) ∩ supp(B) and q ∈ supp(B) ∩ supp(B + ). All fuzzy sets Ai ∈ A such that Ai ∈ / {B − , B, B + } can be disregarded in + the calculation of f (p, q), because for any A in A, if min(A(p), A(q)) = 0, then | A(p) − A(q) |= 0. We thus have ∀p, q ∈ supp(B), f + (p, q) = maxA∈{B − ,B,B + } | A(p)−A(q) |= max(| B − (p)−B − (q) |, | B(p)−B(q) |, / supp(B + ) and | B + (p) − B + (q) |). But B + (p) = B − (q) = 0, because p ∈ − − + q∈ / supp(B ). Moreover, ∀p ∈ supp(B ) ∩ supp(B ), we have B − (p) = 1−B(p) and ∀q ∈ supp(B)∩supp(B + ), B + (q) = 1−B(q), due to additivity in A. We thus have f + (p, q) = max(| 1 − B(p) − 0 |, | B(p) − B(q) |, | 0 − 1 + B(q) |) = max(1 − B(p), 1 − B(q)). 3. If p and q are related by 2 fuzzy sets B1 and B2 in A, we have f + (p, q) =| B1 (p) − B1 (q) |=| B2 (p) − B2 (q) |. Proof. In this case, there exists two contiguous fuzzy sets B1 , B2 in A such that p, q ∈ supp(B1 ) ∩ supp(B2 ) and for all Ai ∈ A such that Ai = B1 and Ai = B2 , min(Ai (p), Ai (q)) = 0. As seen previously, we only have to consider fuzzy sets B1 and B2 from A. We thus have f + (p, q) = max(| B1 (p) − B1 (q) |, | B2 (p) − B2 (q) |). But since A is 2-Ruspini and B1 and B2 are contiguous, f + (p, q) =| B1 (p) − B1 (q) |=| B2 (p) − B2 (q) |, which completes the proof. Corollary 2. Let (Ω, ) be a total order and A be a 2-Ruspini CFP derived from A. Function f + derived from A satisfies the triangle inequality. Proof. To prove triangle inequality, we have to verify that ∀x, y, z ∈ Ω, f + (x, z) ≤ f + (x, y) + f + (y, z). Let suppp stand for the union of supports of the fuzzy sets to which an element p in Ω has non-empty membership, i.e. ∀p ∈ Ω, suppp = ∪i {supp(Ai ) | Ai (p) > 0}. We have to verify two cases. – Case 1: x and z are unrelated in A, i.e. for all Ai ∈ A, min(Ai (x), Ai (z)) = 0. In this case, by definition, f + (x, z) = 1. When y ∈ / suppx ∪ suppz , i.e. y is related to neither x nor z in A, we have f + (x, y) = f + (y, z) = 1 and thus f + (x, z) = 1 ≤ f + (x, y) + f + (y, z) = 2. When y ∈ suppx , we have f + (y, z) = 1, since by hypothesis suppx ∩ suppz = ∅, and therefore f + (x, z) = 1 ≤ f + (x, y) + f + (y, z) = f + (x, y) + 1. We use a similar reasoning when y ∈ suppz . – Case 2: x and z are related in A, i.e. ∃A ∈ A, min(A(x), A(z)) > 0. When y ∈ / suppx ∪ suppz , i.e. y is related to neither x nor z in A, we have f + (x, y) = f + (y, z) = 1. We thus obtain f + (x, z) ≤ f + (x, y) + f + (y, z) = 2.
194
S. Sandri, F. Martins-Bedê, and L. Dutra
Let us now examine the case in which y is related to either x or z or both. As mentioned above, since A is a 2-Ruspini CFP , we have two possibilities for x and z; they are related by either 1 or 2 fuzzy sets. • Case 2.1: x and z are related by a single fuzzy set, i.e., there exists B ∈ A such that x, z ∈ supp(B) and ∀Ai ∈ A, Ai = B, min(Ai (x), Ai (z)) = 0. If y ∈ / supp(B), since y ∈ suppx ∪ suppz , it either unrelated to x or to z (but not both) and thus, using Corollary 1.1, either f + (x, y) = 1 or f + (y, z) = 1 and therefore f + (x, z) ≤ f + (x, y) + f + (y, z). Let us now suppose that y ∈ supp(B). Without lack of generality, let us assume that x ≤ z. From Corollary 1.2, if x and z belong to the core of B, we trivially have f + (x, z) = 0 ≤ f + (x, y) + f + (y, z). Otherwise, we have f + (x, z) = max(1 − B(x), 1 − B(z)). Let us consider the latter case. There are two possibilities for y, it is either i) related to both x and z or ii) related to either x or z but not both. In i), x, y, z ∈ supp(B) and thus f + (x, z) = max(1 − B(x), 1 − B(z)) ≤ max(1−B(x), 1−B(y))+max(1−B(y), 1−B(z)) = f + (x, y)+f + (y, z). In ii) either y ∈ supp(B − ) or y ∈ supp(B + ), where B − and B + are fuzzy sets in A to the left and right of B respectively. If y ∈ supp(B − ), x and y are related by two fuzzy sets, B − and B, and by Corollary 1.3 we have f + (x, y) =| B(x) − B(y) |. Moreover, y and z are unrelated and thus f + (y, z) = 1. Therefore, f + (x, z) = max(1 − B(x), 1 − B(z)) ≤| B(x) − B(y) | + 1 = f + (x, y) + f + (y, z). The case in which y ∈ supp(B + ) is proved in a similar manner. • Case 2.2: x and z are related by exactly 2 fuzzy sets, i.e. when there exists two contiguous fuzzy sets B1 , B2 in A such that x, z ∈ supp(B1 ) ∩ supp(B2 ) and for all Ai ∈ A such that Ai = B1 and Ai = B2 , min(Ai (x), Ai (z)) = 0. From Corollary 1.3, we have f + (x, z) =| B1 (x) − B1 (z) |. Without lack of generality, let us suppose that B1 stands to the left of B2 and that x ≤ z. We then have B1 (x) ≥ B1 (z) and thus f + (x, z) = B1 (x) − B1 (z). We verify two possibilities for y: ∗ i) y ∈ supp(B1 ) ∩ supp(B2 ): We have to consider three orderings in what regards y : · y ≤ x ≤ z: In this case, B1 (y) ≥ B1 (x) ≥ B1 (z) and using Corollary 1.3, f + (x, y) = B1 (y) − B1 (x) and f + (y, z) = B1 (y) − B1 (z). But f + (x, z) = B1 (x) − B1 (z) ≤ B1 (y) − B1 (x) + B1 (y) − B1 (z) = f + (x, y) + f + (y, z). · x ≤ y ≤ z: We have B1 (x) ≥ B1 (y) ≥ B1 (z) and in a similar manner we obtain f + (x, z) = B1 (x) − B1 (z) = B1 (x) − B1 (y) + B1 (y) − B1 (z) = f + (x, y) + f + (y, z). · x ≤ z ≤ y: We have B1 (x) ≥ B1 (z) ≥ B1 (y) and in a similar manner to y ≤ x ≤ z we obtain f + (x, z) ≤ f + (x, y) + f + (y, z). ∗ ii) y ∈ / supp(B1 ) ∩ supp(B2 ): Since y ∈ supp(B), either a) y ∈ supp(B1 ) − supp(B2 ) or b) y ∈ supp(B2 ) − supp(B1 ).
Using a Fuzzy Based Pseudometric in Classification
195
In a), x and y are related by two fuzzy sets, B1 itself, and a fuzzy set to the left of B1 , which we will call B1− . Applying Corollary 1.3 we obtain f + (x, y) =| B1 (x) − B1 (y) |. Moreover, since y is unrelated to z, we obtain f + (y, z) = 1 (see Corollary 1.1). Therefore f + (x, z) =| B1 (x) − B1 (z) |≤| B1 (x) − B1 (y) | + 1 = f + (x, y) + f + (y, z). We obtain a similar result for the case b), which ends the proof. Corollary 3. Let (Ω, ) be a total order, A a 2-Ruspini CFP , and f + derived from A. If all fuzzy sets in A are triangular, f + satisfies the identity of indiscernibles property. Proof. To satisfy the identity of indiscernibles property, we should have f + (x, y) = 0 if and only if x = y. Let us suppose that there exists two elements p and q in Ω such f + (p, q) = 0 and p = q. But f + (p, q) = 0 only when two conditions are satisfied: 1. ∃i, min(Ai (p), Ai (q)) > 0 and 2. supi | Ai (p) − Ai (q) |= 0. As seen in Corollary 1, any two elements in the domain may be either unrelated or related by 1 or 2 fuzzy sets. But p and q are necessarily related, due to the first condition above. Therefore, we have to analyse two cases: – Case 1: p and q are related by a single fuzzy set B. In this case, as seen in Corollary 1.2, we have to verify weather both p and q belong to core(B). But in a triangular fuzzy set A ∈ A, core(A) = cA ∈ Ω, i.e. the core is a point. Since p and q are distinct and the core of B is a point, p and q cannot not both belong to the core(B). The remaining possibility is that f + (p, q) = max(1−B(p), 1−B(q)). But this expression go to 0 only when B(p) = B(q) = 1, which would mean that they both belong to the core of B, what contradicts the previous result. – Case 2: p and q are related by two contiguous fuzzy sets B1 and B2 . As seen in Corollary 1.3, we have f + (p, q) =| B1 (p) − B1 (q) |. But in this case, f + (p, q) = 0 only when B1 (p) = B1 (q). Since B1 is triangular, B1 (p) = B1 (q) only when p = q = core(B1 ), which contradicts the hypothesis. Since in both cases, we obtain a contradiction of the hypothesis, we conclude that the identity of indiscernibles property holds. Function f + derived from a 2-Ruspini CFP A is not a distance in the general case, because identity of indiscernibles does not hold when there exists a fuzzy set A in A such that core(A) = [a, b], with a, b ∈ Ω and a = b, i.e, A is either a trapezoidal fuzzy set or crisp interval. Indeed, in such a case, ∀p, q ∈ [a, b] we have f + (p, q) = 0. We now prove that f + is a pseudometric for any 2-Ruspini CFP A and a distance when all fuzzy sets in A are triangular. Proposition 1. Let f + be derived from a 2-Ruspini CFP A on Ω. 1. Function f + is a pseudometric. 2. Function f + is a distance when all fuzzy sets in A are triangular.
196
S. Sandri, F. Martins-Bedê, and L. Dutra
Proof. It is straightforward to verify that f + satisfies symmetry, anti-reflexitivity and non-negativity, because S + is a symmetric and reflexive fuzzy relation. Moreover, in Corollary 2 we proved that the triangle inequality property holds for function f + . This proves part 1. We proved in Corollary 3 that f + satisfies the identity of indiscernibles property, when all fuzzy sets in A are triangular. This proves part 2, which completes the proof. We now propose to use the arithmetic means to in order to extend f + to multidimensional domains. In Proposition 2, we prove that the resulting function has the same properties as in the one-dimensional domain. Proposition 2. Let O = Ω1 × ... × Ωm , where ∀i, (Ωi , ) is a total order. Let Ai be + : O → [0, 1] be the a 2-Ruspini CFP on Ωi and fi+ be derived from Ai . Let f(μ) + extension of function f to multidimensional domains, defined as + + f(μ) (x, y) = μ(f1+ (x1 , y1 ), ..., fm (xm , ym )),
where μ : [0, 1]m → [0, 1] is the arithmetic means, i.e., μ(a1 , ..., am ) =
1≤i≤m
m
ai
.
+ 1. Function f(μ) is a pseudometric. + 2. Function f(μ) is a distance when all fuzzy sets in A are triangular. + trivially satisfies symmetry, anti-reflexitivity and non-negativity. Proof. Function f(μ) + We now prove that the triangle inequality property holds for f(μ) . + and from Proposition 1, we have From the definitions of f + and f(μ)
f + (x ,z )
f + (x ,y )+f + (y ,z )
f + (x ,y )
f + (y ,z )
i i i + i f(μ) (x, z) = i im i i ≤ i i i m = i i m i i + i im i i = + + + f(μ) (x, y) + f(μ) (y, z). Therefore, f(μ) satisfies the triangle inequality and is thus a pseudometric. This proves part 1. Let x and y be two elements in the m-dimensional space O. But from Proposition 1, when all fuzzy sets in each 2-Ruspini CFP Ai , defined on its corresponding Ωi , is composed solely of triangular fuzzy sets, we have ∀i(fi+ (xi , yi ) = 0 ⇔ xi = yi ). But ∀i(fi+ (xi , yi ) = 0 ⇔ xi = yi ) ⇒ [∀i(fi+ (xi , yi ) = 0) ⇔ ∀i(xi = yi )]. Since we trivially have i) x = y iff ∀i ∈ [1, m], xi = yi and ii) fμ+ (x, y) = 0 iff + ∀i ∈ [1, m], fi+ (xi , yi ) = 0, we thus have (fμ+ (x, y) = 0 ⇔ x = y). Therefore f(μ) satisfies the identity of indiscernibles and is thus a metric, when the Ai ’s are composed of triangular fuzzy sets, which concludes the proof.
4 Use of f + in a Classification Application In the following, we briefly describe an experiment that illustrates the use of function f + in a land use classification task in the Brazilian Amazon region. The area of interest covers approximately 411 km2 and in the municipality of Belterra, state of Pará, in the Brazilian Amazon region, partially contained in the National Forest of Tapajós. An intense occupation process occurred in the region along the BR-163 highway (CuiabáSantarém), with opening of roads to establish small farms, after deforestation of primary
Using a Fuzzy Based Pseudometric in Classification
197
forest areas [2]. As a result, there are mosaics of secondary vegetation in various stages, with pastures and cultivated areas embedded in a forest matrix [3]. In this application [5], 14 attributes have been considered, derived from either radar or optical satellite images, with 6 classes: forest, initial or intermediate regeneration, advanced regeneration or degraded forest, cultivated area, exposed soil, and pasture. The samples consist of 428 ground information based hand-made polygons. The attribute value for each polygon is the average of the values for the pixels composing it. The experiments have been done using 10 folders (9 for training and 1 for testing), partitioned as 2 sets containing 42 elements and 8 sets containing 43 elements. Figure 1 brings the accuracy results for this application, considering k-NN with 1 to 19 neighbors, using the Euclidean distance (kNN_dE) and the Mahalanobis distance + obtained from two types of partitions using 3 fuzzy (kNN_dM), as well as function f(μ) sets each, a triangular one (kNN_dFtg) and a trapezoidal one (kNN_dFtz). In order to calculate f + for each attribute, the corresponding domain was reduced to the interval of the minimal and maximal sample values; both of which extended by 20%. For both the triangular and trapezoidal partitions, the fuzzy sets were homogeneously distributed on the reduced domain.
a)
b) Fig. 1. Classification accuracy results for: a) k-NN average and b) k-NN maximum
198
S. Sandri, F. Martins-Bedê, and L. Dutra
We see from the figures that all methods had high accuracy and that the best results were obtained with the Euclidean distance and the use of fμ+ for the chosen triangular partitions. We also see that the use of triangular partition with fμ+ , with the advantage of producing its best results with a small number of neighbours, contrarily to all of the other methods.
5 Conclusions In this work, we have proposed the use of a pseudometric based on a particular type of fuzzy partition, that becomes a distance when the underlying partition is formed solely by triangular fuzzy sets. The extension of this function for multi-dimensional domains was also proposed. We have proved that this function derived from both trapezoidal and triangular partitions enjoy good properties, which make them formally suitable to be used in k-NN classifiers. Finally, we have shown a real-world experiment, in which this function obtained very good results, showing its practical use in applications. The obtained results are very promising and future work includes applying these functions in other experiments. Acknowledgements. The authors acknowledge financial support from FAPESP and are very grateful to Mariane S. Reis for help on the Tapajós case study.
References 1. Dubois, D., Prade, H.: Possibility Theory: An Approach to Computerized Processing of Uncertainty. Plenum Press, New York (1988) 2. Brazilian Institute of Environment and Renewable Natural Resources (IBAMA): Floresta Nacional do Tapajós Plano de Manejo, vol. I (2009) (in Portuguese), http://www.icmbio.gov.br/portal/images/stories/ imgs-unidades-coservacao/flona_tapajoss.pdf (Date accessed: December 30, 2013) 3. Escada, M.I.S., Amaral, S., Rennó, C.D., Pinheiro, T.F.: Levantamento do uso e cobertura da terra e da rede de infraestrutura no distrito florestal da BR- 163. Repport INPE-15739RPQ/824, INPE. S.J. Campos, Brazil (2009), http://urlib.net/sid.inpe.br/mtc-m18@80/2009/04.24.14.45 (Date accessed: December 30, 2013) (2009) 4. Korsrilabutr, T., Kijsirikul, B.: Pseudometrics for Nearest Neighbor Classification of Time Series Data. Engineering Journal 13 (May 2009), http://engj.org/index.php/ej/article/view/46 (Date accessed: December 30, 2013) 5. Martins-Bedê, F.T.: An extension to kNN classifier for multiple spaces, PhD Thesis. INPE. S.J. Campos, Brazil (in press, 2014) (in Portuguese) 6. Mendonça, J.H., Sandri, S., Martins-Bedê, F.T., Guimarães, R., Carvalho, O.: Training strategies for a fuzzy CBR cluster-based approach. Mathware & Soft Computing 20, 42–49 (2013) 7. Recasens, J.: Indistinguishability operators, modelling fuzzy equalities and fuzzy equivalence relations. STUDFUZZ, vol. 260. Springer (2011) 8. Sandri, S., Martins-Bedê, F.T.: A method for deriving order compatible fuzzy relations from convex fuzzy partitions. Fuzzy Sets and Systems (in press, 2014) 9. Theodoridis, S., Koutroumbas, K.: Pattern recognition, 3rd edn. Academic Press (2006)