A Hybrid Evolutionary Multiobjective Approach for ... - Semantic Scholar

Report 0 Downloads 155 Views
A Hybrid Evolutionary Multiobjective Approach for the Dynamic Component Selection Problem Andreea Vescan and Crina Gros¸an Department of Computer Science Babes¸-Bolyai University Cluj-Napoca, Romania {avescan,cgrosan}@cs.ubbcluj.ro

Shengxiang Yang Department of Information Systems, Computing and Mathematics Brunel University London United Kingdom [email protected]

Abstract

ronment and for the situations when more than one objective is involved in the optimization process. The current work does both analysis of the algorithm and its drawbacks and application to a practical problem. The paper is organized as follows: Section 2 presents a short introduction on components and their compositions. The proposed approach (that uses an evolutionary algorithm) is presented in Section 3. In Section 4 some experiments and comparisons are performed. We conclude our paper and discuss future work in Section 5.

Component selection is a crucial problem in Component Based Software Engineering (CBSE). CBSE is concerned with the assembly of pre-existing software components that leads to a software system that responds to client-specific requirements. This work deals with the component selection problem which we formulate as multiobjective optimization, involving four objectives: the number of used components, the number of new requirements, the number of provided interfaces and the number of the initial requirements that are not in solution. We use the Pareto dominance principle to deal with the multiobjective optimization problem. Needles to say the last two objectives should be zero. Afterwards, we investigate the problem in an dynamic or changing environment for which two practical scenarios are envisaged: the repository containing the components varies over time and the system requirements change over time. The algorithm employed uses a combination of evolutionary algorithms and repair mechanism. The idea behind this was to avoid restarting the algorithm from randomly generated solutions and to make use of the ones found at the previous step.

2. Components and Compositions A component is an independent software package that provides functionality via welldefined interfaces. The interface may be an export interface through which a component provides functionality to other components or an import interface through which a component gains services from other components. The purpose [1] of a component is to provide functionality that can be used in different contexts. This functionality is accessible through a components provides interface. Components may have multiple provide interfaces. A component can depend on functionality offered by other components. The functionality that is required by a component forms a components requires interface. A component may also have multiple of these. A component with a requires interface can be bound to any component that implements this interfaces. Thus, by specifying functionality in terms of interfaces, no dependencies on concrete components are introduced. This property makes components independently deployable. However, non-functional properties of components, such as performance characteristics, may yield such component dependencies. These are ignored in this article. The “blackbox” nature of a component is important: that is, a component can be incorporated in a software system without regard to how it is implemented. In other words, the interface of a component should provide all the information that users need. Moreover, this information should be the only information they need. Consequently, the interface of

1. Introduction Complex versions of component selection problem involve two important characteristics: can depend on an uncertain, dynamic or changing environment and their characterization supposes the use of more than one criterion (it is multicriterial). The work in this paper envisages both the above aspects and proposes a hybrid evolutionary approach which adapts to dynamic environments while trying to satisfy in an optimal way all the criteria. The approach combines an evolutionary algorithm with a repair mechanism which is required to ensure the validity of the candidate solutions. There is still plenty of research going on for the analysis of the evolutionary approaches’ behavior in dynamic envi-

c 978-1-4577-2152-6/11/$26.00 2011 IEEE

714

a component should be the only point of access to the component. Components are used as building blocks to form larger software entities. Assembling components [1] is called composition. Composition involves putting components together and connecting provided functionality to required functionality. Composition can be static or dynamic. With static composition the collection of components that form an application is statically known. With dynamic composition the composition of components is determined dynamically, e.g., at run-time. In this article we only consider static composition. Component selection methods are traditionally done in an architecture-centric manner. An approach was proposed in [6]. The authors present a method for simultaneously defining software architecture and selecting off-the-shelf components. Another type of component selection approaches is built around the relationship between requirements and components available for use. In [4] the authors have presented a framework for the construction of optimal component systems based on term rewriting strategies. Paper [5] proposes a comparison between a Greedy algorithm and a Genetic Algorithm. Various genetic algorithms representations were proposed in [9], [10], [8], [7].

3. Proposed approach description 3.1. Problem formulation An informal specification of our aim is described next. It is needed to construct a final system specified by input data (that is given) and output data (what is required to compute). We can see the final system as a compound component and thus the input data becomes the required interfaces of the component and the output data becomes the provided interfaces, and in this context we have the required interfaces as provided and we need to provide the internal structure of the final compound component by offering the provided interfaces. In Figure 1 all the above discussion is graphically represented. A formal definition of the problem (seen as a compound component) is as follows. Consider SR the set of final system requirements (the provided functionalities of the final compound component) as SR = {r1 , r2 , ..., rn } and SC the set of components (repository) available for selection as SC = {c1 , c2 , ..., cm }. Each component ci can satisfy a subset of the requirements from SR (the provided functionalities) denoted SPci = {pi1 , pi2 , ..., pik } and has a set of requirements denoted SRci = {ri1 , ri2 , ..., rih }. The goal is to find a set of components Sol in such a way that every requirement rj (j = 1, n) from the set SR can be

input data (given)

Internal structure of the needed Final system

output data (results)

Compout component view

r1 r2

p1 Final system as Compound component

p2 p3

Figure 1. Graphically representation of the problem formulation assigned a component ci from Sol where rj is in SPci (i = 1, m), while minimizing the number of used components. All the requirements of the selected components must be satisfied by the components in the solution. The dynamics of the component selection problem can be viewed in two ways: 1) The repository containing the components varies over time: A set of possible components is initially considered. Each component has two states: on and off (or 0 and 1). At each time step, a constant number of components are kept on while the remaining is kept off. While moving to the next time step, the status of some of the components is changed from on to off and vice versa, ensuring the number of the on ones remains constant. 2) The system requirements change over time. A few operations are allowed to take place in this dynamic situation: a) new requirements are introduced, in addition to the existing ones; b) some of the requirements are removed and c) a combination of the two above: some of the requirements are removed while new ones are added, but without keeping the size of the requirements constant.

3.2. Solution representation A solution (chromosome) is represented as a 4-tuple (lstP rov, lstComp, lstInitReq, lstN ewReq) with the following information: • list of provided interfaces (lstP rov); • list of components (lstComp); • list of initial requirements (lstInitReq); • list of new requirements (lstN ewReq). The value of i − th component represents the component satisfying the i − th provided interface from the list of provided interfaces. A chromosome is initialized with the list of provided interfaces by the list of requirements of the final required system and with the list of initial requirements

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

715

with the list of the requirements of the final required system (these will be provided as implicit, being input data of the problem/system). An example is given in what follows. To describe the representation we will use in what follows the component repository in figure 2. There are many components that may provide the same functionality with different requirements interfaces. Starting for a set of three requirements and having a set of 18 available components, the goal is to find a subset of the given components such that all the requirements are satisfied. The set of requirements SR = {r2 , r5 , r6 } (view as provided interfaces {p2 , p5 , p6 }, see the discussion in Section 3.1) and the set of components SC = {c0 , c1 , c2 , c3 , c4 , c5 , c6 , c7 , c8 , c9 , ..., c18 } are given as in Figure 2. p2 r1

C1

r4

r9

p5

C17

r1

C6

p6 p4

C3 C4

r7

C5

C2

r3

p6

r3

r3

p2 p5 r7 r7 p5 r2

p9 p5

C7 p6

C8

p8

C18

p4

r7

C9 r7

r10

p4

C10

r7

r3 r7 r1

r7

p8

C11 p8

C12

p9

p9 r5

C13

p1 p9

C14 C15 C16

p10 p3

p1 p3

Figure 2. Components repository and final system for the first case of experiments A valid chromosome using the components repository above may be structured as follows: Crom0 = ((2, 5, 6), (4, 1, 2), (1, 3), (9, 4)). This chromosome does not represent a solution, it is only an initialized chromosome without any applied genetic operator. The provided interfaces (2, 5, 6) are offered by the components (4, 1, 2). The set of initial requirements are: (1, 3). By using a component we need to provide it’s requirements: component 4 requires the 9-th new requirement and component 2 requires the 4-th new requirement.

3.3. Genetic operator The genetic operator used is mutation. Mutation operator used here consists in applying the following steps: • randomly select a requirement from the list of new requirements; • add the associated provided interface of the new requirement in the list of provided interfaces; • add the component that satisfies the added provided interface (a component is randomly selected from the set of components that offer it); • remove the required selected from the list of new requirements;

716

add to the list of new requirements the requirements of the added component (if there are) to the list of components. The same chromosome (from section 3.2) after applying mutation operator has the internal structure: Crom1 = ((2, 5, 6, 9, 4), (4, 1, 2, 6, 3), (1, 3), ()). In order to provide the 9-th new requirement we have selected component 6, and for the 4-th new requirements the component 3 was chosen. No new requirements are added (the requirements of the new selected components are in the set of the initial requirements. There is another possibility of modifying a chromosome: if no more new requirements are need to be satisfied then the mutation operator performs as follows: • randomly select a provided form the list of new providers; • modify the component that satisfies the selected provided interface (a component is randomly selected from the set of components that offer it); • remove from the list of providers all the dependencies that the selected component has (if no other components have the same dependency); • add to the list of new requirements the requirements of the added component (if there are) to the list of components. For the above chromosome the first kind of mutation may not be applied, but the second one describe above. For the selected provided 6 from the list of available components {C2 , C5 , C7 } we select the 7 component. Thus component 2 is replased by the 7 component and the dependency of the 2 component is removed, i.e. provided 4 with component 3. The new chromosome has the following structure: Crom3 = ((2, 5, 6, 9), (4, 1, 7, 6), (1, 3), ()). •

3.4. Algorithm description An evolutionary algorithm as the one described before is adequate for static situations. In dynamic environments, the solutions usually change from one step to another and the idea is to make use of as much information found previously by the algorithm as possible instead of re-starting it from scratch. There are situations in which just a small adjustment of the solutions already found at the previous step is enough, but, on the other side, there are situations in which the solutions at the next step are completely different from the ones already found. Furthermore, a simple adjustment will not lead to the desired solutions. The situations treated in this paper are more towards the second scenario, being impossible - from the way a solution is build - to evolve towards a solution at the next step without any intervention. Adding new requirements or removing some of them involves changing the size of the chromosome.

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

In this respect, we combined the evolutionary approach with a repair mechanism. The repairing operator has the role to adapt each chromosome by modifying its structure) such as to make it adapted to the new problem (or to the changed problem). The repair operator for a chromosome is given below: • if we change the final requirements then the modification of the chromosomes are done as in the following way: – if a new requirement is added then the requirement is added in the list of provided; for the new requirement a possible component that satisfy it is added to the list of components; – if a requirement is deleted, then the corresponding component from the list of components is also deleted; for the deleted component all its dependencies from the list of provided (or new requirements) are deleted (if no other components depend on them); – the list of requirements stop are also modified according to the new list or (final) required interfaces of the final system. • if we change the set of components from the repository then the modification of the chromosome are done in the following way: – if a component is deleted from the repository then if that component is used by a chromosome, then for the provided interface we search a new available component from the new set of components; – for the deleted component all its dependencies from the list of provided (or new requirements) are deleted (of no other components depend on them). The Crom3 chromosome is changed in the following way if we delete the requirement (provided) 2 and add a new requirement (transfomed in provided) 4 (component 3 was chosen to provide the 4 requirement): Crom4 = ((5, 6, 4), (1, 7, 3), (1, 3), ()). The 2 provided was offered by the 4 component which has the dependency to the 6 component, so the first and the last elements from the list of provided and from the list of components are removed. The same Crom3 chromosome is changed in Crom5 = ((2, 5, 6), (1, 1, 7), (1, 3), ()) if the component 4 is removed from the repository. Because the 2 requirement was offered by component 4 a new component is randomly chosen, in this example component 1. Also, the dependency of the 4 component to the 6 component is removed from the chromosome.

3.5. Multiobjective aspect of the problem The approach presented in this paper uses principles of evolutionary computation and multiobjective optimization [3]. First, the problem is formulated as a multiple objective

optimization problem having four objectives: the number of used components, the number of new requirements, the number of provided interfaces and the number of the initial requirements that are not in solution. All objectives are to be minimized. There are several ways to deal with a multiobjective optimization problem. In this paper the Pareto dominance [11] principle is used. Definition 1: Pareto dominance. Consider a maximization problem. Let x, y be two decision vectors (solutions) from the definition domain. Solution x dominate y (also written as x  y) if and only if the following conditions are fulfilled: 1) fi (x) = fi (y), ∀ i = 1, n; 2) ∃j ∈ {1, 2, , n} : fj (x) > fj (y). That is, a feasible vector x is Pareto optimal if no feasible vector y can increase some criterion without causing a simultaneous decrease in at least one other criterion. It is well known that a high number of objectives pose problems for the evolutionary algorithms using Pareto dominance relationship since it might be difficult to always find solutions which are better than the ones already found with respect to all the criteria. This is the case with our problem as well and we observed that the evolution stagnates after a few iterations due to the fact that all the solutions will be nondominated and very rarely a new generated one dominates the others. Therefore, we introduce a supplementary condition for comparing two solutions which are nondominated among them. If they are nondominated, we will thus prefer the one for which the aggregation (or sum) of all of the objectives values if lower.

4. Experiments We consider two types of dynamics and, consequently two experiments corresponding to each of them: - The requirements of the problem change over time; - The components available at a certain time step change.

4.1. Case 1: Dynamic (changing) requirements The final system requirements change from one step to another. There are a few possible scenarios: adding new requirements to the ones at the previous step; removing some of the requirements from the previous step; a combination of the previous two: adding new requirements and removing some of the ones at the previous step (in this situation we always ensure that the added components are different from the removed ones). We do not treat each of these situations in particular due to the fact that we did not observe a particular behavior for a particular situation. It appears that the complexity is same no matter the sort of dynamics involved in this case.

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

717

Six different time steps are built using artificially generated data and the dynamics at each of these steps are shown in Table 1. The initial step is T = 1 and the changes start with the time step T = 2. It is worth noticing that the repository containing the components is unchanged for the entire duration of the algorithm and time steps. Table 1. Dynamics for the test case 1: changing requirements. Changes

Step T=2 Add two new requirements

T=3 Remove one requirement

T=4 Remove one requirement and add a new requirement

T=5 Add a new requirement

T=6 Remove two requirements and add a new requirement

In order to analyze the behavior of the algorithm, we performed a few tests. Their role is to see if the number of iterations and the population size play a role in finding the Pareto solutions. For each time step we report the number of nondominated solution in the final population and the number of distinct solutions (some of them will have multiple copies and we consider in the end the singular solutions). Figure 3 shows the effect of population size which varies from 10 to 100 with the step size 10 and Figure 4 presents the influence of iterations number for the same variation. The values of the other parameters of the algorithm are: mutation probability 0.7, number of iterations is 20 for the test of Figure 3 while population size is 50 for test presented in Figure 4. Some remarks can be derived from the experiments performed. We are interested in finding as many nondominated solutions as possible, but, on the other had we look for diversity as well and we wish to have a large number of distinct solutions among the nondominated ones. As evident from Figure 3, all the individuals in the population are able to converge to a solution (we can observe that in all situations, the number of nondominated solutions obtained at the end of a time step equals the population size) which ensures a good convergence of the algorithm. Population size plays a role in identifying distinct nondominated solutions and we can observe a slight variation in at least one of the time steps with an increase in population size. On the other hand, the number of iterations does not play a crucial role; we can see an increasing trend in number of distinct individuals at the end of a time step but this is not a constant increase. This is somehow justified: a larger population is of a greater help in finding more solutions than an increased number of iterations (which may, in turn, help

718

Figure 4. Number of nondominated solutions at each time step and for different iterations number (population size is 50).

for a better convergence).

4.2. Case 2: Dynamic components In this case, the repository containing components changes over time. The repository contains a given number of components but only a percentage is available for selection at a time step. This percentage remains constant for all the time steps but the set of available components changes with each time step. Figure 5 shows the component repository for the second type of experiment. The modifications for each step are: 1 step - adding two components; 2 step - removing one component; 3 step - adding and removing one component, and step 4 - adding 3 components.

p2

r6

C1 p3

C2

r1 r8

p5

C4 r1

r1 r10

C15

r7

p5

p3 p2

p7 r1

p5 p2

C6

C17

r1

C9

r4

C10

p3 p3

r1

p2

C14

r1

C8

r9

r1

p5

C5

r4

C7 C13

C3

r4

p9 p5

C11 r4

C12

p2

r1 r2

p1

Final system

p5 p10

C16

Figure 5. Components repository and final system for the second case of experiments Figure 6 shows the effect of the population size over the number of distinct nondominated solutions. 30 iterations are considered for each time step. It can be easily observed that after the size 40, there is no improvement and even with a low population size and less iterations the algorithm adapts very well (we therefore present the graphs for a population of up to 80 individuals).

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

p2 p3

Figure 3. Influence of the population size for the final number of nondominated solutions (and distinct nondominated solutions).

As with the previous test, we studied the effect of number of iterations and we considered several experiments with iterations varying from 10 to 100 while the population size is kept constant: 30. Figure 7 depicts the results.

4.3. Comparisons The algorithm employed here uses a combination of evolutionary algorithms and repair mechanism. The idea behind this was to avoid restarting the algorithm from randomly generated solutions and to make use of the ones found at the previous step. We compare these results with an algorithm which restarts every time with a new randomly generated population. We use a population of size 100 and 100 iterations and we compare the two algorithms in terms of overall fitness (i.e. the average value of the sum of all the four objectives)

and in terms of number of distinct nondominated solutions found at the end of each time step. Results of this comparison are included in Table 2 for the case with dynamic requirements and in Table 3 for the case with dynamic components. From the comparison of the two algorithm it can be easily deduced the advantage of the hybrid approach while compared with the algorithm which uses complete restart. The hybrid approach is more efficient in both overall fitness and diversity of the nondominated solutions found.

5. Conclusion The current work investigates the potential of evolutionary algorithms in a particular case of multiobjective dynamic system: software component selection problem. Two types of dynamics have been considered: the requirements of

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

719

Figure 6. Number of nondominated solutions and number of distinct nondominated solutions at each time step for dynamic component set. Table 2. Comparison between EA with repair mechanism and EA with complete restart (fitness represents the averaged sum of all the objectives values and Nr. Non represents the number of distinct nondominated solutions at the end of a time step) Case 1 experiment. Algorithm

EA with repair EA restarted

Step T=2 Fit. Nr. non 2.5 2

T=3 Fit. Nr. non 2 3

T=4 Fit. Nr. non 2 5

T=5 Fit. Nr. non 2 5

T=6 Fit. Nr. non 2 4

2

2.5

2

2

2.5

1

1

4

4

Figure 7. Effect of the number of iterations in the convergence and diversity at each time step for the case with dynamic component set. Population size is kept constant at a value of 30.

the system change over time (in the first case) and the components available in the repository change over time (in the second case). The multiobjective evolutionary approach is combined with a repair mechanism which helps the algorithm continue

720

from the population already found at the previous step instead of restarting the whole process. The tests performed show the potential of evolutionary algorithms for this particular problem and for other similar ones. This work will be further extended to deal with more complicated problems (the next step is to consider with hierarchies of components [12]).

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

3

Table 3. Comparison between EA with repair mechanism and EA with complete restart (fitness represents the averaged sum of all the objectives values and Nr. Non represents the number of distinct nondominated solutions at the end of a time step) - for Case 2 experiment. Algorithm

EA with repair EA restarted

Step T=2 Fit. Nr. non 2 5

T=3 Fit. Nr. non 2 5

T=4 Fit. Nr. non 2 4

T=5 Fit. Nr. non 2 4

2

1.5

1.5

1.5

2

4

3

[11] A. Abraham ,L. Jain, R. Goldberg, Evolutionary Multiobjective Optimization: Theoretical Advances and Applications, Springer Verlag, ISBN 1852337877, 2005. [12] A. Vescan, C. Grosan, Evolutionary Multiobjective Approach for Multilevel Component Composition, Studia Universitatis Babes-Bolyai Series Informatica, Vol. LV, No. 4, pp. 18–32, 2010.

4

References [1] I. Crnkovic, M. Larsson, Building Reliable Component-Based Software Systems, Artech House publisher, 2002. [2] A. Vescan, Components ordered assembly construction based on temporal restraint, Proceedings of the 3rd DoctoralWorkshop on Mathematical and Engineering Methods in Computer Science, Znojmo, Czechia, pp. 249–256, 2007. [3] C. Grosan A comparison of several evolutionary models and representations for multiobjective optimization, ISE Book Series on Real Word Multi-Objective System Engineering, chapter 3, Nova Science, 2005. [4] L. Gesellensetter, S. Glesner, Only the Best Can Make It: Optimal Component Selection, Electron. Notes Theor. Comput. Sci., vol. 176 (2), pp. 105–124, 2007. [5] N. Haghpanah, S. Moaven, J. Habibi, M. Kargar, S. H. Yeganeh, Approximation Algorithms for Software Component Selection Problem, APSEC conference, pp. 159–166, 2007. [6] E. Mancebo, A. Andrews, A strategy for selecting multiple components, SAC ’05: Proceedings of the 2005 ACM symposium on Applied computing, pp. 1505–1510, 2005. [7] A. Vescan, Optimal component selection using a multiobjective evolutionary algorithm, Neural Network World Journal, no. 2, pp. 201–213, 2009. [8] A. Vescan, A Metrics-based Evolutionary Approach for the Component Selection Problem, the 11th International Conference on Computer Modelling and Simulation (UKSim 2009), pp. 83–88, 2009. [9] A. Vescan, C. Grosan, Two Evolutionary Multiobjective Approaches for the Component Selection Problem, Proceedings of the Fourth International Workshop on Evolutionary Multiobjective Optimization Design and Applications, Kaohsiung, Taiwan, pp. 395–400, 2008. [10] A. Vescan, C. Grosan, A Hybrid Evolutionary Multiobjective Approach for the Component Selection Problem, Proceedings of the 3rd International Workshop on Hybrid Artificial Intelligence Systems, Burgos, Spain, LNCS/LNAI 5271, pp. 164– 171, 2008.

2011 11th International Conference on Hybrid Intelligent Systems (HIS)

721