Evolving Optimal Parameters for Swarm Control Jennifer Golbeck University of Maryland, College Park A.V. Williams Building College Park, Maryland 20742 USA
ABSTRACT Using many inexpensive rovers in place of single costly ones is an idea that has been gaining attention in the last decade. How to effectively control those rovers is an open question, but swarming is an attractive option to present. While much research in the field investigates intelligent swarming, recent research has shown that the unintelligent swarm is an effective control mechanism for thoroughly covering a space and maintaining swarm-like behavior in the face of widespread failures. This paper takes that research one step further, exploring the application of a genetic algorithm to evolve optimal parameters for an exploratory swarm.
KEY WORDS: swarm behavior, distributed control, evolutionary optimization, genetic algorithms 1. INTRODUCTION Much research in swarm intelligence (SI) has focused on issues such as task switching, agent allocation, and adaptive tasks [1]. Swarm behavior has been used as a tool to achieve these much more complicated behaviors. While this research is both interesting and tremendously useful, the swarm in its most basic form has interesting properties that should be addressed. Many real problems do not necessarily require intelligent agents that can change their behavior based on perceived changes in group needs. Often, simple, thorough information gathering is the goal. Many animals use swarming behavior as a method for information gathering. From a central nest location, individuals travel out in differing patterns. Once they have found a target, like food, they return to signal for help. However, before locating something of interest, the group as a whole can conduct a rather comprehensive search of a given area. The pattern of this search originates from the nest and inspects local areas with more frequency than more distant ones. This makes intuitive sense – why travel 500 yards for food when there is a source 50 yards from the nest? As a result, terrain near the nest is more heavily inspected. The pattern of this coverage in nature often looks very much like a Gaussian distribution [2]. Why should this behavior interest us? Insects are relatively unintelligent individuals with little to no concept of the larger group needs or goals, yet they are able to self-organize under no supervision to effectively accomplish tasks. With errands such as foraging, the group is able to effectively and thoroughly cover a large area of terrain in their search. Additionally, the group is not reliant on any one individual. One ant may get eaten while foraging, but the group does not measurably suffer for the loss. In our terms, the group is highly fault tolerant. When rovers need to cover ground, it is important to think about how to organize and control them. Centralized control can be expensive and difficult. Furthermore, if the central control system fails, the entire network of rovers is incapacitated. Using a deterministic plan for covering ground is also susceptible to faults – if a rover meets an obstacle or if it is damaged, the unfinished portion of its route may not be covered. To the contrary, an approach that mimics the swarming seen in nature avoids all of these pitfalls. The built-in redundancy and non-reliance on any individual makes a system that is highly fault tolerant and balanced between exploration and localization. Previous research has investigated the unintelligent swarm as a control mechanism. That research showed that the swarm was an effective strategy for covering territory with a normal distribution as well as dealing with large failure rates within the system [6]. An unanswered question is what makes an ideal swarm? What is the best way to organize a swarm internally? This research employs a genetic algorithm to evolve optimal parameters for a swarm of rovers engaged in an exploratory mission.
2. BACKGROUND AND PREVIOUS RESEARCH Golbeck [6] analyzed the performance of an unintelligent swarm of rovers, using a simulator to examine the performance of software agents. This research extends the initial study of unintelligent swarms, and the results presented here are based on the simulation developed in [6]. The rationale and analysis behind that study are presented in this section. The main method used to implement swarm behavior has each individual move toward its two closest neighbors. Obviously, using only one neighbor will just pair off individuals. Using two neighbors will draw individuals in different directions. For example, Figure 1 shows a subgroup of five agents. Agent 1 is closest to agents 2 and 5. Based on the distance from those two agents, number 1 will move in the direction indicated by the dashed line. Although agent 2 is a closest neighbor of 1, agents 3 and 4 are both closer to 2 than agent 1 is. Thus, agent 2 will move in the direction indicated by the dashed line originating from it. To prevent individuals from colliding, they are programmed to move away from any other individual that has moved too close. Instead of factoring in an attraction to that individual when calculating the movement vector, a repulsion of the same strength is factored in.
Figure 1: Nearest neighbors and movement vectors for two agents.
Since this research focused on using the swarm to explore an area around a central point, it is important that the swarm stays centered. While the traditional swarm moves freely within a given space, individuals in an exploratory swarm must frequently return toward the origin. To achieve such behavior, each individual has a fixed acceleration toward the center that is factored into its movement vector. This prevents an individual or group of individuals from flying out of range. Finally, a strict limit is fixed on how far each individual can move from the center. If it reaches that limit, an individual is programmed to “bounce back” toward the center. The movement vector is simply reversed if it would take an individual beyond the established boundary. Simulations in [6] implemented this algorithm with varying numbers of rovers. The simulated terrain was divided into a grid, and the number of visits to each section was recorded. As statistics would suggest, simulations with more agents had a more normal distribution. Figure 2 shows the graph for a population of 50 agents. The smooth Gaussian distribution of the data is clearly visible.
Figure 2: A graph of the frequency with which each plot of territory is visited by 50 rovers over 50,000 observations.
Figure 2 records visits over 55,000 “observations’. An observation is recorded at each point visited by an individual. The simulation used to produce the graph in Figure 2 had each agent making 50,000 observations. The simulator could produce data for that many observations in less than one minute. However, applying this technology to the control of mobile robots is done in a very different environment. They cannot move as quickly as simulated agents, and this fact must be taken into account when describing the behavior they will exhibit. A more realistic number of observations is 1,400. This number comes from the very rough estimate that over a ten-day datacollection mission, a rover could make an observation and relocate in about 10-minutes. Figure 3 illustrates the distribution of visits for the same 50 rovers making only 1,400 observations. While the distribution is not as smooth as what we see with 50,000 in Figure 2, the graph still clearly shows a Gaussian distribution. The roughness of this curve is just an issue of the size of the data set. Figure 2 demonstrates that with 50 rovers a smooth distribution will eventually be achieved. Thus, it can be inferred that the rough curve shown in Figure 3 is actually created by the Gaussian behavior of the agents.
Figure 3: A graph of the frequency with which each plot of territory is visited by rovers. This graph reflects a data set of 50 rovers taking 1,400 measurements.
Using only ten agents that each make 1,400 observations produces a rougher, but still normal distribution of visits. In the swarm algorithm, faults are handled in such a way as to allow the swarm to still perform effectively. If an individual becomes
disabled – either through system failure or by becoming stuck in the terrain – it is removed from the swarm. It stops transmitting its location, and as a result the other individuals in the swarm ignore it and do not consider it to be a neighbor. This feature of the algorithm means that if an original swarm of 50 agents experiences even an 80% failure rate, the 10 remaining active agents will still be able to effectively accomplish a survey of the territory, albeit less thorough. Those 10 agents behave as though the other 40 had never existed. With a large initial swarm, this can allow incredibly high failure rates without loss of system functionality. With smaller initial swarms, the failure rate cannot be so high. This is because it is not failure rate that causes the breakdown of the system; it is the number of active rovers that determines whether or not the swarm behaves in a normal and predictable way. While a rough normal distribution is still achieved over 1,400 observations for 10 rovers, no such distribution is achieved with only 7 rovers. The distribution appears to lose its organized form when the number of agents falls to around 7 or 8. Unlike the roughness that is seen with only 10 agents, the disorganization seen with these smaller numbers is not a matter that is resolved with longer simulations. More observations by each rover do not eventually produce a smooth, normal curve. This is because forces that hold the larger groups together are weaker as the group shrinks. With less than eight individuals in a simulation, it is not uncommon for the entire population to move as a whole. With neighbors heading toward the edge of the territory, the draw toward the center can be overwhelmed. This produces a larger number of visits to the far edges of the territory. The small population also prevents the neighbor randomization technique from creating a coherent group. Since all eight members can be in close proximity to one another, randomization does not have much effect. Contrast this with a population of fifty individuals that, by size alone, must be further spaced, thereby increasing the efficacy of randomization.
3. EXPERIMENT AND RESULTS The simple swarm algorithm is computationally quick and efficient – a significant win for systems whose computational power is dedicated to another purpose. Because performance is so important in exploratory missions, it is important to consider the optimal parameters for a swarm. Rough guesses were used to create the swarms described in the previous section. The goal of this paper is to apply a scientific technique to the problem of optimizing swarm parameters. Brute force search is not ideal to find the optimal parameters in this case. This research is very approximate in determining parameters, looking for values within 10% of the optimal. Even with these limitations in place, the size of the search space is still about 240. With the time required for each simulation, that would take about 1000 years of computation time to search. Although parallel processing techniques and machine speed may be able to reduce that to a more reasonable time frame, our hypothesis is that the genetic algorithm is an ideal method for quickly optimizing these parameters. Five parameters were isolated for optimization: 1. Number of neighbors – ranging from 2, the minimum required to achieve swarming behavior, to n where n is the number of rovers in the simulation. 2. Acceleration to center – how strongly each rover is drawn back toward the center of the area. 3. Attraction toward neighbors – how strongly each rover is pulled in the direction of the nearest-neighbors vector. 4. Randomization frequency – how often neighbors are randomized to prevent clustering. 5. Repelling distance – how close rovers can come to one another before they start to repel. This metric prevents collisions, and also prevents a number of rovers from investigating a small area at the same time. Number of rovers was not considered as a parameter. As the previous results described, the swarm maintains its integrity down to a population of approximately eight rovers. We vary number of rovers in this experiment to see if different optimal parameter values evolve for different population sizes, but do not consider it in evolution. 3.1 The Simulation This research started with the same simulation described in Section 2 of this paper. While previous research looked for a Gaussian distribution, that was not the purpose here. A normal distribution leads to a fairly even coverage of territory. However, the goal of exploration is not necessarily a normally distributed coverage of the area; the normal distribution is used to ensure a high likelihood of finding interesting events. For a more terrestrial example, consider someone clearing a mine field. She may use a fairly Gaussian distribution of attention to a given area, but her success is not determined by how normal that distribution is. Instead, it is determined by how many mines are cleared. Following from that, this research creates a set of ten unique points randomly scattered throughout the simulation space. The score for a simulation is a function of the number of events discovered and how often each event is observed. Each simulation runs for 1,400 observations per rover, and fitness is determined by the average score over 5 simulations.
3.2 The Genetic Algorithm The evolutionary population was comprised of 20 genetic entities. Each was encoded as a chromosome, with gene on the chromosome representing a different parameter of the swarm behavior. All but two of the individuals were randomly initialized. Those two were initialized to have a chromosome of either all 0’s or all 1’s. This ensured that the extreme ends of each trait were present in the initial population. After calculating fitness as described above, this study implements roulette wheel selection, also called stochastic sampling with replacement [3]. In this stochastic algorithm, the fitness of each individual is normalized. Based on their fitness, individuals are mapped to contiguous segments of a line, such that each individual's segment is equal in size to its fitness. A random number is generated and the individual whose segment spans the random number is selected. The process is repeated until the correct number of individuals is obtained. Individual
1
2
3
4
5
6
7
8
9
10
Fitness
27
22
18
15
13
12
9
8
4
3
Normalized Fitness
0.20
0.17
0.14
0.11
0.10
0.10
0.07
0.06
0.03
0.02
Table 1: Sample Population showing fitness and normalized fitness for each individual
Figure 4: Continuous line divided for the 10 individuals of the sample population. Random points generated along this line determine which individuals will be selected for reproduction.
After reproduction, mutation occurs with a small probability, averaging about one mutated gene per population per generation. This mutation takes the form of a bit-flip in the chromosome. We also applied elitism, maintaining a copy of the top two individuals in the next generation. 3.3 Results Many simulations were used to analyze the effectiveness of the genetic algorithm. To start, simulations used ten rovers, the lower bound of rovers that will operate in a swarm.
Figure 5: Average Fitness of the populations at each generation
With the initial conditions described above and a population of 20 parameter sets, we see a significant increase in fitness over 25 generations of evolution. Figure 5 shows the average fitness at each generation. Clearly, we see a great improvement over the course of the algorithm. While there are only a few consecutive generations that show a statistically significant improvement between them, overall the results are dramatic. Improvement between the first and last generation is significant (by the standard t-test) with p< 10-21. In addition to tracking the average fitness of the population at each generation, the simulator also kept track of the best performing individual overall. Those collective individuals far exceeded the performance of the rest of the population. While the upper end of the populations’ average fitness per rover was around 50, these individuals achieved scores near or above 100. An examination of the features of these unique performers in conjunction with analysis of the makeup of a fully evolved swarm gives a strong indication of exactly what parameters are optimal. It is worth noting that using the parameter values of these outstanding individuals as an initial population did not produce results that exceeded those achieved by a randomly initialized population. This indicates that the genetic algorithm is capable of introducing any features that are not present a priori, and evolving them into the population. For 10 rovers, parameter values were relatively consistent among evolved populations and the outstanding individuals from each simulation. For each of the five parameters, the results were as follows: • Acceleration toward middle: All but one of the outstanding individuals had a value of 20 for acceleration toward middle. The value of 20 was clearly seen among most evolved individuals. That number itself represents about 0.1% of the distance across the territory. • Number of Neighbors: Again, with tremendous consistency, a value of 2 neighbors was observed. This is interesting because it is the simplest implementation of the swarm behavior. This can likely be attributed to the fact that any higher number makes it more likely for individuals to form a self-attracting clique that can stray out of the larger swarm behavior. • Weight of Neighbors: All of our populations had a weight over 80%, and more than half had a weight of 100%. This indicates that the draw of neighbors must not be diminished. In cases where the value is too low, rovers are overwhelmed by the draw toward the center, and thus a large portion of the area remains unexplored. • Neighbor randomization – The average and most common value for randomization was for it to occur at the 840th observation. This means that randomization occurs once, about half way through the simulation. More frequent randomizations tend to disrupt the organization of the swarm. Values that are too large prevent randomization all together which may tend to allow the formation of small cliques among the rovers. • Repellant distance: A repellant distance of 32 was very common among all of our populations. With a space of 20,000 X 40,000 units, a distance of 32 means that rovers can come very close to one another, but still need something between 0.1% and 0.2% of the territory between themselves.
Figure 6: Average score per rover per generation for swarms with 50 rovers
It is possible that swarm size effects these optimal parameters. Since the number of rovers is dictated by many factors (i.e. hardware cost, payload weight, etc), it did not make sense to include it as an evolvable parameter. The initial size of 10 rovers approximates performance at the lowest point where swarm behavior is present. To test the impact of swarm size on the evolution of parameters, another set of simulations was run with a size of 50. The improvement here, shown in Figure 6, is as clearly visible as it is for a population of 10 rovers. Statistically, the results are quite significant. One difference, however, is that there is much greater variance in the actual values of the parameters than is present for 10 rovers. While in the smaller swarm size, over half of the population (and often over 80%) shared a common value, there was no such majority for 50 rovers. Values are generally evenly spread within the middle 50% of their range, and there does not seem to be relationships in the data which would account for this variance. The exceptions are numbers for randomization frequency and number of neighbors. Again, most of the evolved populations and outstanding individuals showed 2 neighbors for the swarm. Randomization frequency also shared a common value with the swarm size of 10, with a majority of the results calling for randomization after about 840 observations. This seems to indicate that regardless of swarm size, using only 2 neighbors produces the best behavior, and overall, randomization of neighbors should occur about once, halfway through the simulation.
4. CONCLUSIONS From these data, we can clearly conclude that the genetic algorithm is an effective strategy for increasing swarm performance by optimizing parameters. The scores of an optimized swarm are significantly higher than both a swarm with random parameter values and the well thought out parameter values of earlier research. Depending on the hardware available in a particular mission, this research suggests that scientists will see great improvement in swarm performance by first evolving parameters. For individual cases, factors like swarm size and number of observations per mission can be altered to find a relevant set of numbers. There is certainly room to expand this work. A more thorough set of simulations for varied mission lengths (encoded by the number of observations in a simulation) and perhaps much larger swarm sizes may lend greater insight into why each the parameters consistently evolve to the same values.
5. ACKNOWLEDGEMENT My grateful thanks to Irene Golbeck for her review and support of this and all of my research.
REFERENCES [1] W. Agassounon, A. Martinoli, and R. M. Goodman. A Scalable, Distributed Algorithm for Allocating Workers in Embedded Systems, Proc. of the IEEE Conf. on System, Man and Cybernetics SMC-01, October 2001, pp. 3367-3373. [2] Cheng, K., Spetch. M.L., & Johnston, M. (1997). Spatial peak shift and generalization in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 23, 469-481. [3] Baker, J. E. “Reducing Bias and Inefficiency in the Selection Algorithm.” Proceedings of the Second International Conference on Genetic Algorithms and their Application, Hillsdale, New Jersey, USA: Lawrence Erlbaum Associates, 1987: 14-21 [4] Beni G. and Wang J., “Swarm Intelligence”. Proc. of the Seventh Annual Meeting of the Robotics Society of Japan, pp.425-428, 1989. [5] Brooks, R. A. and A. M. Flynn, "Fast, Cheap and Out of Control: A Robot Invasion of the Solar System, Journal of the British Interplanetary Society, October 1989, pp. 478--485. [6] Golbeck, J. “Unintelligent Swarming for Robust Exploratory Systems”. Proc. of the IASTED International Conference on Control and Applications, May 2002.