The Induction of Finite Transducers Using Genetic Programming Amashini Naidoo1, Nelishia Pillay2 1
School of Computer Science, Univesity of KwaZulu-Natal, Westville Campus, Westville, KwaZulu-Natal, South Africa
[email protected] 2
School of Computer Science, University of KwaZulu-Natal, Pietermaritzburg Campus, Pietermartizburg, KwaZulu-Natal, South Africa
[email protected] Abstract. This paper reports on the results of a preliminary study conducted to evaluate genetic programming (GP) as a means of evolving finite state transducers. A genetic programming system representing each individual as a directed graph was implemented to evolve Mealy machines. Tournament selection was used to choose parents for the next generation and the reproduction, mutation and crossover operators were applied to the selected parents to create the next generation. The system was tested on six standard Mealy machine problems. The GP system was able to successfully induce solutions to all six problems. Furthermore, the solutions evolved were humancompetitive and in all cases the minimal transducer was evolved. Keywords: genetic programming, finite state transducers.
1 Introduction Finite state transducers are used in various domains of Computer Science such as natural language processing and image processing. A fair amount of research has investigated the evolution of finite state machines, however almost all of these efforts have been focused on inducing finite acceptors and very little work has examined the effectiveness of evolutionary algorithms in generating finite transducers. The study presented in this paper is an initial attempt at assessing genetic programming (GP) for the purpose of inducing finite transducers, namely, Mealy machines. The following section provides an overview of finite state transducers. Section 3 presents a brief introduction to genetic programming and describes previous attempts at evolving transducers. A genetic programming system for evolving Mealy machines is proposed in section 4. Section 5 discusses the overall methodology employed to evaluate GP as means of generating Mealy machines. The performance of this GP system on six standard finite transducer problems is analyzed in section 6. Finally, section 7 summarizes the findings of this study.
2 Finite Transducers Finite transducers are finite state machines that produce an output string for a given input string. There are two types of transducers, namely, Mealy machines and Moore machines. Mealy machines are formally defined as: Me= {q0, ∑, Γ, Q, δ} where
q0 is the start state ∑ is the input string alphabet Γ is the output string alphabet Q is the set of states δ: Q x ∑ x Γ → Q, i.e. defines the transitions
A Mealy machine that takes a string consisting of a’s and b’s as input and outputs a binary string containing a1at the position of every substring “ab” in the original string is illustrated in Figure 2.1.
a/0
a/0
b/0
A
b/1
B
Figure 2.1. Example of a Mealy machine.
Notice that Mealy machines produce the output string as part of the transition. For example, at state A if a b is read in a 0 is output. Moore machines differ from Mealy machines in that the output string is produced at the state. Both these machines are deterministic. Algorithms exist for converting Mealy machines to Moore machines, thus only the induction of Mealy machines will be examined in this study.
3 GP and Finite Transducers Genetic programming (GP) is an evolutionary algorithm that represents elements of a population using variable sized structures [1]. Examples of representations used by different genetic programming systems include parse trees, matrices, and directed graphs, just to name a few. GP has been successfully employed in numerous domains and in a number of cases has produced human-competitive results. Research into the generation of finite automata using evolutionary algorithms was initiated as early as the 1960's with the study by Fogel et al. [2] investigating the evolution of deterministic finite automata (DFAs). Later attempts at inducing DFAs include the genetic algorithm implemented by Dupont [3] to evolve DFAs. This system takes a maximal canonical automaton for a set of positive sentences for the language as input.
Dunay et al. [4] employ a genetic programming system to generate S-expressions representing DFAs. Brave [5] takes a similar approach that uses genetic programming to evolve cellular automata encodings of the DFAs. More recent studies investigate the use of gene regulation [6] and a genetic programming system representing each DFA as a transition matrix [7], for the purposes of DFA evolution. While a lot of work has been directed at examining the generation of finite acceptors, not much research has addressed evolving finite transducers. The only study applying evolutionary algorithms to this domain is that conducted by Lucas [8] to induce chaining codes for binary images. Lucas [8] implements a random-hill climber to induce a Mealy machine for converting a 4-direction chain code for a binary image to an 8-direction chain code. Each potential solution transducer is a table representing the transition matrix of the corresponding finite transducer. Each transducer is comprised of a maximum of ten states. One of three nondestructive mutation operators is applied at each stage of the search to further evolve the potential solution. The training set consisted of fifty pairs of input strings and their corresponding target output. A general solution was induced within fifty runs of the system. The solution consisted of five redundant states which once removed yielded a transducer that was equivalent to the target machine. The following section presents a genetic programming system, using a direct representation for each finite transducer, namely, a transition graph, to evolve Mealy machines.
4 Proposed GP System This section proposes a genetic programming system for the evolution of Mealy machines. The generational control model is employed by the system. The number of elements of the population is kept fixed from one generation to the next. During each generation the reproduction, mutation and crossover operators are applied to parents that have been selected using tournament selection. These processes are described in more detail below. 4.1 Representation Each element of the population is represented as illustrated in Figure 2.1. A direct representation is used for each Mealy machine, i.e. each element of the population is a directed graph with each node representing the state of the finite transducer and each edge representing the transition between states. An edge label specifies an input character and the corresponding output character. The Mealy machines are deterministic, hence each state has an outgoing arc for each element of the input alphabet and a state can have at most one outgoing arc for each element of the input alphabet. Thus, the terminal set is comprised of elements of the input and output alphabet and function nodes represent states in the Mealy machine. During initial population generation each individual is created by randomly choosing source and destination states and input and output characters for the edges joining these nodes, until the maximum node limit per individual is reached.
4.2
Interpretation, Fitness Evaluation and Selection
The fitness cases are essentially pairs of input strings and the corresponding output string that must be produced by the Mealy machine. The interpretation process takes an input string and the start state of the finite transducer as input. As each character of the input string is processed the corresponding transition of the transducer is applied and the output characters specified by each transition are concatenated to produce the output string. The fitness of an individual is the number of fitness cases for which it produces the correct output string. These fitness measures are used by the tournament selection method to choose the parents for the next generation. 4.3
Genetic Operators
The GP system applies the reproduction, mutation and crossover operators to the selected parent to create the next generation. The reproduction operator basically clones the chosen parent. Figure 4.3.1 provides an overview of the crossover process.
1. Randomly choose crossover points in both parents
P 1:
a/1
a/0
b/0
0
a/0
P2:
1
0 b/1
a/0
1
a/0
b/1
2
a/1
b/0
b/0
2 a/1, b/0
a/1
4
b/1
3 b/0
2. Swap the subgraphs rooted at the crossover points. Randomly allocate external edges to nodes in the new graph.
O 1:
a/0
0
1 a/1 b/0
a/1 a/1
3
a/0
O2:
b/0
2
a/1 b/0
Figure 4.3: Application of the crossover operator
0
b/1
1
a/0 b/0
3
b/0
a/0 b/1
2 a/1
The crossover operator randomly selects crossover points in copies of both the selected parents. The subgraphs rooted at these points are swapped. Internal edges refer to those edges directed at nodes remaining in the parent while external edges refer to edges that were connected to nodes in the removed subgraph. In Figure 4.3.1 internal edges are illustrated by a solid line and external edges by a broken line. The external edges in both parents are randomly allocated to target nodes in the newly inserted subgraphs. The mutation operator replaces a randomly chosen subgraph in the selected parent with a newly generated subgraph. All external edges, i.e. edges that were previously connected to nodes in the removed subgraph, are randomly redirected to nodes in the new subgraph. This process is illustrated in Figure 4.3.2.
P
1. Randomly select a mutation point in the parent
a/1
0
1
a/0 b/0
b/0 a/1
3
2
b/0
2.
Remove the subgraph rooted at this point
a/0 b/0
3.
Insert the newly created subgraph
a/1
0
a/1
1
0
1
a/0 b/0
a/0
b/0 b/0
b/0
3 a/0 b/1
Figure 4.3.2. Application of the mutation operator.
Genetic operators often produce offspring that have worse fitness than their parents.
Unfortunately, in such cases these operators impede the success of the GP algorithm in finding solutions. Thus, non-destructive versions of the both these operators, i.e. genetic operators that produce offspring that are at least as fit as their parents, have been implemented and will be used if needed. In both cases the operation is repeated until an offspring at least as good as the parent (or one of the parents in the case of crossover) is produced. A limit is set on the number of improvement steps. If this limit is exceeded the offspring with worse fitness than its parent is accepted.
5 Experimental Methodology The study reported in this paper is an initial attempt at evaluating genetic programming as a means of evolving finite transducers. Hence, the genetic programming system proposed in the previous section will be tested on the standard finite transducer benchmarks described in the general literature on theory of machines and formal languages such as [9]. These benchmarks are listed in Table 5.1. Based on the findings of this study, revisions will be made to the original system and further evaluations will be performed in a specific application domain such as natural language processing. Table 5.1: Mealy machine data set (from [9] and [10]). Machine M1
Description Mealy machine that outputs a 1 for each substring ‘aaa’, Σ = {a, b}.
Example Input: aaaaabaaa Output: 001000001
M2
Mealy machine that outputs a 1 for each substring ‘aab’, Σ = {a, b}.
Input: abaabaab Output: 00001001
M3
Mealy machine that takes a binary string in reverse order as input and outputs a binary string representing the number input incremented by 1.
Input: 001 Output: 101
M4
Mealy machine that outputs a 1 at every double letter, Σ = {a, b}.
Input: baaabbabb Output: 001101001
M5
Mealy machine that takes a binary string as input and outputs the 1's complement of the string.
Input: 011010 Output: 100101
M6
Mealy machine that takes a binary string as input and outputs a string consisting of E’s and O’s such that an E occurs at a position if the number of 1's read in so far is even and an O if it is odd.
Input: 11100101 Output: OEOOOEEO
The system was implemented in Java (JDK 1.5.0_6) and all simulations were run on a Windows based 1.86 GHz PC with 1GB of RAM.
The random number generator used is that provided by the JDK library. The GP parameters used are tabulated in Table 5.2. Note that the application rates of the genetic operators were empirically derived. A different number of fitness cases were needed for each language and ranged from a minimum of 20 to a maximum of 56 fitness cases. The next section discusses the performance of the proposed GP system when applied to the benchmarks in Table 5.1. Table 5.2. GP Parameters Population size
2000
Selection method
Tournament selection
Tournament size
5
Maximum number of nodes
6
Maximum generations
50
Crossover rate
85%
Mutation rate
5%
Reproduction rate
10%
Fitness cases
Pairs of input and corresponding output strings.
Raw fitness
The number of correct output strings.
Termination criteria
A solution has been found or 50 generations are completed.
6 Results and Discussion The genetic programming system proposed in section 4 was applied to the data set in Table 5.1. The system was able to generate general solutions for all six machines. Ten runs were performed for each machine. Each solution was evolved in under a minute. The only other study applying evolutionary algorithms to the induction of finite transducers is that conducted by Lucas [8]. However, the domain for which transducers have been evolved is different from those presented in this paper and thus a direct comparison of the results is not possible. Hence, the evolved solutions are compared to human-generated solutions. A solution for each of the machines and the corresponding “human generated” solutions are listed in Table 6.1. Note that for all six machines the evolved solutions are equivalent to the “human generated” solutions and for all of these machines the minimal transducer was evolved. Table 6.2 lists the success rates for both standard and non-destructive operators.
The standard genetic operators implemented generally produced fit individuals and produced a 100% success rate for all machines accept M1. In the case of M1 the destructive effects of the genetic operators resulted in the lower success rate. The application of non-destructive operators for this machine produces a success rate of 100%. Table 6.1. Mealy machine solutions. Machine M1
“Human Generated” Solution
GP Generated Solution
a/0
b/0
q0
a/0
q1
b/0
a/0
b/0
0
q2
1
b/0
a/1, b/0
a/0
2
a/1, b/0
M2 a/0
a/0
b/0
q0
a/0
q1
b/0
q2
a/0
a/0
b/0
0
1
b/0
b/1
a/0
b/1
M3 0/0, 1/1
0/0, 1/1 no carry
0/1
start
0/1
1/0
1
0/1
0
0/1 1/0
carry
2
1/0
1/0
M4 a/1
a/1
q1
a/0
q0
b/0
1
a/0 a/0
0
b/0
a/0
b/0
b/0
q2 b/1
2 b/1
2
Machine M5
“Human Generated” Solution
GP Generated Solution
0/1, 1/0
0/1, 1/0
q0
0
M6
0/E
0/E
1/O 0/O
1
1/E
1/O 0/O
0
2
1/E
1
Table 6.2: Success Rates for Mealy Machine Simulations Machine
Standard Operators
Non-destructive Operators
M1
70%
100%
M2
100%
100%
M3
100%
100%
M4
100%
100%
M5
100%
100%
M6
100%
100%
7 Conclusion The main aim of the study presented in this paper is to assess the potential of genetic programming as a means of inducing finite transducers. A GP system, using directed graphs to represent transducers, was implemented and tested on six standard transducer problems. The results obtained are promising. The system was able to evolve solutions to all six problems. Furthermore, the solutions evolved were human competitive and in all cases the minimal transducer was found. Thus, the main contribution of this study is the discovery of a genetic programming system, using a simple direct representation of each transducer, as an effective methodology for generating finite transducers. Future extensions of this study will investigate applying the current system to specific applications in the domain of natural language processing. Acknowledgments. The authors would like to thank the NRF (National Foundation for Research) of South Africa for funding this project.
References 1. Koza, J. R.: Genetic Programming I: On the Programming of Computers by Natural Selection, MIT Press (1992). 2. Fogel, L.J.: Owens, A., J., Walsh, M.J., Artificial Intelligence Through Simulated Evolution, Wiley and Sons, New York (1966). 3. Dupont, P.: Regular Grammatical Inference from Positive and Negative Samples by Genetic Search: the GIG Method. In Carrasco, R.C. and Oncina, J. (eds.): Grammatical Inference and Applications (ICGI-94). Springer-Berlin, Heidelberg (1994) 236 - 245. 4. Dunay, B.D.: Petry, F.E., Buckles, B.P: Regular Language Induction with Genetic Programming. In: Proceedings of the 1994 IEEE World Congress on Computational Intelligence, Orlando, Florida, USA. IEEE Press (1994) 396 - 400. 5. Brave, S.: Evolving Deterministic Finite Automata Using Cellular Encoding. In : J.R. Koza et al. (eds.): Proceedings of the First Annual Conference on Genetic Programming (GP 96). MIT Press (1996) 39 - 44. 6. Luke, S., Hamahashi, S., Kitano, H.: “Genetic” Programming. In: Banzhaf, W., Daida, J., Eiben, A.E., Garzan, M. H., Honavar, Jakiela, V., M. and Smith, R.E.: Proceedings of the Genetic Programming and Evolutionary Computation Conference, Orlando, Florida, USA, Vol. 2. (1999) 1098 - 1105. 7. Lucas, S.M., Reynolds, T.: Learning DFA: Evolution versus Evidence Driven State Merging. In: The Proceedings of the 2003 Congress on Evolutionary Computation (CEC 2003). IEEE Press (2003) 351 – 358. 8. Lucas, S. M.: Evolving Finite State Transducers: Some Initial Explorations. In: Genetic Programming: 6th European Conference, EuroGP 2003, Essex, UK, April 14 -16, 2003, Lecture Notes in Computer Science, Vol. 2610. Springer (2003) 130 - 141. 9. Cohen, D. I. A.: Introduction to Computer Theory, John Wiley & Sons (1986). 10. Forcada, M.L.: Neural Networks: Automata and Formal Methods of Computation, January http://www.dlsi.ua.es/~mlf/nnafmc/pbook.pdf. (2002).