On Learning Spatio-Temporal Relational Structures in Two Dierent Domains Adrian R. Pearce1, Terry Caelli1 , and Simon Goss2 School of Computing, Curtin University, Perth WA 6845, Australia Aeronautical and Maritime Research Laboratories, DSTO, Melbourne, Australia. 1
2
Abstract. In this paper we consider the types of representations and
learning procedures required to construct rules which can adequately describe relational information as it occurs in spatio-temporal sequences. A comparison of interpreting on-line hand drawings is made to the automatic generation of ight manoeuvre description based on a relational learning system we have developed, the Consolidated Learning Algorithm based on Relational Evidence Theory (CLARET). The package adapts relational learning techniques to utilise the constraints present in time series data. Our approach involves supporting queries, automatic descriptions and/or predictions from spatio-temporal action sequences.
1 Introduction
We describe a systematic approach for automatically generating human-readable descriptions (or prescriptions) from the relational structures present in spatiotemporal sequences. For example in describing ight, an approach to land manoeuvre is de ned by the subsequence of dierent roll{pitch{yaw states of the aeroplane and dierent actions on the control yoke. On-line hand-drawn schematic diagrams are de ned by dierent strokes, drawn at dierent times at particular orientations. Relational representation and the associated graph matching approach is a powerful architecture for interpreting visual and temporal information. In schematic hand-drawn domains it has been used for pattern recognition using graph matching and subgraph isomorphism [7]. We have developed a software package|the Consolidated Learning Algorithm (CLARET) [8]|which is based on Relational Evidence Theory [3]. The package adapts relational learning techniques [9, 2] to utilise the constraints present in time series data|those of states and their continuous valued, attributed relationships (the scenery) and actions or designs (the scenario). Relational rules are generated which explicitly depict actions and relationships between states of the form, WHILE interpreting (or intending to achieve) goal IF this state and that state have these relationships in space AND this action and that action have those relationships in time AND : : : THEN describe (or prescribe) sub-goal at time t. In the on-line hand drawn schematic system based on CLARET [8], unconstrained schematic drawing is allowed, that is, symbols can be input with i
j
Fig. 1. Left: Scienti c Symbols Application: The application window is shown comprising of 128 scienti c symbols. Symbols are drawn on the central canvas and matched interpretations are back-projected by calculating orientation parameters (shown here using vertical oset). The interpretations using CLARET are shown. Right: Rotation, scale, and shift invariance: Symbols are shown drawn at dierent orientations. Matched interpretations are projected back by calculating rotation, scale, and shift orientation parameters.
no constraints on rotation, scale, or positioning. A vocabulary of symbols is provided by pasting their images across the top of the application window. Handwritten \components" are de ned using an on-line digitiser which produces time-stampedposition and pressure points from pen-down to pen-up condition. Matched interpretations are projected back by calculating rotation, scale, and shift orientation parameters (see Figure 1). In the CLARET algorithm, an unknown segmented and labelled trajectory is presented to the system together with n known trajectories. First, relational descriptions are generated by extracting relationships between pattern segments Pen trajectories are rst segmented into line-based descriptions using a hierarchical multi-scale polygonal approximation method [11]. Features are then extracted from individual states in continuous numerical form (see Figure 2). Second, the patterns are matched using the attribute values and labelled parts from relational descriptions. This involves the repetition of the following steps: attribute generalisation, graph matching and relational specialisation.
Attribute Space B
angle
Current Pattern () p1 Relations ! (p;; 1 p2 : 135; 1) ; ; (p2! p1 : 225; 1) ! (p;; 2 p3 : 270; 0:6) p2 (; p;3! p2 : 90; 1:7) ! (p;; 1 p3 : 45; 0:6) p3 ; ; (p3! p1 : 315; 1:7)
IJ
360
180
0 0.5
1.0
2.0
lengthratio
Fig. 2. Generating binary relations: For all existing (known) patterns, relations are
generated which form directed, planar graph relational structures and form an attribute space BIJ (closed circles). The relations (;;! pI pJ : ANG; LEN ) represent dierence in angle (ANG) and length ratio (LEN ). Relations are also generated for the current (unknown) pattern (open circles) and are included into attribute space BIJ (bottom right hand side).
The method used for generalising over attributes is based on the attribute selection and splitting technique used in decision trees, notably C4.5 [10], except that the information metric is replaced by a variance criterion. Attributes are partitioned into regions and each split results in two new regions de ned by the conjunctions of attribute bounds: rules (r ), see Figure 3 (right hand side). A least general technique is used which approximates K-means splitting [5] and minimises the attribute range of each new rule by maximising the split between rules. Each partitioned rule is replaced by two new least general rules (see Figure 3 (top middle), rule r0 is replaced with rules r1 and r2 ). Such hashing of the observed relational attributes provides initial hypothesis about the resolution of parts required to recognise and discriminate between the known patterns. The question can now be asked: are these rules speci c enough to dierentiate between the dierent relational structures present to correctly interpret the current pattern? Graph matching is used to solve this problem of mapping current (unknown) pattern parts to existing (known) pattern parts, by nding compatible sets of labelled rules. Inter-rule dependencies are determined for subgraphs of rules, Rulegraphs [3]. Rulegraphs are formed when labelled relations instantiate rules according to their attribute values. Two rules r and r are connected if there exists instantiations r (p p ) and r (p p ), such that part p is shared. For example, relation ; p;1! p2 (Figure 2) instantiates rule r1 to form r1 (; p;1! p2 ) (Figure 3). Part mappings between current and existing patterns (q 7! p ) are i
i
i
I
J
j
J
j
K
J
J
J
Rulegraph Interpretations Attribute Space B
IJ
r0
r0
q1 7! p1 _ p2 _ p3 _ p4 q2 7! p1 _ p2 _ p3 _ p4 q3 7! p1 _ p2 _ p3 _ p4
Conditional Attribute Spaces B
Graph Matching
r1
r1
r2
JK
r1
r2
r2
Attribute Generalisation
r3
q1 7! p1 _ p2 q2 7! p1 _ p2 q3 7! p3 Relational Specialisation
r2
r1 r4
r4
r5
q1 7! p1 q2 7! p2 q3 7! p3
r2
r1 r4
r7
r2
r2 r1
r1
q1 7! p1 q2 7! p2 q3 7! p3
r5
p1 q1 q2 p3 p2 q3 p4
r4 r5
r4 r6 r7
Fig. 3. The search for the interpretation of the example in Figure 2 is shown here,
which involves successive applications of attribute generalisation, graph matching and relational specialisation (right hand side). A queue contains rulegraph interpretations of the current pattern in terms of all existing patterns. During the search, mappings between parts in the current pattern (open circles) are solved with respect to the existing patterns (closed circles), the shades of grey in the rules (vertices) correspond to the degree of specialisation (left hand side). A relational evidence network is used to prune the search space while guaranteeing a nite interpretation (see text for details).
solved by checking the existence of common parts in rules. This relies on checking the compatibility between rules based on the consistency of the mapping states in the current pattern with respect to the existing pattern. Rulegraph interpretations are built up by testing compatibility of new (candidate) rules r for possible addition to the set of already existing rules. Interpretations are made by mapping parts which instantiate rules for the current pattern with respect to each existing pattern. Initially, parts are mapped from current patterns to existing patterns for parts only in the root rule r0 . The mapping will initially be many to many, in the sense that rule r0 is completely general (all nodes of the relational graph have the same `colour') and has not yet been specialised, partitioned by the attribute learner. The purpose of relational specialisation is to generate conditional rules by adding literals to the current clause. This creates paths through the relational structures. The method is based on Conditional Rule Generation [1] which allows for (intra-rule) dependencies between attribute states to be represented by creating paths through relational descriptions. Conditional attribute spaces B are formed by traversing paths via relations in attribute space B to relations in attribute space B . Relational specialisation forms conditional attribute spaces B = fr (; p;;p!) j r (;;! p p ) 2 B g for each of the rules r 2 B . Connected paths are traversed from instantiations r (;;! p p ) to relations ; p;;p! (via part p ). Labelled, rst order rules are generated in the form, c
JK
IJ
JK
JK
j
J
K
i
I
J
IJ
i
i
I
J
IJ
J
K
J
r (; p;;p!; A1 ; A2 ; : : : ) ; r (;;! p p ; A1 ; A2 ; : : : ); 0:25 A1 0:95; : : : where p are part variables and A are attributes. Note that label compatibility is represented via paths through relational structures of arbitrary arity p p ! p p ! : : : , which instantiate conditional rules r ; r ; : : : hierarchically. i
J
K
j
I
I
J
i
I
J
K
i
J
j
To determine the best interpretation an evidential measure is required. A relational evidence network is used to impose an ordering of interpretations during the search, which maximally prunes the search space and thus determines the best interpretation in optimal time. In order to represent uncertainty in the CLARET hierarchy of rules, a model must be used which captures the dependency between speci c instantiations (labelled parts) and rules (attribute states). In CLARET there is clear dependency of parts via the instantiation of conditional rules with labelled relations, and so a Bayesian network model can be formulated to include both attribute and part indexing. Each pattern ! is from an existing (though updatable) closed world of patterns ! 2 . Patterns give rise to relations (;;! p p : A1 ; : : : ; A ) 2 , where corresponds to all relations, with unique parts p and p and attribute values, A1 ; : : : ; A . Attribute values are partitioned into rules r which represent all possible attribute states r 2 . This allows for hierarchical modelling of such processes by p
p
I
J
n
I
J
n
i
i
p(! j ) = p
X X Y p(r (;p;;p!) j ;;! p p ; ! )p(;;! p p j r (;;! p p ); ! ) j
i;j
J
K
I
J
p
I
I ;J;K I ;J;K
J
i
I
J
p
for relations ;;! p p ;; p;;p! 2 and where rules r and r are from conditional attribute spaces r 2 B and r 2 B . The CLARET algorithm is designed I
J
J
i
K
IJ
i
j
JK
j
to maximise p(! j ) in Equation 1 while minimising the right hand side with respect to least generalisation of attributes. The formulation relates to the class of exact solutions to Bayesian Networks [6] allowing for analytical determination of probabilities during learning. The relational evidence theory techniques have been shown to compare favourably in a number of empirical comparisons to other learning methods such as Neural Networks for three dimensional object recognition [3] and schematic interpretation [8]. p
2 Recognising Flight Manoeuvres
Over the past few years a new class of ight simulator has been developed which has the ability to not only record the actions of pilots, instrument and simulation internal status variables, but also the time-stamped positions of objects, and dynamic entities in the three dimensional ight course relative to the pilot. In our on-going work, the \behavioural clones" approach of Sammut and co-workers is extended to include dynamic knowledge of the world [12]. In addition to the status of navigation instruments and past actions, pilots use knowledge of the world, both in own-ship (egocentric or view-dependent) and map-view (exocentric or view-independent) representations. Such information is critical in control and trajectory planning in the visual ight regime. In the agent oriented Smart Whole Air Mission Model (SWARM) decision support system, the binding of the physical process of ight (responses) with the symbolic process of tactics selection (plans) is required. This system is based on the procedural reasoning system architecture [4]. During training of the ight simulation system, the pilot selects a particular manoeuvre from an ontology of dierent manoeuvres and records the beginning and end of each manoeuvre and event. The aim, here, is to predict pilot actions, in a given time interval, from pilot actions, instrument settings, object co-ordinates and near-object characteristics at previous time intervals. Figure 4 shows the process of learning action in sequences of input trajectory using the CLARET learning algorithm. In recognition mode, the system dynamically binds to dierent manoeuvres as they occur in the input time series on-line. The system can be interactively used to either query partially enacted sequences in predictive mode or describe sequences in descriptive mode. Descriptions of manoeuvres are generated which de ne action sequences of the form, WHILE in the context of LEVEL-LEFT-TURN IF STICK-LEFT before INIT-BANK AND MOUNTAIN-IN-VIEW before STICK-RIGHT AND : : : THEN believe intention is to TURN-TO-CROSSWIND-LEG.
Descriptive Interpreter
,
State Interpreter
,
Graph Matching Prescriptive Rules
r2
r1
Relational Specialisation
Attribute Generalisation Trajectory Segmenter
Yaw
4
Relational Structure 2
3 1
Roll
5 Pitch
Fig. 4. Spatio-temporal Interpretation using the Consolidated Learning Algorithm
(CLARET): Input ight trajectory (numeric roll-pitch{yaw and object relationships) from the ight simulator is collected and segmented into line-based descriptions. Similar segmentation and relational attribute extraction is carried out on the time-series trajectories obtained from each separate domain, e.g. controls, plane and objects. Successive applications of the matching algorithm result in both scene interpretations (states) and hierarchical scenario interpretations (actions). Hierarchical interpretation involves the successive application of attribute generalisation (numerical learning), graph matching and relational specialisation.
These behavioural models \cloned" from examples of pilot performance have signi cance in the provision of agency (arti cial players) in simulation. The use of simulators as knowledge engineering tools for the construction of pilot models for operational research and modelling is also possible. 3 Discussion
Approaches to spatio-temporal interpretation have been described for two dierent domains|on-line schematic interpretation and automatic description generation for ight manoeuvres. Both applications have rely on the same framework of trajectory segmentation followed by attribute generalisation, relational specialisation and graph matching stages. These relational techniques complement the planning and query tools currently available in computer vision systems by exploring new ways of representing and generalising over temporal information. References 1. Bischof, W. F., Caelli, T.: Learning Structural Descriptions of Patterns: A New Technique for Conditional Clustering and Rule Generation. Pattern Recognition 27 (1994) 689{97 2. Bratko, I., Muggleton, S.: Applications of inductive logic programming. Communications of the ACM 38 (1995) 65{70 3. Pearce, A. R., Caelli, T., Bischof, W. F.: Rulegraphs for Graph Matching in Pattern Recognition. Pattern Recognition 27 (1994) 1231{47 4. Ingrand, F. F., George, M. P.: An Architecture for Real-Time Reasoning and System Control. IEEE Expert, December (1992) 34{44 5. Jain, A. K., Dubes R. C.: Algorithms for Clustering Data Prentice Hall, Englewood Clis (1988) 6. Lauritzen, S. L.: Graphical Models Oxford : Clarendon Press (1996) 7. Messmer, B. T., Bunke, H.: Automatic Learning and Recognition of Graphical Symbols in Engineering Drawings. Springer Verlag Lecture Notes in Computer Science 1072 May 1996 123{34 8. Pearce, A., Caelli, T., Bischof, W.: CLARET: A new Relational Learning Algorithm for Interpretation in Spatial Domains. Proceedings of the Fourth International Conference on Control, Automation, Robotics and Vision (ICARV'96), Singapore, December (1996) 650{654 9. Quinlan, J. R.: Learning logical de nitions from relations Machine Learning 5 (1990) 239{66 10. Quinlan, J. R.: C4.5 Programs for Machine Learning Morgan Kaufmann (1993) 11. Rocha, J., Pavlidis, T.: A shape analysis model with applications to a character recognition system IEEE PAMI 16:4 (1994) 393{404 12. Shirazi, G. M., Sammut, C.: An Interactive Method for Learning to Control Dynamic Systems Proceedings of the Knowledge Acquisition Workshop 1996 (KA96), Sydney, Australia (1996) 60{76