51st IEEE Conference on Decision and Control December 10-13, 2012. Maui, Hawaii, USA
Path Planning for Optimal Classification M. Faied, P. Kabamba, B. Hyun, and A. Girard
Abstract— As stated in the Office of the Secretary of Defense’s Unmanned Aircraft Systems Roadmap 2005-2030, reconnaissance is the number one priority mission for Unmanned Air Vehicles (UAVs) of all sizes. During reconnaissance missions, classification of objects of interest (e.g, as friend or foe) is key to mission performance. Classification is based on information collection, and it has generally been assumed that the more information collected, the better the classification decision. Although this is a correct general trend, a recent study has shown it does not hold in all cases. This paper focuses on presenting methods to plan paths for unmanned vehicles that optimize classification decisions (as opposed to the amount of information collected). We consider an unmanned vehicle (agent) classifying an object of interest in a given area. The agent plans its path to collect the information most relevant to optimizing its classification performance, based on the maximum likelihood ratio. In addition, a classification performance measure for multiple measurements is analytically derived.
I. I NTRODUCTION A. Overview Today, autonomous vehicles (agents) are often employed to explore an area and investigate objects of interest located within that area. The Air Force, Navy, and NASA all utilize autonomous vehicles to collect information [5], [20], [21] and their specific roles vary application by application. For instance, in the Air Force’s intelligence, surveillance and reconnaissance (ISR) missions, one goal is to collect enough information to determine whether the object of interest is friendly or threatening. Other examples include the Navy’s underwater mine-finding mission where an unmanned underwater vehicle is used to search for underwater mines [5], and NASA’s Mars exploration mission where unmanned ground vehicles are sent to investigate environmental conditions on Mars [21]. Although the benefits of employing unmanned vehicles in remote areas are obvious, the utilization of the vehicles is often restricted by limited time and resources. Therefore, to maximize the vehicle utility, vehicle trajectories must be planned to optimize mission objectives. Additionally, for successful mission completion, it is crucial that the path planning strategy employed take into account the information accumulation aspect in its framework in the early design phase. In [14], an information-based formulation for time-optimal cooperative exploration scenarios is presented. To reduce the uncertainty in the sensor measurement, a Kalman filter is The research was supported in part by the United States Air Force grant FA 8650-07-2-3744. The authors are with the department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109, USA {mfaieda, kabamba,
bhyun, anouck}@umich.edu
U.S. Government work not protected by U.S. copyright
employed where the states include the object position, visibility, and status. Moreover, a connection between Kalman filters and Shannon’s equation is shown analytically through the use of a range-dependent covariance. Reference [11] presents an experimental implementation of an informationbased path planning algorithm utilizing a three-wheeled ground robot. These information-based formulations for path planning are used even if the objective is to classify the objects of interest. When classifying in such missions, it is believed that collection of sufficient information is the necessary condition for good classification [7]. Thus, often the objective of the mission is to gather as much information as possible so that classification is correct. Although it is a widely accepted view that more information implies better classification performance, there has been little work on formally proving this. Our previous work in [10], shows that increasing the amount of information, in the sense of Shannon, does not always imply improving classification performance, when classification is made by the likelihoodratio rule and the classification performance is the probability of misclassification. In this paper, a path planning formulation for optimal classification is presented. In this scenario, a mobile agent carries a gimbaled sensor, examines an unidentified object within a given search area, collects information using its sensor, and utilizes that information to classify the object. The agent plans its path in a manner that maximizes the probability of correct classification. B. Literature Review A large body of research has been published in recent years about motion control and collaborative control of networked autonomous vehicles. Although an exhaustive overview of the state of the art is beyond the scope of this paper, a brief review of the most relevant literature is as follows. Many methods exist for solving the basic trajectory planning problem [16]. Despite many apparent differences, the methods are based on few general approaches: roadmap [16], [23], [18], cell decomposition [22], [23], potential field [15], [1] and probabilistic [13]. Each trajectory planning problem is phrased in terms of optimizing some performance index. Information-based exploration has been discussed in a number of papers in recent years, most notably in Reference [9]. Other methods have used information collection to conduct area searches [6], decentralized sensor control [4] and optimal sensor placement [19]. On the other hand, resource allocation for cooperative classification has been discussed in [3]. The paper focuses on cooperative use of multiple vehicles to maximize the
520
probability of correct target classification. A hierarchical distribution decision system is presented that has three levels of decomposition: The top level performs task assignment using a market-based bidding scheme; the middle sub-team level coordinates cooperative tasks; and the lower level executes the elementary tasks (path planning). This combines statistical reasoning from two orthogonal views. In [24], a stochastic projection method to incorporate the statistics of navigation dynamics and target motion is developed to project the estimated pixel location of a feature with an arbitrary level of confidence. Classification is of crucial importance because the agent will make decisions based upon the classification result. These decisions directly contribute to generating proper mission plans and evaluating mission performance. The current literature addresses classification from the information point of view. The literature discusses various methods of planning optimal paths for information-based exploration. To the best of the authors’ knowledge, there is no work on optimal paths for classification. Based upon the result in [10] showing noncongruence between classification and information, we are proposing a path planning method for optimal classification. The problem is challenging because it is a dynamic path planning problem, the mission time is often limited, the agent is only provided with partial a priori information, and the amount of information that the agent can collect depends on the relative positions of the agent with respect to the object. C. Original Contributions This paper presents an integrated model of an agent, its kinematics and its information collection in terms of measurements, applicable to classification performance, and with the following original features. First, the sensor’s information about the object of interest is specified using the Johnson model [17], [12] and the Object Information Signature (OIS). Second, the Johnson model depends on the range from the agent to the object of interest. Third, the OIS changes with the relative azimuth between the agent and the object of interest. Based on a generic integrated system model, the problem of optimal classification for an autonomous agent is formulated as an optimal path planning problem where the states are the Cartesian coordinates of the agent, the control input is the time history of the agent heading angle, the objective function is the probability of misclassification for multiple measurements, and the boundary conditions are subject to inequality constraints that reflect the requirements of the model behavior. The present paper studies this optimization problem and provides the following original contributions: 1) A path planning problem for optimal classification is solved. 2) An information collection model -in terms of the Johnson model and the Object Information Signature model- is presented. 3) A classification performance measure for multiple measurements is analytically derived.
D. Paper Outline The remainder of the paper is as follows. In Section II, the integrated model is presented. In Section III, the optimization problem is formulated to minimize the probability of misclassification for multiple measurements. In Section IV, the problem solution is presented and in Section V, simulation results are illustrated through an example. Conclusions and future work are discussed in Section VI. II. M ODELING In this section, we present the model used throughout the paper. The model consists of five subsystems: agent kinematics, sensor, objects of interest, agent information collection and agent decision. We present each of these subsystems in detail, along with their related variables and outline their interaction. A. Agent Kinematics In this work, the following discrete time unicycle model is used: l cos φ(t) x(t) x(t + 1) , (1) + = l sin φ(t) y(t) y(t + 1) where 1 ≤ t ≤ T; (x(t), y(t)) refer to the Cartesian coordinates of the agent at step t; l refers to a constant displacement; φ(t) to the heading of the agent at step t and T is the final time step. We assume that the agent is moving at a predefined constant speed, and has the capability of choosing the time history of the heading angle φ. B. Sensor The mobile agent carries a gimbaled sensor that is characterized by a performance prediction model (Johnson model [17], [12]). This model provides the probability of correct classification of an object as a function of resolution and range from the sensor to the object. Specifically, the probability of correct classification is: P (N ) =
(N/N50 )
E
,
(2)
E = 2.7 + 0.7 (N/N50 ),
(3)
1 + (N/N50 )
E
where where N is the number of so-called resolvable cycles across the object (see [12]) and is calculated as N (r) = ρd/r,
(4)
where ρ is the maximum resolvable spatial frequency, d is the characteristic object dimension, and r is the range from the agent to the object, and N50 is the number of cycles required for a 50% probability of correct classification. The parameter N50 reflects the difficulty of finding an object in various levels of clutter. Guidelines that relate various values of N50 with a particular task are obtained by fitting field test data [8].
521
C. Objects of Interest y
The object of interest is characterized by the 4tuple (xj , yj , θ, X): its Cartesian coordinates (xj , yj ) ∈ [0, Xarea ]X[0, Yarea ], the object forward direction θ ∈ [0, 2π] and the object category X X ∈ {T, F }.
\
T
y
(5)
x
All of these variables are known except for the category X. The probability that X takes a specific realization is, P (X = T ) = u, P (X = F ) = 1 − u,
Object
M
(xj, yj )
Ts
(6)
x
where u ∈ [0, 1]. We denote u as the priori probability in Bayes’ sense [2] and it represents the proportion of T objects among the objects of interest.
agent ( x, y)
Fig. 1.
Relative angles between the agent and the object of interest
D. Agent Information Collection Model The classification decision is based on a number of measurements around the object. These measurements are generally obtained through imperfect sensor that introduce randomness and uncertainties. Let a(k) ∈ A be a discrete random variable from the set of measurements A = {a(1), a(2), · · · , a(k), · · · , a(K)}, that denotes a measurement taken by a sensor on the agent of the object property, and maps from (range r, relative azimuth ψ, random variable n) into {T, F } that is: a(k) : (r, ψ, n(k)) → {T, F }, 1 ≤ k ≤ K,
(7)
where • r is the range from sensor carried by the agent to the object at time step t, q (8) r = (x(t) − xj )2 + (y(t) − yj )2 , •
ψ is the relative azimuth of the object of interest with respect to the agent at time step t,
For the object, we define the Object Information Signature (OIS), which is a measure of how correctly the object can be classified from the corresponding relative azimuth ψ. The OIS can be represented by a vector diagram where the magnitude of the vector represents the OIS content in the direction of the vector. A larger OIS indicates that an object is more likely to be classified correctly. For simplicity, we assume that there are only two distinct OIS profiles that are related to the two object categories, α ∈ {αT , αF }.
(10)
Object in this work is non-isotropic, and does not move. By non-isotropic we mean that the OIS is not uniform in all azimuths. For example, assume that we have a top view of a military tank -as shown in Figure 2- that can be correctly recognized from directions 1 and 3. Thus its OIS is higher at these directions, that is: α(ψ) = 0.75
45 ≤ψ < 135,
(11a)
(9)
α(ψ) = 0.75
225 ≤ψ < 315,
(11b)
where θs is the angle between the reference direction and line-of-sight from the agent to the object of interest, and θ is the angle between the reference direction and the forward direction of the object of interest as in Figure 1. • n(k) accounts for the randomness in the measurements for fixed agent and object positions, • T represents the sensor measuring a property from a T rue object, and • F represents the sensor measuring a property from a F alse object. • K is the maximum number of measurements. The boundary between these two T and F measurement classes may not be clear-cut. The incidence of a(k) = T and a(k) = F depends not only on the range but also on the object property X and the current azimuth. So the major parameters that affect the change of P (a(k) = T |X) and P (a(k) = F |X) must include the range r and the relative azimuth ψ.
α(ψ) = 0.55
−45 ≤ψ < 45,
(11c)
α(ψ) = 0.25
135 ≤ψ < 225.
(11d)
ψ = θs (x(t), y(t), xj , yj ) − θ,
1
4 2
3
Fig. 2.
Example of different OIS for military tank
The OIS instance assumed in the previous equation reflects that the agent’s view from direction (1 and 3) as in Figure
522
3(a), increases its ability to classify the tank when compared to its view from direction (2 and 4) as in Figure 3(b).
M Agent Kinematics Eq.( 1)
( x, y) Range Eq.( 8)
Relative azimuth Eq.( 9)
r
\
Conditional Probability Eq.( 12)
(a) Agent view when looking from direction 1 or 3
P (a ( k ) | X )
Classification decision Eq.( 13)
Os Performance measure Eq.( 14)
Pm ( K )
(b) Agent view when looking from direction 2 or 4 Fig. 3.
Tank different views related to different azimuths
Based on the OIS profile for the object of interest, we can formulate the agent information collection model in terms of P (N ), and the OIS as follows: P (a(k) = T |X = T ) = α(ψ) ∗ P (N ),
(12a)
P (a(k) = F |X = T ) = [1 − α(ψ)] ∗ P (N )
(12b)
= β(ψ) ∗ P (N ), P (a(k) = T |X = F ) = 1 − γ(ψ) ∗ P (N )
(12c) (12d)
= η(ψ) ∗ P (N ),
(12e)
P (a(k) = F |X = F ) = γ(ψ) ∗ P (N ).
(12f)
E. Agent Decision Model The agent collects a sequence of measurements A = (a(1), a(2) · · · , a(k) · · · , a(K)), based on them the posterior probability of a hypothesis (object being a target or not) is calculated. Then the agent classifies the object based on the maximum-likelihood ratio, i.e., ( (X=T |A) T if PP (X=F |A) > 1, (13) Os = P (X=T |A) F if P (X=F |A) ≤ 1. The analysis of the posterior probability and the classification decision are discussed in detail in the problem solution section. The classification performance is defined by the probability of misclassification for the object. Misclassification stands for classification that is opposite to the object category. Hence, the probability of misclassification is : Pm = P (Os = T ∧ X = F ) + P (Os = F ∧ X = T ). (14)
Fig. 4.
heading angle time history φ shapes the time history of the agent kinematics (x, y) in a causal manner as in (1). Changing the kinematics causes a change in two parameters: the range between the agent and the object r through (8); and the relative azimuth ψ as in (9). Accordingly the sensors information collection model P (a(k)|X) changes as in (12). The classification decision follows (13). Hence the probability of misclassification depends on the time history of the agent’s heading angle.
III. P ROBLEM F ORMULATION We allow the agent to move K steps around the object of interest and collect a measurement at each step. Given prior probabilities for this object of interest, the collected measurements are used to calculate the posterior probability of a hypothesis (object being T or F ). Using a likelihoodratio rule [2], which is a decision rule based on posterior probabilities, we formulate our classification decision for the object. The classification performance is measured by the probability of misclassification. In this paper, we focus on planning the agent’s path in a manner that minimizes the probability of misclassification for multiple measurements. Our goal is to solve the optimization problem (15) min Pm (K), φ(.)
F. Model Interactions In the following, we show the dependency between the probability of misclassification and the time history of the agent’s heading angle φ. As we can see from Figure 4, the
Probability of misclassification as a function of φ
where Pm (K) is the probability of the agent misclassifying the object after taking K measurements and φ is the agent heading, subject to constraints:
523
conditional probabilities are given as: φ ≥ 0,
(16a)
φ < 2π,
(16b)
P (a(k) = T |X = T ) = σT , P (a(k) = F |X = F ) = σF ,
α(ψ) ∗ P (N ) ≥ 0.5,
(16c)
P (a(k) = F |X = T ) = 1 − σT ,
(22c)
γ(ψ) ∗ P (N ) ≥ 0.5,
(16d)
P (a(k) = T |X = F ) = 1 − σF ,
(22d)
α(ψ) ∗ P (N ) ≤ 1,
(16e)
γ(ψ) ∗ P (N ) ≤ 1.
(16f)
where σT is the rate of true positives (recognizing truth out of truth), and σF is the rate of true negatives (recognizing falsehood out of falsehood).
IV. P ROBLEM S OLUTION A. Sample Space Now combining types of measurements and different objects’ status, we define the problem sample space as: Ω = {T, F } × {T, F }.
fX (x) = P (X = x) = P (ω ∈ Ω : X(ω) = x),
(18)
and fY is defined as: fY (y) = P (Y = y) = P (ω ∈ Ω : Y (ω) = y).
(22b)
B. Posterior Probability We use Bayes’ theorem to compute the posterior probability of a hypothesis (object being T or F ) given evidence that supports the hypothesis, i.e., measurement a(k) is:
(17)
The probability mass function fX is defined as:
(22a)
P (X = T |a(k)) =
P (a(k)|X = T )P (X = T ) , P (a(k))
(23)
where P (a(k)|X = T ) is the likelihood function, P (X = T ) is the a priori probability, and P (a(k)) = P (a(k)|X = T )P (X = T ) + P (a(k)|X = F )P (X = F ) by the theorem of total probability.
(19)
The probability of the sample space is the summation over all possible joint probabilities of these two random variables, X and a(k) as:
C. Multiple Measurement Aggregation The Bayes theorem holds for multiple measurements A = {a(1), a(2), · · · , a(K)}. Assuming that the evidence is conditionally independent given X, we obtain:
P (Ω) = P (X = T, a(k) = T ) + P (X = T, a(k) = F ) + P (X = F, a(k) = T ) + P (X = F, a(k) = F )
P (X = T |a(1), · · · , a(K)) = P (a(1), · · · , a(k), · · · , a(K)|X = T )P (X = T ) (24) P (a(1), · · · , a(k), · · · , a(K)) P (a(1)|X = T ) · · · P (a(K)|X = T )P (X = T ) = , P (a(1), · · · a(K))
= 1, (20) where P (X, a(k)) is the joint probability of X and a(k). Note that X = T and X = F are mutually exclusive events, and so are a(k) = T and a(k) = F . Now, we formulate (20) using the product rule P (X, A) = P (A|X)P (X): P (Ω) = P (a(k) = T |X = T )P (X = T ) + P (a(k) = F |X = T )P (X = T ) + P (a(k) = T |X = F )P (X = F ) + P (a(k) = F |X = F )P (X = F ) = P (X = T ) {P (a(k) = T |X = T ) + P (a(k) = F |X = T )} + P (X = F ) {P (a(k) = T |X = F ) + P (a(k) = F |X = F )} = 1, (21)
where k denotes the measurement order with a total number of measurements equal to K. There are σT and σF that are equal to the OIS value at the relative azimuth α(ψ) and γ(ψ) respectively, multiplied by the probability of correct classification P (N ) as in (12). Each time a new measurement a(k) around the object is received, the rate of true positives σTψ , and true negatives σFψ from the current relative azimuth, in addition to the current experienced measurement a(k) are used to update the posterior probability. Assume that for the current object we have four different angles that the agent can project the object from. Given four σT , as (σT1 , σT2 , σT3 , σT4 ) for each angle, another four σF , and four experienced measurements as (T (1), F (2), T (3), F (4)) where the measurement index denotes the measurement order, we update the posterior probability of this object as follows
where P (a(k)|X) is the conditional probability of measurement a(k) given X.
P (X = T |a(1), a(2), · · · , a(4)) = σT1 σT3 u(σT2 − 1)(σT4 − 1)/
The likelihood of the object property given the object category is modeled by conditional probabilities. For twooption object categories and two-option object properties, the
[σT1 σT3 u(σT2 − 1)(σT4 − 1)− σF2 σF4 (σF1 − 1)(σF3 − 1))(u − 1)].
524
(25)
D. Maximum Likelihood Classification We formulate our classification decision likelihood classification. Definition I Likelihood-ratio rule Let Os ∈ {T, F } be a decision variable likelihood-ratio rule, then: ( (X=T |A) 1 if PP (X=F |A) P (Os = T |A) = (X=T |A) 0 if PP (X=F |A) ( P (X=T |A) 0 if P (X=F |A) P (Os = F |A) = (X=T |A) 1 if PP (X=F |A)
using maximum
Assuming that the classification is unbiased, and using conditional independence, we obtain:
that follows the > 1, ≤ 1, > 1, ≤ 1,
Pm (k) = P (Os = T |a(1), a(2), · · · , a(k)) ∗ P (X = F |a(1), a(2), · · · , a(k)) ∗ P (a(1), a(2), · · · , a(k))
(26)
+ P (Os = F |a(1), a(2), · · · , a(k)) ∗ P (X = T |a(1), a(2), · · · , a(k))
(27)
where P (X = T |A) and P (X = F |A) are the aggregated posterior probabilities for the object category X being T or F respectively, from the set of measurements A = {a(1), a(2), · · · , a(K)}. E. Classification Performance The classification performance is defined by the probability of misclassification. Definition II Probability of misclassification The probability of misclassification is the sum of the probabilities of two faulty outcomes: false positives and false negatives:
∗ P (a(1), a(2), · · · , a(k)).
(32)
Using Bayes’ theorem to substitute for the posterior probability factor in each term in the previous equation yields:
Pm (k) = P (Os = T |a(1), a(2), · · · , a(k))P (X = F ) ∗ P (a(1), a(2), · · · , a(k)|X = F ) + P (Os = F |a(1), a(2), · · · , a(k))P (X = T ) ∗ P (a(1), a(2), · · · , a(k)|X = T ).
(33)
Pm = P (Os = T ∧ X = F ) + P (Os = F ∧ X = T ). (28) Although we have considered the generic case of equal weights for the two outcomes, there can be different weights associated with the outcomes depending on the strategic objective of the classifier.
Assuming that the measurements are conditionally independent given X, we get:
Pm = P (Os = T ∧ X = F ) + P (Os = F ∧ X = T ) = P (Os = T ∧ X = F |a)P (a) + P (Os = F ∧ X = T |a)P (a),
(29)
by the theorem of total probability. Assuming that the classification is unbiased, we can relax the expression by conditional independence, i.e., P (Os = Os0 ∧ X = X0 |A = A0 ) = P (Os = Os0 |A = A0 ) · P (X = X0 |A = A0 ). This means that given a measurement A = A0 , the classifier’s decision Os depends only on the measurements rather than the category of the object that produced the measurements. Substituting, Eq. (29) yields: Pm = P (Os = T |a)P (X = F |a)P (a) + P (Os = F |a)P (X = T |a)P (a).
Pm (k) = P (Os = T |a(1), · · · , a(k))P (X = F ) ∗ P (a(1)|X = F ) · · · P (a(k)|X = F ) + P (Os = F |a(1), · · · , a(k)))P (X = T ) ∗ P (a(1)|X = T ) · · · P (a(k)|X = T ),
(34)
with a(k) ∈ {T, F }, 1 < k ≤ K. Substituting with the two options for each measurement occurrence a(k) in the probability of misclassification, Equation (34) for one measurement yields:
(30)
F. Classification Performance for Multiple Measurements Assessing the probability of misclassification for multiple measurements yields:
Pm (1) = P (Os = T |a(1) = T ) ∗ P (a(1) = T |X = F )P (X = F ) + P (Os = T |a(1) = F )
Pm (k) = P (Os = T ∧ X = F ) + P (Os = F ∧ X = T ) = P (Os = T ∧ X = F |a(1), · · · , a(k))
∗ P (a(1) = F |X = F )P (X = F )
∗ P (a(1), · · · , a(k))
+ P (Os = F |a(1) = T ) ∗ P (a(1) = T |X = T )P (X = T )
+ P (Os = F ∧ X = T |a(1), · · · , a(k))
+ P (Os = F |a(1) = F )
∗ P (a(1), · · · , a(k)).
∗ P (a(1) = F |X = T )P (X = T ),
(31)
525
(35)
11.5
and for two measurements:
11
Pm (2) = P (Os = T |a(1) = T, a(2) = T )P (X = F ) ∗ P (a(1) = T |X = F )P (a(2) = T |X = F ) + P (Os = T |a(1) = F, a(2) = T )P (X = F )
10.5
10
∗ P (a(1) = F |X = F )P (a(2) = T |X = F ) 9.5 9.4
+ P (Os = T |a(1) = T, a(2) = F )P (X = F ) ∗ P (a(1) = T |X = F )P (a(2) = F |X = F )
Fig. 5.
9.6
9.8
10
10.2
10.4
10.6
10.8
11
Planned path for optimal classification
+ P (Os = T |a(1) = F, a(2) = F )P (X = F ) ∗ P (a(1) = F |X = F )P (a(2) = F |X = F )
B. Sensor
+ P (Os = F |a(1) = T, a(2) = T )P (X = T )
In the sensor model, N50 = 0.9. Although the characteristic target dimension d and the maximum resolvable spatial frequency belong to the object model, we include them here for sensor model completeness; d = 7, ρ = 1. The last variable in the sensor model r, is a function of the agent’s current location and the object’s location as in (8).
∗ P (a(1) = T |X = T )P (a(2) = T |X = T ) + P (Os = F |a(1) = F, a(2) = T )P (X = T ) ∗ P (a(1) = F |X = T )P (a(2) = T |X = T ) + P (Os = F |a(1) = T, a(2) = F )P (X = T ) ∗ P (a(1) = T |X = T )P (a(2) = F |X = T )
C. Object
+ P (Os = F |a(1) = F, a(2) = F )P (X = T ) ∗ P (a(1) = F |X = T )P (a(2) = F |X = T ) (36) For an arbitrary number of measurements K, the probability of misclassification can be expressed as: X P (Os = T |a(1), · · · , a(K)) Pm (K) =
We assume that our unidentified object is located at (10.2, 11). The object forward direction is θ = 0.6108. The OIS for that object is as follows:
a(.)∈{T,F }
∗ P (X = F )
K Y
P (a(k)|X = F )
+
P (Os = F |a(1), · · · , a(K))
a(.)∈{T,F }
∗ P (X = T )
K Y
P (a(k)|X = T ).
0 ≤ ψ < π/2,
(38a)
α(ψ) = 0.8
π/2 ≤ ψ < π,
(38b)
α(ψ) = 0.6
π ≤ ψ < 3π/2,
(38c)
α(ψ) = 0.7
3π/2 ≤ ψ < 2π,
(38d)
γ(ψ) = 0.6
0 ≤ ψ < π/2,
(39a)
γ(ψ) = 0.8
π/2 ≤ ψ < π,
(39b)
γ(ψ) = 0.5
π ≤ ψ < 3π/2,
(39c)
γ(ψ) = 0.9
3π/2 ≤ ψ < 2π,
(39d)
for a T rue object, and
k=1
X
α(ψ) = 0.5
(37)
k=1
Finally, we want to emphasize that this is a stochastic nonlinear optimization problem whose complexity increases exponentially in terms of the number of measurements. Its search space size is O(2.2k ), where k is the number of measurements. This optimization problem is dynamic in nature. By dynamic, we mean that the agent kinematics evolve in response to two stimuli: the previous history of the states and the heading decision made by the agent. V. S IMULATION RESULTS In this section, we present the algorithm for calculating the probability of misclassification for multiple measurements and its minimization. In the following subsections, we present the actual model used in simulation and detail of the calculation of the probability of misclassification. A. Agent Following the unicycle model described in (1), the agent starts approaching the object and taking measurements at (10.5, 10.5) with l = 0.5.
for a F alse object. The agent has the capability of taking 4 measurements of the object. We initialize the agent’s time history of headings as φ = [0.3927 0.4488 0.3491 5.026]. The numerical solver f mincon solves the optimization problem given in eq (15) subject to the constraints given in (16), and returns the optimal heading sequence as φ = [0.7854 3.1416 0.5236 3.7699]. The initial path yields probability of misclassification Pm4 = 0.2301, and the probability of misclassification for the optimized path is Pm4 = 0.1892. We have experienced many local minima with different depths related to different initial conditions for this problem due to the nonconvex nature of the problem. VI. C ONCLUSIONS AND F UTURE W ORK This paper has presented a new classification-based formulation for optimal path planning. The problem of optimal path planning is phrased in terms of probability of misclassification for multiple measurements. We derive the analytical expression for the probability of misclassification for multiple measurements. The model depends on the relative azimuth
526
between the agent and the object as well as the range. Future work will focus on four directions. The first one is to find a heuristic that locates a good approximation to the global optimum of the search space (e.g., simulated annealing). The second direction is creating an exhaustive enumeration algorithm that returns the best possible solution, and comparing its solution time with that of the first direction. A third direction is comparing these planned paths for optimal classification versus those obtained for optimal information. Extending this formulation to the multi-agent multi-object case is our fourth direction for the future work.
[22] T. Schwartz and M. Sharir, “On the piano movers problem: I. the case of a two-dimensional rigid polygonal body moving admist polygonal barriers,” Communications on Pure and Applied Mathematics, vol. 36, p. 345398, 1983. [23] S. Udupa, “Collision detection and avoidance in computer controlled manipulators,” in Fifth International Joint Conference on Artificial Intelligence, 2009. [24] M. Veth, J. Raquet, and M. Pachter, “Stochastic constraints for fast image correspondence search with uncertain terrain model,” IEEE Transactions on Aerospace Electronic Systems, vol. 42, no. 3, pp. 973– 982, 2006.
R EFERENCES [1] J. Barraquand and J. C. Latombe, “Robot motion planning: A distributed representation approach,” International Journal of Robotics Research, vol. 10, p. 628649, 1991. [2] T. Bayes, “An essay towards solving a problem in the doctrine of chances,” Philosophical Transactions of the Royal Society of London, p. 370418, 1763. [3] P. Chandler, M. Pachter, K. Nygard, and D. Swaroop, “Cooperative control for target classification,” Cooperative Control and Optimization, 2002. [4] T. Chung, V. Gupta, J. Burdick, and R. Murray, “On a decentralized active sensing strategy using mobile sensor platforms in a network,” in IEEE Conf. on Decision and Control, 2004, pp. 1914–1919. [5] J. Fernandez, J. Christoff, D. Cook, C. Station, N. Center, and S. Spring, “Synthetic aperture sonar on auv,” OCEANS, 2003. [6] B. Grocholsky, Information-theoretic control of multiple sensor platforms, PhD. University of Sydney, 2002. [7] R. Holsapple, P. Chandler, J. Baker, A. Girard, and M. Pachter, “Autonomous decision making with uncertainty for an urban intelligence, surveillance and reconnaissance (isr) scenario,” in AIAA Guidance, Navigation and Control Conference and Exhibit, 2008. [8] J. Howe, “Thermal imaging systems modeling - present status and future challenges,” in SPIE Infrared Technology XX, B.F. Andresen, Ed.,, 1994, pp. 538–550. [9] I. Hussein, “Kalman filtering with optimal sensor motion planning,,” in IEEE American Control Conference, 2008. [10] B. Hyun, M. Faied, P. Kabamba, and A. Girard, “On the independence of information and classification performance,” IEEE Transaction on Systems, Man and Cybernetics, Part B: Cybernetics, submitted. [11] B. Hyun, J. Jackson, A. Klesh, A. Girard, and P. Kabamba, “Robotic exploration with non-isotropic sensors,” in AIAA Guidance, Navigation, and Control Conference, 2009. [12] J. Johnson, “Analysis of image forming systems,” in Image Intensifier Symposium. U.S. Army Engineering Branch and Development Laboratories, 1958, pp. 249–273. [13] L. E. Kavraki, P. Latombe, and M. Overmars, “Probabilistic roadmaps for path planning in high-dimensional configuration spaces,” IEEE Transactions on Robotics and Automation, vol. 12, p. 566580, 1996. [14] A. Klesh, P. Kabamba, and A. Girard, “Optimal path planning for uncertain exploration,” in IEEE American Control Conference, 2009. [15] D. E. Koditschek, “Exact robot navigation by means of potential functions: Some topological considerations,” in IEEE International Conference on Robotics and Automation, 1987. [16] J. C. Latombe, Robot Motion Planning, 1991. [17] J. Leachtenauer and R. Driggers, “Surveillance and reconnaissance imaging systems,” Artech House, 2001. [18] T. Lozano-Perez, “Automatic planning of manipulator transfer movements,” IEEE Transactions on Systems, Man and Cybernetics, vol. 11, p. 681698, 1981. [19] S. Martnez and F. Bullo, “Optimal sensor placement and motion coordination for target tracking,,” Automatica, vol. 42, no. 4, p. 661668, 2006. [20] O. of the Secretary of Defense, Unmanned Aircraft Systems Roadmap 2007-2030, 2007. [21] P. Schenker, T. Huntsberger, P. Pirjanian, E. Baumgartner, and E. Tunstel, “Planetary rover developments supporting mars exploration, sample return and future human-robotics colonization,” Autonomous Robots, no. 2, pp. 103–126, 2003.
527