IEEE TRANSACTIONS ON ROBOTICS, VOL. 22, NO. 6, DECEMBER 2006
Norm-Optimal Iterative Learning Control Applied to Gantry Robots for Automation Applications James D. Ratcliffe, Paul L. Lewin, Eric Rogers, Jari J. Hätönen, and David H. Owens
Abstract—This paper is concerned with the practical implementation of the norm-optimal iterative learning control (NOILC) algorithm. Here, the complexity of this algorithm is first considered with respect to real-time control applications, and a new modified version, fast norm-optimal ILC (F-NOILC), is derived for this application, which potentially allows implementation with a sampling rate three times faster that the original algorithm. A performance index is used to assess the experimental results obtained from applying F-NOILC to an industrial gantry robot system and, in particular, the effects of varying the parameters in the cost function, which is at the heart of the norm-optimal approach. Index Terms—Iterative control, iterative methods, learning control, realtime control, signal processing.
I. INTRODUCTION Iterative learning control (ILC) is concerned with trajectory tracking-control problems, where the required trajectory is repeated ad infinitum over a finite duration known as the trial or iteration length. This applies to many industrial applications such as injection molding, robotics, automated manufacturing plants, and food processing, to name a few; see, for example, [1]. The novel principle behind ILC is to suitably use information from previous trials, often in combination with appropriate current trial information, to select the current trial input to sequentially improve performance from trial to trial. ILC was initially defined by Arimoto et al. [2], and early algorithms were relatively simple in structure, (typically) consisting of a previous input term and a function of previous error. Since then, algorithm complexity has significantly increased as researchers from robust, adaptive, optimal, and other control disciplines have applied their specific expertise to ILC systems. One such algorithm is norm-optimal ILC (NOILC); see, for example, [3]. In the cost function used, it is the difference between the inputs used on successive trials which is penalized, the aim being to achieve optimal correction while being cautious in the use of control action (to avoid excessive/dangerous actuator demands). As with other “advanced” ILC algorithms, simulation studies have shown that in the right circumstances, NOILC can deliver superior performance, compared with more simply structured algorithms, and there is a clear need to determine whether this can also be achieved in practice. Only with such evidence available, coupled with efficient implementation structures, will it be possible for an end user to make an informed choice as to which algorithm to use. Experimental comparison of NOILC with a conventional feedback control implementation on a single-input single-output (SISO) plant consisting of a chain conveyor has shown that this algorithm structure is Manuscript received February 20, 2006; revised May 6, 2006. This paper was recommended for publication by Associate Editor D. Sun and Editor K. Lynch upon evaluation of the reviewers’ comments. J. Ratcliffe, P. Lewin, and E. Rogers are with the School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, U.K. (e-mail:
[email protected]). J. Hätönen and D. H. Owens are with the Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, S1 3JD, U.K. Digital Object Identifier 10.1109/TRO.2006.882927
1303
capable of better performance in terms of trajectory following [4]. The work presented here aims to expand on these initial tests, and investigate numerous aspects of ILC implementation, which are specifically relevant to the NOILC algorithm, with the eventual aim of informing the choice referred to above. NOILC is implemented on a gantry robot consisting of three SISO axes operating simultaneously. The number of calculations which the controller must perform between each sample interval is, therefore, significant, and limitations on processor frequency define the shortest time in which all the calculations can be performed. For example, a complex algorithm and a slow processor result in a low sampling frequency, which produces poor control performance. To overcome this potential problem, a new implementation of the NOILC controller, termed the fast (F-)NOILC, is developed, which allows faster computation of the control input, together with a high sample frequency and a high-order model, to be implemented on each axis. By varying the cost function weighting terms, it is possible to adjust the balance between convergence speed and performance along the trial. The effects caused by adjusting these parameters have never been experimentally investigated, but are clearly critical to a successful industrial application. Therefore, in Section VII, experimental results are presented comparing performance with choice of parameter values. A new performance index is defined in Section III specifically for this task. The penultimate section considers robustness to initial state error, and then conclusions are given in the last section. II. F-NOILC The state-space model of the plant to be controlled by an ILC scheme is assumed to be of the following form: x
k(
k( ) + k( ) 0 (1) k k( ) k ( ) is the 2 1 state vector, k ( ) is the 2 1 k ( ) is the 2 1 vector of control inputs, and the trial
t
+ 1) =
Ax
t
( t) =
Cx
t
y
Bu
t ;
t
T
where on trial k; x t n m y t r output vector, u t length T < 1. Equivalently, the plant operates for a fixed number, M (= (T =h)) samples (where h is the sampling rate). The basic ILC requirement is that
klim !1 k = d klim !1 k = 1 y
r ;
u
u
where rd is the desired reference trajectory, u1 is termed the learned control, and the tracking error on trial k is given by ek = rd 0 yk . In this paper, we consider the following cost function for processes described by (1): J
M T k+1 ( k+1 ) = 12 k+1 ( ) k+1 ( ) + T t=0 ( ) = k+1 ( ) 0 k ( ) u
e
H
H u; k
t
u
Qe
t
t
u
H
RH
t
where the weighting matrices Q and R are of compatible dimensions, and are symmetric positive-semidefinite and positive-definite, respectively. Following [3], the solution is • matrix gain (Riccati) equation K ( t)
=
A
T
K (t
+ 1)A + C
2f T B
where
K ( t)
K (M )
1552-3098/$20.00 © 2006 IEEE
= 0;
K (t
T
Q (t
0[ T ( 01 T + 1)g
+ 1)C
+ 1)B + R(t
A
K t
B
+ 1)B
K (t
+ 1)A]
(2)
is a matrix gain which has the terminal condition
1304
IEEE TRANSACTIONS ON ROBOTICS, VOL. 22, NO. 6, DECEMBER 2006
• predictive component equation with k+1 (M ) = 0
k+1 (t) = fI + K (t)BR01 (t)BT g01 2fAT k+1 (t + 1) + C T Q(t + 1)ek (t + 1)g
(3)
• input update equation
uk+1 (t) = uk (t) 0 [fB T K (t)B + R(t)g01 B T K (t) 2Afxk+1 (t) 0 xk (t)g] + R01 (t)BT k+1 (t):
(4)
The main disadvantage limiting the practical application of the NOILC algorithm is the large amount of computation which must be performed between each sample interval. To remedy this problem, a faster version of the algorithm (F-NOILC) can be used, which allows the majority of calculations to be performed during the design and commissioning of the controller. The remaining calculations are significantly reduced in number, and consist solely of multiplications, additions, and subtractions. Implementation of the algorithm is as follows. The matrix gain K (t) defined by (2) can be calculated before the system operates, and hence, does not contribute to the real-time processing load. The predictive term (3) must be calculated between each iteration. Note that this equation has a terminal, as opposed to an initial, condition, and must therefore be computed in descending sample order. The input update (4) must be calculated at each sample instant. It is, therefore, the input update equation which particularly contributes to the real-time processing load, and has a significant influence on the minimum sample time. The F-NOILC algorithm is derived by identifying simplifications which can be made to the computation of the original. Consider the predictive component (3); the only variables in this equation are the tracking error ek and the predictive term itself k+1 , and all of the other terms can be combined together to produce constant matrices
(t) = fI + K (t)BR01 (t)B T g01 ( t) = ( t) A T
(t) = (t)C T Q(t + 1)
controller, the operating system, and the efficiency of the program functions. In simulation comparisons, there was a factor of three increase in available speed, when using an identical setup for both NOILC algorithms and running them at maximum simulation rate, without specifying the need for strict sample intervals. However, care must be exercised when transferring this result to experimental implementation. Although some increase in speed will be achieved by the F-NOILC algorithm, a quantifiable guarantee is beyond the scope of this paper.
(5) (6)
III. MEASURING ILC PERFORMANCE In NOILC, the weighting matrices (scalars here) Q and R can be used to adjust the balance between convergence speed and robustness, respectively. One of the specific objectives of this paper is to investigate just how much the choices of Q and R affect the performance of the NOILC algorithm. It is generally recognized that there are three variables which are of particular importance when describing the performance of an ILC algorithm. These are: 1) convergence speed; 2) minimum tracking error; and 3) long-term performance. Although the instantaneous data recorded during each trial (such as input voltage and output error) is useful for analyzing the learning process and its stability, it is far more common to calculate some general measure of the tracking accuracy for each iteration and observe how this changes, as the number of iterations increases. This can specifically indicate minimum error, time to reach minimum error (convergence speed), and any sign of performance degradation from trial to trial. Popular measures of tracking accuracy are the error norm or the mean-squared error (MSE). The proposed performance index simply involves calculating the area under the MSE curve for the first N iterations, where N can be selected appropriately for the system being analyzed. This results in the performance index for N iterations, denoted by PIN . Based on analysis of a range of experiments implemented on the gantry robot, a reasonable indication of performance can be ascertained during the first 100 iterations of any test. For these experiments, PI100 , a simple summation of the first 100 MSE values was used, where
leading to the computationally simpler predictive component equation
k+1 (t) = (t)k+1 (t + 1) + (t)ek (t + 1):
(7)
Exactly the same concept can be applied to the input update (4), resulting in the simplified input update equation
uk+1 (t) = uk (t) 0 (t)fxk+1 (t) 0 xk (t)g + !(t)k+1 (t) !(t) = R01 (t)B T (t) = (B T K (t)B + R(t))01 B T K (t)A:
(8)
The resulting implementation therefore requires seven matrices (A; B; C; ; ; and ! ) to be supplied to the real-time controller. The F-NOILC algorithm uses significantly more memory than the original, because the memory allocation is static, rather than dynamic. The NOILC can recycle memory once calculations are complete. However, it is worth observing that the process of recycling the memory takes time and decreases the amount of time available for computation of the algorithm. The F-NOILC approach is preferable, because it is relatively easier and cheaper to upgrade memory than to upgrade the processor. With respect to the improvement in computation speed, due to the reduced number of calculations, it is possible to calculate exactly the time required to perform each algebraic operation for both the NOILC and the F-NOILC, then find the total time for each variant. However, the results of this process still ultimately depend on the characteristics of the
PI100 =
100
n=1
e2 n
(9)
where n is the iteration number. To allow a fair comparison of algorithm performance, several test parameters must be held constant; namely, the plant (or plant model in simulation), the reference trajectory, the value of N , and the MSE value for the first iteration (e1 ). Parameter e1 is perhaps the most difficult of these values to hold constant. However, this can be achieved if the plant input is set to zero for the duration of the first iteration. The plant output should, therefore, remain constant, and the value of e1 will be the MSE of the reference trajectory. With e1 held constant, it is logical to remove the PIN dependency on the unit of MSE, by normalizing the MSE so that e1 = 1. It is now possible to define upper and lower bounds on the value of PIN ; first, suppose that the algorithm achieves perfect tracking after only one iteration of learning. As specified previously, the MSE for the first iteration is normalized to one, but by the second iteration, the MSE will be zero and will remain zero for all N > 1. Then the minimum value of the performance bound, denoted by PIN (min), is equal to one, which defines the lower bound. Conversely, when the algorithm learns nothing at any iteration, the MSE will be equal to one for each iteration, and therefore, the upper bound, denoted by PIN (max), can be taken as N . The closer the value of PIN is to one, the better the tracking performance.
IEEE TRANSACTIONS ON ROBOTICS, VOL. 22, NO. 6, DECEMBER 2006
Fig. 1.
X -axis Bode plot. IV. GANTRY ROBOT TEST FACILITY
The gantry robot is a commercially available system found in several industrial applications. The robot is located above one end of a plastic chain conveyor, and is tasked with collecting payloads from a dispenser and placing them onto the moving conveyor beneath. The robot must synchronize both speed and position with the conveyor to achieve accurate placement of the payload. The gantry robot can be treated as three separate SISO systems (one for each axis) which can operate simultaneously to locate the end-effector anywhere within a cuboid work envelope. The lowest X -axis moves in the horizontal plane, parallel to the conveyor beneath. The Y -axis is mounted on the X -axis and moves in the horizontal plane, but perpendicular to the conveyor. The Z -axis is the shorter vertical axis mounted on the Y -axis. The X and Y -axes consist of linear brushless dc motors, while the Z -axis is a linear ball-screw stage powered by a rotary brushless dc motor. All motors are energized by performance-matched dc amplifiers. Axis position is measured by means of linear or rotary optical incremental encoders as appropriate. In this paper, the conveyor beneath the gantry robot is not considered. To implement F-NOILC, is it necessary to obtain a model for the plant which is to be controlled. Each axis of the gantry was modeled independently by means of sinusoidal frequency-response tests. From this data, it was possible to construct Bode plots for each axis, and hence, approximate transfer functions could be determined. These were then refined by means of a least-mean square optimization technique, to minimize the difference between the frequency response of the real plant and that of the model. The resulting Bode plots comparing the plant and the model for the X -axis are given in Fig. 1. V. TEST PARAMETERS With all axes operating simultaneously, the reference trajectories for the axes produce a 3-D synchronizing “pick and place” action (see Fig. 2). The trajectories produce a work rate of 30 units per minute, which is equivalent to an iteration time period of 2 s. Using a sampling frequency of 1 kHz, this generates 2000 samples per iteration. All tests were performed in ILC format, and hence, the following hold: 1) there is a stoppage time between iterations; 2) the plant is reset to known initial states before the next iteration; 3) calculation of the next ILC plant input occurs between iterations. A 2-s stoppage time exists between each iteration, during which, the next input to the plant is calculated. The stoppage time also allows vi-
1305
Fig. 2. Implementation. Five iterations.
brations induced in the previous iteration to die away and prevents vibrations from being propagated between iterations. Before each iteration, the axes are homed to within 630m of a known starting location to minimize the effects of initial state error. The plant input voltage for the first iteration is zero. Therefore, the algorithm must learn to track the reference in its entirety. There is no assistance from any other form of controller. In the practical implementation, the system states are estimated by means of a tuned full-state observer. VI. F-NOILC: INITIAL IMPLEMENTATION Initial implementation of the F-NOILC algorithm demonstrates excellent performance in terms of convergence speed, minimum error, and long-term performance (number of iterations without degradation in performance), also termed long-term stability in some of the literature. Here, the current trial error and control input are scalars, and hence, we replace Q and R in the performance index by the scalars q and r , respectively. Fig. 2 shows the tracking performance for the first five iterations, with q and r set to 100 and 0.01, respectively, for the X - and Y -axes, and 1000 and 0.01, respectively, for the Z -axis. These values of q and r were used for initial tests because they resulted in good tracking performance. At this stage, no attempt has been made to investigate the effects of changing q and r , and no attempt has been made to optimize them. Note that iteration 1 is a point located at the start of the trajectory, because the plant input is held at zero for the first iteration, as specified in Section V. Iteration 2 is already a good estimate of the required trajectory, and performance improves rapidly at each iteration, resulting in excellent tracking by iteration 5. Fig. 3 shows the MSE calculated for each axis during a 5000 iteration test designed to investigate the long-term performance of the F-NOILC algorithm. The most important feature of each plot is that there is no indication of an increasing MSE, which typically indicates that the algorithm is diverging away from the minimum error value and is unstable. The lack of increasing MSE strongly suggests that the algorithm is stable. It is important to state that the 5000 iteration test does not guarantee infinite iteration performance. However, it is a good indicator that the algorithm can achieve long-term performance. VII.
q AND r TUNING PARAMETERS
To investigate the effect of varying q and r , a batch of tests have been performed using different combinations of these parameters. A range
1306
IEEE TRANSACTIONS ON ROBOTICS, VOL. 22, NO. 6, DECEMBER 2006
Fig. 3. Implementation. MSE 5000 iterations.
Fig. 5. Effect of initial error on MSE (q = 100; r = 0:0001).
approaching optimality. Temporarily increasing q=r has little effect on the performance, until the system becomes unstable and PI100 jumps back to 100. This is represented by the channel, and then the steep slope to the left of the chart. It is important to note that the ratio of q to r is what determines the algorithm performance rather than the actual values for each parameter. If a larger range of q and r values were used, the chart would still have a channel cutting diagonally across it. VIII. ROBUSTNESS TO INITIAL STATE ERROR
Fig. 4.
X -axis PI
for various q and r .
of values from 0.1 to 106 for q were individually compared with values of r from 1004 to 100. Each combination was implemented for 100 iterations, and the PI100 performance index described in Section III was calculated. Because the F-NOILC has two tuning parameters, it is particularly suitable to plot the algorithm performance on a 3-D surface chart. Fig. 4 displays the performance plot for the X -axis. The other two axes performance plots are very similar, particularly the Y -axis, where the low-frequency gain of the linear motor is practically identical to that of the X -axis. Remembering that q affects the rate of error reduction and r limits the input change, interpreting the plots becomes a simple task. To the right of the chart is a region of poor tracking performance, where the PI100 value is near or equal to 100, indicating that virtually nothing is learnt during the 100 iterations of the test. As could be expected, this corresponds to a small value for q and a large value for r . With these settings, the algorithm is far too conservative. As the ratio of q to r increases, gradually PI100 reduces, indicating that the performance is improving. This is represented by the slope to the right side of the chart. As the q=r ratio continues to increase, PI100 is reduced to values very close to one, indicating that the perfect trajectory is learned in almost one iteration. The balance of error reduction to input change is now
ILC algorithms (including NOILC) are often derived on the assumption that the initial conditions are the same at the start of each new trial. Here, this corresponds to assuming that the axes are reset to exactly the same position before each iteration. In practice, this is almost impossible to achieve. In most, if not all, cases there will always be some residual error, which results in initial state error. As long as this error is many orders of magnitude smaller than the range of the reference trajectory, it is reasonable to assume that the initial state error is zero. If this is not the case, then problems can be expected to arise. This subject area has been investigated from an analysis standpoint by various authors, e.g., [5] (for algorithms other than norm-optimal), but there is little experimental evidence to support the development of this area. In this section, we experimentally examine the norm-optimal algorithm performance in this respect. Each axis of the gantry robot is able to home to within 630m of the desired starting position. Compared with the required motion profiles, this error is very small. Therefore, to investigate the effect of initial state error on F-NOILC, homing error has been artificially introduced into the system. The closed-loop controller used for homing the axes deliberately adds an offset to the required homing position by using a bounded pseudorandom number generator. The bounds must be specified and define the maximum limits, which the offset can reach. Within these bounds, the offset is strictly pseudorandom, because a seeded random number generator is used. The seed is the iteration number. Therefore, within one test, the data appears to be random, but for different tests, the same random number is generated for corresponding iterations. Fig. 5 shows MSE plots for various bounds of homing error. In these tests, q and r were set to 100 and 0.0001 for all axes. These values give excellent tracking, as shown in the performance plots presented in Section VII. The values for the offset bound are 0 mm, 6 1 mm, 6 2 mm, and 6 3 mm. Clearly, adding initial error has an effect on the
IEEE TRANSACTIONS ON ROBOTICS, VOL. 22, NO. 6, DECEMBER 2006
Fig. 6.
X -axis input, iteration 100, offset bound 6 3 mm (
q
1307
improved performance when compared with simply structured algorithms such as P-type ILC. Of course, it is not possible to derive equations to answer this question, and hence, any judgement can only be based on appropriate available experimental comparisons. In terms of the algorithm class considered here, the authors have previously implemented a range of P-type algorithms on the the same experimental facility [6]. Comparing these results with the ones given here reveals that the F-NOILC algorithm achieves minimum MSE within 5 iterations, compared with 20 required for the P-type implementations. However, both algorithms achieve similar levels of MSE reduction. A fast variant of NOILC has been developed, implemented, and tested on a gantry robot. A performance index to assess how the algorithm tuning parameters affect learning performance has been developed and applied. This has provided a substantial set of experimental evidence on one of the critical problems between optimal control-based theory and its implementation (in ILC context), namely, the selection of the cost function weighting parameters. Practical assessment of the performance of the F-NOILC has indicated that it does have robustness to initial state error. = 100; r =
0:0001).
REFERENCES minimum tracking error, but not noticeably on the convergence speed or the long-term performance. As the initial error bound increases, the minimum tracking error also increases. Because the feedforward component of the F-NOILC algorithm cannot compensate for the error at the start of the next iteration, it can only compensate for the error which was at the start of the previous iteration. The larger the difference between these values, the less the feedforward algorithm is able to compensate. Data for initial error with bounds larger than 6 3 mm is not presented, because for 6 3 mm, the feedback controller, which is also an intrinsic part of the F-NOILC algorithm, generated an impulse input at the start of the trajectory to try and compensate for the initial error. This can be seen clearly in Fig. 6. IX. DISCUSSION AND CONCLUSIONS With any proposed implementation of an advanced ILC algorithm, there is clearly a need to determine if this will lead to significantly
[1] H. Havlicsek and A. Alleyne, “Nonlinear control of an electrohydraulic injection molding machine via iterative adaptive learning,” IEEE/ASME Trans. Mechatron., vol. 4, no. 3, pp. 312–323, May 1999. [2] S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operation of robots by learning,” J. Robot. Syst., vol. 1, pp. 123–140, 1984. [3] N. Amann, D. H. Owens, and E. Rogers, “Iterative learning control for discrete-time systems with exponential rate of convergence,” IEE Proc. Control Theory Appl., vol. 143, no. 2, pp. 217–224, 1996. [4] T. Al-Towaim, A. D. Barton, P. L. Lewin, E. Rogers, and D. H. Owens, “Iterative learning control—2D control systems from theory to application,” Int. J. Control, vol. 77, no. 9, pp. 877–893, 2004. [5] H.-S. Lee and Z. Bien, “Study on robustness of iterative learning control with non-zero initial error,” Int. J. Control, vol. 64, no. 3, pp. 345–359, 1996. [6] J. D. Ratcliffe, J. J. Hatonen, P. L. Lewin, E. Rogers, T. J. Harte, and D. H. Owens, “P-type iterative learning control for systems that contain resonance,” Int. J. Adapt. Control Signal Process., vol. 19, no. 10, pp. 769–796, 2005.