You are on page 1of 8

IEEE TRANSACTIONS ON EDUCATION, VOL. 47, NO.

3, AUGUST 2004

377

Solving Optimal Control Problems With State Constraints Using Nonlinear Programming and Simulation Tools
Victor M. Becerra, Senior Member, IEEE
AbstractThis paper illustrates how nonlinear programming and simulation tools, which are available in packages such as MATLAB and SIMULINK, can easily be used to solve optimal control problems with state- and/or input-dependent inequality constraints. The method presented is illustrated with a model of a single-link manipulator. The method is suitable to be taught to advanced undergraduate and Masters level students in control engineering. Index TermsInput constraints, nonlinear programming, optimal control, simulation, state constraints.

I. INTRODUCTION

HE PURPOSE of this paper is to illustrate how nonlinear programming and simulation tools, which are available in packages such as MATLAB [1] and SIMULINK [2], can easily be used to solve optimal control problems with state- and/or input-dependent inequality constraints. The presence of state-dependent inequality constraints complicates the treatment of these problems since analytical methods require previous knowledge of the number and sequence of state-constrained arcs, which are normally unknown beforehand [3]. A state-constrained arc must be anticipated so that the system arrives at the constraint with zero-state constraint rate; otherwise, the system cannot stay on the constraint. One possible solution method for this kind of problem is the penalty function method [4], where the problem is converted into an unconstrainted problem by augmenting the objective function with a term that becomes positive and large when any inequality constraint is violated. The disadvantage of penalty function methods is that they usually cause ill conditioning and, hence, numerical convergence problems. A rather inefficient solution technique is known as the slack variable method [5], which converts these problems into unconstrained problems by introducing slack variables, so that the slack variables are nonzero off the constraint, but zero on the constraint, which may introduce singularities in the calculations [6]. An indirect method is known as inverse dynamic optimization [7]. It is regarded as an inverse control method since points on the state-variable trajectories are chosen so that the path is

optimal. The control variable trajectories are found by numerical differentiation of the state trajectories. This method involves discretization and uses nonlinear programming. The method is limited since accuracy is lost as the number of time steps increases [6]. Nonlinear programming [4] codes handle inequality constraints in a more efficient and accurate way, compared, for instance, with slack variable or penalty function methods. Collocation methods, which use nonlinear programming, are the most reliable methods for solving problems with state-variable inequality constraints [8], [9]. However, these methods are difficult to introduce to undergraduate students or even Masters level postgraduate students in control engineering. The method presented in this paper, which is inspired by the control parameterization method [10][13], uses tools sometimes known (depending on the program) by advanced undergraduates or Masters level postgraduates in control engineering, and is suitable to be taught at such levels. Specifically, fourth-year undergraduates doing a four-year Master of Engineering (M.Eng.) degree course in Cybernetics at the University of Reading, Reading, U.K., take a 10-h module on nonlinear programming as part of a 40-h course on advanced control that provides them with the fundamentals to use nonlinear programming tools to solve control problems. An example syllabus of a nonlinear programming module for advanced control engineering students is provided in the Appendix. The paper is organized as follows. Section II presents the formulation of the methods. Section III provides a numerical example solved by the proposed method (using constrained optimization) and by the penalty function method (using unconstrained optimization). The penalty function method was used as a comparison since it is also suitable to be taught at this level. Section IV discusses the generality of the method and implementation issues. Section V provides concluding remarks. II. FORMULATION OF THE METHOD A. The Dynamic System Consider a dynamic system described by the set of state equations (1)

Manuscript received April 25, 2003; revised August 12, 2003. The author is with the U.K. Department of Cybernetics, University of Reading, Reading RG6 6AY, U.K. Digital Object Identifier 10.1109/TE.2004.825925

where is an vector of states, and is an vector of manipuis a vector function. lated inputs, is continuous time, and

0018-9359/04$20.00 2004 IEEE

378

IEEE TRANSACTIONS ON EDUCATION, VOL. 47, NO. 3, AUGUST 2004

Using a discretization method with a certain time step , such , where is an integer, the discretized that system can be described by the set of discrete state equations (2) represents a vector mapping that, given the control where , provides the transition between the state to the input next state . B. Performance Index and Constraints Suppose that it is desired to minimize a performance index (3) , where for system (2) over the time period is a scalar terminal weighting function and is a scalar intermediate weighting function. Furthermore, suppose that the states of the system are constrained such that the following inequality should be satisfied along the optimal trajectory: (4) and the following equality constraint should be satisfied at the end of the optimal trajectory: (5) is a vector mapping of state inequality constraints where is a vector mapping of terminal equality constraints. and In addition, suppose that the system starts at the initial state and that the control input is constrained between upper and lower limits (6) C. Optimal Control Problem Thus, the optimal control problem can be formulated as (7) subject to

The discrete-time optimal control problem defined by (7) and (8) is quite general and covers most practical situations. Additional but less common types of of constraints are described, for example, in [13]. Continuous time problems may also be included in the previous formulation by using a suitable discretization method as discussed previously, noting that it may be also be necessary to discretize the performance index. D. Nonlinear Programming Problem The optimal control problem defined previously can be formulated as a nonlinear programming problem [4] by using the control parameterization concept [10][13]. Define the decision vector as (9) where . Define also the admissible region for the decision vector as (10) and . Then, the optimal control problem given previously can be written as (11) subject to (12) (13) where represents the mapping between the decision vector and the objective function value (given the initial state ), is a function of inequality constraints that represents the mapping from the set of decision vectors to the range of the over the period , and constraint mapping is a function of equality constraints, which represents the mapping from the set of decision vectors to the range of the . It should be emphaterminal equality constraint mapping sized that the equality constraints related to the state equations are not part of the of the system equality constraints described by function . The state equations of the system are implicitly enforced since they are evalu. ated in order to obtain the state trajectory Notice that it is not necessary to have an explicit expresin terms of the decision sion for the performance index variables. It suffices to be able to evaluate it numerically. The and . The same applies to the constraint mappings , , and should be differentiable if a grafunctions dient-based algorithm is to be used to solve the nonlinear programming problem (which is usually the case). Most nonlinear programming software routines have options for evaluating the required gradients numerically so that analytical gradients are not usually required. Notice that software routines for constrained optimization usually handle bound constraints on the decision vector separately from more general inequality constraints . where

(8) and the input sequence Given the initial state , it is possible to use the dynamic . With model to obtain the state sequence and the state trajectory , it the input trajectory is possible to evaluate the performance index and also the inequality and equality constraints.

BECERRA: SOLVING OPTIMAL CONTROL PROBLEMS WITH STATE CONSTRAINTS

379

at 2 s, with the following restrictions on the velocity and control signal:

(18) C. Solution Using Constrained Optimization 1) Formulation as a Constrained Optimal Control Problem: For example, using Eulers method for discretization 0.05 s, it is possible to define the following with a step optimal control problem:
Fig. 1. Single-link manipulator.

E. The Penalty Function Approach It is possible to convert the previously described constrained nonlinear programming problem into an unconstrained one by using the penalty function approach [4]. Using a squadratic , the formu(and, hence, differentiable) penalty function lation is (14) where the penalty function is given by

(19) subject to

(20) provides the required angle Notice that the value in the steady state, and, hence, its use in the performance index. If the discretization method is of higher order than Eulers method (e.g., fifth-order RungeKutta), then the expressions for the discretized dynamics change, but apart from that change, the problem formulation remains the same. Notice that to obtain a solution, it is not necessary to be able to write explicitly the discretized form of the dynamics. It suffices to be able to integrate the model between the discretization points. From the optimization point of view, a fixed-step integration algorithm is the best choice if that is sufficient to solve the state equations. However, in the case of a stiff system of differential equations [14], a variable-step method may be required. If a variable-step method is used, then it is important to insist that the integration algorithm provides results at fixed time intervals, since otherwise, the resulting trajectories are nonsmooth (nondifferentiable) with respect to the decision variables for optimization. Lack of smoothness may cause convergence problems to the optimization method. 2) MATLAB Implementation Using Constrained Optimization: The implemented solution is based around MATLABs function, which is part of the Optimization Toolbox implements a sequential quadratic pro[15]. Function gramming algorithm [4] to solve nonlinearly constrained optimization problems. The following MATLAB function is used to compute the . A simulation of the system is carobjective function ried out to evaluate the objective function. Note that slm is a SIMULINK model of the single link manipulator, which is illustrated in Fig. 2. In Fig. 2, notice the use of input and output ports.

(15) is the maximum bewhere is a penalty factor and tween and . Because of the use of unconstrained optimization, it is necessary to include bound constraints on the deci. sion vector as part of the inequality constraints given by Despite its simplicity, the penalty function method has some well-known disadvantages that will be illustrated in this paper. III. EXAMPLE A. Single-Link Manipulator Model Consider the following model of a single-link manipulator, which is illustrated in Fig. 1: (16) where is the angular position, is the mass of the end-of-rod element, is the length of the rod, is the friction coefficient at the pivot point, and is the applied torque at the pivot point. , , and suppose that 2 kg, Define 1 m, and 6 kg m s. Then, the model can be written as

(17)

B. The Control Requirements Suppose that it is desired to take the manipulator from the initial resting condition to the vicinity of point

380

IEEE TRANSACTIONS ON EDUCATION, VOL. 47, NO. 3, AUGUST 2004

Fig. 2. Single-link manipulator model implemented in SIMULINK.

The input port corresponds to the control signal, and the output ports correspond with the states of the system. The sorting order of the output ports is important. The solver used for integration was a fixed-step, fifth-order RungeKutta method, with a step size of 0.05 s.

( ,

, , ; .

, );

; ( , , , ; , . ; . ; , ); , . ( ); , , ,

. ;

The following MATLAB script is used to call the constrained optimization routine and solve the problem. . ( ); ; ; ; ; . ; ( . ); ; , . ; ; , , , ;

The following MATLAB function is used to compute the constraint mappings and . Again, a simulation of the system is carried out to evaluate these functions.

BECERRA: SOLVING OPTIMAL CONTROL PROBLEMS WITH STATE CONSTRAINTS

381

Fig. 3. Optimal states and control signal.

. ); . , ( , , , . ; , ; ( , , , , . ( ); Notice that input bound constraints are specified in this script. Notice also that the initial guess for the solution was , . 3) Results Using Constrained Optimization: The optimal solution was achieved after six major sequential quadratic programming (SQP) iterations and 297 objective functions evaluations. The optimal value of the performance index . The optimal state and input trajectories was , , , , ); , . , , , , , , , , ); , ( , , , , , ,

reaches the are shown in Fig. 3. Notice that the velocity constraint and stays on the constraint for a period of time before leaving it (this is a state-constrained arc). Notice also that the input constraints are not active in the optimal solution. D. Solution Using a Penalty Function and Unconstrained Optimization 1) Formulation as an Unconstrained Optimal Control Problem With a Penalty Function: For example, using Eulers 0.05 s, it is possible method for discretization with a step to define the following optimal control problem, which includes a penalty term to account for the inequality constraints:

(21) subject to

(22) where the penalty term is given by

(23)

382

IEEE TRANSACTIONS ON EDUCATION, VOL. 47, NO. 3, AUGUST 2004

where and is a penalty factor. Notice that the penalty function becomes nonzero and positive when any of the , or inequality constraints is violated. 2) MATLAB Implementation Using a Penalty Function and Unconstrained Optimization: The implemented solution is function, which is part of based around MATLABs the Optimization Toolbox [15]. Function implements a quasi-Newton optimization algorithm [4] to solve unconstrained optimization problems. The following MATLAB function is used to compute the objective function, which now includes a penalty term. A simulation of the system is carried out to evaluate the objective function. The solver used for integration was a fixed-step, fifth-order RungeKutta method, with a step size of 0.05 s.

; ; ;

; ( , , , ,

);

( ,

, , ;

, );

, );

, .

; . , . ( , , , ; .

);

( , ,

, , );

; ; ; ); ) ; . . ; . ; ( ,

. , ,

3) Results Using the Penalty Function Approach and Unconstrained Optimization: The resulting trajectories are similar to those obtained using constrained optimization, which are shown in Fig. 3, particularly for large values of the penalty factor . It is noted that in order to avoid a significant constraint violation, the value of the penalty factor needs to be large, as can be seen in Fig. 4. However, for large values of the penalty factor, the problem becomes ill conditioned, which can be noted since the number of function evaluations that are required to achieve convergence increases with . This ill-conditioned problem is illustrated in Fig. 5. These results illustrate the main disadvantages of the penalty function approach: ill conditioning and difficulty in choosing a suitable value for the penalty factor. IV. DISCUSSION

. ; The following MATLAB script is used to call the unconstrained optimization routine and solve the problem.

The method for solving optimal control problems with state constraints, using MATLAB and SIMULINK that has been presented in this paper, is fairly general and can be applied to a wide variety of optimal control problems. Depending on the complexity of the dynamics, it may be better to implement the

BECERRA: SOLVING OPTIMAL CONTROL PROBLEMS WITH STATE CONSTRAINTS

383

Fig. 4. Maximum constraint violation on x for different values of the penalty factor .

Fig. 5. Number of function evaluations for different values of the penalty factor .

model of the system as an S-function using MATLAB code or, if computational efficiency is paramount, the C programming language [16], rather than by connecting SIMULINK blocks. If an S-function is used to implement the dynamics of the system, the structure of the remaining MATLAB code that is required by the method (an objective function file, a constraint function file, and a script to call the optimization routine) remains the same.

V. CONCLUSION This paper has illustrated the use of nonlinear programming and simulation tools, particularly those available as part of MATLAB and SIMULINK, to solve optimal control problems with state- and/or input-dependent inequality constraints. This kind of problem is notoriously difficult to treat in an

384

IEEE TRANSACTIONS ON EDUCATION, VOL. 47, NO. 3, AUGUST 2004

analytical way, even if the dynamics of the system are linear. The numerical method presented, which is inspired by the control parameterization method, is illustrated with a model of a single-link manipulator and is compared with the penalty function approach. The method presented can be introduced at an advanced-level undergraduate course or at a Masters level course in control engineering. APPENDIX EXAMPLE SYLLABUS OF A NONLINEAR PROGRAMMING COURSE FOR ADVANCED CONTROL ENGINEERING STUDENTS Lecture 1Introduction to optimization: Static optimization; dynamic optimization;. basic mathematical formulation; feasible region; steps to solve optimization problems; classification of optimization problems (linear, nonlinear, quadratic, constrained, unconstrained); difficulties often encountered; typical examples and applications. Lecture 2Unconstrained optimization: The gradient vector and the Hessian matrix; formulation; necessary conditions;. sufficient conditions; gradient descent method; steepest descent method. Lecture 3Line search, Newton, and quasi-Newton methods: Line search, quadratic interpolation formulas; Newtons method in several dimensions; modified Newton method; the DavidonFletcherPowell algorithm. Lecture 4BroydenFletcherGoldfarbShanno (BFGS) algorithm and control applications of unconstrained optimization. Lecture 5Genetic algorithms: Introduction to genetic algorithms; chromosome and representation scheme; initial population; selection;. crossover and mutation operators. Lecture 6Constrained optimization: Optimization with equality constraints; Lagrange multipliers; necessary conditions; optimization with mixed equality and inequality constraints; active and inactive constraints; regular point; KarushKuhnTucker conditions. Lecture 7Algorithms for constrained optimization: Penalty function methods; quadratic programming; sequential quadratic programming. Lecture 8Control applications of constrained optimization:. Nonlinear optimal control; optimal design of controller parameters. Tutorial 1: Unconstrained problems using MATLAB and SIMULINK. Tutorial 2: Constrained problems using MATLAB and SIMULINK.

REFERENCES
[1] Using MATLAB, The Mathworks, Inc., Natik, MA, 2000. [2] SIMULINK: Dynamic System Simulation for MATLAB, The Mathworks, Inc., Natik, MA, 2000. [3] A. Bryson and Y. Ho, Applied Optimal Control. Washington, DC: Hemisphere, 1975. [4] M. Bazaraa, H. Sherali, and C. Shetty, Nonlinear Programming: Theory and Algorithms. New York: Wiley, 1993. [5] Y. Zhao, Optimal control of an aircraft flying through a downburst, Ph.D. thesis, Dept. of Aeronautics and Astronautics, Standford Univ., Standford, CA, 1989. [6] A. Bryson, Dynamic Optimization. Menlo Park, CA: Addison-Wesley, 1999. [7] H. Seywald, Trajectory optimization based on differential inclusion, J. Guid. Control Dyn., vol. 17, pp. 480487, 1994. [8] R. Mehra and R. Davies, A generalized gradient method for optimal control problems with inequality constraints and singular arcs, IEEE Trans. Automat. Contr., vol. AC-17, pp. 6978, Feb. 1972. [9] C. Hargraves and S. Paris, Direct trajectory optimization using nonlinear programming and collocation, J. Guid. Control Dyn., vol. 10, 1987. [10] D. Tabak and B. C. Kuo, Application of mathematical programming in the design of optimal control systems, Int. J. Control, vol. 10, pp. 545552, 1969. [11] G. Hicks and W. Ray, Approximation methods for optimal control synthesis, Can. J. Chem. Eng., vol. 49, pp. 522528, 1971. [12] C. Goh and K. Teo, Control parametrization: A unified approach to optimal control problems with general constraints, Automatica, vol. 24, pp. 318, 1988. [13] K. Teo, C. Goh, and K. Wong, A Unified Computational Approach to Optimal Control Problems. New York: Longman Scientific and Technical and Wiley, 1991. [14] E. Hairer and G. Wanner, Solving Ordinary Differential Equations II: Stiff and Differential Algebraic Problems. Berlin, Germany: SpringerVerlag, 1996. [15] Optimization Toolbox Users Guide, The Mathworks,Inc., Natik, MA, 1999. [16] Writing S-Functions, Version 4, The Mathworks,Inc., Natik, MA, 2000.

Victor M. Becerra (S91M95SM03) received the Bachelors degree in electrical engineering (cum laude) from Simon Bolivar University, Caracas, Venezuela, in 1990, the Ph.D. degree in control engineering from City University, London, U.K., in 1994, and the M.Sc. degree in financial management from Middlesex University, London, U.K., in 2001. He was with C.V.G. Edelca, Caracas, Venezuala, between 1989 and 1991, working on power systems analysis and control. He was a Research Fellow at City University between 1994 and 1999. Since January 2000, he has been a Lecturer with the Department of Cybernetics, University of Reading, Reading, U.K., where he lectures in the field of control engineering. He is the Chair of the Cybernetics Intelligence Research Group at the University of Reading. His main research interests are in the fields of optimal control, predictive control, adaptive control, nonlinear control, and the intersections between control systems and artificial intelligence. Dr. Becerra is a Member of the Institution of Electrical Engineers (IEE).

You might also like