This Notebook illustrates the following libraries:
SymPy
(http://sympy.org)python-control
(http://python-control.sourceforge.net/manual/)matplotlib
, http://matplotlib.org/)...applied on the example of a Linear Quadratic control problem.
PH, December 2014
import sympy
from sympy import symbols, simplify, Eq, solve
sympy.init_printing()
integrator system, in discrete time:
\[x_{k+1} = x_k + u_k\]
x, u = symbols('x_k u_k')
x_next = x+u
x_next
Instaneous penalty (quadratic cost)
\[ c(x,u) = x^2 + r. u^2 \]
r = symbols('r')
c = x**2 + r*u**2
c
The optimal control is solution of Bellman equation:
\[J(x) = \min_u c(x,u) + J(f(x,u)) \]
with \(J(x)\) known to be quadratic: \(J(x) = V x^2\)
V = symbols('V')
# left-hand side
Bell_lhs = V*x**2
# right-hand side
Bell_rhs = c+V*x_next**2
Bell_rhs
Minimize the right-hand side, by solving \(d\cdot/du = 0\)
Bell_rhs.diff(u)
solve(Bell_rhs.diff(u), u)
The optimal control is a linear feedback:
\[u^* = -K^* x\]
u_opt = solve(Bell_rhs.diff(u), u)[0]
K_opt = -u_opt/x
K_opt
but, we still need to find \(V\)...
→ Reinject the optimal feedback in Bellman equation
Bell_rhs_opt = Bell_rhs.subs(u, u_opt)
Bell_rhs_opt
Solve for \(V\) (Bellman eq. simplifies to a 2nd order polynomial root finding)
solve(Bell_lhs-Bell_rhs_opt, V)
And we know that \(V>0\) and \(r>0\), so we only keep the positive solution
V_opt = solve(Bell_lhs-Bell_rhs_opt, V)[1]
V_opt
Final expression of the optimal gain:
K_opt = simplify(K_opt.subs(V, V_opt))
K_opt
(notice: \(K^*\) is equivalent to \(\frac{1}{\sqrt{r}}\) when \(r\to \infty\))
Special values of \(K^*\):
(K_opt.subs(r, 0),
K_opt.subs(r, 1).evalf(3),
K_opt.subs(r, 10).evalf(3))
Transform the symbolic expression in a numerical function, to be evaluated and plotted
import numpy
from numpy import array, sqrt
K_opt_r = sympy.lambdify(r, K_opt, numpy)
K_opt_r(array([0, 1, 10]))
import matplotlib.pyplot as plt
%matplotlib inline
r_list = numpy.logspace(-2,3.9, 100)
plt.loglog(r_list, K_opt_r(r_list), label='Optimal gain $K^*$')
plt.loglog(r_list, 1/sqrt(r_list), 'r--', label='$1/\sqrt{r}$ asymptote')
plt.grid(True)
plt.xlabel('Control penalty r')
plt.ylim(ymax=2)
plt.legend(loc='lower left');
Using python-control
to find numerically the optimal feedback (http://python-control.sourceforge.net/manual/synthesis.html).
(Based on SLICOT
Fortran routines http://slicot.org/ )
→ requires the control
and slycot
packages (and installing slycot
requires a Fortran compiler, like gfortran
on Linux, and lapack-dev
for the lapack headers).
import control
Linear quadratic regulator design with lqr
(http://python-control.sourceforge.net/manual/synthesis.html#control.lqr)
problem: only for continous time. Discrete time not yet implemented (→ student project?).
control.lqr??
Integrator, in continuous time:
\[\dot{x} = u\]
A = 0
B = 1
Q = 1
R = 100
K, S, E = control.lqr(A, B, Q, R)
K
1/numpy.sqrt(R)
In continuous time, the optimal control is solution of the Hamilton-Jacobi-Bellman (HJB) equation
\[0 = \min_u(c(x,u) + \frac{\partial J}{\partial x} f(x,u))\]
(in steady state)
x, u = symbols('x u')
x_der = u
x_der
r = symbols('r')
c = x**2 + r*u**2
c
V = symbols('V')
J = V*x**2
HJB = c + J.diff(x)*x_der
HJB
minimize with respect to \(u\):
solve(HJB.diff(u), u)
u_opt = solve(HJB.diff(u), u)[0]
K_opt = -u_opt/x
K_opt
Solve for V
solve(HJB.subs(u, u_opt), V)
V_opt = solve(HJB.subs(u, u_opt), V)[1]
V_opt
Optimal gain:
K_opt = K_opt.subs(V, V_opt)
K_opt
Observation: the optimal gains in discrete time and continuous time match for big values of \(r\) (i.e. for systems with slow dynamics).