Problems with previous approaches:
dynamic programming (DP)
Lyapunov approaches
<aside> đź’ˇ These are model-based approaches! Maybe useful to apply e.g. SOS to model-free RL?
</aside>
Approach in this chap.:
Instead of trying to solve for for the entire state space (reallisically there are inf. states),
we will formulate a simpler version of the optimization problem
attempt to find an optimal control solution for only one single initial condition $x[0] = x_0$.
Given a control system $\dot{x} = f(x, u)$ with an initial condition $x_0$, and an input traj. $u(t)$ defined over a finite interval $t \in [t_0, t_f]$
$\min_{u(.)} \int_{t_0}^{t_f} l(x,u)\ dt$
subject to $\dot{x} = f(x, u)$ and $x(0) = x_0$ + additional constraints (e.g. collision, input=torque limits)
where $l()$ is instantaneous cost, $u(.)$ is $u(t)$ for $t \in [t_0, t_f]$
BUT to formulate this as a numeral optimization,
need to apply finite parameterization
Different ways:
1) Direct Transcription
Above problem is same as
$\min_{ u[.]} \displaystyle\sum_{n=0}^{N-1} l(x[n], u[n])$ subject to $x[n+1] = Ax[n] + Bu[n]$ and $x[0] = x_0$
Direct transcription:
adding x[.] as decision variables
$\min_{x[.], u[.]} \displaystyle\sum_{n=0}^{N-1} l(x[n], u[n])$ subject to $x[n+1] = Ax[n] + Bu[n]$ and $x[0] = x_0$
“If we can restrict our additional constraints to linear inequality constraints and our objective function to being linear/quadratic in x and u, then the resulting trajectory optimization is a convex optimization.”
2) Direct Shooting
You might have noticed adding x[.] as decision variables was not necessary.
If we know x[0] and u[.], then we can solve x[.] using forward simulation