Feedback Control For Cassie with Deep Reinforcement Learning

Untitled

Problems with model-based approaches:

control strategies often need reduced-order abstractions to get solutions. (Simplification)

⇒ controller would not be fully aware of all the details (torque limits, joint limits, etc)

Alternative solution ⇒ Deep RL offers a model-free approach

But problem: (당시 기준) often based on ad-hoc (for particular purposes) simplified simulation models.

In this paper:

demonstrate the effectiveness of DRL on an actual real robot called Cassie

Summarized approach:

Formulate the feedback control problem as searching for an optimal imitation policy for a Markov Decision Process → apply DRL to train controllers for bipedal walking tasks in a model-free manner with a single reference motion

Contents

Background

A. RL & Policy Gradient Methods

Markov Decision Process

B. Feedback Control

Given a dynamical sys. $x_{t+1} = f(x_t, u_t)$, where $x_t$, $x_{t+1} \in X \in R^n$ and $u_t \in U \in R^m$

trajectory optimization is often done offline to produce a nominal trajectory with $\hat{X}$ and $\hat{U}$ that satisfies the equation of motion