On Learning Symmetric Locomotion

Started with an idea that how motion symmetry will help improve or speed up the learning process of locomotion.

In this paper,

4 different methods of incorporating symmetry into the learning process are introduced:

DUP - Duplicating tuples with their symmetric counterparts.
LOSS - Adding a symmetry auxiliary loss.
PHASE - Motion phase mirroring.
NET - Enforcing symmetry in the network itself.

Contents:

Intro. to Symmetry Enforcement Methods
Methods
- DUP
- LOSS
- PHASE
- NET

Intro. to Symmetry Enforcement Methods

Def’) Two trajectories are symmetric if for each state-action tuple from one trajectory correspond to state-action tuple $(Ms (s),Ma(a))$ from the other trajectory.

$M_s : S→S$ $M_a: A→A$

$M_s(s)$ - the mirror state of $s$ $M_a(a)$ - the mirror action of $a$

Similarily, we can define the policy as

$\pi_{\theta}(M_s (s)) = M_a (\pi_{\theta}(s))$

⇒ A symmetric policy thus produces the mirrored action when given the mirrored state as input.