Untitled

In this paper:

Contents:

Previous Methods

Method

Command input: forward velocity, lateral velocity, and yaw rate.

Policy network: maps the observation of the current state → joint state history to the joint position targets

Actuator network: maps joint state history → joint position targets to 12 joint torque values

Rigid-body simulator: outputs the next state of the robot, given the joint torques and the current state as input

스크린샷 2022-08-09 오후 3.18.11.png