Lee, A. X., Lu, H., Gupta, A., Levine, S., & Abbeel, P. (2015). Learning Force-Based Manipulation of Deformable Objects from Multiple Demonstrations. Proceedings Of the IEEE International Conference on Robotics and Automation. doi:10.1109/ICRA.2015.7138997

Summary

This paper builds on Schulman et al. by not only warping the pose of the robot’s trajectory, but also the force that is applied to the objects. This makes for a more robust and generalizable scheme for manipulating deformable objects (such as rope).

Methods

n/a

Algorithm

The algorithm has several steps:

Step 1

Perform scene registration, where the points from the test scene are mapped to the points from the demonstrated scene in order to come up with a function $f$ that transforms between the two.

Step 2

The trajectories from demonstrated scenes are mapped to the new scene using $f$: the poses are given by $\tilde{\mathbf{Q}}^d=f(\hat{\mathbf{Q}}^d)$ and the forces are given by $\tilde{\mathbf{U}}^d=\frac{df^d}{d\mathbf{p}}(\hat{\mathbf{Q}}^d)\hat{\mathbf{U}}^d$.

Step 3

The trajectories are time-aligned using “dynamic time warping” such that all demonstrated trajectories follow the same time course. This yields $\mathbf{Q}^d$ and $\mathbf{U}^d$.

Step 4

To extract the final trajectory based on the time-aligned demonstrated trajectories, Lee et al. use a Gaussian model of the force:

Here, $\bar{\mathbf{q}}_t$, $\dot{\bar{\mathbf{q}}}_t$, and $\bar{\mathbf{u}}_t$ are the positions, velocities, and forces of the desired trajectory and are computed by fitting a joint Gaussian distribution to $[\mathbf{q}_t^d,\dot{\mathbf{q}}_t^d,\mathbf{u}_t^d]$ at each time step and then taking the mean of that distribution. The variables $\mathbf{K}_{pt}$ and $\mathbf{K}_{vt}$ are the position and velocity gains and are computed from the covariances of the fitted joint Gaussians: $\mathbf{K}_{pt}=-\Sigma_{\mathbf{uq},t}\Sigma^{-1}_{\mathbf{qq},t}$ and $\mathbf{K}_{vt}=-\Sigma_{\mathbf{u\dot{q}},t}\Sigma^{-1}_{\mathbf{\dot{q}q},t}$ (see Section 8.1.3 of the Matrix Cookbook for a better intuition of where this comes from).

Rather than directly fitting the covariances $\Sigma_t$, they use a inverse-Wishart prior on the covariance in order to encourage the gains to be non-negative. They additionally include samples from nearby time points which encourages the estimate to vary smoothly over time.

Step 5

Lee et al. now optimize the robot joint angles $\Theta=[\theta_1,\ldots{},\theta_T]^\top$ to match the desired trajectory (as computed in the previous step). Then, they convert the gains into joint-space gains $\mathbf{K}_{pt}^\theta$ and $\mathbf{K}_{vt}^\theta$ and compute the torque as:

where $\mathbf{J}(\theta)$ is the Jacobian and where $\theta_\mathrm{obs}$ and $\dot{\theta}_\mathrm{obs}$ are the observed joint angles and velocities.

Takeaways

This approach is a significant improvement over previous work, which was unable to generalize to situations in which the applied force was more important than the exact position (e.g. folding a larger towel than the one in the demonstrations requires applying the same force, but moving the arms out further in order to apply that force). The core idea is still the same, however: without having any task-specific or physical knowledge, map from demonstrated trajectories (including forces) to new situations.

I think it’s really interesting that they are able to take force into account and trade off between force and position without actually having knowledge of the task; I’m surprised it works as well as it does! The intuition behind why it works is that when there is variance in the demonstrations, that implies that absolute position is less important, and that something else is more important—for example, force. Thus, where there is more variance in position, the force should be considered more strongly, and when there is low variance, position should be considered more strongly. I do wonder though whether this property always holds true. For example, if you were trying to unscrew a cap from a jar, and different demonstrations had caps that were stuck more or less tightly, I think that the position of the gripper would end up still being about the same—it would be the force that differs. In that case, this wouldn’t work (I think) because it would move the gripper in the appropriate way, but not apply the right force; and in fact, having demonstrations of the force wouldn’t necessarily be sufficient either—to accomplish this task, you would have to have some notion of the position of the jar and the cap and how much the cap is moving, etc.