Kitaev, N., Mordatch, I., Patil, S., & Abbeel, P. (2015). Physics-Based Trajectory Optimization for Grasping in Cluttered Environments. Proceedings Of the IEEE International Conference on Robotics and Automation. Retrieved from http://www.eecs.berkeley.edu/ pabbeel/papers/2015-ICRA-clutter.pdf

Summary

In this paper, Kitaev et al. use rollouts from a physics engine in order to compute the gradient of the object, which takes into account physical dynamics. They compare two different methods: a baseline which only uses straight trajectories, and their physics-based method. They find that the physics-based method is able to produce behavior that is similar to that exhibited by people (e.g. weaving through cluttered objects) and which avoids catastrophic physical events.

Algorithm

Baseline method

First, the baseline approach finds a straight line trajectory at some particular approach angle $\alpha$. For any particular value of $\alpha$, the trajectory $\mathbf{q}_\alpha$ is computed using sequental quadratic programming to minimize an objective function:

$\min_{\mathbf{q}^{1,\ldots{},\Delta}} \sum_t \sum_k w_k c_k^t (\mathbf{q}^t,\mathbf{q}^{t-1})$ $\begin{align*} \mathrm{s.t.}\ \ & \mathbf{p}_\mathrm{gripper}(\mathbf{q}^\Delta)=\mathbf{p}_\mathrm{gripper}(\mathbf{q}_\alpha^\Delta)\\ & R_\mathrm{gripper}(\mathbf{q}^\Delta)=R_\mathrm{gripper}(\mathbf{q}_\alpha^\Delta) \end{align*}$

where $\mathbf{p}_\mathrm{gripper}$ is the position of the gripper and $R_\mathrm{gripper}$ is the rotation of the gripper. The cost terms are:

$c_\mathrm{static\underline{\ }hull}$ – penalizes penetration between the robot and static obstacles if they are closer than some distance $d_\mathrm{safe}$.
$c_\mathrm{vel}$ – penalizes the robot’s velocity to ensure smooth trajectories

Trajectories are computed for each value of $\alpha$ and then physics simulations are used to evaluate each $\mathbf{q}_\alpha$ (criteria described below). The $\alpha$ corresponding to the best trajectory is chosen.

Physics-based method

The physics-based method also takes into account the physical dynamics of the static objects – not just that of the robot. The objective is:

$\begin{align*} & \min_{\mathbf{q}^{1,\ldots{},\Delta}} \sum_t \sum_k \sum_i w_k c_k^t (\mathbf{q}^t,\mathbf{q}^{t-1},\mathbf{x}_i^t)\\ & \mathrm{s.t.}\ \ \mathcal{X}^t=\phi(\mathcal{X}^{t-1},\mathbf{q}^t),\ \mathcal{X}^1\ \mathrm{is\ fixed} \end{align*}$

where $\phi$ is the physical dynamics. They solve this optimization by picking an initial trajectory and rolling out physics for that trajectory. Then, they linearize the dynamics around that trajectory (which they can do by taking the derivative of the dynamics using a smooth contact model), and optimize that approximation.

The cost terms are:

$c_\mathrm{static}$ – similar to $c_\mathrm{static\underline{\ }hull}$, except that it checks collision at each time step rather than using the swept-out convex hull between timesteps
$c_\mathrm{acc}$ – similar to $c_\mathrm{vel}$, but ensures smooth acceleration rather than velocity
$c_\mathrm{force}$ – penalizes contact forces to prevent kinematically impossible trajectories
$c_\mathrm{motion}$ – penalizes static objects falling off the table
$c_\mathrm{upright}$ – penalizes static objects tipping over
$c_\mathrm{grasp}$ – encourages the gripper to be close to the target object
$c_\mathrm{grp\underline{\ }horiz}$ – enforces that the gripper must be horizontal
$c_\mathrm{grp\underline{\ }open}$ – enforces that the gripper must be open on the last timestep

(there are a few other cost terms that are used in pre- and post-processing steps, but I am not going to go into the details of those here).

Methods

Kitaev et al. evaluate their approach on “shelf” and “refridgerator” scenes with varying numbers of objects, and they define success as:

Success – the robot grasps the target object without knocking over any other objects
Partial success – the robot grasps the target object and some of the other objects are tipped over (but not knocked off the shelf)
Failure – any other behavior, including failure to grasp the object, knocking objects off the shelf, and kinematic failures (e.g. attempting to move through a static object).

Takeaways

The problem that Kitaev et al. are solving here is that sometimes when reaching for an object, there will be other objects in the way. Traditionally, trajectory optimization has just dealt with situations where this isn’t an issue by avoiding obstacles entirely, but this isn’t always an option. Instead, some situations call for actually reasoning about the dynamics of objects and executing a trajectory that does collide with those objects, but that does so in an intelligent way such that they are less likely to fall over or cause other objects to fall over.

The approach that they take in this paper is to include a term for linearized approximate dynamics in the objective function. They make the simplifying assumption that the dynamics are smooth (e.g. rather than having a step function saying “in contact” and “not in contact”, there is some smooth function that varies between the two).

It’s not clear to me how well this approach would work in practice, though. Their experiments show improvement over the baseline, but the baseline seems to be pretty simple and doesn’t include cost terms that might be helpful to it (e.g. by allowing the gripper to rotate and remain closed until the end). For the most crowded cases, these things probably wouldn’t help the baseline much, but for intermediate cases a better baseline would probably be one that allows the gripper to move around rather than enforcing a straight line. Additionally, would the smooth contact model work well in scenarios in which objects need to remain stacked, for example? Especially in the case of the fridge scenario, there are often objects stacked on top of each other, and it would be important to represent this in the simulation. I’m not sure if their smooth contact model would be stable for stacked objects, though.

That said, it is really cool that they seem to find realistic behavior emerging from the physics-based approach – for example, the “snaking” behavior of the gripper in very crowded scenes to push objects to the side, rather than forward. And, it is interesting to see this as yet another type of approach that can be used to incorporate physical knowledge into reasoning about the world.