Zheng, B., Zhao, Y., Yu, J. C., Ikeuchi, K., & Zhu, S.-C. (2014). Detecting Potential Falling Objects by Inferring Human Action and Natural Disturbance. Proceedings Of the IEEE International Conference on Robotics and Automation. doi:10.1109/ICRA.2014.6907351

Summary

In this paper, Zheng et al. outline a system for inferring how likely objects are to fall, based on various disturbance models (e.g. human motion, earthquake, wind) and work/energy calculations. They compare their model’s predictions to human judgments and find that they correspond fairly well (at least based on rank orderings).

Methods

n/a

Algorithm

First, they define the “energy barrier” to be the work required to move an object from it’s current resting position $x_0$ (in an energy minima) to an unstable equilibrium $\tilde{x}$. Then, the “falling risk” of an object is defined to be the amount of change in energy between the unstable equilibrium to a smaller local minima $x_{0^\prime}$.

Second, they define a “disturbance field” to be a probability distribution over forces in the scene:

  • Earthquake – horizontal forces distributed uniformly at random
  • Wind – forces in a particular direction distributed uniformly at random
  • Human action – they calculate this by convolving two different distributions. First, they calculate the areas that humans are most likely to be in, based on computing shortest paths between random pairs of points in the room and then computing how likely it is for someone to cross a particular point across all shortest paths. Second, they calculate the regions of space that a human’s arms or legs are most likely to move through, by using a 3D motion capture (they don’t say, though, what motions they used in the motion capture – just walking?). By convolving these two fields, they end up with a probability distribution for how likely it is for motion to occur at particular places in space.

They then calculate the expected falling risk as:

where $p(W,\mathbf{x}_0)$ is given by the disturbance field and $R(a, \mathbf{x}_0, W)$ is the maximum energy that the object can release when it is moved out of its energy barrier by work $W$. This is related to, for example, how high up the object is – there will be a higher risk if the object is high up because it can fall further.

Takeaways

It is a really interesting idea to estimate how likely it is for things to fall based on computing a possible disturbance field. I wonder, though, how similar this is to how people estimate things. I definitely think we probably need something more efficient and robust than running simulations for every single object in the room (a la what we did with the tower simulations), but I also don’t believe that people would estimate how likely a place is to be walked in. Perhaps an approach that would be more similar to human behavior would be to include this type of calculate into the loss function. That is, for a given path, you estimate where movement is likely to occur just for that path and then you can estimate the expected risk for objects near the path. Then you don’t need to consider the whole scene at once. This would essentially just take into account the local human disturbance field rather than the global one.