Kemp, C., & Tenenbaum, J. B. (2008). The discovery of structural form. Proceedings Of the National Academy of Sciences of the United States of America, 105(31), 10687–92. doi:10.1073/pnas.0802631105

Summary

Many types of data have structured underlying representations. But, these structures can be widely different (e.g., a hierarchical tree vs. a spectrum vs. clusters). Kemp & Tenenbaum show how many of these types of representations can be captured using a graph grammar, and both the structure and form of the representation can be jointly inferred from the given data.

Methods

n/a

Algorithm

If $F$ is the form (e.g., chain) and $S$ is the particular structure of that form and $D$ is the data, then:

$P(S,F\vert D)\propto P(D\vert S)P(S\vert F)P(F)$

They set $P(F)$ to be uniform, where $P(S\vert F)\propto\theta^k$ (where $k$ is the number of clusters).

If the data is given by a binary feature matrix, then $P(D\vert S)$ is computed by assuming that features of the data are independently generated frm a multivariate Gaussian distribution with a dimension for each node in the graph $S$. The covariance of the distribution is defined such that connected nodes should have more similar features.

If the data is relational, then $P(D\vert S)$ is computed such that it is high if the large entries in $D$ correspond to edges in the graph.

There are several different simple forms that they define as well. Each node in the graph is a cluster of data points:

Partition
Chain
Order
Ring
Hierarchy
Tree
Grid (Chain x Chain)
Cylinder (Chain x Ring)

Takeaways

The structure of the world is usually not flat, but may have a more complex structure (clusters, hierarchies, etc.). We can specify these forms a priori, but it is better if we can determine them from the data itself. By using a graph grammar, we can capture a wide range of these structures, including things that are usually considered flat (e.g. grid) all the way to very-not-flat things, like tree structures.