Mozer, M. C., Pashler, H., & Homaei, H. (2008). Optimal predictions in everyday cognition: the wisdom of individuals or crowds? Cognitive Science, 32(7), 1133–1147. Retrieved from http://onlinelibrary.wiley.com/doi/10.1080/03640210802353016/abstract

Summary

In this paper, Mozer et al. question whether the conclusion of Griffiths & Tenenbaum—that people have accurate knowledge of full prior distributions—is justified. In particular, they argue that the same results can be obtained with a model that only uses a few ($k$) samples from the prior. They show that a simple heuristic model (termed Min$k$) actually outperforms the full Bayesian model, and also matches the inter-participant variance better than the Bayesian model.

Mozer et al. suggest that their Min$k$ model is perhaps more like an algorithmic-level theory, while the fully Bayesian model is a computational-level theory. However, they caution that by focusing purely on the computational level, important factors such as variability and processing time might be missed.

Methods

n/a

Algorithm

Mozer et al. suggest a few different models. First, the model from Griffiths & Tenenbaum (the G&T-Bayesian model) predicts the median of the posterior:

The Min$k$ model retrieves $k$ samples from memory and predicts the minimum sample that is greater than $t_\mathrm{cur}$. If no samples are greater than $t_\mathrm{cur}$, then the model predicts $(1+g)t_\mathrm{cur}$, where $g$ is guess proportion of the query.

The GT$k$Guess model is similar to Min$k$, except that instead of returning the minimum, it returns the median of the posterior computed with an empirical prior estimated via $k$ samples:

If the prior estimate is less than $t_\mathrm{total}$, then the GT$k$Guess model performs the same guess with $g$ as the Min$k$ model.

The GT$k$Smooth model doesn’t guess, but instead uses something like a kernel density estimate:

Takeaways

I think this is actually a really important point—average participant judgments can’t necessarily be used to say something about the knowledge of individuals. I do think, though, that starting at the computational level and working down can still be revealing. In this case, the computational level analysis reveals that at least in aggregate people’s judgments reflect true statistical regularities in the world. It could have been the other way around, and a computational level analysis allows us to first look at the high-level picture. Given that, we can then ask questions like the one that Mozer et al. ask here: what is actually the reason why people’s judgments reflect regularities? Is it that each participant has accurate knowledge of the prior, or that knowledge across people is distributed according to the prior? Mozer et al. show that it is the latter.

I would be interested to see how well an analysis based on mechanisms for sampling would do. For example, would a model that uses rejection sampling also give a consistent algorithmic-level account? I would hypothesize that it would give the same results. You could also measure response time to see if people take longer to respond in cases where they are unlikely to get samples from the prior that are greater than $t_\mathrm{total}$—i.e., they have to make more rejections, and so it takes them longer to respond.