Khemlani, S. S., Mackiewicz, R., Bucciarelli, M., & Johnson-Laird, P. N. (2013). Kinematic mental simulations in abduction and deduction. Proceedings Of the National Academy of Sciences of the United States of America, 110(42), 16766–71. doi:10.1073/pnas.1316275110

Summary

In this paper, Khemlani et al. conduct a series of experiments in which they have people solve programming-like problems, come up with algorithmic solutions to those problems, and execute existing algorithms. Their experiments operate in a domain that involves train tracks with a set of cars where the problems are to rearrange the cars by sliding them to different sections of the tracks. They also propose a model of how people solve these types of tasks, based on Johnson-Laird’s model theory but with the extension to kinematic mental models. Importantly, there are three assumptions that they make about the way in which people use mental models:

  1. Mental models are iconic (i.e., they have the same form as the thing they represent)
  2. Kinematic mental models are based in time (i.e., the sequence of operations that they simulate are thus ordered in time)
  3. Mental models can be schematic (i.e., not necessarily a visual mental image)

The kinematic mental models theory makes several predictions about people’s performance on programming-like tasks:

  • Solutions that involve more steps, and steps that operate on more objects (operands), will take longer and be more prone to error. Thus a solution that takes 5 steps each operating on a single car should be faster than one that takes 7, but also faster than one that takes 5 steps each operating on two cars.
  • People should find it easier to generate algorithms that use while loops than algorithms that use for loops, because “naive individuals use simulations to abduce algorithms”. (It’s not really clear to me how this follows?)
  • People should find it more difficult to abduce algorithms that have higher Kolmogorov complexity.

Methods

Experiment 1 (“problem solving”) had participants solve rearrangement problems. They were able to actually move the cars on the track using a computer interface. Khemlani et al. found that participants made more moves when their model made more moves, and that participants also made more moves as the number of operands increased. They found similar results with response times.

Experiment 2 (“abduction”) had participants come up with an algorithm to solve each problem (i.e. a description of how to solve the problem, in English). They for each type of problem, they had to solve it for 8 cars, and then for an unspecified number of cars. Participants were close to ceiling in coming up with algorithms for the 8 car problems, but varied on the problems that had an unspecified number of cars (where, as predicted, solutions with a higher Kolgomorov complexity had a lower success rate). Participants also used while loops more frequently.

Experiment 3 (“deduction”) gave participants a description of the procedure, the intial state of the trains, and asked them to determine what the end result would be after executing the procedure. They attempted to control for amount of information in the descriptions, by making sure that each description was the same length (number of words). They again found that people’s success was correlated with the algorithm’s Kolmogorov complexity.

Algorithm

Creation of an algorithm (abduction) involves three steps: first computing solutions to two specific problems, then recovering the loop that must be performed, and then converting the structure of the solution into a verbal description. The specific solutions are solved using a “partial means-ends analysis”, in which the problem is broken down into subgoals, where the first goal is to get the rightmost car on the right track, and so on. The way their model recovers the loop is to first find repeated sequences of at least two moves in both of the solutions to the specific problems, and then either:

  • In the case of a for loop, solve a system of linear equations based on the solutions to the two specific problems.
  • In the case of a while loop, determine the condition that needs to be satisified for the while loop to halt.

To compute Kolmogorov complexity of an algorithm, they simply take the number of characters in the LISP program and multiplied by the number of bits for each character.

Takeaways

This is a really cool exploration of how people reason about more structured plans such as algorithms. It would be really interesting to see if you coded people’s algorithms in Experiment 2 into actual code, how well they corresponded to the algorithms produced by mAbducer, rather than just comparing the success rate. That is, are people actually coming up with solutions like those predicted by the model theory?

I also wonder how well an approach based on some type of grammar would work (e.g. something like Kevin Ellis’ paper from NIPS 2015, “Unsupervised Learning by Program Synthesis”). I also wonder how their mAbducer program compares to typical approaches to program induction in general. Are there key differences that the model theory predicts that general program induction algorithms would not predict?

Khemlani et al. use a very specific notion of “simulation” in this paper, which is essentially the simulation of a computer program: the sequential application of known rules beginning with a given initial state. There are other types of computations that they assume, too (e.g. that are used to solve the problems in the first place), but once specific instances of the programs have been solved, simulations of those programs are used to abduce a general solution to the problem. Simulations of the general solution are then used to perform deduction on new problems.

One potential issue with the third experiment is that participants didn’t actually come up with the solutions themselves. I wonder if there is something important about the way that people form their mental models: it might be that you can’t just give them a description of the program, but that they actually need to abduce it themselves. If this were the case, I wonder if the accuracies at solving the deduction problems would be higher.