Visual representations are often important for problem solving (Schrager, 1990; Farah, 1988; Casakin & Goldschmidt 1999; Monaghan & Clement, 1999). For example, problem solving can be facilitated by animations (Pedone et al., 2001), and visually evocative phrases in stimuli (Gick & Holyoak, 1980; Beveridge & Parkins, 1987). There is also anecdotal and documentary evidence for visual thinking in science (Miller, 1984; Nersessian, 1984; 1992; Gooding, 1994; Shepard, 1988; Thagard & Hardy, 1992; Boden, 1990). But the details of how people use visual resources in problem solving is hardly understood.
Some visual reasoning involves analogical problem solving, which is gaining knowledge about some target analog by transferring it from a base or source analog. By visual analogy I mean analogical reasoning with visual knowledge. This work will focus visual properties such as spatial relationships between objects, shapes and sizes, rather than on textures and colors.
This dissertation will deal with two main problems. The first is that we do not know the conditions under which visual analogy is useful. The second is that we don't know how to represent visual information, or exactly what visual information to represent for computational purposes. What kinds of visual symbols are useful for analogical problem solving? Which symbols should be used?
A sub-problem of analogical problem solving is the symbol mismatch problem. A symbolic mismatch is when processing is hindered because the symbols representing two things are not the same.
Let us examine Duncker's classic fortress/tumor problem as an example situation where symbolic mismatches could cause an analogical problem solving agent to fail (Duncker, 1926). Imagine the agent knows of a solved problem which involves breaking up an army into smaller groups. The army is, quite reasonably, represented as a group of constituent soldiers. The target problem involves a ray of radiation which must be turned into a number of rays with less intensity (see Figure 2). The ray might be represented as energy, with a number associated with its intensity, a representation chosen to serve a different function (e.g. so that numeric intensities can be added). Not having anticipated that the ray and army might need to be aligned, they could have been encoded with incompatible representations.
Symbolic mismatches can be encountered during analogical retrieval, mapping, or transfer. In this example we have two symbolic mismatches. First, without some similarity (perhaps through their relational structure), the ray and the army, being different symbols, cannot be aligned. They are, semantically, rather distant. But even if this alignment problem is somehow overcome, the agent would still have a problem with transferring the solution strategy. The transformation applied to the army will not work on the ray because the the representation of the ray, in this example, does not have constituent parts. Breaking something into parts is different from dispersing energy.
The point of this example is to show that a reasonable non-visual representation can fail for analogical problem solving. It is possible to represent this problem with no symbolic mismatches (Holyoak & Thagard (1989) do so), but symbolic mismatches are bound to occur in any large knowledge base (Lenat & Guha, 1990).
I propose to develop a theory of visual analogy that addresses these problems, and a representation-level model to show how it could work. In the next two sections I will describe the work done so far in this regard, and lay out a plan for its completion. In my theory evaluation section I will describe the planned computer implementation, along with other planned evaluations.