The role of visual analogy in scientific discovery
By Jim R. Davies, Nancy J. Nersessian, & Ashok K. Goel
College of Computing
Georgia Institute of Technology
Atlanta, Georgia, 30332-0280
{jimmyd, nancyn, goel}@cc.gatech.edu
http://www.jimdavies.org/research/visual-analogy/
Computational model-based analogy research has used structural and causal knowledge (e.g. Holyoak & Thagard 1997; Falkenhainer et al. 1990; Griffith et al. 1996), but not knowledge that was specifically visual in nature. Causal and structural knowledge contains knowledge of components and only those connections that are relevant for predicting the behavior of a system. In contrast visual knowledge could contain information about where things are on the image plane, their size and orientations, etc. Several historical episodes of scientific analogy appear to have involved a visual component. What is called "visual analogy" involves the analogical retrieval, mapping, and transfer of visual knowledge.
For example, Nersessian (1984, 1992, in press a) has argued that Maxwell used a visual analogy between spinning vortices and rotating gears in constructing a mechanical model of electromagnetic induction in the aether. Maxwell's model had adjacent vortices of aether, all swirling in the same direction. Maxwell discussed how it is impossible to imagine how the vortices could continue in motion in such a system. According to Nersessian’s cognitive-historical analysis it is plausible that Maxwell noticed, using a perceptual process, that the vortices would stop because of friction. He then abstracted a generic model of the vortices as "spinning wheels" and was able to solve the problem by making an analogy to his knowledge of gear systems. In the talk we will describe more examples of visual thinking in science.
Visual knowledge is useful for analogical mappings when causal knowledge is incomplete, and fails to support matches between domains. Even if two systems, like the sun and an orange, are found to be quite different in their causal structure, a match might still be made based on visual similarity, and could prove salient for a problem. The purpose of this paper is to investigate the roles of visual representations in analogy with emphasis on scientific model creation. We will use a simplified version of the Rutherford-Bohr analogy between an atom and the solar system as a running example. Though its historical role in the scientific case is debated, it provides a simple example that helps illustrate our arguments.
Most computational visual analogy systems represent images as a symbolic network. We will use the term "simage" to mean a symbolic network image. Some examples might be
ABOVE(circle, box)
to represent a circle being above a box, or
[sun: location(top) color(yellow)]
to represent the sun being yellow and at the top of an image. Simages are sets of symbols, representing visual elements and properties, connected with labeled, directed links. For example, a simage solar system representation might look like this:
[sun looks-like circle]
[sun has-color yellow]
[sun has-size small]
[sun has-location center]
[planet looks-like circle]
[planet has-size small]
[planet has-location revolution-path]
[revolution-path looks-like ellipse]
[revolution-path has-location center]
[revolution-path has-size large]
Figure 1 shows a functionally identical graphical simage representation.
Figure 1. A simage representation of a simple solar system image.
Another kind of visual representation is a bitmap image, which represents points of light in specific locations. We will discuss this kind of representation later.
All computational accounts of visual analogy have symbols in the representations used for analogy. MAGI (Ferguson 1994) uses the Structure Mapping Engine (SME) (Falkenhainer et al. 1990), to make analogies with visual information. SME uses a symbolic network representation, so MAGI represents its visual knowledge as simages. LetterSpirit (McGraw & Hofstadter 1993), which makes analogical transfer of font style, begins with a bitmap representation of sorts, but first uses a perceptual process to find out which letter is being looked at, and what the letter's different pieces are. The mapping is, again, at a higher level than the bitmap. The VAMP system used a hybrid bitmap-symbolic representation (Thagard et al. 1992). Thagard has since moved on to a purely simage representations with the DIVA system (Croft & Thagard 2000).
These computational analogy systems find mappings based on the similarity of symbols (where the symbols represent components of systems or the links between them). In the domains these systems work with, the same symbol names are used in different domains, so finding correspondences between components is possible. For example, if a planet "revolves around" the sun and an electron "revolves around" the nucleus, then many of these systems could make an analogical mapping between the two because both use the identical symbol for the relation. But what recourse does an agent have when there is a symbol mismatch? For example, imagine a person who, for whatever reason, encoded the electron as "orbiting" the nucleus, and the earth as "revolving around" the sun. The analogy cannot be made in this case because the symbols ("orbiting" and "revolving around") aren't the same. This is a common problem (See (Yarlett & Ramscar 2000) for an attempt to solve it using Latent Semantic Analysis and SME).
Though at this level the systems may differ, abstractions of these systems may have symbolic correspondences. Nersessian (in press a, in press b; Griffith et al. 1996) suggested that generic abstractions could mediate systems that are very dissimilar on the surface. Bhatta and Goel (1993) showed that analogy with a generic abstract model is useful in computational design. We suggest that visual representations provide one kind of generic abstraction. Though the symbol "orbiting" and "revolving around" are different, the way they are visualized may not be. For example, the paths of orbiting electrons and revolving planets can both be visualized as elliptical. In the simage, the paths may be represented using the symbol for an ellipse, showing the system that the two concepts are indeed similar. The agent can map the paths because the symbol "ellipse" is used for both visualizations.
In summary, simages can be useful in analogy when dealing with a symbol mismatch. Traditional computational accounts of analogy have gone far because the examples used, which are usually hand-coded, are designed to not have symbol mismatches.
Visual knowledge can be expressed in several different representations. So far we have only discussed the simage representation. In contrast "bitmap" images are sub-symbolic. Such representations describe nothing but points of light in specific locations, like dots on a grid. A bitmap representation of a square might be a set of coordinates:
(0,1) (0,2) (0,3) (1,1) (1,3) (2,1) (2,2) (2,3)
As you can see, such images are uninterpreted (Kellman & Arterberry 1998, Kosslyn 1994, Pylyshyn 1978). To know there's a square there, you need to apply some perceptual process to it. If bitmap representations actually contained explicit representations of the objects in them, computer vision would be no problem at all.
If one applies a perceptual process to a bitmap image and detects some object, like a line or a box, then the perceptual process has used some kind of knowledge, either implicit or explicit, to extract a symbol out of it. In this paper we include in the notion of ‘symbol’ any identification or detection of a visual object in a bitmap. This is consistent with Barsalou's perceptual symbol system theory (1999).
Making an analogical mapping for two visual representations means finding a set of correspondences between the objects in the images. Since there are not, strictly speaking, explicitly represented objects in bitmap images, the correspondences must be made between symbols extracted from them: The mapping occurs at the symbolic level. For example, in the atom/solar system analogy, maps are made between the electron and the planet, or the sun and the nucleus. From here analogical transfer can occur.
Contrast this with a hypothetical mapping at the bitmap level, which would consist of correspondences between points of light. Points of light are meaningless for mapping and transfer, so analogical mapping needs to occur at the simage level. This is why researchers have avoided bitmap image representations. Why create a bitmap if you just have to turn it back into symbols before you can do anything with it?
Even though computational analogy researchers have not found a function for the bitmaps in the human mind, there is good evidence that people use bitmap as well as symbolic representations of images. Visual mental imagery seems to play a part in visual analogy. According to Kosslyn (1994) mental imagery is a projection into the visual cortex of an image in roughly the same spatial layout as the original retinal stimulation. That is, taking a symbolic representation and generating a bitmap representation from it. This poses some questions: Why do people use mental imagery in visual analogy? What is the use of a bitmap image? To answer these questions we present a hypothesis regarding the usefulness of bitmap imagery in visual analogy: People visualize so they can re-perceive and generate new symbolic structures that may be used to make better analogical mappings.
Kosslyn (1994) has championed the idea that re-perception is the purpose of mental imagery. Here is a common example: But if we ask you if your home has a door, you could probably answer without picturing the door in your mind. If we asked how many windows your house has, you would probably have to visualize walking through the house, and count windows from the mental imagery. The explanation for this is that the door is explicitly associated with your concept of your house, but the number of windows is not. So you need to generate the mental image and count, just as you might with normal perception. After that, the number of windows is associated with your concept of your house, and you might not need to count if asked again.
Imagine our subject once again, representing the electron as "orbiting" the nucleus, and the planet "revolving around" the sun, making analogical mapping difficult. Further let's imagine that the simages of these two systems are different as well. Perhaps the path of the electron's orbit is represented as a "circle" and the path of the planet's orbit is as an "ellipse." Here again we have a symbol mismatch at the simage level. When an attempt at mapping at these first two levels fails, mental imagery can be used to see the similarity. When a circle and an ellipse are rendered into a bitmap representation, the similarity is there, but only implicitly. For the system to recognize the similarity, perceptual processes must be applied to it. For example, she could see the circle as an ellipse.
Suppose our subject is trying to make an analogical mapping between the orbit and the revolution. There are no matching symbols, so she uses mental imagery to visualize the circle, which generates a bitmap representation. She is trying to see an ellipse in the bitmap image, which provides top-down processing. Since a circle is a kind of ellipse, when imagining the circle her perceptual system recognizes it as an ellipse. Now the ellipse symbol is associated with the orbit in a newly generated simage. There is a better symbol match between the two systems, and an analogical mapping can be made between the simages.
We are implementing a computational model (called Proteus) which can make visual analogies and use mental imagery for re-representation, as described above. It can create mappings between simages and transfer problem solutions from a source to a target. In the case of a simage symbol mismatch, Proteus will visualize, generating bitmap images from the simages in memory. To reinterpret them as different symbols, Proteus will also be able to make perceptual identifications from those bitmaps, generating new symbolic representations. It then re-attempts to make a mapping based on the new simage.
In summary we hypothesize that one use of visual representations is for mapping when the causal description fails to make an appropriate analogy, perhaps because of a symbol mismatch. Scientists attempt to analogize with simages, and failing that, the simages are used to generate a bitmap representation with visual mental imagery. Bitmap representations are re-examined with the perceptual system, to generate new symbolic structures that may be used to make better analogical mappings. In this work simages are stored, interpreted representations of images, and bitmaps are temporary, unstored structures which are generated when needed for re-perception. In the talk we will describe Proteus and present more detail on the importance of visual representations for scientific model creation, as well as psychological evidence for the use of mental imagery for purposes of re-perception.
References:
Barsalou, L.W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577-609.
Bhatta, S. R. &. Goel, A. K (1993). Learning generic mechanisms from experiences for analogical reasoning. In the Proceedings of the Fifteenth National Conference of the Cognitive Science Society. Boulder, CO. Hillsdale, NJ: Lawrence Erlbaum. pp. 237—242.
Croft, D. & Thagard, P. (2000). Dynamic imagery: A computational model of motion and visual analogy. Unpublished manuscript.
Falkenhainer, B., Forbus, K. D., & Gentner, D. (1990). The Structure mapping engine: algorithm and examples. Artificial Intelligence (41) pp1-63.
Ferguson, R. W. (1994). MAGI: Analogy-based encoding using regularity and symmetry. Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society. Edited by A. Ram & K. Eiselt, Atlanta, GA: Lawrence Erlbaum Associates. pp. 283-288
Griffith, T. W., Nersessian, N. J. & Goel, A. K. (1996). The role of generic models in conceptual change. In Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society, Mahwah, NJ: Lawrence Erlbaum.
Holyoak, K. J. & Thagard, P. (1997). The analogical mind. American Psychologist.52(1)35-44.
Kellman, P. J. & Arterberry, M.E. (1998). Chapter 5: Object Perception. In The cradle of knowledge: Development of perception in infancy, edited by P. J. Kellman & M. E. Arterberry. Cambridge: M.I.T. Press.
Kosslyn, S. M. (1994) Image and Brain: The Resolution of the Imagery Debate. MIT Press, Cambridge, MA.
McGraw, G. & Hofstadter, D. R. (1993) Perception and Creation of Alphabetic Style. In Artificial Intelligence and Creativity: Papers from the 1993 Spring Symposium, AAAI Technical Report SS-93-01, AAAI Press.
Nersessian, N. J. (in press a) Abstraction via generic modeling in concept formation in science. In Correcting the Model: Idealization and Abstraction in Science. M. R. Jones and N. Cartwright (eds.). Amsterdam: Editions Rodopi.
Nersessian, N. J. (in press b). Maxwell and "the method of physical analogy": Model-based reasoning, generic abstraction, and conceptual change. In The Incomparable Mr. Stein: Essays in the History and Philosophy of Science and Mathematics to Honor Howard Stein on his 70th Birthday, edited by D. Malamet. LaSalle, IL: Open Court.
Nersessian, N. J. (1984). Faraday to Einstein: Constructing Meaning in Scientific Theories. Dordretcht: Martinus Nijhoff/Kluwer Academic Publishers.
Nersessian, N. J. (1992) How do scientists think? Capturing the dynamics of conceptual change in science. In Minnesota Studies in the Philosophy of Science, edited by R. Giere. Minneapolis: University of Minnesota Press.
Pylyshyn, Z. W. (1978). Imagery and artificial intelligence. In Perception and Cognition. Issues in the Foundations of Psychology, Minnesota Studies in the Philosophy of Science, vol. 9, edited by C.W. Savage. Minneapolis: University of Minnesota Press. pp. 19-55
Thagard, P., D. Gochfeld, & S. Hardy (1992). Visual analogical mapping. Proceedings of the 14th Annual Conference of the Cognitive Science Society. Hillsdale, Erlbaum. pp522-527.
Yarlett, D. & Ramscar, M. (2000). Structure-mapping theory and lexico-semantic information. In Gleitman, L. R., and Joshi, A. K. (Eds.) Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Lawrence Erlbaum. Mahwah, NJ. pp.571--576.