@Article{, author = {Janice Glasgow and Dimitri Papadias}, title = {Computational imagery}, year = {1998}, OPTpages = {157--205}, }
This paper proposes a theory of visual representation for the main purposes of expressive power, inferential adequacy, and efficiency. Study results motivated it. It is not, strictly speaking, a cognitive model. For example, they overcome attention limitations (p167). "internal image representations are informationally equivalent to representations involved in our scheme, that is, information in one representation is inferable from the other (Larkin and Simon 1987)." (p195)
Visual information: What things look like [p158]
Spatial information: Where things are in relation to one another
These representations are stored, descriptively, in long term memory, and the images can be generated from them as needed.
Three kinds of representations: deep, spatial, visual
Deep representations: hierarchically organized, desciptive information about the image.
Spatial representations: Symbolic informations that preserves spatial properties. (working memory structuure only)
The following two coorespond to the two "What" and "where" pathways. [p162]
Visual representations: represents space as an occupancy array (shape and size).(working memory structure only.)
Array theory provides primitive functions for image representations. This has been implemented in a system called Nial.
SYSTEM: Nial
Paivio (1975) suggests that imagistic memory is different from, and works in parallel with, verbal memory. [p160]
Standing (1973) showed that visual memory can be superior in recall.
Pinker (1988) suggests that images can be object- or world-centered.[p173] They are represented and manipulated in 3d. (p178)
Reed (1974): Image descriptions are hierarchically organized.
Finke, Pinker, & Farah (1989): Images can be re-interpreted.
Finke & Slayton (1988): Images can be combined for creative discoveries.
Farah (1988b): People typically use mental imagery or spatial reasoning.[p162]
[p165] Certain questions can be answered without using mental imagery, because they are spatial in nature:
The working memory representations are consciously experienced.[p166]
In an occupancy array, cells denote objects filling space. "nested, rectangularly arranged collections of data objects."[p167] Nesting is a recursive object description system in this case.
Israel (1987): Believes the best way to make AI is to imitate the way the mind works.
In Nial, the cells appear to be filled with labels, not just grayscales. (summary author note: This is like a 3d version of VAMP.1 by Thagard). See Figure 7.2, p170.
Kosslyn (1980); Pinker (1984): There is a separate LTM image description.
[p171] In the Nial programming language, frames are connected with either AKO (a kind of) or PARTS (denoting part/whole relationships.) [p172]
M. Levine (1978): relations like left-of, behind, above, etc.[p174]
This approach is unlike Kosslyn's in that it is 3d and viewer-independent.[p175]
The part relation and the recursive nature of the occupancy array allows for zooming in and out.[p177]
The authors claim that if you represent spatial information as propositions, and use logic to make inferences, it's computationally inefficient. [p181] (summary author's note: just because you're using propositions doesn't mean you've got to use logic.)
The symbolic array solution helps with the frame problem too. "In a propositional representation we would have to consider all of the effects that this would have on the current state. Using a symbolic array to store the map, we need only delete the country from its previous position and insert it in the new one... There still remains, however, the problem of dealing with truth maintenance if we desire to preserve relations as changes are made." (summary author note: Can this be right? Sounds like an easy example, moving something on a map. Why would it be so hard propositionally?)
A weakness of a symbolic network with these relations is that it cannot represent quantification or disjunction (e.g. It's either north or south of France). [p182]
Nial: semantic network of frames. AKO relations allow for inheritance. fdefine puts values in slots. fchange, fput, and fdelete modify frames. fget retrieves information. PARTS and the orientation slot enable rendering into the occupancy array. [p184]
Occupancy arrays can rotate, translate, zoom. Functions can retrieve volume and shape. This is for visual reasoning.
The system has attention focus and shiting functions.
This system has been applied to molecular scene analysis.[p193]
Sloman (1985): I believe that when we know how to represent shapes, spatial structures and spatial relationships, many other areas of AI will benefit, since spatia analogies and spatial modes of reasoning are so pervasive." pp386--287 [p200]