Keywords: Vision Systems: No specific system is discussed. Summary: Chapter 1: Discusses the history of research in perception, leading to the notion that the key problem is a rigorous study of the internal mechanisms of vision rather than merely the behavioral characteristics that emerge from these mechanisms. Introduces the notion of levels of analysis and asserts that the study of complex systems is highly dependent on this notion. Presents a very general discussion of representation and process. Describes three levels of analysis of information processing: theory, representation / algorithm, implementation. Argues that both of the top two levels (theory and representation / algorithm) are crucial items of study. Introduces the problem of vision as mapping inputs as arrays of photoreceptor values to outputs of (less obvious) internal representations. Discusses the visual system of the housefly. Briefly discusses human vision concluding with three stage process for vision: from image to primal sketch to 2.5-D sketch to 3-D model representation. Chapter 2: Discusses early vision: the first two stages of the vision process (converting an image to a primal sketch and converting that primal sketch to a 2.5-D sketch). Describes underlying images as consisting of surfaces often composed of hierarchically organized elements (e.g. stripes on a cat composed of hairs on a cat). Discusses continuity, boundaries, and motion. Describes the concept of the primal sketch in detail. Provides mathematical formulas for detecting zero-crossings (i.e. breaks in image intensity). Describes the conversion from zero-crossing into a raw primal sketch of edges, blobs, etc. Discusses key issues in the representation of localized orientation and organization. Discusses light source and transparency effects. Presents the construction of the full primal sketch as the recursive composition of elements from the raw primal sketch into larger more general tokens. Chapter 5: Discusses the conversion from a homogeneous 2.5-D sketch into a modularized (i.e. multiple levels of abstraction) 3-D model. Focuses on the issues of representation and recognition of shapes and coordinate axes. Describes potential extensions to the theory such as 2-D vision, curved axes, relationships between multiple objects, Presents a series of issues in greater detail: building the 3-D model, relating the object-centered coordinate system in the 3-D model to the viewer-centered one in the earlier stages, cataloging (i.e. memory storage, indexing, and retrieval) of the models, and recognition. Provides some psychological evidence for the preceding discussion. Chapter 6: Summarizes four major points of the preceding work: levels of explanation, vision as an information-processing task, process oriented accounts of visual behavior, and the heterogeneity of both subject (e.g. content, process, representation, etc.) and methodology (e.g. mathematical analysis, microscopic neurological observation, psychological experimentation, , etc.). Chapter 7: Introduces a question and answer format for addressing key issues in this theory. Defends the notion of levels of explanation while admitting that the levels do have interconnections. Describes systems based on feature detection as inherently too limited to do effective general visual information processing. Distinguishes between representation and implementation in regards to the issues of procedural and declarative information. Briefly characterizes the transition between images and zero-crossings as involving a change from a numerical domain to a symbolic one. Argues against microworld analyses such as blocks-world and Waltz's prism figures as being inherently non scalable. Argues that Minsky's frames are really implementational mechanisms rather than representations. States that the majority of AI (including ELIZA, productions systems, etc.) as being inherently mechanism based and claims that "the goal of such studies is is mimicry rather than true understanding." Further discusses the numerical to symbolic transition. Discusses computational efficiency issues (within neurons) relating to eye movements. Briefly discusses natural language processing, planning, etc. within the context of the modularized levels of explanation framework.