McGraw & Drasin 1993: Recognition of Gridletters: Probing the Behavior of Three Competing Models

[ CogSci Summaries home | UP | email ] http://www.jimdavies.org/summaries/

Gary McGraw & Daniel Drasin. (1993) Recognition of Gridletters: Probing the Behavior of Three Competing Models. In Proceedings of the Fifth Midwest AI and Cognitive Science Conference, pages 63-67, April 1993.

The actual paper is online.

Letter Spirit: A project that models aspects of human high-level perception and creativity on a computer, focusing on the creative act of artistic letter-design.
DumRec, NetRec, FnetRec - the three models being compared in the paper.

This paper compares the performance of three different models of letter recognition in the Letter Spirit domain.

Categorical sameness is property possessed by instances of a single letter in various styles (e.g., the letter 'a' in Times, Courier, Palatino).

Stylistic sameness is property possessed by instances of various letters in a single style (e.g., the letters 'a', 'b', and 'c' in Times).

Each letter is formed by a set of short line segments, called quanta, on a fixed grid of dimension 3x7. See figure 2.

The three models are:

DumRec

Associated with each training letter is a property list.
Given a mystery letter, DumRec computes its property list and compares it with that of each training letter. The score is weighted sum of the match of the property lists.
The weights play a crucial role in DumRec's performance. To modify them is to "tune" DumRec.

NetRec

2- or 3-layer feedforward connectionist networks trained using backpropogation.
56 input units, each corresponds to a quanta. 26 output units, each corresponds to a letter of alphabet. Hidden layer may have 0-120 units.
Major open problems: a learning rate for backpropogation and the number of hidden units.

FnetRec

A variation of NetRec. Forces the network to pay more attention to certain features as determined by human.
Train a number of small "subnets" to detect certain features. Examples of features are height, weights, descenders, ascenders, different numbers of tips, etc.
Input to the letter-recognizer network is the existing 56 input units as well as outputs from the subnets.

Comparing Performance

The percentage of successful recognition: DumRec 74.3%, FnetRec 72.84%, NetRec 70.45%
DumRec performs best among all, however the differences are not large.
DumRec, most of the time, guess the correct letter or a "reasonable" wrong one.
NetRec and FnetRec are very similar when compare letter by letter, though in general the latter slightly outperforms.

For all models, the performance is still unacceptable (too many mis-categorizations). The reason could be

DumRec - Probably because the features considered are too low-level. Better recognition requires the use of higher-level features (e.g., roles).
NetRec and FnetRec - Style may interfere with recognition.

Last modified: Thu May 6 09:02:17 EDT 1999