[
CogSci Summaries home |
UP |
email
]
http://www.jimdavies.org/summaries/
Rowley, H.A., S. Baluja, & T. Kanade (1998), Neural Network-Based
Face Detection, IEEE Transactions on PAMI, 20(1):23-38.
@Article{rowley1998,
author = {H.A. Rowley, S. Baluja, & T. Kanade},
title = {Neural network-based face detection},
journal = {IEEE Transactions on PAMI},
year = {1998},
OPTkey = {},
OPTvolume = {20},
OPTnumber = {1},
OPTpages = {23--28},
}
Author of the summary: Jim R. Davies, 2000, jim@jimdavies.org
Cite this paper for:
- face detection
- to get bad examples, use what the nn incorrectly identifies as
a face. [p4]
- neural network to arbitrate the output of other neural
networks.
Purpose: To detect faces in an image.
Results: detected 90.5% of the faces, with an acceptable # of false
positives. 130 complex images were used.
How it works
A filter takes a 20x20 pixel image and outputs whether or not there is
a face in the image. This filter is applied to all parts of the
image. The filter is also applied to each size, so that a face that
takes up the entire image will also be detected. To do this
subsampling makes every sub-image 20x20.
The input image is made into a pyramid with windows of different
sizes. Each goes through the following process:
- Lighting is corrected. This simulates ambient light.
- Histogram equalized. This effectively raises contrast.
- 20x20 is input to the neural net.
This paper's main contribution is how the nn analyzes the image. There
are three sets of receptive fields. First, 4 areas that look at 10x10
pixels each. Second, 16 that look at 5x5 each. And Third are
6 horizontal bands (20x5). These bands detect facial features.
Training the network
Another big contribution of this paper is how they got representative
non-face images. They used non-face scenes, ran the nn on them. Where
they falsely identified faces, use those sub-images as the non-face
examples. Thus you have non-faces that one might think were
faces. Pretty smart!
"For each location and scale, the number of detections within a
specified neighborhood of that location can be counted. If that number
is above a threshold, than that location is classified as a face."
This is called "thresholding". If there is a face detected, then all
other overlapping faces detected are probably errors "Overlap
elimination."
To further reduce false positives, multiple networks
arbitrate. [p5] Arbitration works by ANDing the two pyramids. 2
networks will likely not both say the same position and scale has a
face. They also got an arbitration nn to take the output of the other
nn's and decide if there really is a face. This worked about as well
as the heuristics AND and OR. [p6]
The experimental, other work, and future work sections are not
summarized here.
Summary author's notes:
- page numbers are from the pre-print version. Add 23 to each to
get the journal page number. :)
Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:
JimDavies
(jim@jimdavies.org)
Last modified: Fri Mar 3 11:12:16 EST 2000