[ CogSci Summaries home | UP | email ]
http://www.jimdavies.org/summaries/

Fisher, D.H. (1987), Knowledge Acquisition via Incremental Conceptual Clustering, Machine Learning 2:139-172, reprinted in Shavlik & Dietterich (eds.), Readings in Machine Learning, section 3.2.1.

@Article{,
  author = 	 {D. H. Fisher},
  title = 	 { Knowledge Acquisition via Incremental Conceptual Clustering},
  journal = 	 {Machine Learning},
  year = 	 {1987},
  OPTvolume = 	 {2},
  OPTpages = 	 {139-172},
}

Author of the summary: Alexander Stoytchev, 2000, saho@cc.gatech.edu

Cite this paper for:

A conceptual clustering system accepts a set of object descriptions
(events, observations, facts) and produces a classification scheme over
the observations. These systems do not rquire a teacher (i.e. they are
unsupervised) but they use an evaluation function to discover classes with
"good" conceptual descriptions. A learning of this kind is referred to as
learning from observation (as opposed to learning from examples)

Typically, conceptual clustering systems assume that the observations are
available indefinitely so that batch processing (as opposed to
incremental) is possible using all observations.
However, COBWEB is an incremental system.

Clustering forms a classification tree over objects.
There are two problems that need to be addressed:
  * The clustering problem: determine useful subsets of a set of objects
  * The characterization problem: determining useful concepts for
    each object class

Search control and direction 


=====
Ai=Vij - attribute value pair
Ck     - class

Intra-class similarity is given by conditional probabilities of 
the following kind.

   P(Ai=Vij | Ck)
The larger this probability, the greater the proportion of class members
sharing this specific value (Vij) and the more predictable the value is of
class members.


Inter-class similarity is given by

  P(Ck | Ai=Vij)
  The larger this probability the fewer the objects that share this value 
  and the more predictive the value (Vij) is of the class Ck.

these probabilities can be combined to give an overall measure of
partition quality, where a partition is a set of mutually exclusive object 
classes {C1, C2, ..., Cn}.
The measure used in this paper is:
   n
  Sum  Sum  Sum P(Ai=Vij) * P(Ck|Ai=Vij) * P(Ai=Vij|Ck)          (3-1)
  k=1   i    j  

In essence the formula is a trade-off between intra-class similarity
and inter-class dissimilarity,  summed across all classes (k),
attributes(i), and values (j). The probability P(Ai=Vij) weights the
importance of of individual values, in essence saying it is more important
to increase the class-conditioned prdictability and predictiveness of
frequently ocurring values than for infrequently ocurring ones.


The above formula can be rewritten using Bayes' rule.

                                          P(Ck , Ai=Vij)
  P(Ai=Vij) * P(Ck|Ai=Vij) = P(Ai=Vij) * -------------------
 					      P(Ai=Vij)

                           = P(Ck , Ai=Vij)
                           
                           = P(Ck) * P(Ai=Vij|Ck) 
                           

Thus formula (3-1) can be rewritten as:


   n
  Sum P(Ck) Sum Sum  P(Ai=Vij|Ck)^2                           (3-2)
  k=1        i   j



Finally, we can do yet another modification to the formula. It is based on
the category utility as the increase in the expected number of attribute
values that can be correctly guessed  [P(Ck) Sum Sum  P(Ai=Vij|Ck)^2  ] 
                                              i   j
given a partition {C1, C2, ..., Cn} over the expected number of correct
guesses give no such knowledge  [Sum Sum  P(Ai=Vij)^2 ]
                                  i   j

   n
  Sum P(Ck) { Sum Sum P(Ai=Vij|Ck)^2 - P(Ai=Vij)^2 }           (3-3)
  k=1          i   j
  --------------------------------------------------
                         n

n - is the number of categories in the partition.

The attribute value probabilities are computed from to integer counts.
For example
                   ( #times-a-bird-was-observed-to-fly)
 P(flies|bird) =  -------------------------------------
                     (#times-a-bird-was-observed)



Algorithm:
==========

COBWEB incrementally incorporates objects into a classification tree.
Each node of the tree is a probabilistic concept that represents an object
class. The incorporation of an object is a process of classifying
the object by descending the tree along an appropriate path,
updating counts (for the probabilities) along the way and doing
one of the following operations
 
   * Claasify an object with respect to an existing class
   * create a new class
   * merge two classes
   * divide a class into several classes

 
The paper continues with examples and results on some data sets.
the authors try to establish the claim that clustering systems can be
useful for making predictions about future objects.

Further more they try to see if the value of a given attribute can be used
to predict the value of another attribute (not the class of the object).
This is given by formula (4-1). 


============
COBWEB is an incremental conceptual lustering system and can be viewed as
hill climbing through the space of classification trees. There are several
criteria that can be used to evaluate incremental systems:

  * the cost of incorporating a single instance into a classification tree
  * the quality of learned classification trees
  * number of objects required to converge on a stable classification tree

  Results and graphs for these are given in the paper section 4.


COBWEB seeks classification in which the first level of the tree is
optimal with respect to a measure of clustering quality. However, like all
hill-climbing methods it is sensitive to the order of examples and can get
stuck in a local minimum.


SIMILAR systems:


 Michalski and Stepp (1983) : Conceptual clustering
 Lebowitz (1982) : UNIMEM
 SYRUS
 CLASSIT - EXTENSION OF COBWEB 
 CLUSTER/2

Summary author's notes:


Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:
JimDavies (jim@jimdavies.org)
Last modified: Wed Mar 8 11:06:27 EST 2000