[
CogSci Summaries home |
UP |
email
]
Quillian, M. (1968). Semantic Memory, in
M. Minsky (ed.), Semantic
Information Processing, pp 227-270, MIT Press;
reprinted in Collins & Smith (eds.), Readings in
Cognitive Science, section 2.1
Author of the summary: Jim Davies, 1998, jim@jimdavies.org
Cite this paper for:
- First semantic memory model that covers both general knowledge and
word knowledge
- Type nodes have meaning, token nodes point to
type nodes. Many words have token nodes.
- A memory model that wants to represent no information that can
be inferred.
- Six kinds of links between concepts in a semantic network (p87).
- Three types of parameter symbols S, D, M. (slots to be filled when
parsing sentences). (p88).
- Spreading activation with labels indicating the source (p91).
p80 (pages are of the Readings in Cognitive Science reprint)
The question dealt with: "What constitutes a reasonable view of how
semantic information is organized within a person's memory?" The model
presented here is a model of human cognition (p96)
Here is the first task we want the model to be able to do: Given two
English words, compare and contrast the meanings. The model presented
in this chapter does a reasonable job at this.
First, some caveats:
- The model deals only with the "objective" meanings, as opposed to
the emotive connotations
- It is not a model of learning
- The model focuses on recognition, not recall
p81 Some background work:
There used to be 2 competing theories concerning semantic memory: an
aggregate of associated elements, or based on plans. This distinction
is no longer important because it has been shown that attributes,
concepts and plans can all be represented as lists (as in the IPL
language from Newell, Shaw, and Simon). Learning theorists also
buy this. The lack of distinction allows modelers to use all kinds
of cognitive elements (plans, attributes, associations) as building
blocks (BASEBALL, SAD-SAM, and STUDENT all did it this way).
Simmon's Synthex project used a memory of verbatim text with an
index. It was found to fail when it had to make inferences.
Green et al. and Lindsay organized memory as a single predefined
hierarchy. It failed when it had to make inferences where there
was a changed subject (jumped about the hierarchy). This gets
even more rigid as the info in the tree grows.
p82:
The models mentioned above do not deal a whole lot with permanent
memory-- they are intended to model cognitive processes. Linguistic
theories appear to care about permanent memory even less.
Following in the tradition of Chomsky, linguists attempt to understand
the nature of language apart from people's use of it. Actually,
they are not completely consistent in whether or not they believe
that their grammars are a model of human language use. But in any
case, the ideas that the tradition has spawned has inspired a
lot of psychlinguistic work.
Some assume that semantic memory for words is seperate from
semantic memory of other things (like the memory of your dog's face).
Here this will not be assumed. Instead we will assume that the
semantic memory accounts for words, facts, perceptions, etc.
p83
Our theory says that "language is remembered, dealt with in
thought, and united to nonlinguistic concepts in a form that looks
like the result of phase structure rules. . ." The grammar of
an uttered sentence is decided after the meaning, not before.
The model
There are a mass of nodes connected by links. The nodes roughly correspond
to words. They get their meaning in two ways:
- Type node: The node links to other nodes that explicitly make
the definition.
- Token node: The node links to its type node. There can be many
tokens for a given type (water and agua point to the same meaning),
but not vice versa.
p85:
The type nodes has many connections, which in turn have their own
connections. The full meaning of a concept is all of the nodes that
can be connected from the type node. Each link has direction and is
labeled.
p86:
The quantity of information argues that there should be no redundancy
nor information that could be inferred from more primitive
information. (see summary author's note 1)
It may be that visual and spatial representations are stored in the
same semantic network, but that such representations can be retrieved and
experienced "directly" to do spatial reasoning.
There need to be many kinds of links.
The ontology of links:
- Subclass to superclass
- Modification (adjective or adverb)
- Disjunction (e.g. earth, air, fire, water)
- Conjunction (e.g. old and red need to be conjoined so they both
can modify house in the phrase "old red house")
- and 6.(open ended) (The final two are open ended,
linking two thing
concepts to a relationship concept to form a sort of custom link.)
p87-8:
What the nodes really represent are properties, which are flexible and
primitive. When a property is connected to a concept, there is a
numerical tag (With a fineness of 9 gradations) specifying the
intensity of that relationship. Words like "a," "six," "perhaps," "very,"
and "not" are not nodes, but dictate that range restricting tags be placed
to the token nodes of other words.
p88:
Pronouns in sentences are replaced with explicit references to nodes
according to what that pronoun represents.
Plane:
Each different meaning of a word has its own plane. For example, the
word "plant" means 1. a living thing, 2. a place of manufacture, 3. a
verb meaning to place or put somewhere. Each sense of the word
corresponds to its own plane. The meaning, given a plane, is a
function of connections within the plane and connections to things off
the plane. See the diagram on page 84 for the plant example.
There are three kinds of parameter symbols (S, D, M):
- S
"the parameter symbol whose value is to be any word related to
the present word as its subject."
- D
"the parameter symbol whose value is to be any word related to the
present word as its direct object."
- M
"the parameter symbol whose value is to be any word that the
present word directly modifies."
p89:
These are necessary for specifying the relationships between words
on the same definition plane. So in the definition of "to comb," there
would be a parameter slot D which was expecting something to comb
through-- the object of the combing. This slot stays open, expecting
something in the text to come along and fill it. "D always refers to
some object of the word in whose defining plane it appears. There may be
clue words (like "hair") which tell what the slot is likely to be
filled with.
p91:
One thing they had the program do was to take two words and tell
what their relationship was. This was compared to human data,
then altered, to try to get it about right.
This is how it finds the correct meaning of two words. Starting with
each word, expand outward with associative links, raising the activation
of each node encountered, and also labeling that node with the source
patriarch. An intersection node is a node that is activated by
both partiarchs' searching.
The activated nodes are labeled with
- where the spread started
- which node most recently activated me (current patriarch)
You can follow a path from a node to the patriarch with the
second label. An intersection node will have two paths, one to one
patriarch and the other to the other. In a sense this is the two
patriarchs searching for each other.
The third part of the program outputs sentence-like strings that describe
the relationship between the two concepts, like, given "plant" and "live,"
it returns: "A plant is a live structure" and "Plant is structure which
gets food from air. This food is thing being has to take into itself to
keep live."
p95:
In an experiment, the model correctly disambiguated 12 out of 19
ambiguous words.
p96:
Improvements to be made with the model:
- The parameters D, S, M are not sufficient. E.g. In the definition
of "swarm" there would be a connection to bees with an S link
(subject). But in the sentence "The garden swarmed with bees,"
the garden is not clustering in some area, as we might think
from the definition. This could be changed so that there were
ergative and locative kinds of subjecthood.
p98:
It is widely held that the same grammar is used to generate and
understand speech. For example, the analysis by synthesis theory
(Miller and Chomsky) claims that language understanding happens as
a result of trying to recreate the conditions which would result
in uttering the sentence yourself.
There are contradictory facts, though: Children can understand
sentences more complex than they can generate, as can foreigners
learning the language.
This model can understand without recreating a generative hypothesis.
This may have broad implications, as the facts stated above are part of
the reason Chomsky claims that there is an innate grammar.
Summary author's notes:
- In the modern ACT-R cognitive architecture, inferred facts
become represented and eventually can be retrieved on their own. This
corresponds to the psychological effect of the difference between
retrieval of addition facts as opposed to figuring them out. The
model presented in this paper, in stating that there should be no
information stored that can be inferred, would make the model figure
out every addition problem by adding, with none to retrieve. I think
the reaction time data for addition facts argues against this view.
-
Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:
JimDavies
(
jim@jimdavies.org
)