Matthew F RUTLEDGETAYLOR Christian LEBIERE Robert THOMSON James STASZEWSKI and John R ANDERSON Carnegie Mellon University Pittsburgh PA USA 19th Annual ACTR Workshop Pittsburgh PA USA ID: 277612
Download Presentation The PPT/PDF document "A Comparison of Rule-Based versus Exempl..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Comparison of Rule-Based versus Exemplar-Based Categorization Using the ACT-R Architecture
Matthew F.
RUTLEDGE-TAYLOR,
Christian
LEBIERE, Robert THOMSON,
James
STASZEWSKI
and John R.
ANDERSON
Carnegie
Mellon University, Pittsburgh, PA, USA
19th Annual ACT-R Workshop:
Pittsburgh, PA, USASlide2
Overview
Categorization theories
Facility Identification Task
Study examples of four different facilities
Categorize unseen facilities
ACT-R Models
Rule-based versus Exemplar-based
Three different varieties of each based on information attended
Model Results
Rule-based models are equivalent to exemplar-based models in terms of hit-rate performance
DiscussionSlide3
Categorization theories
Rule-based theories (Goodman,
Tenenbaum
, Feldman & Griffiths, 2008)
Exceptions, e.g. RULEX (
Nosofsky
&
Palmeri
, 1995)
Probabilistic membership (Goodman et al., 2008)
Prototype theories (
Rosch
, 1973)
Multiple prototype theories
Exemplar theories (
Nosofsky
, 1986)
WTA
vs
weighted similarity
ACT-R has been used previously to compare and contrast exemplar-based and rule-based approaches to categorization (Anderson & Betz, 2001)Slide4
Facility Identification Task
Building (IMINT)
Hardware
MASINT1
MASINT2
SIGINT
Notional Simulated Imagery
Four
kinds of facilities
Probabilistic feature compositionSlide5
Facility Identification TaskProbabilistic occurrences of features
Facility A
Facility B
Facility C
Facility D
Building
1
High
Mid
High
Mid
Building
2
High
Mid
High
High
Building
3
High
Mid
Mid
High
Building
4
High
High
Mid
Mid
Building
5
Low
High
Mid
High
Building
6
Low
High
High
High
Building
7
Low
High
High
Mid
Building
8
Low
Mid
Mid
High
MASINT1
Few
Many
Few
Many
MASINT2
Few
Many
Many
Few
SIGINT
Many
Few
Many
Few
Hardware
Few
Few
Few
ManySlide6
Three comparisons
Human data
versus
model data
Hit-rate accuracy
Exemplar model
versus
rule-based model
Blended retrieval of facility chunk, VSRetrieval of one or more rules that manipulate a probability distribution
Cognitive phenotypes: versions of both exemplar and rule-based models that attend to different dataFeature countsBuildings that are present
BothSlide7
Three participant phenotypes
Phenotype #1: Assumes buildings are key
Attentive to specific buildings in the image
Ignores the MASINT, SIGINT, and Hardware
Phenotype
#2: Assumes the numbers of each feature type is key
Attentive to counts of each facility feature
Ignores the types of buildings (just counts them)
Phenotype
#3: Attends to both specific buildings and feature countsSlide8
Facility Identification
Phenotype #1
Specific Buildings only:
SA
model
Building #2
Building #3
Building #6
Building #7
2
3
6
7Slide9
Facility Identification
Phenotype #2
Feature type counts only:
PM
model
Buildings
4
Hardware
1
MASINT1
6
MASINT2
2
SIGINT
5Slide10
Facility Identification
Phenotype #3
SA
and
PM
Building #2
Building #3
Building #6
Building #7
Hardware
1
MASINT1
6
MASINT2
2
SIGINT
5
2
3
6
7Slide11
ACT-R Exemplar based model
Implicit statistical
learning
Commits
t
okens of facilities to declarative memory
Slots for facility type (A, B, C or D)
Slots for sums of each feature type
Slot for presence (or absence) of each building (IMINT)
CategorizationRetrieval request made to DM based on facility features in target
Category slot values of retrieved chunk is used as categorization decision of the modelSlide12
Facility chunkSlide13
ACT-R: Chunk activation
A
i
= B
i
+ S
i
+ P
i +
ƐiAi is the net activation,
Bi is the base-level activation, Si
is the effect of spreading activation, Pi is the effect of the partial matching
mismatch penalty, and Ɛi is magnitude of activation noise.Slide14
Spreading Activation
All values in all included buffers, spread activation to DM
All facility features stored held in the visual buffer spread activation to all chunks in DM
Primary retrieval factor for phenotype #1 (
buildings
)Slide15
Spreading Activation
Visual Buffer
Facility Chunk
Declarative Memory
Facility Chunk
Facility Chunk
Facility Chunk
b1 nil
b2 building2
b3 building3
b4 nil
b5 nil
b6 building6
b7 building7
category d
b1 nil
b2
building2
b3 nil
b4 nil
b5 nil
b6
building6
b7
building7
category d
b1 nil
b2
building2
b3
building3
b4 building4
b5 nil
b6
building6
b7 nil
category a
b1 building1
b2 nil
b3
building3
b4 nil
b5 building5
b6 nil
b7 nil
category dSlide16
Partial MatchingThe partial match is on a slot by slot basis
For each chunk in DM, the degree to which each slot mismatches the corresponding slot in the retrieval cue determines the mismatch penalty
Primary
retrieval factor for phenotype #2 (counts)Slide17
Partial Matching
Retrieval Buffer
Facility Chunk
Declarative Memory
Facility Chunk
Facility Chunk
Facility Chunk
buildings
b4
Masint1
m6
Masint2
n2
Sigints
s5
hardware
h1
buildings b4
Masint1
m7
Masint2 n
0
Sigints
s7
hardware h2
buildings b5
Masint1
m4
Masint2 n
1
Sigints
s5
hardware h2
buildings b5
Masint1
m1
Masint2 n
8
Sigints
s5
hardware h0
category d
category d
category c
Dissimilar values = high penalty
Similar values = low penalty
Equal values = no penalty
category dSlide18
Heat Map on Counts of FeaturesSlide19
Results of Exemplar Based Model
PM only
0.462
SA only
0.665
PM + SA
0.720
Human Participant Accuracy: 0.535
Performance and interviews suggests
Mix of phenotypes, with #2 (PM-like) most prevalentEmployment of some explicit rulesSlide20
ACT-R Rule Based Model
Applied a set of rules to the unidentified target facility
Accumulated a net probability distribution over the four possible facility categories
Facility with greatest probability is the forced choice category response by the modelSlide21
ACT-R Rule Based Model
Two kinds of rules
SA-like: applies to presence of buildings
PM-like: applies to feature counts
Rules implemented as chunks in DM
Sets of dedicated productions for retrieving relevant rules
High confidence in
choice of rules
Based on analysis of probabilities of
featuresSlide22
ACT-R Rule Based Model
Example building rule
- If
is present then facility A is 1.38 times more likely (than if not present)
Example count rule
- If there are 5 MASINT1 then facility A is 3 times more likely (than if more or less)
-
Note: Count rules apply if count total in target is within a threshold difference of number in ruleSlide23
Rule chunksSlide24
ACT-R Rule Based ModelThree versions of the rules based model
Only apply building rules: similar to SA exemplar model
Only apply count rules: similar to PA exemplar model
Apply both building and count rules: similar to combined exemplar modelSlide25
ACT-R Rule Based Model Results
B
uilding rules only: 0.657
Count rules only: 0.476
Both building and count rules: 0.755
Strategy
Rule-based
Exemplar
% Difference
SA / Buildings
0.657
0.655
0.30
PM / Counts
0.476
0.462
2.94
Combined
0.755
0.720
4.64Slide26
Discussion
Agreement between rule-based and exemplar models, implemented in ACT-R, supports the equivalence of these approaches
They exploit the same available information
The performance equivalence between the two
establishes
that functional Bayesian
inferencing
can be accomplished in ACT-R either through:
explicit, rule applicationimplicit, subsymbolic processes of the activation calculus, that support the exemplar model
ACT-R learning mechanisms of the subsymbolic system in ACT-R is Bayesian in nature (Anderson, 1990; 1993) Blending allows ACT-R to implement importance sampling (Shi, et al., 2010) Slide27
Acknowledgements
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of the Interior (DOI) contract number D10PC20021. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained hereon are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI, or the U.S. Government.Slide28
Blended Retrieval
Standard retrieval
One previously existing chunk is retrieved
Effectively, WTA closest exemplar
Blending
One new chunk which is a blend of matching chunks is retrieved (created)
All slots not specified in the retrieval cue are assigned blended values
The contribution each exemplar chunk makes to blended slot values is proportional to the
activation
of the chunk