Individual Differences: Marsha C. Lovett Carnegie Mellon University

Individual Differences
Marsha C. Lovett Carnegie Mellon University
Differences Beyond Mere Noise

Between-subject differences: systematic, reliable
Architectural (processing) differences
e.g., processing speed, working memory capacity, decay
Knowledge-based differences
Knowledge contents (e.g., facts, strategies, etc.) Same content, but differences in experience/practice
e.g., different trial sequences, different real-world experiences
Representational differences
Features represented, knowledge structures
Differences Beyond Mere Noise

Within-subject differences: temporal, subtle
Knowledge/experience grows (learning) Processing parameters change (e.g., fatigue) Representation changes (insight)
Why model Ind Diffs in ACT-R?

Additional constraints on model/theory
Models can fit the averaged data and yet fail to fit individuals (or subbroups) data (cf. Siegler, 1987)
0.7
J
0.7 Global Average Human Average

G J J G G J J G G J J G J G J G
Proportion StrategyA
Proportion StrategyA
0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3
0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3

J G J G J G
Global Average Unaware Aware Human

G J G J J G G J
Block
Block

Predicted variability often lower than observed
100 80
% Correct
60
40
20
0 3 4 5 6
Memory Size
(from Lovett, Reder, & Lebiere, 1997)

ACT-R is a nonlinear system variation in input can have surprising consequences
FIXED CASE
fixed input
W1
Linear System f(W) = aW
fixed output
a1 = a
fixed input
W1
Nonlinear System g(W) = eW
fixed output
e1 = e

VARYING CASE variation in input
W~
Normal(1,s2)
Linear System f(W) = aW
variability in output
Normal(a,as2)
mean same as fixed
variation in input
W~
Normal(1,s2)
Nonlinear System g(W) = eW
variability in output
Normal(e1+.5s ,?#@!)
change in mean!

Adding Ind Diffs changes variability and mean
Memory Size
This affects fitting of other global parameters
Especially within unified theories

Unified theories like ACT-R use single set of mechanisms to capture data across tasks Modeling Ind Diffs within ACT-R
Allows modeling of an individual across tasks Allows testing of individual difference theories across tasks
How to model Ind Diffs in ACT-R?

Architectural (processing) differences
Global parameters: G (motivation), W (WM), P/M
Knowledge differences content vs experience

Symbolic: Different sets of productions, chunks Subsymbolic: Production utilities, chunk activations
Representational differences (cf. Lovett & Schunn, 1999)

Chunk types, production conditions, proc vs. decl
ACT-R Ind Diff Models

Architectural Parameters Varied
Indl Subject
Param G egs d W W* W
Subgroup
Task (Modeler) Strat choice in BST (Schunn) KA-ATC (Taatgen) Scheduling (Taatgen, Jongman) WM tasks (Lovett, Reder,Lebiere) List memory(Reder, Schunn) Exptal Design (Schunn) Strat choice in memory (Reder, Schunn) Scaling (Petrov) Device operation (Byrne) Digit symbol (Byrne)

Symbolic Knowledge Varied
Knowledge productions productions productions

Indl Subject
Task (Modeler) User interface (Gray) 2-col subtraction (Young) Seriation length ( Young)
Subgroup
productions chunks
Exptal Design (Schunn)

Other combinations varied
? varied egs p-utils knwldg wm

Indl Subject
Task (Modeler) Analogy in prob solving (Salvucci) Unix tutor, piloting (Doane)
Subgroup
p-utils Early algebra (Koedinger) rt egs p-conds TON (Jones, Ritter)
Example: Archl Param Varied

WM capacitys effect on performance
Model individuals in a WM task called MODS Take MODS model and vary W parameter
(Lovett, Reder, & Lebiere, 1999)
Example contd
Can fit individual data at even finer level
Serial position effect (w/ W fit from set size effect)
Subject 221 W = 0.7 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E B
Subject 201 W = 0.9 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
B E
Proportion Correct
B E
B E
B E
B E
B E
B E B E E B
B E
3 4 5 Serial Position
B E
Da ta Mod el
Subject 211 W = 1.0 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 203 W = 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
B E B E E E B E B B B E
Proportion Correct
E B E B B E E B E B
E B
Example contd
Can other params account for indl patterns?
W varying
Subject 221 W = 0.7 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E B
d varying
Subject 221 d = 0.8
B E
Subject 201 W = 0.9 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 201 d = 0.6

E E B
B E
B E
B E
B E
B E
B E B E E B
B E
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E E B E B B B
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Proportion Correct
Proportion Correct
E E
B E
B E
E B B
B E
Da ta Mod el
B E
Da ta Mod el
Subject 211 W = 1.0 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 203 W = 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 226 d = 0.7 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E E B E E B E B B B E B
Subject 203 d = 0.2 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E E B E B E E B B B E B
Proportion Correct
E B E B B E E B E B
E B
Proportion Correct
(Daily, Lovett, & Reder, 2001)
Example contd
Cross-task predictions w/ no new parameters!
Subjects performed MODS & N-back tasks Ws estimated from MODS to predict N-back
(Lovett, Daily, & Reder, 2000)
Summary: Architectural Ind Diffs

Vary global parameter(s) to represent Ind Diffs Parameter values take on meaning
Predict other measures within task Predict performance on other tasks Relate to other empirical measures Change across time
Can compare different theories of Ind Diffs
Ex: Symbolic Knowledge Varied

Scientific Discovery in Microworld
(Schunn & Anderson, 1998)
Task: reveal truth behind data by conducting experiments and interpreting data tables Large performance differences in experiment designs and interpretation Subgroups modeled with different sets of procedural & declarative knowledge
Issues in Symbolic Knowledge Diffs

Varying procedural/declarative knowledge
Consider elements as qualitative parameters
Model is set of elements drawn/not from fixed set
Use other constraints to winnow possible model versions from the power set
Developmental progression Learnable via symbolic learning mechanisms
Ex: Subsymbolic Knowledge Diffs

Interface use (Gray et al., 2001)
Model runs get same experience as subjects Use same priors for chunk activations and production utilities, but different experience across model runs leads outputs to diverge
Early algebra (Koedinger & Maclaren, 2001)

Vary priors for production utilities to account for subjects different pre-experimental experience
Ex: Representational Ind Diffs

Tower of Nottingham (Jones, Ritter, Wood, 2000)
Goal: Capture developmental differences by implementing developmental theories in ACT-R Besides global parameters (rt, egn), vary # of conditions in productions left hand sides
Analogical Problem Solving (Salvucci, 1998)

Goal: Capture wide variation in eye-fixation strategies when subjects refer to source problem Besides varying parameters (egn, prod utilities), built decl-based and proc-based models
Ind Diffs Model Fitting

Many Ind Diff parameters
Fit IndDiff parameters for each subject NPbig
Few Ind Diff (esp 1) parameters

Fit global params while IndDiff param(s) are drawn from distribution (i.e., not fixed) Fit IndDiff param(s) for each subject G+NPsmall
Hierarchical modeling: best of both!

Fit global and IndDiff params together IndDiff param-values from distribution NsmallP
Concluding Bold Statement

All ACT-R models should be Ind Diffs Models
You have the data (simply omit averaging step) Its just a few more fitting cycles (see prev slide) Avoids perils of averaging over subjects Increases model variability (closer to observed)
Default parameter settings would become distributions, not fixed values
More on Why: Guess the Authors

Computational models need to be able to account for both the commonalty across
individuals processing as well as the variation between individuals performance. Cognitive models should be developed to predict the performance of individual participants across tasks and along multiple dimensions. Ideally, such a modeling effort would be able to predict individuals performance in a new task with no new free parameters, presumably after deriving an estimate of each individuals processing parameters from previous modeling of other tasks. (Lovett, Daily, & Reder, 2001)
A way to keep the multiple-constraint advantage offered by unified theories of cognition while making their development tractable is to do Individual Data Modelling (IDM). That is, to gather a large number or empircal/experimental observations on a single subject (or a few subjects analysed individually) using a variety of tasks that exercise multiple abilities (e.g., perception, memory, problem solving), and then to use these data to develop a detailed computational model of the subject that is able to learn while performing the tasks. (Gobet & Ritter, 2000)
Example contd
Sensitivity analysis: do other params manage?
W varying
Subject 221 W = 0.7 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E B
rt varying
Subject 221 RT = 1.194
B E
Subject 201 W = 0.9 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 201 RT = 0.694

E B
B E
B E
B E
B E
B E
B E B E E B
B E
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E B
E B
B E
B E
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Proportion Correct
B E E E B E B B E E B B
Proportion Correct
B E
Da ta Mod el
B E
Da ta Mod el
Subject 211 W = 1.0 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 203 W = 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 204 RT = 0.194 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Subject 203 RT = -0.056 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
E B B E E B E E B B E B
Proportion Correct
Proportion Correct
E B E B B E E B E B
E B E B E B E B E E B B
E B
(Daily, Lovett, & Reder, 2001)

Individual Differences: Marsha C. Lovett Carnegie Mellon University

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Individual Differences: Marsha C. Lovett Carnegie Mellon University

Uploaded by

Copyright:

Available Formats

Individual Differences

Marsha C. Lovett Carnegie Mellon University

Differences Beyond Mere Noise

Differences Beyond Mere Noise

Why model Ind Diffs in ACT-R?

0.7 Global Average Human Average

0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3

0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3

Global Average Unaware Aware Human

Why model Ind Diffs in ACT-R?

(from Lovett, Reder, & Lebiere, 1997)

Why model Ind Diffs in ACT-R?

Linear System f(W) = aW

Nonlinear System g(W) = eW

Why model Ind Diffs in ACT-R?

Linear System f(W) = aW

Nonlinear System g(W) = eW

Why model Ind Diffs in ACT-R?

This affects fitting of other global parameters

Especially within unified theories

How to model Ind Diffs in ACT-R?

Knowledge differences content vs experience

Representational differences (cf. Lovett & Schunn, 1999)

ACT-R Ind Diff Models

ACT-R Ind Diff Models

Knowledge productions productions productions

Exptal Design (Schunn)

ACT-R Ind Diff Models

? varied egs p-utils knwldg wm

p-utils Early algebra (Koedinger) rt egs p-conds TON (Jones, Ritter)

Example: Archl Param Varied

(Lovett, Reder, & Lebiere, 1999)

Subject 201 d = 0.6

(Daily, Lovett, & Reder, 2001)

(Lovett, Daily, & Reder, 2000)

Summary: Architectural Ind Diffs

Can compare different theories of Ind Diffs

Ex: Symbolic Knowledge Varied

Issues in Symbolic Knowledge Diffs

Ex: Subsymbolic Knowledge Diffs

Early algebra (Koedinger & Maclaren, 2001)

Ex: Representational Ind Diffs

Analogical Problem Solving (Salvucci, 1998)

Ind Diffs Model Fitting

Few Ind Diff (esp 1) parameters

Hierarchical modeling: best of both!

Concluding Bold Statement

Default parameter settings would become distributions, not fixed values

More on Why: Guess the Authors

Subject 201 RT = 0.694

(Daily, Lovett, & Reder, 2001)

You might also like