2009 Slava Kalyuga - Cognitive Load Factors in Instructional Design For Advanced Leraners PDF

COGNITIVE LOAD FACTORS IN
INSTRUCTIONAL DESIGN FOR

ADVANCED LEARNERS
No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
COGNITIVE LOAD FACTORS IN
INSTRUCTIONAL DESIGN FOR
ADVANCED LEARNERS
SLAVA KALYUGA
Nova Science Publishers, Inc.

New York
Copyright 2009 by Nova Science Publishers, Inc.
All rights reserved. No part of this book may be reproduced, stored in a retrieval system
or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape,
mechanical photocopying, recording or otherwise without the written permission of the
Publisher.
For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com
NOTICE TO THE READER

The Publisher has taken reasonable care in the preparation of this book, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or
omissions. No liability is assumed for incidental or consequential damages in connection
with or arising out of information contained in this book. The Publisher shall not be liable
for any special, consequential, or exemplary damages resulting, in whole or in part, from
the readers use of, or reliance upon, this material.
Independent verification should be sought for any data, advice or recommendations

contained in this book. In addition, no responsibility is assumed by the publisher for any
injury and/or damage to persons or property arising from any methods, products,
instructions, ideas or otherwise contained in this publication.
This publication is designed to provide accurate and authoritative information with regard
to the subject matter covered herein. It is sold with the clear understanding that the
Publisher is not engaged in rendering legal or any other professional services. If legal or
any other expert assistance is required, the services of a competent person should be
sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A
COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF
PUBLISHERS.
LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA
ISBN: 978-1-60741-685-2 (E-Book)
Available upon request
Published by Nova Science Publishers, Inc. New York

CONTENTS
Preface vii
Chapter 1 Basic Architecture of Human Cognition 1
Chapter 2 Cognitive Studies of Expert-Novice Differences
and Design of Instruction 21
Chapter 3 Cognitive Load Perspective in Instructional Design 35
Chapter 4 Cognitive Load Principles in Instructional Design
for Advanced Learners 69
Summary Toward a Cognitively Efficient Instructional
Technology for Advanced Learners 91
Index 99
PREFACE
The empirical evidence described in this book indicates that instructional

designs and procedures that are cognitively optimal for less knowledgeable
learners may not be optimal for more advanced learners. Instructional designers or
instructors need to evaluate accurately the learner levels of expertise to design or
select optimal instructional procedures and formats. Frequently, learners need to
be assessed in real time during an instructional session in order to adjust the
design of further instruction appropriately. Traditional testing procedures may not
be suitable for this purpose. The following chapters describe a cognitive load
approach to the development of rapid schema-based tests of learner expertise. The
proposed methods of cognitive diagnosis will be based on contemporary
knowledge of human cognitive architecture and will be further used as means of
optimizing cognitive load in learner-tailored computer-based learning
environments.
Chapter 1
BASIC ARCHITECTURE OF HUMAN

COGNITION
A cognitive approach to human learning emphasizes the internal cognitive

mechanisms of learning. Such mechanisms are usually described as
transformations performed on various mental representations of situations and
tasks. An important assumption of the approach is that a single general cognitive
system underlies human cognition. Different theoretical approaches specify this
general cognitive system as corresponding cognitive architectures. The
understanding of human cognition within a cognitive architecture requires
knowledge of corresponding models of memory organization, forms of knowledge
representation, mechanisms of problem solving, and the nature of human
expertise.
MEMORY ORGANIZATION
The major characteristics of human memory are its strength or durability,
capacity (number of items of information stored in memory), and speed of access.
According to these characteristics, memory is divided into long-term memory and
short-term memory. Long-term memory (LTM) is characterized by high strength
and includes well-learned knowledge, for example, the name of the first US
President, 5 x 5 = 25, or the spelling of the word potatoes. It is presumed to have
unlimited capacity, although the access to the stored information could be slow.
Both the strength of memory and the speed of access increase with practice. More
2 Slava Kalyuga
fully elaborated and more deeply processed material results in better long-term
memory.
Short-term memory (STM), on the other hand, includes information that has
been just encoded from sensory registers or retrieved from long- term memory,
for example, what have you been thinking about just before this? what are you
thinking about when dialing the phone number 8344 2124?. The durability of
STM is a matter of seconds (Peterson & Peterson, 1959), and information in STM
could be accessed very rapidly. The number of items of information that can be
maintained in an active state simultaneously in STM is about seven units for most
people (Miller, 1956). For example, it is very difficult for us to recall more than
approximately seven serially presented random numbers (e.g., an unfamiliar
phone number) a few seconds after we hear or see them, unless the numbers have
been intentionally rehearsed. When asked to copy strings of digits from one page
to another, we usually do this by grouping the digits by easily manageable units of
three or four at a time.
The most generally specified basic human cognitive architecture includes
these two substructures (STM and LTM). Examples are the standard model
(Newell & Simon, 1972) and modal model (Atkinson & Shiffrin, 1968; Waugh &
Norman, 1965). In more specific models, these substructures might be regarded
either as a single memory store with different modes of activation for long-term
and short-term components, or as separate memory stores. These distinctions are
not essential when considering the basic level of cognitive architecture. However,
in order to explain human cognition, this general model needs to be supplemented
by some attention control mechanism (central processor or central executive)
which determines what information from sensory stores or LTM is brought into
STM. The information that is actually attended to is limited to a small number of
chunks in STM (Simon, 1979; Ericsson & Simon, 1993a, 1993b).
Various cognitive architectures and elaborations of the general model extend
the described memory structure. For example, the concept of working memory
(WM) was introduced to account for processing of units of information that are
interconnected, rather than random, and should be processed concurrently because
of the nature of things they reflect or due to established associations in long-term
memory. Working memory is considered as "a system for the temporary holding
and manipulation of information during the performance of a range of cognitive
tasks" (Baddeley, 1986, p. 34), a desktop of the brain that keeps track of what
we are doing or where we are moment to moment, that holds information long
enough to make a decision, to dial a telephone number, or to repeat a strange
foreign word that we have just heard (Logie, 1999, p.174). Some simple
examples of working memory operation could be provided by the following tasks:
Basic Architecture of Human Cognition 3
close your eyes and pick up a pen in front of you; count the number of windows in
your house or apartment; mentally rearrange the furniture in your room, or
mentally complete a mathematical operation (for more examples, see Logie,
1999).
After incoming stimuli from an external source are registered in sensory
memory, perceived or matched to recognizable patterns by using prior knowledge
(if any) in LTM and context, and are paid attention to, they are transferred into
WM. If a unit of information is not recognized due to the lack of appropriate LTM
patterns, it still could be attended to and processed in WM, with appropriate
cognitive resources allocated for the task. Attended units of information in WM
are assigned meaning and used for constructing integrated mental representations
of a situation or task (Figure 1). This information, however, may fade very
quickly if attention is diverted or if the capacity of WM is overloaded.
Baddeley and Hitch (1974) first proposed that WM performed both
processing and storage functions. They suggested three structural components of
working memory: a central executive and two separate auditory and visual stores
for handling verbal information and visual images. These two stores serve as
maintenance systems controlled by the central executive and are called
respectively an articulatory or phonological loop (inner voice) and a visuospatial
sketchpad (inner eye). The limited capacity of the central executive is used for
processing incoming information, with the remainder used for the storage of
intermediate and final products of that processing. Storage and processing
capabilities of WM trade off against each other. When memory load increases
above some threshold, our performance could be inhibited. To get a feeling of
WM limitations, try to mentally add two large numbers (for example, 83 468 437
and 93 849 040). For a concurrent task, you may try also to attend simultaneously
to a comedy show on your TV. It would be very difficult to do because each of
these activities alone may take all of your WM resources.
There are three major functional aspects of working memory operation:
temporary storage, manipulation of information, and executive control.
Temporary storage of information was the focus of classic models of STM and
was studied using standard word or digit STM span tasks. These were simple
tasks involving recalling a list of digits or unrelated words and not requiring much
prior knowledge. Active manipulation of information has been the focus of
models of WM and has been studied using WM span tests that require concurrent
processing of several tasks. These are relatively more complex tasks involving
meaningful cognitive operations such as reading sentences or performing
numerical transformations, and then recalling the final words of those sentences or
results of the math operations. Performance of complex cognitive tasks requires
4 Slava Kalyuga
simultaneous use and integration of various sources of information, coordination

of separate processes and representations. It is the executive functioning of WM,
interactions between WM and LTM knowledge structures that have become the
focus of research in recent years (see Miyake & Shah, 1999, for a recent overview
of WM models and the state of the field).
A number of hypotheses have been proposed to explain individual differences
in WM capacity and its relation to performance. These theories considered
differences in total WM capacity, differences in processing efficiency of WM, or
both. According to the total capacity approach (Baddeley & Hitch, 1974; Cantor
& Engle, 1993; Case, 1985; Engle, Cantor, & Carullo, 1992), all cognitive
processes require resources from a fixed pool. Any resources not allocated to the
operations can be used for short-term storage. The storage and processing
capabilities of working memory trade off against each other. When memory load
increases above some threshold, a persons performance may decline. A change in
total capacity caused, for example, by fatigue or age should affect the
performance in a wide range of tasks.
Working Memory
Constructing mental
representations of a
situation or task
Long-Term
Memory
Knowledge base
Sensory Memory:
Incoming information
Figure 1. Basic architecture of human cognition.

The task-specific hypothesis (Daneman & Carpenter, 1980) assumed that

WM capacity is specific to the particular task being performed. Efficient
processing skills leave more WM capacity for storage of processing products. A
change in processing efficiency should be specific to a particular task and result
from intensive practice or training (Just & Carpenter, 1992). Performance would
be influenced only if available resources are in short supply when a person
operates at the limit of WM capacity. The processing efficiency approach assumes
that a single central system is responsible for the processing and temporary
storage of information. Its limited capacity must be shared between the processing
and the storage demands. Individuals with inefficient processes have a
functionally smaller storage capacity because they must allocate more resources to
the processes (Daneman & Carpenter, 1983; Daneman & Tardif, 1987).
Working memory capacity was measured in terms of operational capacity
dependent on the type of specific background task used in a particular domain
(Carpenter & Just, 1989). For example, the reading span test was used to measure
WM capacity as the largest size of the set of simple sentences from which a
subject can reliably recall the final words of all the sentences (Daneman &
Carpenter, 1983). Daneman and Tardif (1987) established that the reading span
was a measure specific to the language skills, not a measure of general working
memory capacity, and it correlated significantly with reading comprehension
ability.
Although there obviously are systematic differences among individuals in
their working memory capacity for specific tasks, and these differences influence
performance when the person operates at the limit of his or her working memory
capacity, no single approach or hypothesis concerning the interpretation of
individual differences in WM capacity has received convincing empirical support.
Such differences could be strongly influenced by knowledge structures available
in long-term memory. Any WM span implicitly reflects an individual's knowledge
and experience in a domain, and this knowledge inevitably influences his or her
performance in both processing and storage parts of the task (e.g., Hulme,
Maughan, & Brown, 1991; Hulme, Roodenrys, Brown, & Mercer, 1995). WM
span measures thus could be used as predictors of the persons performance in the
corresponding domain rather than measures of his or her true general WM
capacity. It is practically impossible to eliminate the influence of the persons
knowledge base when meaningful tasks are involved in WM span tests. From this
point of view, approaches that focus on connections between the content and
operation of working memory and long-term memory could be more relevant and
productive.
6 Slava Kalyuga
Simple chunking mechanisms provide an example of using long-term

memory structures in transforming the content of working memory. The chunk is
a familiar unit of information based upon previous learning. For example, it could
be difficult to remember and recall a string of random letters like
B,B,C,C,I,A,A,B,C,F,B,I, unless we chunk them together into BBC, CIA, ABC,
FBI. In this case, we use our prior knowledge stored in LTM to reduce the number
of elements to a manageable four chunks. The same method could be used with
the following string of numbers: 1,9,1,4,1,9,4,5,1,9,9,6,2,0,0,1. Another common
example of chunking in language comprehension is the way we chunk letters into
familiar words, and words into familiar phrases. An STM capacity estimate of
around seven units (Miller, 1956) actually indicates the number of chunks rather
than total amount of information stored in STM. This mechanism explains how
we manage to get around the information-processing bottleneck created by our
limited working memory capacity, and to learn the enormous amount of
knowledge in our LTM.
People can be trained to effectively increase their memory capacity to an
amazing degree through extensive training in chunking and re-chunking
information into meaningful units using their prior knowledge stored in LTM. The
skilled memory theory (Chase & Ericsson, 1982) claims that people develop
mechanisms that enable them to use a large and familiar knowledge base to
rapidly encode, store, and retrieve information within the area of their expertise
and thus circumvent the working memory capacity limitations. As a result, experts
possess an enhanced functional working memory capacity in domains of their
expertise (Ericsson & Staszewski, 1989).
Available domain-specific knowledge enables experts to quickly encode and
retain large amounts of information in LTM. Such LTM storage and retrieval
operations speed up with practice and are comparable with STM encoding and
retrieval, resulting in experts' superior task performance and superior recall for
familiar materials (the skilled memory effect; Ericsson & Staszewski, 1989). For
example, expert mnemonists can increase their digit spans far beyond the limit of
Miller's seven plus-or-minus two digits. They use familiar chunks of knowledge
in LTM to encode new information in an easily accessible form. Ericsson and
Staszewski (1989) described a person who expanded his digit span to 84 digits by
grouping them into short sequences and encoding them in terms of, familiar to
him, athletic running times, dates, and ages. He nevertheless operated under the
constraints of limited-capacity STM: the size of digit groups never exceeded five
digits, and these groups never were clustered in supergroups with more than four
groups in a supergroup.
In the WM model of Carpenter and Just (1989), the operation of WM during

reading comprehension is also based on relations between WM and LTM. In this
model, WM consists of currently active pointers to LTM structures and partial or
final products of processing. A reader stores the theme of the text, the general
representation of the situation, the major propositions from preceding sentences,
as well as a representation of the sentence he or she is currently reading (Just &
Carpenter, 1992). When dealing with an unstructured series of words, we can
usually recall only six or seven unrelated words in order (according to our STM
span). Skillful readers, on the other hand, can recall and understand long
sentences (about 77% of words in up to 22-word sentences) because they use
internal structures in LTM to circumvent WM limitations. Thus, sentence
comprehension can be considered as recoding (chunking) incoming symbols into
some structure (Carpenter & Just, 1989).
Ericsson and Kintsch (1995) further developed these ideas into the theory of
long-term working memory (LT-WM). In this theory, LTM knowledge structures
associated with components of working memory form a LT-WM structure that is
capable of holding virtually unlimited amount of information. Some additional
mechanisms were introduced for overcoming the effects of interference in experts'
use of LTM knowledge for storage and retrieval of newly encoded information
were introduced. The proposed mechanism of LT-WM operation involves cue-
based retrieval of information from LTM. The encoding method can be based on a
specifically constructed retrieval structure, an elaborated existing memory
structure, or a combination of the two. Skilled performance depends on domain-
specific knowledge structures relevant to particular tasks, and, consequently, there
are individual differences in the operation of LT-WM for a given task (Ericsson &
Kintsch, 1995).
KNOWLEDGE REPRESENTATIONS
Our knowledge base in LTM profoundly influences cognitive processes in
most situations. Therefore, forms of knowledge representations are critical for
understanding human cognition. Several major ways of representing the meaning
of information in memory have been suggested: propositional representations
(semantic networks), procedural representations (production systems), and
schemas. Analogical representations or mental models (Rumelhart & Norman,
1983) can be generally considered as schemas. The concept of a proposition
denotes the primitive unit of meaning, or a smallest unit of knowledge about
which it is possible to make the judgment, true or false. Networks of such
8 Slava Kalyuga
interconnected units can be used to represent the meaning of sentences and

pictures.
Newell and Simon (1972) suggested that knowledge could be represented by
a set of conditional rules or productions condition action. The production rules
are stored in long-term memory and are retrieved and used in working memory.
The current contents of working memory are matched against the conditions of all
the production rules in long-term memory. Whenever the conditions of a rule
occur in working memory, the rule is triggered and its action is carried out. Action
of the rule can change the contents of working memory and determine which rule
is triggered next. Thus, the principles determining how one rule is followed by
another are built into the rules themselves.
One of the most advanced theories based on the idea of production rules, the
ACT* theory (Adaptive Control of Thought; Anderson, 1983), or its updated
version ACT-R (R for rational; Anderson, 1993), suggest a separate type of long-
term memory for production rules (for skills) in addition to the declarative
memory (propositions, images, and other representations for facts and
experiences). The items in these memories can vary in their degree of activity. If
the contents of working memory match more than one rule in procedural memory
then whichever is the most active is triggered.
The concept of a schema, originally discussed by Bartlett (1932), came into
cognitive psychology from research in artificial intelligence (Minsky, 1975;
Bobrow & Winograd, 1977). Schemas generally represent the object as a set of
attributes (slots). Schemas abstract generalizations about objects from specific
instances, encode general categories and typical features. They may include not
only propositions, but also perceptual features (for example, spatial images) and
stereotypic sequences of events. Schemas may have slots with fixed or variable
values; slots with variable values usually have some default or most probable
values.
The most important features of schemas are stable patterns of relationships
between variables (slots). Each schema contains information about some class of
structures. When particular values are assigned to slots of a schema, a schema-
based knowledge structure could be obtained in the form of concepts,
propositions, etc. The obtained knowledge structures could be more general or
more specific depending on those values. Multiple schemas can be linked together
and organized into sophisticated hierarchical structures where one schema can
form part of a more complex schema.
Schemas may represent knowledge of all kinds and levels: from individual
letters (allowing us to recognize different variations of handwritten letters) to
complex electronic or organizational systems, behavioral patterns, visual and
auditory perceptual images. For example, our schema for a human face includes
slots for eyes, a nose, a mouth, ears, etc. These components are arranged in a
certain configuration that is not a rigid one. However, some general requirements
should be met: the nose and eyes should be located above the mouth; eyes should
be located above the nose on different sides of it, etc. This general schema allows
us to recognize instances of human faces in limitless situations, including some
peculiar forms of visual arts.
A students schema for solving linear algebraic equations of the type ax = b
may include three slots: 1) a number b on the right hand side of the equation; 2) a
number a on the left hand side of the equation; and 3) the division operation:
divide the content of the first slot on the content of the second slot. For less
experienced students, the schema may include the operation of dividing both sides
of the equation on the same number a. In this case, the schema would contain
slots for both parts of the equation, the dividing number a, and the division
operation.
For an example of higher-level schematic knowledge representations,
consider the technical domain that includes knowledge about various technical
objects (e.g., tools, devices, machines, technological procedures). This variety of
knowledge in any technical area could be represented with different levels of
specification: from descriptions of general features to specific details. A
schematic framework for representing knowledge about a technical object may
include three main interconnected components that could be referred to as
functional, operational, and structural descriptions. Any technical object could be
characterized by some functions or purpose it was designed for (what is this
object for?), processes utilized in the objects operation (how does it operate?),
and the objects internal structure including links between its components (what
does it consist of?). To explain an objects operation means to explain why a
given set of linked parts performs specific functions utilizing certain processes
during operation. A learner should establish connections between functional,
operational, and structural components of the objects description in order to
understand how it works (Kalyuga, 1984; 1990).
Gruber and Russell (1996) suggested similar classes of an artifact description:
structure (the physical and/or logical composition of an artifact in terms of the
composition of parts and connection topologies), behavior (something an artifact
might do in terms of observable states or changes), function (effect or goal to
achieve by artifact behavior), requirements (prescriptions concerning the
structure, behavior, and/or function that the artifact must satisfy), and objectives
(specifications of desired properties of the artifact other than pure functions, such
10 Slava Kalyuga
as cost and reliability). Requirements and objectives could be generally included

into the functional description (as functional requirements and general functions).
functions of the object
alternative
combinations of
processes realizing
a set of functions
alternative technical
solutions realizing a
combination of
processes
Figure 2. General schematic structure of technical knowledge.
Each of above aspects of technical knowledge may have different levels of

generalization. It is possible to describe an object in very general terms (a global
level or general overview) or in more details with different levels of specification.
When combined together, all aspects, components, and levels of the description of
a technical object create a sophisticated multilevel hierarchical schematic
structure of technical knowledge. In an abstract form, this structure could be
represented by the graph in Figure 2. Three levels of description are shown for
functions, processes, and structural components of a technical object. Simple and

superficial knowledge about the object may include only isolated components
corresponding to the upper rows in the depicted clusters of knowledge elements.
Further deepening of knowledge requires establishing relations between these
components and adding elaborated knowledge on more specific levels of
description.
There are many definitions of schemas depending on the theoretical
perspective of the researcher. It is practically impossible to precisely describe the
schematic knowledge structures held by an individual. As Norman (1983) noted,
"we must discard our hopes of finding neat, elegant mental models, but instead
learn to understand the messy, sloppy, incomplete, and indistinct structures that
people actually have" (p. 14). In general, a schema can be described functionally
as a cognitive construct (an organized knowledge structure) that allows people to
classify information according to the manner in which it will be used (e.g., Chi,
Glaser, & Rees, 1982; Sweller, 1993). Such organized knowledge structures
represent a major mechanism for extracting meaning from new information,
acquiring and storing knowledge, circumventing the limitations of working
memory, increasing the strength of memory, and recalling information. They
impose an organization on the information, guide retrieval, and provide
connections to prior knowledge.
In schema theory, the process of learning can be considered as encoding new
information in terms of existing schemas, as schema modification, or as the
creation of new schemas. The creation or modification of a schema is based on
conscious cognitive processing of information in working memory. In a more
general context, schema acquisition could be regarded as an example of a non-
linear process where the schema emerges from lower-level components during
learning or practice. As a cognitive unit, the schema represents a higher level of
organization than just a simple collection of lower-level components.
The need for the emergence of higher levels of schema hierarchy could be
associated with general limitations of human information processing. In a wider
context, any qualitatively new level of a system emerges in a non-linear way as a
means to overcome the combinatorial barrier caused by immense number of
possible combinations of the variety of elements of the previous, lower level.
Examples of such processes are the emergence of the molecular level from atoms,
biochemical structures from molecules, or nerve impulses from biochemical
structures (Scott, 1995; Turchin, 1977). Structured neuronal groups might
represent the qualitatively new biological level of conscious cognitive functioning
(Edelman, 1992). On the psychological level of description, our abstract high-
level schematic knowledge representations in long-term memory (and
12 Slava Kalyuga
corresponding intellectual abilities associated with operating such structures)

might have emerged as a means of overcoming the combinatorial barrier under
conditions of limited processing capacity.
Because a schema is treated as a single unit in working memory, such high-
level structures require less working memory capacity for processing than the
multiple, lower-level elements they contain, making the working memory load
more manageable. Our abilities to construct and use higher-order hierarchical
cognitive configurations of knowledge structures in long-term memory might
have emerged during evolution as a way of providing structure to the elements
being dealt with by working memory (Sweller, 2003, 2004). Thus, by allowing
multiple elements to be treated as a single element in working memory, long-term
memory schematic structures may have, as one of their functions, the reduction of
working memory load.
Specific schema selection in a particular situation is usually automated and
quick. Our first impression about an unfamiliar person (which is said to be the
most important), our comprehension of movies, fiction, music, humor, or art is
guided by our acquired domain-specific schematic knowledge structures. Schemas
guide our recall of different past events. Our memory usually retains the gist of a
situation or event according to our schematic knowledge of it. The schema defines
what is encoded and stored. When recalling the event, we create schema
instantiations filling in missing information and inferring unavailable components
using our schemas for the event. Sometimes such recall may produce various
distortions to fit our schemas or expectations (e.g., recall scenes of court
procedures from movies and fiction stories with witnesses remembering details
they have not actually witnessed).
The structure of the schematic knowledge can be empirically assessed, for
example, by asking students to group problems into clusters on the basis of
similarity; to categorize problems after hearing only part of the text; to provide
answers to problems when content words have been replaced by nonsense words;
to solve problems when material in the text is ambiguous; to contrast problems
using a nominated principle; to recall problems that were presented earlier; to
identify which information within problems is necessary and sufficient for
solution; and to classify problems in terms of whether the text of each problem
provides sufficient, missing or irrelevant information for solution (text editing)
(Low & Over, 1992).
Previously acquired schematic knowledge structures are the most important
factor that influences learning new material. A students understanding of an
instruction means instantiation of appropriate familiar schemas that would allow
her or him to assimilate new information with prior knowledge. A failure to
comprehend instruction might be caused by the lack of any appropriate schemas

in LTM, by the lack of sufficient cues in the situation to elicit a schema, or by the
learner applying a different schema than that intended by the instruction.
Students' preexisting schemas often resist change: everything that cannot be
understood within the available schematic frameworks is ignored or learned by
rote. It is important to build new knowledge on top of students existing schemas
or help them to acquire an appropriate schematic framework by relating it to
something already known. Useful instructional techniques could be analogies or
diagrams, to establish links with existing knowledge, and advance-organizers to
elicit or activate existing relevant schemas or provide new ones (concept maps,
headings, summaries at the start of chapters, etc.).
Similar to production systems, a schema-based approach to representing
knowledge provides a general framework that can be instantiated by specific
theories. In all schema-based models of cognitive architecture, schemas are
matched to the contents of working memory for recognition. If a schema is
partially matched by the information in working memory, it will create further
information to complete the match. Schemas instantiated in working memory
could be modified or reorganized, then placed back into long-term memory and
serve as a new, more specific schema for further recognition.
Schema theories do not differentiate between procedural and declarative
knowledge. Instructions for actions may be produced by matching a schema to a
situation and adding missing pieces of information. For example, recognizing a
situation as a schema for solving simple linear algebraic equation and recognizing
values of corresponding slots would provides directions for necessary operations.
Production rules could be considered as a form of schematic knowledge. There is
a tendency towards converging production system and schema-based approaches
within those approaches. For example, Koedinger and Anderson (1990) integrated
two approaches by constructing a computational (production-system-style) model
of solving geometry problems using schema-based knowledge structures. The
schemas (diagram configuration schemas) were described as clusters of
geometry facts that were associated with a single prototypical geometric image.
In this book, schematic knowledge structures will be used as the basic unit
and prevailing form of knowledge representations in long-term memory.
Accordingly, the approach to human performance that is based on studies of
schematic knowledge structures will be further referred to as a schema approach.
14 Slava Kalyuga
PROBLEM SOLVING AND THE NATURE

OF HUMAN EXPERTISE
All of our purposeful cognitive activities can be considered as problem

solving. Initially, in the 1950s and 1960s, most research studies on problem
solving were concerned with knowledge-lean task domains that required no
special training or background knowledge (for example, the famous Tower of
Hanoi task, various puzzles, etc.). The study of such tasks led to the formulation
of a general theory of human problem solving (Newell & Simon, 1972). In this
theory, a problem contains three main components: a given state, a goal state, and
a set of operators for transforming the given state into the goal state. Problem-
solving activity is considered as a search in the problem space that consists of
separate problem states (knowledge states). The task of problem solving is to find
a sequence of operators that can transform the initial state into a goal state within
the problem space.
So-called weak methods could be used in solving knowledge-lean tasks. We
often use general heuristics (rules of thumb) for choosing necessary sequences of
operators. For example, the difference reduction heuristic suggests choosing
operators that maximally reduce the difference between the current state and the
desired state. However, this method does not guarantee success in solving the
problem, and more advanced methods are usually adopted. Forward chaining
starts with the initial problem state, and a selected heuristics-based operator is
applied, and then the strategy repeats. Backward chaining starts with the desired
solution state, and a heuristically chosen operator is applied in reverse. A
subgoaling strategy chooses an operator and forms a subgoal to find a way to
change the current state so that the chosen operator could be applied. The method
of solving by analogy uses the structure of the solution to one problem to obtain
the solution to another problem (van Lehn, 1989).
The weak methods are often used in combined forms. For example, the GPS
(General Problem Solver) production system-based mechanism developed by
Newell and Simon (1972) uses the means-ends analysis method. This method
consists of looking for an operation that reduces the difference between the goal
and initial state, setting up subgoals whose solution provides a solution of the
original goal, and building up a hierarchical plan to solve a problem. Means-ends
analysis thus combines forward chaining and operator subgoaling: the current
state of problem solving is compared to the goal state and actions are selected to
reduce the difference (van Lehn, 1989).
In the early 1980s, experiments with puzzle problems demonstrated that, even
after extensive problem solving by means-ends analysis, participants still did not
induce a simple solution rule. Rule induction occurred only after some additional
information had been provided (Mawer & Sweller, 1982; Sweller & Levine, 1982;
Sweller, Mawer, & Howe, 1982). Empirical evidence was obtained that extensive
practice in conventional problem solving was not an effective way of acquiring
schemas that are required to successfully solve corresponding problems (Owen &
Sweller, 1985; Sweller & Cooper, 1985; Sweller & Levine, 1982; Sweller,
Mawer, & Ward, 1983). These studies suggested that a means-ends strategy could
inhibit schema acquisition.
A means-ends strategy focuses attention on specific features of the problem
situation required to reach the goal and on reducing difference between current
and goal problem states by selecting proper operators. Maintaining subgoals and
considering alternative solution pathways are cognitively demanding mental
activities that might result in working memory overload. Additionally, these
activities are unrelated to learning solution schemas that are critical for successful
future problem solving. They reduce resources devoted to learning other
important aspects of problem structure. For example, studies of two-step problems
demonstrated that cognitive load might be very high at the subgoal stages
resulting in more errors than on the final goal stage (Ayres & Sweller, 1990).
Sweller & Levine (1982) demonstrated rapid learning of maze problem-
solving schemas when the specific goal state was unknown, and it was not
possible to reduce differences between the goal and given problem states. Sweller,
Mawer, and Ward (1983) found that using a means-ends strategy can actually
impair learning, and that less directed exploration of the problems facilitated
acquisition of useful problem schemas. They used simple physical and geometry
problems without a specific goal stated (goal-free problems such as Calculate the
value of as many variables as you can) and observed enhanced development of
problem-solving skills. Owen and Sweller (1985) found that problem solvers
using a means-ends strategy made significantly more errors than those using other
methods, supposedly due to the working memory load associated with means-
ends analysis.
In a theoretical investigation of the cognitive (working memory) load
phenomena, Sweller (1988) constructed and analyzed a computational model of
cognitive processes based on a theory of production systems (Newell & Simon,
1972). The model operates by matching elements on the condition side of each
production to elements in a working memory (for example, the knowns,
unknowns, goal, possible equations or theorems). If the condition side of a
production is matched by some of the elements in working memory, the
16 Slava Kalyuga
production can fire, and its action alters the content of working memory allowing
other productions to fire. The cognitive load in such a model could be measured
considering the number of statements in working memory, the number of
productions, the number of cycles to solution, and the total number of conditions
matched. Application of this model to novice cognitive behavior in various
instructional procedures provided evidence of the heavy cognitive load associated
with a means-ends strategy compared with a forward-working goal-free strategy.
It also explained why the use of goal-free problems or worked examples was more
effective means of acquiring schemas than conventional problem solving
(Sweller, 1988; Ayres & Sweller, 1990).
Since the late 1970s, the research focus in problem solving shifted to studying
knowledge-rich task domains (algebra, geometry, physics, thermodynamics,
computer programming, chess, bridge, etc.) that required an essential knowledge
base as a prerequisite. Problem solving in such domains has additional
complexities. Representation of a problem requires a great deal of domain
knowledge, and operators that are usually used are domain-specific operators. The
central questions of research in such domains are how is knowledge used to build
up a problem representation and how does it influence the actual problem-solving
process (Reimann & Chi, 1989).
In semantically rich domains, problem solving involves searching one's
knowledge of the domain in order to find the operators for solving the problem.
Research on the use of knowledge in problem solving suggests that people use
two types of domain-specific knowledge to solve problems: declarative
conceptual knowledge (knowledge of the principles of the domain) and procedural
knowledge (knowledge how to perform cognitive activities). Procedural
knowledge may be described as a set of production rules that define actions for
achieving goals (Anderson, 1983). Conceptual and procedural knowledge in
problem solving can be considered as organized into problem schemas. They form
the general framework of knowledge that corresponds to classes of problems.
Problem solving in complex domains thus can be viewed as finding an
appropriate problem schema in long-term memory and filling in this schema with
the specific parameters of the problem (Chi, Feltovich, & Glaser 1981; Chi &
Glaser, 1985). The problem schema determines what conceptual knowledge is
used to build a representation of the problem statement, and what procedures are
used to solve the problem. Much research in knowledge-rich domains is
concerned with the differences between expert and novice problem solving. It has
become evident that experts' behavior is mostly determined by their knowledge
base. Therefore, the learning processes in which the experts acquired this
knowledge are critical in explaining their performance. The focus of attention in
the later studies shifted to learning theories as theories of the acquisition of

expertise (Van Lehn, 1989).
A considerable number of recent research studies in cognitive psychology
have been concerned with the investigation of the structures and processes of
human competent performance as a consequence of learning. It is generally
accepted that development of expert performance is a very complex process
involving a great deal of deliberate effort. Studies have shown that at least 10
years of practice are necessary for people in various fields of culture and science
to reach superior levels of skilled performance (Ericsson & Charness, 1994;
Ericsson, Krampe, & Tesch-Romer, 1993; Simon & Chase, 1973).
Expert performance is usually acquired during extensive deliberate practice in
a domain. Such practice should be organized at an appropriate and challenging
level of difficulty, allow steady skill refinement by repetition and error correction,
and provide informative feedback to the learner (Ericsson et al., 1993; Ericsson &
Lehman, 1996). Competent expert performance generally requires well-developed
cognitive skills, well-organized structures of knowledge, and self-regulatory
performance control or metacognitive strategies (Glaser, 1990).
Well developed cognitive skills as a major characteristic of expert
performance require functional (related to conditions of applicability) automated
knowledge (Fitts & Posner, 1967; Anderson, 1983, 1993; Klahr, Langley, &
Neches, 1987). The process of skill learning is claimed to occur in several stages.
In the first stage (cognitive stage), a description of the procedure is learned in the
form of declarative knowledge. In the second stage (an associative stage), the
declarative information is transformed into a procedural form, and a set of
procedures for performing the skill is acquired. Such a process of converting
declarative knowledge into a procedural form is called proceduralization. In this
stage, two forms of knowledge (declarative and procedural) coexist. In the third
stage (autonomous stage), the skill becomes more rapid and automatic (Anderson,
1983).
When knowledge becomes automated during the development of proficiency,
conscious processing capacity can be concentrated on higher levels of cognition.
Automated performance requires a limited attentional capacity. Processing that
once demanded active control, after extensive practice can become automatic,
freeing limited attentional capacity for other tasks (Kotovsky, Hayes, & Simon,
1985; Schneider& Shiffrin, 1977; Shiffrin & Schneider, 1977). For example,
while the use of declarative knowledge initially requires much conscious
cognitive processing, automatic application of proceduralized knowledge frees
working memory and allows its capacity to be used for the processing of new
knowledge. Intensive training on certain procedural elements of a task can make
18 Slava Kalyuga
them more automatic and free cognitive capacity for other more creative elements.
This is especially important for transfer of training (Cooper & Sweller, 1987;
Howell & Cooke, 1989). Automated lower level routine procedures enable
learners to concentrate on finding new ways of applying their knowledge in
unfamiliar situations.
The process of learning could be considered as the acquisition of new
schemas that eliminate the need to apply weak problem-solving methods (e.g.,
means-ends analysis) to solve future similar problems. The result is a shift from a
novice strategy of working backward from the goal using means-ends analysis
and subgoaling, to a more expert knowledge-based strategy of working forward
from the initial state to the goal. Availability of a sufficient set of relevant
domain-specific schematic knowledge structures that could be used in performing
tasks is an important feature of a competent human performance. With experience
in a domain, knowledge is organized into larger interconnected aggregate
structures that explain the skilled performance of experts (Chi, Glaser, & Farr,
1988; Lord & Maher, 1991).
Under a schema-based approach, learning can take different forms. Schema
evolution is a central mechanism in the development of expertise. New
information could be encoded in terms of existing schemas without involving any
new schemas. Schemas evolve as they are applied and utilized as learner
experience in the domain increases. Another form of learning is restructuring or
creation of new schemas. In order to explain how schemas can be built up through
experience, Rumelhart and Norman (1981) proposed a mechanism of learning by
analogy. Initially, a new schema could be created by modeling it on an existing
schema followed by a process of refinement (tuning). When a learner encounters a
new situation in a familiar domain, she or he tries to interpret it using existing
schemas. If none of them suits the situation, the best existing schema can serve as
a model from which to start the tuning process. The characteristics of this model
that do not contradict the new situation are carried over into the new schema.
Planning and self-regulatory (metacognitive) skills allow experts to control
their performance, assess their work, and predict its results. These self-regulatory
skills are an important condition of expert ability to use the available knowledge
base (Chi, Bassok, Lewis, Reimann, & Glaser, 1989; Larkin, McDermott, Simon,
& Simon, 1980). Chi et al. (1989) proposed that students learn and understand
examples of problem solutions via the self-explanations they give while studying.
Students who are successful problem-solvers tend to study example exercises by
explaining and providing justifications for each action and relating these actions
to the principles and concepts of the domain. These students read the example
with understanding and self-monitoring. Students who are less successful
problem-solvers do not connect their explanations (if any) with their

understanding of the principles of the domain. During problem solving, successful
students may use examples for a specific reference, whereas less successful
students repeat them in search for ready-made solutions. The level of performance
significantly depends on the metacognitive skills that learners bring to the task.
Cognitive studies of human performance and learning have the potential to
greatly influence instructional design principles. Generally, instructional design
should minimize learners' involvement in activities that overburden their limited
working memory and be adapted to the learners available knowledge structures
in long-term memory. Appropriate design of instruction should be based on the
knowledge of characteristics of expert performance, expert-novice differences,
and the transition process from novice to expert. Cognitive models of expert
performance and their influence on the design of instruction are considered in the
following chapter.
Chapter 2
COGNITIVE STUDIES OF EXPERT-NOVICE

DIFFERENCES AND DESIGN OF INSTRUCTION
SCHEMA-BASED APPROACH TO STUDYING

EXPERT PERFORMANCE
The purpose of cognitive studies of human expertise is to identify the
cognitive structures and processes responsible for skilled performance. Expert
performance has been studied in a variety of domains, for example, chess (de
Groot, 1965), physics (Chi, Feltovich, & Glaser, 1981; Larkin, McDermott,
Simon, & Simon, 1980), programming (Anderson, Boyle, & Reiser, 1985) and
radiology (Lesgold, Rubinson, Feltovich, Glaser, Klopfer, & Wang, 1988), to
name just a few. Various techniques and approaches have been applied to find out
the organization of experts' knowledge, the characteristics of their understanding,
information processing requirements and the nature of competency in such areas
as chess (Chase & Simon, 1973; Simon, 1979), geometry (Greeno, 1977),
genetics (Smith & Goodman, 1984), physics (Larkin & Reif, 1976), electronic
troubleshooting (Brown & Duguid, 1989; Forbus & Gentner, 1986; Gitomer,
1988; Lesgold & Lajoie, 1991; Morris & Rouse, 1985; Perez, 1991; Rasmussen,
1986; Swezey, Perez, & Allen, 1988; Tenney & Kurland, 1988; Wiggs & Perez,
1988), and mechanical troubleshooting (de Kleer & Brown, 1983, 1984; diSesssa,
1983; Forbus, 1984; Hegarty, 1991; Hegarty & Just, 1989; Heller & Reif, 1984;
Miyake, 1986; Reif, 1987; Stanfill, 1983; White, 1983; White & Frederiksen,
1986).
As discussed in the previous chapter, schemas are a major type of knowledge
representation in long-term memory that reflects prototypical features of objects,
22 Slava Kalyuga
situations, and events. To understand or interpret incoming information, the

human cognitive system matches this information with existing schemas
(Rumelhart & Norman, 1983). In general, studies of expert-novice differences
demonstrate that expertise is not so much a function of superior problem-solving
strategies or a better working memory, but rather experts have a better domain-
specific schematic knowledge base.
Chunks have played an important role in the development of the
understanding of expert-novice differences. Since Miller's (1956) finding that
short-term memory is limited to approximately seven units, or chunks, of
information, a chunk has served as a unit of measurement for memory capacity. A
chunk can be considered as a generalized example of a schema. De Groot (1965;
1966) was one of the first psychologists who investigated expert-novice
differences and demonstrated that expertise can be explained by the enormous
amounts of knowledge that experts can access. In his classic studies, chess players
had to reconstruct the positions of chess pieces on a board, after a brief exposure
(5 seconds). De Groot's findings that chess masters could recall many more pieces
from briefly exposed real chess positions than novices was explained by masters
having larger chunks. Chase and Simon (1973) noticed that experts placed chess
pieces on the board in groups that represented meaningful configurations. The
experts did not show superior performance when random placements of the chess
pieces were used.
Egan and Schwartz (1979) studied expertise in electronics with a
methodology similar to that used by Chase and Simon (1973) in studying chess
expertise. They found that experts could reconstruct large circuit diagrams from
memory recalling them in chunks of meaningfully related components. The
experts were better than novices at recalling meaningful (not random) circuit
diagrams. The size, rather than number, of recalled chunks increased with study
time. Chase and Ericsson (1982) further suggested that the superior memory of
chess masters and other experts was due to possession of schema structures with
specific slots filled in with the index information that served as retrieval cues. The
material could be recalled by reading out the contents of these slots and selecting
schemas that corresponded to familiar stimuli.
The schema-based approach was successfully used to explain various
phenomena related to expert performance and differences between experts and
novices (Chi et al., 1981; Reimann & Chi, 1989). For example, in the domain of
physics, experts' categories were based on the principles of mechanics
(conservation of energy and momentum, etc.), whereas novices' categories were
based on objects and surface features stated in each specific problem (incline
plane, spring, etc.). In the case of an object being balanced on an inclined plane,
Cognitive Studies of Expert-Novice Differences and Design of Instruction 23
the experts saw it as an example of a class of problems requiring a balance-of-

forces approach, while novices saw it as an inclined planes problem type. The
failure of a novice to solve this problem may result from the fact that different
incline plane tasks may require different approaches (based on balance of forces,
energy conservation, etc.), and the presence of the incline plane alone does not
determine the appropriate approach.
One of the reasons for novices' difficulties in problem solving is that they
activate only lower-level schemas that incorporate only surface aspects of the
problem, whereas experts activate higher-level schemas that contain information
critical to the problem solution (Chi & Glaser, 1985). Thus, experts categorize
problems in terms of deep structures such as the laws used to solve the problems,
while novices categorize problems based on surface structures such as common
physical attributes. The same problem may elicit different schemas for experts
than for novices.
Schematic knowledge structures in long-term memory effectively provide
necessary executive guidance during high-level cognitive processing (Sweller,
2003). Without such guidance and in the absence of external instructions, people
usually resort to random search or weak problem-solving methods such as means-
ends analysis (a gradual reduction of differences between current and goal
problem states). Such methods are cognitively inefficient and time consuming.
They may impose a heavy working memory load interfering with construction of
new schemas (Sweller, 1988).
In contrast, when experts in a domain encounter a familiar problem situation,
they rapidly retrieve appropriate previously acquired schemas from long-term
memory and apply them in a cognitively efficient way (Chi, et al., 1981; Larkin,
et al., 1980). Schemas allow them to categorize different problem states and
decide the most appropriate solutions. Due to their available knowledge base in
long-term memory, experts are able to avoid cognitively inefficient mental
activities and perform with greater accuracy and lower cognitive loads.
Schematic knowledge structures can be described functionally by indicating
how a person with a specific level of a schema acquisition would act in relevant
problem situations. For example, without any schematic knowledge of procedures
for solving the equation 4x + 2 = 3 and in absence of any guidance, a student will
treat each symbol separately and may try to use a means-ends analysis approach
by reducing differences between a current problem state and the goal state (x = ?)
or attempt to apply various random operations to the numbers.
With some previously acquired knowledge of an appropriate procedure,
another student may immediately proceed to subtract the coefficient 2 from both
sides of the equation: 4x + 2 2 = 3 2. The whole combination of elements (e.g.
24 Slava Kalyuga
4x + 2) will be treated as a meaningful single unit or chunk. If a student practiced

considerably with this kind of equations, the schema for this procedure may be
automated and her or his first solution step will be 4x = 1. Another, even more
experienced student may have all the relevant solution procedures well learned or
automated and would write the final answer (x = 1/4) almost immediately.
Similar examples of expert-novice differences could be demonstrated in other
areas. Each symbol in a wiring diagram could be treated as a separate element by
a novice electrician, while an experienced professional would see the whole
diagram as representing a complete system. For a foreign language non-speaker, a
printed text might look as a collection of unfamiliar symbols, while fluent native
readers would be able to make sense out of the whole text. They would treat
words or even combinations of words as single elements.
By combining multiple elements of information into a single chunk in
working memory, long-term memory schemas allow experts to avoid processing
overwhelming amounts of information and to effectively reduce working memory
load during high-level cognitive processing. In addition, experts are also able to
bypass working memory limitations by having many of their schemas highly
automated due to extensive practice. Human cognitive architecture has evolved in
a way that information processing changes significantly as this information
becomes more familiar to an individual (Sweller, 2003). Schematic knowledge
structures held in long-term memory significantly influence the content and
characteristics of working memory by effectively transforming it into long-term
working memory (Ericsson & Kintsch, 1995).
An experts routine problem solving in a familiar domain usually involves a
selection of an appropriate schema, adapting it to the problem, and executing the
solution procedure. Often it occurs as a direct recognition early in the perception
of the problem (Chi, Feltovich, & Glaser, 1981). Non-routine problem solving
includes additional procedures such as search (when more than one schema is
applicable to the situation) or combining the schemas (when no one schema will
cover the whole problem) (Larkin, 1985). Substantial evidence has accumulated
that a schema theory of problem solving can be successfully used to explain
experts' performance in various task domains (Reimann & Chi, 1989).
Building a problem representation is a key process in problem solving
(Larkin, 1985; McDermott & Larkin, 1978, Simon & Simon, 1978). It has been
found that experts spend more time on a qualitative analysis of the problem and
building explicit representations of the situation (for example, by drawing the
diagrams of causal relationships between the objects). Experts also form more
abstract and enriched representations than novices do. For example, according to
Chi, Feltovich, and Glaser (1981), experts classify physics problems based on
abstract physics categories and principles, while novices do it according to surface

characteristics of the problem. Thus, the level of problem representation depends
on the solver's problem schemas. An initial cue (first sentences in the problem
statement, etc.) may activate a particular schema that is then matched to the
problem. Any mismatch results in the rejection of that schema and triggering of
another schema.
Successful problem solving in technical domains depends on the solver's
schemas for the causal relations between components of a technical system which
allow mental simulations of the system operation (de Kleer & Brown, 1983;
Gentner & Stevens, 1983; Miyake, 1986). Providing learners with a causal
description of a devices operation in addition to information about its
components was shown to enhance their ability to operate the device (Kieras &
Bovair, 1984; Mayer, 1989a).
Different types of schemas are appropriate for solving different types of
problems. At higher levels of skill, the choice of schematic knowledge types is
determined by higher level structures in which an expert's representations are
organized (Hegarty, 1991). Initially, problem schemas are specific to the
situations from which they were induced. With experience, they become indexed
by the general principles and problem solving becomes faster and takes less effort.
Organization of the solvers' knowledge into large groups of chunks or schemas
decreases the demands on working memory and allows learners to activate
appropriate procedures. As soon as experts retrieve a problem schema, they
automatically access the procedures for solving the problem (Chi et al., 1981;
Smith, 1991).
The development of a problem representation can be viewed as the sequential
attempts of schema refining, which depends on the structure of the domain-
specific knowledge of the solver. This results in experts spending more time on
planning and using forward-working and efficient problem-solving processes
(Reimann & Chi, 1989). Empirical studies in various domains have revealed that
problem-solving strategies are determined by the nature of the problem
representations, differences in the organization of knowledge, and the number of
domain-specific problem schemas that solvers have because of their experience in
a domain (Larkin, 1985; Lesgold, Feltovich, Glaser, & Wang, 1981).
Experts performance is schema-driven. Experts possess more domain-
specific schemas and can access and use them more efficiently than novices.
Experts work forward deriving the appropriate problem schema from the problem
statement. In contrast, novices performance is goal-driven. Novices work
backward from the goal, searching for operators that will allow them to derive the
needed solution. However, working backwards is a default strategy that both
26 Slava Kalyuga
experts and novices use when there is no schema for a given type of problems. In
a novel situation, experts use various types of general heuristics together with
domain-specific knowledge (Perkins, Schwartz & Simmon, 1991; Rist, 1989;
Schultz & Luchheud, 1991).
Thus, expert performance depends on available problem representations,
knowledge base (facts, concepts, principles, knowledge of a system and rules how
to use this knowledge), availability of appropriate domain-specific schemas,
general procedures (strategies, heuristics, algorithms), and relations among all
these elements (Hart, 1986; Lesgold and Lajoie, 1991). According to Chi, Glaser,
and Farr (1988), the main features of competent expert performance are:
1) domain-specificity (experts exhibit superior performance mainly in their

own domains);
2) perception of problem situations by large meaningful patterns;
3) high speed of performance;
4) superior well-organized long-term memory knowledge base;
5) deep-level and principle-based problem representations;
6) thorough qualitative analysis of problems; and
7) strong self-monitoring skills.
COGNITIVE STUDIES OF EXPERT-NOVICE DIFFERENCES

AND INSTRUCTIONAL APPROACHES
Most studies of expertise have focused on discrete expert-novice differences

in solving specific tasks. Existence of a continuum between novices and experts
has been frequently ignored. As a result, our knowledge about the development of
expertise and about changes in cognitive processes as expertise is acquired is
limited. Groen and Patel (1991) suggested four developmental levels: 1) novices
with no training in the domain (possessing only common sense knowledge and
everyday experience); 2) intermediates who have received some instruction in the
domain; 3) sub-experts who have expertise in a closely related domain (they may
also be viewed as intermediates); and 4) experts who are always correct in solving
routine problems and solve them by way of forward reasoning. It is impossible for
novices to learn expert approaches directly. When expert rules are taught to
beginners, they form isolated pieces of knowledge that are not retained for a long
period of time (Groen & Patel, 1991). Thus, an existing theory of expert
performance cannot be applied directly to instruction, and theoretical models of
student transition from one level to another should be developed.
Expert routine problem solving is traditionally associated with using a

forward-working strategy; novices tend to work backward. In the case of
unfamiliar problems experts also use backward reasoning. The studies of Sweller
and his colleagues (Mawer & Sweller, 1982; Sweller & Levine, 1982; Sweller et
al., 1983) brought some understanding of when the switch occurs during the
development of expertise and what factors would facilitate the switch. It was
demonstrated that means-ends analysis might prevent the acquisition of problem-
specific rules because this method could leave no cognitive resources available for
meaningful learning.
Rule acquisition occurred or improved under conditions where subjects were
provided with information additional to the problem goal (for example, a set of
subgoals) or were given goal-free problems. Sweller et al., (1983) hypothesized
that the main factor responsible for this result was the kind of information a
learner focuses on during problem solving. If knowledge or schema acquisition is
an aim of problem solving, then the influence of the goal as a control mechanism
should be reduced.
In some studies, forward reasoning intermediate level medical students
performed more poorly then either experts or novices (Groen & Patel, 1991). This
result was explained by their dogmatic reliance on existing basic science
knowledge. When students' knowledge contains misconceptions, forward
reasoning might be harmful for learning. If they reasoned backward, then the
misconceptions would be just temporary hypotheses. It was suggested that in such
cases an emphasis should be placed on self-explanations and testing their
adequacy (explanation-based learning) rather than on correct problem solving
(Groen & Patel, 1991).
Most of the experimental evidence in the area of expert-novice differences
was obtained by contrasting performance of experts and novices. Schoenfeld and
Hermann (1982) conducted one of the first longitudinal studies of the relationship
between problem perception and expertise. Students' perceptions of mathematical
problems were examined before and after intensive training in mathematical
problem solving. It was demonstrated that novices sorted problems based on
surface components mentioned in the problem statement. After the training, they
sorted them in a more expert-like way according to the principles of problem
solution. Thus, problem perception and problem schemas on which such
perception is based changed as learners became more experienced in the domain.
With the development of expertise, problem schemas change in their level of
specificity (diSessa, 1983; Forbus & Gentner, 1986; Kaiser, Jonides, &
Alexander, 1986). Initially induced from specific situations, they become more
general and indexed by the underlying principles (Chi et al., 1981). At higher
28 Slava Kalyuga
levels of development, schemas may also change from qualitative to quantitative

representing relationships between components of problem situations more
precisely (Forbus & Gentner, 1986; Hegarty, Just, & Morrison, 1988). As people
gain more experience with technical systems, they learn relations between their
common subsystems and learn to chunk components of systems into these
subsystems (Hegarty, 1991). New information is then assimilated into existing
sophisticated knowledge structures.
The learning mechanisms and strategies evolve as a learner becomes more
experienced (Langley & Simon, 1981). Lesgold et al. (1988) hypothesized that
early learning is perceptual and different from later cognitive learning. Experts
use schemas to interpret incoming information, intermediates often reshape their
perceptions to fit the schema, whereas novices completely rely on their
perceptions. The previously mentioned decline in performance at intermediate
levels can also be due to the shift from perceptual learning to cognitive schema-
based learning.
According to the triarchic/global/local architecture of expert cognition
(Sternberg & Frensch, 1992), when processing information from new domains, an
expert relies mostly on controlled, global processing. If information belongs to the
expert's narrow area of expertise, she or he relies mostly on automatic, local
processing. Such local processing systems can operate in parallel, be automated,
and characterized by almost unlimited processing capacity. As expertise develops,
learned portions of processing procedures are transferred to a local processing
system. This enables experts to automate more processing and thus to free global
processing resources for dealing with new situations (Sternberg & Frensch, 1992).
However, experts may be inflexible in new situations because it is difficult to
reorganize an automated schema. Experiments with bridge players confirmed that
experts were more affected when new task demands required changing deep,
abstract principles rather than surface features. Novices were more affected by
surface changes than by deep, abstract changes (Sternberg & Frensch, 1992).
Nevertheless, Schraagen (1993) demonstrated that when domain-specific
knowledge is missing, experts could still maintain a more structured approach
than novices could by making use of more abstract high-level knowledge.
According to the theory of skill acquisition (Anderson, 1983), the instruction
in specific performance procedures must be preceded by the instruction in the
concepts, rules, and principles of how things work (declarative knowledge). In
addition to the theoretical principles, the ability to apply them in concrete
situations should be developed (Morris & Rouse, 1985). A procedural approach
only is not sufficient, because it is impossible to predict all possible situations in
advance, especially in complex domains like modern digital electronics. Thus,
training should combine knowledge of system principles with procedures of how

to use this knowledge in a specific context. In general, teaching expert
performance might require a basic conceptual explanation of how things work,
practice in carrying out basic procedures, and variation in experiences for tuning
of procedural knowledge and the development of persistence and confidence
(Gentner & Stevens, 1983; Greeno & Simon, 1988).
Kieras and Bovair (1984) demonstrated that providing students with
conceptual models of a complex system prior to information on how to use that
system produced better recall, faster learning, and fewer errors in the operation of
the system. Combined structural and functional descriptions of system operations
are recommended for effective learning (Psotka, Massey, & Mutter, 1988).
However, specific instructional strategies should be based on the cognitive
requirements of particular tasks. The user does not always need a complete
knowledge of the system in order to be able to operate it.
For example, many experts in technical areas have a very limited
understanding of general physics principles but satisfactorily perform their duties.
If a device is simple, or a procedure is easily learned and practiced (e.g., a
telephone) there may be no need to provide a device model. The user may infer a
usable model without instruction (Kieras & Bovair, 1984). Limited underlying
knowledge and understanding of how certain functions are fulfilled are required
for operating and troubleshooting systems with simple functions. For more
complex systems, a deeper understanding of their components and operation is
required (Lesgold & Lajoie, 1991).
Novices often have difficulties integrating general theoretical concepts with
their intuitions because of conflicts between everyday meanings of new concepts
(e.g., acceleration, mass) and their meaning in theory (Reif, 1987), conflicts
between students' intuitive knowledge and theoretical laws (diSessa, 1982), or
because of the lack of procedural knowledge of solving specific problems that is
often not explicitly taught (Heller & Reif, 1984).
There have been two major approaches in using the results of cognitive
research on knowledge structures in the design of instructional systems (Glaser,
1990). The first approach has been developed in the tradition of knowledge
engineering in artificial intelligence and design of expert systems. It requires
exposing the learner to the knowledge characteristics of well- developed
expertise. The well-known example of a computer-based instructional system
designed in accordance with this approach is the GUIDON project (Clancey &
Letsinger, 1984).
The second approach has been developed in cognitive science and is based on
cognitive models of students' knowledge. For example, in instructional systems
30 Slava Kalyuga
based on qualitative models (Chi, 1988; Forbus & Gentner, 1986), a learner has to
progress from simple to more sophisticated domain-specific conceptual models
(e.g., coordinated functional, causal, and structural models; qualitative and
quantitative models). This progression occurs in the context of solving
specifically designed problems with gradually increasing levels of complexity. An
example of this approach is the program for teaching troubleshooting of electric
circuits QUEST (White & Frederiksen, 1986).
Similar ideas were realized in the STEAMER project (the simulator for
training engineers to operate steam propulsion plants aboard large naval ships).
The primary goal was to teach a robust conceptual model (rather than specific
procedures) that could be used to reason about the steam plant qualitatively
(Holland, Hutchins, McCandless, Rosenstein, & Weitzman, 1987). Abstract
graphic images of the steam plant were organized in a hierarchical manner with
the major plant parameters presented first, followed by more detailed simulations
of subsystem components.
SHERLOCK is an example of a coached-practice learning environment in
which learners compare their own performance with expert performance (Gabrys,
Weiner, & Lesgold, 1993; Lesgold and Lajoie, 1991). Such reflection, however,
may place a large demand on working memory, if solution paths are long or
complicated. SHERLOCK supports reflection by a replay of the trainee's and an
expert's performance. During replay, the system provides a summary of the
information the user has obtained on previous steps. The system allows learners to
observe the expert's decision process, reasons behind it, and the overall goal
structure for the expert performance. This technique reduces the cognitive load
associated with remembering the details of trainee's own performance while
observing the expert's actions (Gabrys et al., 1993).
Another well-known example of a similar approach is the model-tracing
methodology in intelligent tutoring systems (Anderson, 1993). The tutoring
system simulates a students cognitive behavior in real time and maintains a
model of the student's knowledge state. It provides an example-based learning
environment in which students can induce rules from examples. The learner's
actual performance is compared to the ideal structure of solution (production rules
model), and the student is kept on the correct solution path. The tutor estimates
the availability of acquired productions based on their correct and incorrect
applications and selects appropriate problems for exercises. Many tutoring
programs based on the model-tracing methodology have been effectively used in
the fields of programming, geometry proofs, solving algebraic equations
(Anderson, Boyle, & Reiser, 1985; Anderson & Corbett, 1993; Anderson,
Corbett, Fincham, Hoffman, & Pelletier, 1992; Anderson, Farrell, & Sauers,
1984).
COGNITIVE MODELS OF DEVELOPMENT OF EXPERTISE

AND INSTRUCTIONAL DESIGN
Cognitive studies of human performance and learning have demonstrated that

learning processes are supported by a basic cognitive architecture that includes a
powerful long-term memory and a limited working memory. Schema acquisition
and automation as the major learning mechanisms are critical in intellectual skills
formation. Studies of chess skills and other domains indicate that our knowledge
base provides the foundation of intellectual skills. Schemas held in long-term
memory allow experts to avoid processing overwhelming amounts of information
in working memory and thus by-pass working memory limitations.
Automatic processing allows mental processes to occur rapidly, smoothly,
without conscious control and associated burden on working memory. With time
and practice, all cognitive processes can occur automatically (van Merrinboer &
Paas, 1990). For example, initially solving a/b=c for a needs considering the
problem consciously before realizing that it belongs to the category that requires
multiplying out the denominator. After substantial practice, the schema becomes
automated and allows instant recognition of the category of this problem (Sweller
& Chandler, 1994).
Initially, a novice learner deals with isolated pieces of information without an
organizing structure. Available lower-level schemas could be used to interpret
these isolated pieces of information. After studying relevant examples and
problem-solving practice, isolated pieces of information may form higher order
structures according to similarities and relationships among them. As a result, new
schemas are formed. In the next phase, new facts are added, schemas become
more integrated (schemas consisting of other schemas rather than facts), and
performance becomes more automated and unconscious. Such a transition
between different phases of learning is a continuous process and boundaries
between phases are vague (Shuell, 1990).
Learners acquire new meaningful knowledge by integrating their existing
knowledge structures with schemas induced from studying examples or problem
solving, for example, by explaining each step to themselves in terms of
knowledge they had already acquired (Chi et al., 1989). A learner may not
understand an instruction if she or he lacks an appropriate schema or cues to
retrieve it, or if the learner activates a different schema from that intended by the
32 Slava Kalyuga
instruction (Rumelhart, 1980). Students' schemas might interfere with instruction

when there are mismatches between the existing schemas and those the
instructional designers assumed they had (Osborne & Schollum, 1983).
Thus, in order to be understood, instruction should correspond to the students'
existing schemas. However, students' own schemas in particular domains are
often quite different from those of experts or teachers. These schemas (alternative
frameworks, misconceptions, preconceptions, phenomenological primitives)
might make much information incomprehensible and require special procedures to
alter them (diSessa, 1993; Howard, 1987; Slotta, Chi, & Juram, 1995). Some
means to determine students' misconceptions are interviews, analysis of students'
reasoning in problem solving, word associations (associations to a concept name
as indicators of the underlying conceptual structure), and concept mapping to
represent key concepts and relationships (Howard, 1987; Sutton, 1980).
General principles derived from cognitive research suggest that in order to
provide consistency between instruction and cognitive processes leading to expert
performance, instruction should be adapted to a learner's prior knowledge, and the
learner should be actively involved in development of the skills (Tannenbaum &
Yukl, 1992). Cognitive task analysis could be used to determine underlying
knowledge structures and cognitive skills required for the task. For example,
Gagne (1984) suggested identifying for each part of a task what a person must be
able to do in order to perform it. Reigeluth (1983) proposed a general-to-specific
approach which requires identifying the main idea followed by determining the
specific aspects of this idea. Broader concepts are consequently differentiated into
ones that are more specific. Knowledge engineering methods that have been
developed in the field of artificial intelligence could also be used to extract expert
knowledge structures and use them in the design of instructional materials.
In the framework of a competency-based approach, van Merrinboer (1997)
developed the four-component instructional design model (4C/ID). This model
provides methods for analysis of complex cognitive skills, knowledge structures
required for performing the skills, and development of appropriate sequences of
whole task practice situations that would support acquisition of those skills. The
model takes into account the limited working memory processing capacity by
gradually increasing the level of cognitive load imposed by the sequences of
whole tasks (van Merrinboer, Kirschner, & Kester, 2003). A set of software tools
to assist designers in applying the 4C/ID methodology has been also developed
(de Croock, Paas, Schlanbusch, & van Merrinboer, 2002).
According to the 4C/ID methodology, cognitively complex learning
environments include four interconnected components: 1) learning tasks
organized in a simple-to-complex sequence of task classes with gradually
diminishing levels of support within each class (process of scaffolding); 2)

supportive information for more general aspects of the learning tasks that change
over different specific problem situations; 3) just-in-time (algorithmic)
information for invariant aspects of the learning tasks; and 4) part-task practice
providing additional repetitive practice for constituent skills that need to be
performed at a very high level of automaticity (van Merrinboer, Clark, & de
Croock, 2002).
Development of these four components requires deconstructing complex
skills to build an intertwined skills hierarchy; sequencing task classes around
authentic problem situations; analyzing mental models and cognitive strategies to
describe knowledge structures guiding non-recurrent aspects of competent
performance; analyzing rules and procedures, and prerequisite knowledge
supporting recurrent skills; and selecting appropriate timing of supportive and
procedural information presentation (Kester, Kirschner, & van Merrinboer, 2004;
van Merrinboer & Dijkstra, 1997; van Merrinboer, Jelsma, & Paas, 1992).
A cognitive approach clearly distinguishes between the actual expert
performance sequence and the instruction sequence (the sequence of learners
activities designed to achieve desired instructional goals). Different instructional
approaches could be used in designing instructions for separate parts of
performance. For example, some skills might be developed initially to a high
degree of efficiency to free working memory for the following changes in
knowledge structures. In other cases, structures of conceptual knowledge could be
taught at the beginning followed by practice with complex procedures (Glaser,
1990).
Hierarchical, multi-level instructional sequences with explicit connections
between different levels of the hierarchy could be more appropriate for building
schematic knowledge structures than linear sequences (Reigeluth, 1983; Ausubel,
Novack, & Hanesian, 1978). The big picture (or central idea, overview of the
content) has to be presented first, followed by the specific knowledge. Moving
from a central idea to its elaboration and back (zooming in and out) results in the
acquisition of specific knowledge as part of whole rather than isolated
information. Building hierarchical schematic knowledge structures in memory
enhances retention and provides cognitive mapping of the material (Eylon & Reif,
1984).
For example, advance organizers assist learners in understanding the
organization of the content and relating it to already available knowledge. They
usually include a brief general introduction to the following material at a higher
level of abstraction using titles and graphical devices to highlight the hierarchical
structure (Ausubel et al., 1978; Mayer, 1983; Mayer & Bromage, 1980). Mayer
34 Slava Kalyuga
(1989a) demonstrated significant advantages of using conceptual models that

highlight the major parts, states, and actions in the system as well as the causal
relations among them. Such models helped learners to build internal models of the
system by directing attention toward the important conceptual information,
organizing the information and integrating it with existing knowledge. Similar
conclusions were derived from the theories of analogical transfer that have
pointed to the crucial role of learners' conceptual models in enabling transfer by
mentally running these models in various situations (de Kleer & Brown, 1981,
1983; Gentner & Stevens, 1983).
However, external conceptual models should be used cautiously when dealing
with more advanced students who may possess well-organized schemas in the
domain. The simplified conceptual models may conflict with the students more
sophisticated knowledge structures and inhibit learning (Mayer, 1989a). White
and Frederiksen (1986) suggested that these conflicts could be overcome by an
appropriate instructional design that is based not only on the expert knowledge
structures, but also on the knowledge of expert-novice differences and transition
processes from novice to expert states. Cognitive conflicts between instruction-
based conceptual models and learners internal knowledge structures may increase
processing demands on their limited working memory. Cognitive load factors
involved in complex cognitive performances and in the process of acquisition of
expertise will be considered in the following chapter.
Chapter 3
COGNITIVE LOAD PERSPECTIVE IN

INSTRUCTIONAL DESIGN
THEORETICAL AND EMPIRICAL BACKGROUND

OF COGNITIVE LOAD THEORY
The concept of mental load was initially introduced in the 1950s and was
based on the concept of a communication channel with limited capacity.
Overloading this channel means operating above the limits of one's capacity
resulting in errors or missed signals. An underload is associated with considerable
spare capacity. Capacity theory of human information processing in relation to
attention was originally developed to explain an operators limited ability to
perform multiple activities simultaneously. Not specifying the nature of capacity
or resources, it provided an explanation for performance decrements that occurred
when the resource demands of the task exceeded the available supply (Kahneman,
1973; Navon, 1984).
Specifying the nature of capacity or resources requires adopting a specific
mechanism of human information processing or cognitive architecture. In the
framework of the standard basic model of cognitive architecture, the working
memory is associated with capacity limitations and cognitive resources
consumption. Studies of cognitive load phenomena during problem solving
clearly demonstrated that when cognitive load was greater than working memory
capacity, learning was difficult, and schema acquisition and rule automation were
inhibited. It was suggested that many traditional instructional materials were
ineffective because they ignored limitations of the human cognitive processing
system, especially the limited processing capacity of working memory. Cognitive
36 Slava Kalyuga
load theory (Sweller, 1988; 1989; 1993; 1994; 1999; 2003; 2004; Sweller &
Chandler, 1994; Sweller, van Merrinboer, & Paas, 1998; Chandler & Sweller,
1991, 1996; Paas & van Merrinboer, 1994b; Paas, Renkl, & Sweller, 2003, 2004)
determined some important cognitive principles relevant to processing
instructional information and their consequences for instructional design.
Two independent sources of cognitive load that place demands on working
memory capacity were initially proposed within the cognitive load theory.
Intrinsic cognitive load is determined by the intellectual complexity of the
instructional material to be learned. For example, operation of an intricate
electrical circuit might be much more difficult to learn than working of any
individual element of this circuit. Extraneous cognitive load is imposed solely by
the format of instruction that can take various forms (written instructions,
practical demonstrations, etc.) and require different activities of learners (solving
problems, studying worked examples, etc.).
Germane load, which is caused by cognitive activities that contribute to
learning, has been introduced into the theory at a later stage to account for the
learning-relevant demands on working memory (Paas & van Merrinboer, 1994a;
Sweller, van Merrinboer, & Paas, 1998). For example, cognitive load caused by
self-explanations during learning from worked examples represents an example of
germane load. Such activities would obviously increase cognitive load, but would
directly contribute to schema construction. The intrinsic, germane, and extraneous
cognitive load combined result in the total cognitive load imposed on a learner.
While intrinsic cognitive load is initially fixed for the learner, extraneous and
germane cognitive load can be manipulated by instructional design.
Cognitive load theory asserts that intrinsic load is determined by the degree of
interactivity between individual learning elements. Any instructional material
consists of elements of information that should be processed by learners. An
element can be regarded as a learning item in its simplest form for a particular
learner (Chandler & Sweller, 1996). If the elements can be processed individually,
the information is considered low in element interactivity. It places little load on
working memory because each element can be learned independently. For
example, for a person learning individual words of a second language, intrinsic
cognitive load is low because little or no interaction exists between learning
elements. The task still might be difficult because there are many new words to
learn.
When learning elements need to be processed simultaneously, the material is
high in element interactivity. Learning the syntax of a language (the appropriate
order of words in a sentence) requires all the relevant words to be held in working
memory simultaneously. Because all words interact, a sentence can only be
Cognitive Load Perspective in Instructional Design 37
understood if individual words and their relations are processed concurrently. This
cognitive processing may increase the burden on working memory (Chandler &
Sweller, 1996). The influence of element interactivity on cognitive load was
demonstrated by Maybery, Bain, and Halford (1986) and Halford, Maybery, and
Bain (1986). Using transitive interference problems (e.g., a is larger than b; b is
larger than c; which is the largest?), they provided evidence that cognitive load
was heaviest when learners attempted to integrate two premises. The integration
required considering all the elements (a, b, and c) and their relations concurrently
bringing element interactivity to its highest point.
Thus, the intrinsic cognitive load caused by instructional material depends on
the level of interaction between the elements of that material. Material including
many elements, but with a low level of interactivity results in a low cognitive
load. Difficulty in learning such material might not be due to working memory
overload, but simply to the total number of elements to be learned. On the other
hand, instructional materials that consist of heavily interacting elements might
impose a significant cognitive load even if the number of elements is relatively
small. Even simple electrical circuits with small number of components are
usually difficult to learn because the elements of the circuits are highly interactive
and must be learned as a whole and simultaneously (Sweller & Chandler, 1994).
Assume a person is learning a simple application of a wiring configuration
(the starter) for switching on/off a light by a single push on start/stop push buttons
(Figure 3). While in isolation, all five elements used in the circuit (two push
buttons, a switch, a light, and a coil) might be well known and simple. Combined
in the circuit, they become interconnected and need to be considered
simultaneously to understand the operation of the circuit. For example, to find out
the state of the light (on or off) the learner should determine whether (1) the stop-
button is in its normally closed (not pushed on) position and (2) the switch is
closed. The state of the switch depends on whether (3) the coil is energized. The
state of the coil depends on whether (4) the start-button has been pushed on to
energize the coil initially and (5) the stop-button has not been pushed on to
interrupt the flow of current through the circuit.
Thus, the state of the light depends on the states of all other components of
the circuit. The number of elements and their relationships that must be
considered simultaneously in this case is five. It must be emphasized that this
estimation is based on the assumption that the learner is familiar with the
operation of separate components of the circuit. For example, she or he
understands that energizing the coil would close the switch, or that a single push
on the stop-button would open the circuit. Otherwise, the number of elements to
consider would expand considerably (Figure 4).
38 Slava Kalyuga
Stop Start
A N
coil
switch light
Figure 3. Electrical circuit for switching on/off a light by start/stop push buttons.
Starter
Stop Start
Normally closed
push button
switch 1 Normally open
push button
Normally open
switch
Figure 4. Components of the Starter.
The degree of element interactivity is not only determined by the nature of the
instructional material. It is also influenced by the learner expertise in a particular
instructional domain and her or his pre-acquired schemas in this area. Because
schemas have a hierarchical structure, what is an element on one level may be a
complex structure when a lower-order schema level is considered. With the
development of expertise in a domain, lower-order schemas may become the
elements of a higher-order schema. In other words, with expertise the size of a
person's chunks (and the amount of information encapsulated within these

chunks) increases.
For example, a child learning to read must first acquire a schema for each
letter of the alphabet to be able to recognize this letter in a variety of situations.
These lower-order letter schemas later become the elements within higher-order
word schemas. For experienced readers, these schemas act as elements in phrases
and sentences (Sweller & Chandler, 1994). Higher-order schema acquisition
reduces cognitive load by reducing the number of interacting elements in working
memory. Many interacting elements for a novice may be a single element for an
expert.
For the learner in the above wiring circuit example, individual components of
the circuit (push buttons, switch, light, and coil) acted as pre-acquired schemas.
The learner was considered as an expert on the level of individual components.
Once the interactions of the components of the circuit (the Starter) have been
learned and the learner has become an expert in this class of electrical circuits,
these lower-order schemas become the elements of a higher-order schema (the
schema for a starter). This new schema can further act as a single element, and the
above described element interactivity is no longer relevant.
If this advanced learner encounters a Starter configuration of electrical
components in a new wiring diagram, cognitive processing associated with these
components probably will be carried out with minimal cognitive effort. These
components will be considered as a single functional element in more complex
electrical circuits, with the function of turning the circuits on or off. The novice-
expert distinction is always relative to the instructional goals: experts possess
knowledge at the highest level targeted by the given set of instructional materials,
whereas novices may possess only some lower-level knowledge.
Thus, the intrinsic cognitive load imposed by the content of instructional
materials is determined by its subjective degree of element interactivity that in
turn depends on the learners level of expertise. On the other hand, extraneous
cognitive load is associated with cognitive activity that a person is involved in
because of the way the task is organized and presented, rather than because the
load is essential for achieving instructional goals (Sweller, Chandler, Tierney, &
Cooper, 1990; Sweller & Chandler, 1994). For example, when some interrelated
elements of instruction (textual, graphical, audio, etc.) are separated over distance
or time, their integration might require intense search processes and remembering
some elements until other elements are attended and processed. Such processes
require additional resources and might significantly increase total cognitive load.
Consider the above simple electrical circuit (Figure 3) accompanied by the
following separate textual explanations:
40 Slava Kalyuga
1. The Starter consists of a start push button, a stop push button, and a
switch activated by the coil.
2. Pressing down the start push button closes the circuit and allows the
current to flow through the coil.
3. The energized coil closes the switch, which provides an alternative closed
circuit for the coil to that provided by the start push button. The start
push button now can be released without breaking the current flow
through the coil.
4. The light is operational, as the closed switch provides a closed circuit for
it.
5. To cease operation of the light the stop push button is pressed. The
circuit in the Starter is now open, the coil is no longer energized, and the
switch returns to its normal open position.
For a novice electrical trainee, understanding these instructions requires

integration of the text and diagram. This may involve holding segments of text in
working memory until corresponding components of the circuit's diagram are
located, attended to, and processed; or keeping some images of the diagram active
until corresponding fragments of the text are found, read, and processed. This
search and match processes is likely to significantly increase extraneous cognitive
load. Similarly, problem solving using means-ends analysis (see Chapter 1)
usually involves a large number of interacting statements in working memory
(e.g., interconnected subgoals and steps to solution). Such problem solving might
require significant cognitive resources that become unavailable for learning. If
learning is the goal of activity, then this cognitive demand should be considered as
an extraneous cognitive load.
According to cognitive load theory, when instructional material is
characterized by a low intrinsic cognitive load, the extraneous cognitive load due
to instructional design may be of little concern because total cognitive load may
not exceed working memory capacity. In contrast, a heavy intrinsic cognitive load
may be produced for many learners when instructional material is characterized
by a high degree of element interactivity. In such a situation, an additional
extraneous cognitive load caused by an inappropriate design can be harmful to
learning because total cognitive load may exceed a learner's working memory
capacity. There may be no cognitive resources available for meaningful learning,
and schema acquisition and automation could be inhibited. In order for learning to
be more effective, total cognitive load should be reduced.
Thus, when instructional material has a high intrinsic load (i.e., high element
interactivity) the extraneous cognitive load imposed by the instructional design
may be critical for learning (Chandler & Sweller, 1996; Sweller & Chandler,
1994). Studies in cognitive load theory and its instructional implications have
demonstrated that designing instructional materials in a way that reduces
extraneous cognitive load can significantly improve learning. A brief review of
those studies is presented in the following section. More extended reviews could
be found in Mayer (2005) and Sweller (1999).
In cognitive load research, working memory load has been measured through
various methods, including computational models (Sweller, 1988), instructional
processing times (Sweller, Chandler, Tierney, & Cooper, 1990), and dual-task
paradigms (Brnken, Plass, & Leutner, 2003, 2004; Chandler & Sweller, 1996).
For a comprehensive overview of cognitive load measurement methods, see Paas,
Tuovinen, Tabbers, & van Gerven (2003). The dual-task paradigms use
performance on a secondary task as an indicator of cognitive load associated with
learning on a primary task. Various simple responses can be used as secondary
tasks, for example, reaction times to some events such as a computer mouse click
(Britton, Glynn, Meyer, & Penland, 1982; Lansman & Hunt, 1982), or counting
backwards (Lindberg & Garling, 1982). For example, the secondary task used by
Chandler and Sweller (1996) consisted of recalling the previous letter seen on the
screen of a separate computer while encoding the new letter appearing after a tone
sounded. An important requirement is that a secondary task should affect the same
working memory processing system (visual and/or auditory) as the primary task;
otherwise, it may not be sensitive to changes in actual cognitive load.
Dual-task techniques for measurement of cognitive load in multimedia
learning were studied by Brnken, Plass, & Leutner (2003, 2004), Brnken,
Steinbacher, Plass, & Leutner (2002), and Plass, Chun, Mayer, & Leutner (2003).
The secondary task represented a simple visual-monitoring task requiring learners
to react (e.g., press a key on the computer keyboard) as soon as possible to a color
change of a letter displayed in a small frame above the main task frame. Reaction
time in the secondary monitoring task was used as a measure of cognitive load
induced by the primary multimedia instruction. The studies demonstrated the
applicability of the dual-task approach to measurement of cognitive load
experienced by each individual learner.
Ratings of subjective mental effort associated with learning instructional
materials have been used in many studies recently, as they are easy to implement
and do not intrude on primary task performance. Furthermore, research indicates
that subjective measures of mental load are reliable and correlate highly with
objective measures (Moray, 1982; O'Donnell & Eggemeier, 1986). In addition to
the usual dependent measures such as processing time, test performance, and
practical task performance, subjective ratings of mental effort have been collected.
42 Slava Kalyuga
Participants are usually asked to estimate how easy or difficult instructions were
to understand by choosing a response option or a number on the scale, ranging
from extremely easy (1) to extremely difficult (7 or 9). The scales are usually
seven or nine point.
Measures of subjective load and test performance scores have also been
combined to generate instructional efficiency indicators calculated following Paas
and van Merrinboer's (1993) procedure. This approach allows estimation of the
relative efficiency of instructional conditions and the cognitive cost of instruction.
High efficiency occurs under conditions of low cognitive load and high-level test
performance, and low efficiency occurs under high cognitive load and low-level
test performance. Efficiency values can be calculated, for example, by converting
cognitive load and performance measures into z-scores (R and P) and combining
z-scores using the formula:
PR
E=
2
The denominator 2 is used in this formula to allow an easy graphical

interpretation of the efficiency of instruction by representing the cognitive load z-
scores (R) and performance z-scores (P) in a cross of axes. The relative efficiency
of an instructional condition corresponding to a point (R, P) on the diagram can
then be measured as the distance from this point to the line of zero efficiency (E =
0) and calculated using the above formula. The high efficiency area (relatively
lower cognitive load with higher performance scores) with E > 0 is above the line
E = 0. The low efficiency area (higher cognitive load with lower performance
scores) with E < 0 is located below this line (Paas & Van Merrinboer, 1993).
Using such efficiency indicators may help to eliminate, for example, the
possibility that subjective ratings are merely measuring self-confidence or
subjective comfort levels rather than cognitive load. If learners rate the mental
effort of a task as low but perform well on tests (high efficiency), they are more
likely rating cognitive load rather than just self-confidence.
EFFECTS GENERATED BY COGNITIVE LOAD THEORY

The goal free and worked examples effects. According to cognitive load
theory, means-ends analysis as a problem-solving strategy is associated with a
significant extraneous cognitive load that has to be eliminated or reduced in order
to facilitate learning. Evidence of interference between conventional problem

solving and schema acquisition had been initially obtained in studies of solving
puzzle problems (Mawer & Sweller, 1982; Sweller, Mawer & Howe, 1982).
Confirmation was obtained from studies of mathematics and science problems
(Cooper & Sweller, 1987; Owen & Sweller, 1985; Sweller & Cooper, 1985;
Sweller, Mawer, & Ward, 1983).
Lewis and Anderson (1985) also demonstrated that conventional goal directed
problem solving could prevent learning of essential aspects of a problem's
structure. The means-ends strategy involves different interconnected steps, such
as defining differences between problem states, finding operators to reduce those
differences, considering subgoals, etc., that might impose a significant cognitive
load. This effect may outweigh any possible benefits of learning from direct
problem solving (Gabrys et al., 1993).
Sweller and his associates assumed that extraneous cognitive load could be
reduced when novice learners' cognitive activities are directed to problem states
and their associated moves (Owen & Sweller, 1985; Sweller & Levine, 1982;
Sweller et al., 1983; Tarmizi & Sweller, 1988). The goal-free effect predicted a
reduction of extraneous cognitive load and facilitation of learning by using goal-
free or nonspecific goal problems. In such problems, the goal state is presented in
nonspecific form (e.g., Calculate the values of as many parameters as you can
instead of the traditional Calculate the value of the parameter X). A learner
concentrates on each problem state and any move that will get her or him to a new
problem state, and then the same applies to the next state, and so on. Because no
activities irrelevant to schema acquisition are involved, cognitive load is reduced
and learning is enhanced.
This assumption was supported by a computational model (Sweller, 1988)
and empirically confirmed in a variety of areas: puzzles, kinematics, geometry,
and trigonometry. The goal-free groups demonstrated reduced acquisition time
and errors, followed by superior performance on similar test and transfer
problems. Analysis of verbal protocols, and written solutions showed that students
presented with conventional problems during the acquisition stage continued to
use the backward-working means-ends strategy on the test problems. Goal-free
groups worked forward which provided evidence of acquired schemas (Sweller,
1989). Ayres (1993) demonstrated an increase in errors made by novices on the
subgoal stage in solving simple two-move geometry problems (with only two
unknown angles to be calculated) compared with the same problems in goal-free
presentation. This stage effect was explained by an increase of cognitive load at
the subgoal stage during application of means-ends analysis in conventional
problem solving.
44 Slava Kalyuga
A goal-free technique is highly effective for problems that have a limited

search space. In areas of high search space, worked examples were suggested as
an alternative to conventional problem-solving techniques. A worked example
consists of a problem statement followed by all the appropriate steps to solution.
The worked examples effect predicted a reduction of extraneous cognitive load
and facilitation of learning by using relatively more worked examples instead of
solving numerous conventional problems. Studying worked examples requires the
learner to attend only to each problem state and its associated move.
Empirical evidence for the effectiveness of worked examples for learning and
their superiority over solving equivalent problems was obtained in multiple
experiments performed by Sweller and Cooper (1985), and Cooper and Sweller
(1987) using algebra transformation problems, such as, for example, for the
equation b(a+c)/e=d, express a in terms of the other variables. Reduction in
processing time during the acquisition stage in those experiments was followed by
superior test performance by worked example groups.
Similar results were obtained by Zhu and Simon (1987) who reported a
number of experiments that compared subjects learning only from worked
examples with those using a traditional lecture and problem-solving procedure.
The research demonstrated that students studying worked examples learnt more
quickly, being at least as successful as, and sometimes more successful than,
students learning by conventional methods. Zhu and Simon (1987) also reported
that the method of learning by examples had been successfully used in a Chinese
school with a class covering the three-year curriculum in algebra and geometry in
two years and at a slightly higher level of performance.
Replacing conventional problems with completion problems may also reduce
extraneous demand on working memory (completion problem effect, see van
Merrinboer, 1990; Sweller, et al., 1998). A completion problem provides a
partial solution that should be completed by the learners. The partial solution
reduces the problem search space, focuses learner attention on problem states and
their associated solution steps, thus decreasing extraneous cognitive load.
Evidence of the effectiveness of worked examples and completion problems
in comparison with conventional instruction in terms of solving transfer problems
was also obtained by Paas (1992). His experiment in the area of statistics
problems demonstrated that a cognitive structure resulting from instruction
emphasizing practice with partly or completely worked-out problems is a more
efficient knowledge base for solving transfer problems than one resulting from
instruction based on conventional problem solving. Training that requires students
to study worked-out problems leads to less effort-demanding and better transfer
performance.
Smith and Goodman (1984) demonstrated that hierarchical instructions

containing explanatory schemas corresponding to steps of instruction (e.g.,
explaining what is the first subgoal, what is needed to accomplish this subgoal,
and so on) improved understanding in comparison with linear instructions that
contained only a linear sequence of steps after stating a general goal. Hierarchical
instruction groups in the experiment (assembling an electrical circuit) showed
more transfer, higher verbal recall and shorter reading times than the linear
groups. Worked-out examples based on such explanatory schemas may provide a
better instructional framework for representing chunks of steps and directing
memory search (Smith & Goodman, 1984).
Goal-free problems and worked examples both focus attention on problem
states and their associated moves and reduce cognitive load. However, Sweller &
Cooper (1985) did not observed a significant difference in performance on
transfer problems between students learning from worked-examples and
conventional problem-solving practice. By simplifying the task and providing
more examples to study, Cooper & Sweller (1987) were able to obtain better
transfer for the worked examples group. Extensive practice is required to
automate problem-solving operators before any improvement can be observed for
different problems. Automation frees up cognitive capacity, allowing the trainee
to make appropriate generalizations. If transfer is an aim of instruction, an
extensive mix of worked examples and actual problem solving could be the most
effective instructional format (Cooper & Sweller, 1987; Gabrys et al., 1993).
The split attention effect. Cognitive load theory predicted that studying
worked examples could be superior to solving the equivalent problems because of
a reduction in extraneous cognitive load. Germane or instructionally productive
cognitive load caused by worked examples could be enhanced by adding process-
oriented information (the principled why and strategic how information) to
examples of complex cognitive activities with multiple solution steps (van Gog,
Paas, & van Merrinboer, 2004).
However, there are situations when worked examples themselves require
significant cognitive resources to be processed successfully. In some of these
situations, cognitive load could be reduced by restructuring examples, for
example, by breaking down explanations of complex solution procedures into
smaller elements that can be learned separately (Gerjets, Scheiter, & Catrambone,
2004). In many situations, even such smaller explanatory modules may impose
significant cognitive load. For example, in geometry, diagrams are usually
accompanied by brief textual statements and neither text nor diagrams are
intelligible in isolation. Such a worked example can be understood only by
mentally integrating corresponding statements in the text and on the diagram and
46 Slava Kalyuga
requires cognitive resources that are unrelated to learning. The imposed cognitive
load may eliminate any benefit of a worked example.
A series of experiments with circle geometry problems provided evidence for
this hypothesis (Tarmizi & Sweller, 1988). Worked examples in a conventional
format (i.e., text and diagram separate) demonstrated performance no better than
solving conventional problems. The geometry worked examples required students
to split their attention between diagram and text by searching and matching
elements from the text to the appropriate entities on the diagram, and failed to
facilitate schema acquisition and rule automation. Tarmizi and Sweller (1988)
demonstrated that the search and match process involved with the geometry
worked examples could be reduced if each textual statement was physically
located near its matching entities on the diagram. Physically integrating textual
information with the related diagram improved performance of the worked-
examples group significantly.
Similar results were obtained in experiments in kinematics (Ward & Sweller,
1990). Worked examples in kinematics usually consist of a problem statement
followed by sets of equations representing the worked problem solution. The
following example (Ward & Sweller, 1990) demonstrates a traditional worked
example:
A car moving from rest reaches a speed of 20 m/s after 10 seconds. What is
the acceleration of the car?
u = 0 m/s
v = 20 m/s
t = 10 s
v = u + at
a = (v - u)/t
a = (20 - 0)/10
2
a = 2 m/ s
To understand the worked example, the learner had to mentally integrate the
related sources of information and split her or his attention between the problem
information and the worked solution. An experiment conducted under normal
classroom conditions demonstrated that studying conventional kinematics worked
examples was no more effective than solving the equivalent problems. Ward and
Sweller (1990) found that the worked example effect took place when the
conventional worked examples were reformatted so that the problem solution was
integrated into the problem statement. For instance, the above example was
transformed into the following integrated format:
A car moving from rest (u) reaches a speed of 20 m/s (v) after 10 seconds (t):
2
[v = u + at, a = (v - u)/t = (20 - 0)/10 = 2 m/ s ]. What is the acceleration of the
car?
Learners who studied integrated worked examples processed the material quicker
and made significantly less errors on test items compared to the conventional
worked example group and the conventional problem-solving group.
Physical integration of related sources of information (statements, diagrams,
equations, etc.) decreases extraneous cognitive load by reducing search processes
involved with conventional split source instructional formats. For example, in the
case of instruction on operation of the Starter circuit for switching on/off a light
by start/stop push buttons (Figure 3), an integrated instructional format is
represented in Figure 5.
5. To cease operation of the light the stop

push button is pressed. The circuit in the 2. Pressing down the start push
Starter is now open, the coil is no longer button closes the circuit and allows
energized, and the switch returns to its the current to flow through the coil
normal open position.
Stop Start
A N
1. The Starter consists of coil

a start push button, a stop . 3. The energized coil closes
push button, and a switch the switch, which provides an
activated by the coil. alternative closed circuit for
the coil to that provided by
the start push button. The start
push button now can be
switch light released without breaking the
4. The light is operational, as current flow through the coil.
the closed switch provides a
closed circuit for it.
Figure 5. Integrated diagram-and-text format of instruction on operation of the electrical

circuit for switching on/off a light by start/stop push buttons.
In general, the split-attention effect occurs when instructional material

requires learners to split their attention unnecessarily between multiple sources of
information. Sweller, Chandler, Tierney, and Cooper (1990) obtained the effect
48 Slava Kalyuga
using introductory teaching materials from coordinate geometry and computer

programming. Novice learners who studied instructions with text integrated at
appropriate points on the diagram spent less time processing the instructional
material and exhibited superior test performance with faster solution times and
fewer errors than learners who used conventional materials. Learners studying
integrated instructional formats also demonstrated superior performance on
transfer problems.
Chandler and Sweller (1991) demonstrated the effect using biology materials
in laboratory-based studies. One group received instructions on the blood flow
through the heart, lungs, and body in a conventional split-source format. Textual
explanations and the diagram depicting blood flow chains needed to be mentally
integrated in order to be understood. Another group received the same information
in an integrated format. The results of the experiment favored the integrated-
format group. Despite spending less time studying the instructions, this group
performed better than the conventional group on subsequent test problems.
Superiority of integrated instructions over the conventional instructions was also
demonstrated using introductory electrical engineering and Computer Numerical
Control (CNC) programming materials (Chandler & Sweller, 1991; Chandler and
Sweller, 1992).
Sweller, Chandler, Tierney and Cooper (1990) extended their findings by
integrating two sets of related textual information rather than integrating text and
diagram. The instructions involved a series of commands for hand-operated
machines and comparable commands for a computer numerical controlled system.
To understand the conventional instructions, novice learners had to hold a
segment of the hand command text in working memory while searching for its
matching numerical control command. The integrated instructions had the
numerical control commands in brackets next to each hand-operated command.
The results demonstrated that the integrated group's performance on test items
was superior to the conventional group even though they spent less time
processing the instructional materials.
Chandler and Sweller (1992) replicated the effect with mutually referring
sources of text. They provided evidence that the manner in which experimental
papers in psychology are usually written results in split-attention between various
segments of the paper for inexperienced readers (i.e., educational psychology
students). If descriptions of an experimental group and a procedure (normally
found in a Method section) are integrated with the results associated with that
experimental group, the need for mental integration is eliminated and extraneous
cognitive load is reduced. The inexperienced students gained advantage from the
integrated version of a relatively simple report. The authors did not claim that the
traditional format was inadequate. Findings could have been different if expert
researchers, who know where to look for specific information, or reports that are
more complex, had been used.
A series of studies conducted by Mayer and his colleagues were related to the
split-attention effect (Mayer, 1997). It was found that instructions consisting of
separate text and unlabelled diagrams were less effective than diagrams that
contained labels that clearly connected text and diagram (Mayer, 1989b; Mayer &
Gallini, 1990). The labeled diagrams could be considered as a kind of physical
integration of the diagram and text, as both techniques reduce the need to search.
Research by Mayer and Anderson (1991, 1992) and Mayer and Sims (1994) on
animated instructions and the contiguity principle may be viewed as a temporal
example of the split-attention effect. The authors found that animation and related
narration were most effective when presented simultaneously rather than serially.
An integrated presentation of auditory and visual information was superior to their
successive presentation.
Thus, the split-attention effect has been tested with novice learners in a
variety of areas in both laboratory and realistic training settings, with different
types of related sources of information involved. Interestingly, physically
embedded textual narratives have been used for many years in comic books for
children, thus demonstrating their effectiveness in assisting children to
comprehend complex materials (most reading materials are cognitively
demanding for children). However, this technique was rarely used in general
instructional materials until its cognitive efficiency had been investigated and
appropriate recommendations suggested. A similar situation applies to the
redundancy effect that is considered next.
The redundancy effect. Research generated by cognitive load theory
indicates that integrated instructional formats are beneficial for learning if the
sources of mutually referring information need to be mentally integrated in order
to be understood. However, physical integration of text and diagram may not
always be appropriate. Often individual sources of information are self-contained,
i.e. provide all of the required information in isolation. For example, electrical
circuit diagrams might be intelligible without any reference to the accompanying
text. Understanding such circuits might occur without processing the textual
information. The elimination rather than integration of such redundant sources of
information could be beneficial for learning. If the redundant information is
integrated physically with essential information, learners have no choice but to
process it. This imposes an extraneous cognitive load that interferes with the
learning process.
50 Slava Kalyuga
Thus, integration of all disparate sources of information is not always

effective. Chandler and Sweller (1991) demonstrated the redundancy effect in the
areas of electrical engineering and biology. When text and diagrams did not have
to be mentally integrated in order to be understood, physically integrated
instructions were no more effective than conventional instructions. In a series of
laboratory experiments with electrical circuit instructions, novice learners who
were not specifically asked to integrate disparate sources of information, required
less instruction time and performed better than learners who were specifically
instructed to integrate mentally related text and diagrams. One self-explanatory
source of information was superior to two redundant sources of information in
either a conventional, or an integrated format (Chandler & Sweller, 1991).
These results were replicated in experiments with biology instructional
material. The self-contained diagram of blood flow through the human body with
arrows indicating the flow was used. Any additional statements placed on the
diagram were redundant. Results showed that physically integrating such
redundant text with the diagram interfered with learning. The integrated format
was less effective than separate diagram and text. Removing the text entirely
produced the best learning outcomes. Processing the redundant text requires
additional cognitive resources and imposes an extraneous cognitive load. If a
split-source format is used, students can reduce this load by ignoring the text
when they realize it is redundant (Chandler & Sweller, 1991). Using a paper-
folding task with primary school students, Bobis, Sweller, and Cooper (1993)
demonstrated that diagrams (rather than textual explanations) could be redundant
too.
The redundancy and split-attention effects were investigated using various
computer packages with novice learners (Sweller & Chandler, 1994; Chandler &
Sweller, 1996). From a cognitive load perspective, the conventional method of
instruction in software applications such as computer aided design/computer aided
manufacture (CAD/CAM), word processing, and spreadsheet packages might not
be efficient for learning. Instructions in conventional computer manuals require
using the computer keyboard and simultaneously paying attention to information
on the computer screen. The learners must split their attention when mentally
integrating information from the manual, screen, and keyboard. This split-
attention situation may result in a heavy extraneous cognitive load.
Cognitive load theory suggests eliminating the computer during the initial
instructional period and replacing it with diagrammatic representations of the
computer screen and keyboard with segments of textual instructions integrated at
their appropriate locations on the diagrams. Such modified integrated instructions
are intelligible without reference to the computer: the computer appears to be
redundant. A series of experiments involving an integrated manual only group, a

conventional manual plus equipment group, and an integrated manual plus
equipment group with materials in computer software applications and with
electrical engineering (electrical installation testing) instructions demonstrated all
the predicted effects with dramatic differences between the groups in both written
and practical skills (Sweller & Chandler, 1994; Chandler & Sweller, 1996).
For example, the manual on the CAD/CAM program used for the control of
industrial machinery covered such basic operations as moving the cursor, using
some basic menu functions, drawing lines, etc. The experiment included an
instructional phase followed by written and practical tests. Despite spending less
time studying their manuals, the integrated manual group was superior on most
written and practical test items. The practical task results were of special interest
in that study as the integrated manual only group had no previous experience with
the computer prior to the testing phase.
As discussed in the previous section, only when instructional material is
characterized by a high degree of element interactivity and consequently may
generate a heavy intrinsic cognitive load, an additional extraneous cognitive load
caused by inappropriate design can be harmful to learning. In contrast, when
information has a low intrinsic cognitive load due to low element interactivity,
redesigning instructions to reduce extraneous cognitive load might not be as
crucial. Estimates of element interactivity require assessing the number of
elements that need to be processed concurrently, which in turn needs to consider
the learners knowledge level.
According to cognitive load theory, split-attention and redundancy effects
take place only if learning materials are characterized by a high level of element
interactivity. For example, learning to use a coordinate system in CAD/CAM
packages imposed a heavy intrinsic cognitive load on inexperienced learners
because they had to consider simultaneously different coordinates (Chandler,
Waldron, & Hesketh, 1988; Hesketh, Chandler, & Andrews, 1988). However,
such tasks as moving the cursor, using scales, grids, using some basic menu
functions, could be learned quite independently and involve little interactions
between elements.
No split-attention and redundancy effects were demonstrated by Sweller and
Chandler (1994) and Chandler and Sweller (1996) in areas of low element
interactivity, and the format of presentation was not significant when using such
materials. A modified self-contained manual format was beneficial only under
conditions of high levels of element interactivity. For the other two formats in
such conditions total cognitive load appeared to exceed the learners' available
cognitive capacity. Measures of cognitive load using the dual-task method
52 Slava Kalyuga
confirmed predicted differences in cognitive load in the Chandler and Sweller

(1996) study. The secondary task consisted in recalling the previous letter seen on
the screen of a separate computer while encoding the new letter appearing after a
tone sounded. Strong primary and secondary task effects (better tests results with
better recall of letters) favored an integrated modified manual group in areas of
high element interactivity. No such effects were found in areas of low element
interactivity.
Thus, when instructional materials are intellectually demanding, the
temporary elimination of the hardware may facilitate learning and reduce
instruction time. Elimination of the manual and placing everything on the screen
(as in computer-based training) may also be effective from the point of view of
cognitive load theory, if there are no other sources of extraneous cognitive load
(van Merrinboer & de Croock, 1992). However, in areas where motor
components and spatial-motor coordination are essential (e.g., typing or driving a
car), extensive practice with real equipment is always important (Sweller &
Chandler, 1994).
It was noted (Sweller, 1993; Sweller & Chandler, 1994) that forms of the
redundancy effect had been demonstrated on a large number of occasions in the
past. Reder and Anderson (1980) found that students could learn more from
summaries of textbooks than from the full chapter. Miller (1937) found that
presenting children with a word associated with a picture was less effective in
teaching children to read than the word alone. Saunders and Solman (1984),
Solman, Singh, and Kehoe (1992), and Wu and Solman (1994) demonstrated that
the addition of pictures to words interfered with learning. Schooler and Engstler-
Schooler (1990) found that having to verbalize a visual stimulus impaired
subsequent recognition performance. The requirement to verbalize could be
redundant and impose an extraneous cognitive load. Lesh, Behr, and Post (1987)
found that mathematical word problems become more difficult with additional
information in the form of concrete materials: processing these materials may
impose an extraneous cognitive load.
Holliday (1976) used a flow diagram to teach the nitrogen, water, oxygen and
carbon dioxide cycles to high school students. One diagram represented the
elements in the cycles as small pictures; another showed them as verbal labels.
Students studied either one of the diagrams, or one of the diagrams alongside a
text that presented the same material, or the text alone. On a multiple-choice
verbal test of comprehension, students who studied the diagram only
outperformed the other two groups. Students who were presented with text and
diagrams performed no better than those who studied just text. The advantages of
diagrams disappeared when they were used with text. Under these conditions, the
text appeared to be redundant.
The modality effect. Current theories of working memory consider capacities
to be distributed over several partly independent subsystems, for example,
separate auditory and visual modules (Baddeley, 1986; 1992; Penney, 1989;
Schneider & Detweiler, 1987). For example, Baddeley (1986) proposed a model
that includes three subsystems: a phonological loop, a visuospatial sketchpad, and
a central executive. The phonological loop processes auditory information (verbal
or written material in an auditory form), while the visuospatial sketchpad deals
with visual information such as diagrams and pictures. Penney (1989) proposed a
model of working memory (the "separate stream hypothesis") where the
processing of auditory and visually presented verbal items was carried out
independently by auditory and visual processors in working memory, and
provided a considerable body of research in support of this hypothesis.
Paivio's (1990) dual coding theory also suggests that information can be
encoded, stored and retrieved from two fundamentally distinct systems, one suited
to verbal information, the other to images. The two systems are interconnected
and may contribute additively to memory performance. If information is coded in
both the verbal and imaginal coding systems, memory for the information will be
enhanced. Alternatively, if information is coded in only one of the two systems,
the details will not be as easily recalled.
According to cognitive load theory, the split-attention effect might occur
when learners should mentally integrate two related sources of information and
this integration overburden limited working memory capacity. When one of the
sources is presented in auditory form, there still should be mental integration of
the audio and visual information, but it may not overload working memory
capacity if working memory is enhanced by a dual-mode presentation. The dual-
mode presentation does not reduce extraneous cognitive load but rather increases
effective working memory capacity. The amount of information that can be
processed using both auditory and visual channels might exceed the processing
capacity of a single channel.
Thus, limited working memory may be effectively expanded by using more
than one sensory modality, and instructional materials with dual-mode
presentation (for example, a visual diagram accompanied by an auditory text) can
be more efficient than equivalent single modality formats. The modality effect
occurs when separate sources of non-redundant information otherwise requiring
integration are presented in alternate, auditory or visual, forms. Increasing
effective working memory by using more than one sensory modality produces a
54 Slava Kalyuga
positive effect on learning, similar to the effect of physically integrating separate

sources of information.
In a series of experiments using geometry instructional material, Mousavi,
Low and Sweller (1995) found that a visually presented geometry diagram,
combined with auditory presented statements, enhanced learning compared to
conventional, visual only presentations. They also used audio/visual instructions
where there was a written description of a geometry diagram rather than the
diagrammatic format. The modality effect was not just limited to diagram and
audio text, but also applicable to a written description accompanied by audio
textual statements.
Tindall-Ford, Chandler, and Sweller (1997) demonstrated that an audio
text/visual diagram (or table) format of instructions in elementary electrical
engineering was superior to purely visually based instructions. Measures of
subjective mental load and instructional efficiency estimates were used to support
the cognitive load interpretation of results. When separate instructions with low
and high element interactivity materials were compared, strong performance
differences were found in favor of an audio/visual format for the high element
interactivity instructions. There were no differences between audio/visual and a
visual only format for low element interactivity instructions. Jeung, Chandler, and
Sweller (1997) demonstrated that dual-mode presentations only enhanced learning
when an extensive visual search required for coordination of auditory and visual
messages was eliminated.
Mayer and his associates (Mayer, 1997; Mayer & Moreno, 1998; see also
Clark & Mayer, 2003; Mayer, 2001 for overviews) have conducted a large
number of experiments demonstrating the superiority of audio/visual instructions.
For example, Mayer and Anderson (1991) presented information on how a bicycle
tire pump works. There were four experimental conditions in this study. The first
group viewed an animation depicting the operation of a bicycle tire pump and
listened to simultaneous audio text, the second group was given only the audio
text without the animation, the third group was provided only the animation with
no audio component, and the fourth group received no formal training (control
group). Results of this experiment (measured by number of creative and detailed
solutions on the problem-solving test) showed that the first group outperformed
the other three groups. Further research (Mayer & Anderson, 1992; Mayer &
Sims, 1994) demonstrated that audio/visual instructions may only be superior
when the audio and visual information is presented simultaneously rather than
sequentially (the contiguity effect as an example of the temporal split-attention
effect).
Thus, according to cognitive load theory, learning might be inhibited when

learners split their attention when mentally integrating text and graphics.
However, when textual information is presented in auditory form, mental
integration may not overload working memory that is now enhanced by using
combined resources of the visual and auditory memories. Such a dual-mode
presentation might be used to circumvent cognitive load problems caused by split-
attention. Limited working memory may be effectively expanded by using more
than one sensory modality. The amount of information that can be processed
using both auditory and visual channels might exceed the processing capacity of a
single channel. It might be especially appropriate when other forms of integration
(e.g. physically integrated instructional formats) produce cluttered visual
presentations.
In fact, some techniques for effective expansion of working memory have
been traditionally used in instructional practice. For example, written reminder
notes used while performing simple arithmetic operations can be considered as a
form of external memory that actually enhances working memory capacity.
Schematic depiction of several sequential stages of a device operation in technical
manuals allows readers to free working memory from keeping states of different
parts of the device when tracing its operation. The success of dual mode
instructional presentations in traditional education may possibly be partly
attributed to the benefits of such presentations in optimizing working memory
load. Novice students usually prefer listening to oral explanations of new complex
diagram-based materials in geometry or engineering rather than reading such
explanations in textbooks.
In practice, however, many standard multimedia instructional presentations
use auditory explanations simultaneously with the same visually presented text.
From the point of view of cognitive load theory, such a dual-mode duplication of
the same information using different modes of presentation increases the risk of
overloading working memory capacity and might have a negative learning effect.
When auditory explanations are used concurrently with the same visually
presented text to explain a diagram, relating corresponding elements of visual and
auditory content of working memory may consume additional cognitive resources.
In such a situation, elimination of a redundant (duplicated) source of information
might be beneficial for learning.
Kalyuga, Chandler, & Sweller (1999) tested these suggestions with computer-
based multimedia instructions on theoretical aspects of soldering. Three
instructional formats were compared (visual text, audio text, and visual plus audio
text) using participants without any substantial knowledge of soldering. A
snapshot of the visual text instructional format is shown in Figure 6 (the fusion
56 Slava Kalyuga
diagram). The results confirmed the advantage of dual-mode presentations for

overcoming split--attention problems (audio text group outperformed visual text
group). The audio text group demonstrated a lower number of reattempts at
interactive exercises, a lower subjective rating of cognitive load and a higher test
performance score than each of the other two groups.
The results also demonstrated a disadvantage of dual-mode duplication of
information (audio text group outperformed visual text plus audio text group).
Inclusion of visually presented text simultaneously with the same text in an
auditory form imposed an additional cognitive load on those learners who
processed it. This result can be related to the work of Schooler and Engstler-
Schooler (1990) which demonstrated that verbalizations of visual stimuli impaired
subsequent recognition performance (the requirement to verbalize could be
redundant, producing cognitive overload).
Figure 6. Snapshot of the visual-only format of instruction for the fusion diagram. Adapted
from Kalyuga, Chandler, & Sweller (1999). Copyright 1999 John Wiley & Sons, Ltd.
Mayer, Heiser, & Lonn (2001) compared performance of students who

received a narrated animation explaining the formation of lighting storms and
students who received the same narrated animation along with concurrent on-
screen text. In two experiments, learners who received narration and animation
performed better on tests of retention and transfer than learners who received
animation, narration, and text. For learners who were required to coordinate and
simultaneously process written and spoken text, an excessive working memory
load could be generated.
In a series of three experiments involving a group of technical apprentices,
Kalyuga, Chandler, & Sweller (2004) compared the effects of simultaneously
presenting the same written and auditory textual information as opposed to either
temporally separating the two modes or eliminating one of the modes. The first
two experiments demonstrated that non-concurrent presentation of auditory and
visual explanations of a diagram proved superior in terms of ratings of mental
load and test scores to a concurrent presentation of the same explanations when
instruction time was constrained and system-controlled. The third experiment
demonstrated that a concurrent presentation of auditory and visual forms of the
same lengthy technical text (without the presence of diagrams) was significantly
less efficient in comparison with an auditory-only text.
The expertise reversal effect. The distinction between the split-attention and
redundancy effects might be based on the distinction between sources of
information that are intelligible or unintelligible in isolation. If a diagram and the
concepts it represents are sufficiently self-contained and intelligible in isolation,
then any text explaining the diagram is redundant and should be omitted in order
to reduce cognitive load. Alternatively, if the concepts or functions of a diagram
are not intelligible in isolation, then the diagram will require additional textual
information. This information should be integrated into the diagram, or presented
in auditory mode, in order to reduce cognitive load.
However, intelligibility of information always depends on the level of
expertise of the learner. For example, some information might not be intelligible
in isolation for less experienced learners and so require physical integration with
additional information to reduce an unnecessary working memory load. The same
information may be intelligible in isolation for more experienced learners who
have previously acquired schemas that allow all necessary inferences to be made.
If additional instructional explanations are provided for more experienced
learners, they are redundant and processing them may unnecessarily increase
cognitive load. Eliminating redundancy may be the best way to reduce cognitive
load in this situation. The split-attention effect might be replaced by the
redundancy effect as expertise develops.
Differences in learner knowledge base should be taken into account when
analyzing the sources of cognitive load. What constitutes an element and which
elements interact in learning entirely depends on a person's acquired schemas
related to material being learned. Many interacting elements for one person may
be a single element for another person with a sophisticated schema. For example,
58 Slava Kalyuga
components of the Starter in Figure 3 (two push-buttons, and a switch) can act as
many individual elements for novices, yet all these components may be
considered together as a single element (the Starter) by an experienced electrical
technician who has acquired the schema for the starter.
If learners have sufficient knowledge to understand the circuit diagram, the
textual explanations might be redundant for these learners. They may prefer to
ignore the text but may have difficulty doing so if the text is integrated into the
diagram (Figure 5), resulting in a higher cognitive load. In this situation, the best
instructional format with the lowest unnecessary cognitive load for these learners
may be a diagram alone format (the redundancy effect). On the other hand, if the
circuit diagram is not intelligible in isolation, then additional necessary text
should be presented in an integrated rather than a conventional split-source format
(split-attention effect).
Participants in most previously reviewed studies that originally demonstrated
split-attention effect (e.g., Tarmizi & Sweller, 1988; Ward & Sweller, 1990;
Sweller et al., 1990; Chandler & Sweller, 1991; 1992) did not have the higher
order schemas for chunking interacting elements into single units. For these
learners, the level of intrinsic element interactivity appeared to be high enough to
make the total cognitive load overwhelming. In such situations, reduction of
extraneous cognitive load caused by split-attention became critical.
However, it was observed at the early stages of investigating the modality
effect that differences between subjects in their domain-specific knowledge
clearly influenced the effect. For example, Mayer and Gallini (1990) and Mayer
and Sims (1994) found that only inexperienced novice students showed a strong
contiguity (split-attention) effect and benefited from instructions that coordinated
the presentation of verbal explanations and visual depictions. There were no
improvements demonstrated for high-experience learners who were able to
compensate for uncoordinated instruction by using their long-term memory
knowledge.
Kalyuga, Chandler, & Sweller (1997, 1998) demonstrated that the level of
learner expertise is a critical learning condition that determines whether split-
attention is a problem and relates split-attention to redundancy. Direct physical
integration of text and diagrams was investigated in those studies. Fragments of
textual explanations were directly embedded into electrical wiring diagrams
similar to that in Figure 5. In the first experiment, instructional information was
presented to learners with very limited experience in the domain. In this case, the
split-attention rather than the redundancy effect was obtained. Students had great
difficulty learning from a diagram alone. A diagram with associated text in a
conventional, split-attention format was less difficult, and a physically integrated
format was the most effective. Evidence that the effects were caused by cognitive
load factors came from subjective rating scales. Students found the integrated
diagram and text materials easier to process but performed at a higher level on the
subsequent tests, resulting in substantially higher instructional efficiency
measures using Paas and van Merrinboer's (1993) metric.
Two subsequent experiments were designed to observe alterations in relative
performance between the conditions as learners' level of expertise increased.
These experiments tested the same learners who participated in the first
experiment over a sufficient period to allow a substantial development of
expertise. Experiment 2 used the same three conditions (however, with different,
more complex wiring diagrams) as were used in Experiment 1. After extensive
training in the domain, an interaction effect was obtained: the effectiveness of the
integrated diagram and text condition decreased while the effectiveness of the
diagram alone condition increased.
Experiment 3 provided additional training to the point where substantial
differences between an integrated diagram and text condition and a diagram alone
condition were obtained, providing evidence of the redundancy effect. With
experienced learners, the inclusion of the text interfered with learning. Students
found the diagram alone materials easier to process but performed at a higher
level on the subsequent tests, resulting in substantially higher efficiency ratings
using Paas and van Merrinboer's (1993) metric. Subjective rating scales and
efficiency data confirmed that the cognitive load profile of these two conditions
was essentially the reverse of that obtained in Experiment 1 with novice learners
(Figure 7).
Performance Mental effort

6 6
4 4
2 2
Novices Experts Novices Experts
diagram with integrated text

diagram only
Figure 7. An interaction between instructional designs and levels of learner expertise in

Kalyuga, Chandler, & Sweller (1998)
60 Slava Kalyuga
Thus, the efficiency of an instructional design depends on levels of learner

expertise in a domain, with trainees gaining optimal benefits from different
formats at different levels of expertise. Similar patterns of results were obtained in
other studies (Kalyuga, Chandler, & Sweller, 2000; 2001; Kalyuga, Chandler,
Tuovinen, & Sweller, 2001; Tuovinen & Sweller, 1999). They will be reviewed in
more detail in the following chapter. Collectively, these studies provided evidence
for the existence of an interaction between instructional designs and levels of
learner expertise in a domain, or the expertise reversal effect (Kalyuga, 2005;
Kalyuga, Ayres, Chandler, & Sweller, 2003).
The expertise reversal effect can be related to work on aptitude-treatment
interactions (e.g., Cronbach & Snow, 1977; Lohman, 1986; Mayer, Stiehl, &
Greeno, 1975; Shute, 1992; Snow, 1989, 1994; Snow & Lohman, 1984).
Aptitude-treatment interactions (ATIs) occur when different instructional
treatments result in differential learning rates depending on student aptitudes
(knowledge, skills, learning styles, personality characteristics, etc.). In the
expertise reversal effect, prior knowledge of students is the aptitude of interest.
Procedures and techniques designed to reduce extraneous working memory
load such as integrating textual explanations into diagrams to minimize split-
attention, replacing visual text with auditory narration, or using worked examples
to increase levels of instructional guidance, were found to be more efficient for
less knowledgeable learners. With the development of expertise in a domain, such
procedures and techniques often became redundant. If more knowledgeable
learners could not avoid or ignore redundant sources of information, those sources
were hypothesized to have imposed an additional cognitive load resulting in
negative rather than positive or neutral effects. A cognitive load interpretation of
the effect was supported by subjective rating measures of mental load.
Knowledgeable learners found it more difficult to process instructional formats
and procedures involving redundant components because of additional,
unnecessary information that they had to attend to and integrate with their
available schemas to find congruity between prior knowledge and incoming
instruction.
Studies of instructional design procedures of completion (van Merrinboer,
1990; 1997), scaffolding (van Merrinboer, Kirschner & Kester, 2003), and fading
(Atkinson, Derry, Renkl & Wortham, 2000; Renkl, 1997; Renkl & Atkinson,
2003) supported the expertise reversal effect. These procedures provide novice
learners with considerable support in the form of worked examples but gradually
reduce this support as levels of expertise increase.
Research by McNamara, Kintsch, Songer, and Kintsch (1996) can also be
related to the expertise reversal effect. Using high school biology instructional
materials, McNamara et al. (1996) found that additions to the original

instructional text designed to increase text coherence benefited only low-
knowledge readers. High-knowledge readers benefited from using the original text
only, which was labeled by the authors as a minimally coherent format. These
results can be interpreted from a cognitive load perspective. It could be suggested
that the high-knowledge learners may not require any additional explanatory
information as they found the original text intelligible without additional material
and relatively low in cognitive load, thus exhibiting a redundancy effect. On the
other hand, low knowledge learners required additional information to understand
the original instructions. Thus, text coherence may be a function of expertise of
the learners. Whereas the text may be minimally coherent for low knowledge
learners and therefore require additional explanations, it may well be fully
coherent in isolation for high knowledge learners.
LEVELS OF EXPERTISE AND OPTIMIZATION OF

COGNITIVE LOAD IN INSTRUCTION
Activities that learners are involved in during instruction alter as their levels
of expertise change. In the absence of any external guidance, novices process
unfamiliar information in working memory using mostly unorganized search
procedures. Working memory must be (and it is likely it has evolved to be)
limited to reduce the number of processing paths to manageable levels. In
contrast, previously learned material, held in long-term memory by experts, is
already appropriately organized, and there are no apparent working memory limits
when experts deal with well-learned information according to their previously
acquired schemas (Sweller, 2003, 2004).
The acquisition of expertise is a complex multi-level process involving, on
the one hand, continuous construction of new schemas, during which working
memory limitations are severe, and on the other hand, the retrieval from long-term
memory and application of previously acquired schemas, during which there are
no discernible working memory limits. The coordination of the vastly different
activities of schema construction and schema application should be taken into
account as a major consumer of cognitive resources that requires an appropriate
share of working memory capacity.
Cognitive activities of novice learners are mostly directed towards
construction of new schemas, while more knowledgeable learners retrieve and
apply previously acquired schemas to the newly encountered situations within
well-learned domains. If more knowledgeable learners are presented with
62 Slava Kalyuga
instructions intended for schema construction purposes, those instructions may

conflict with currently held schemas resulting in redundancy and expertise
reversal effects. ...Less skilled students are most likely to benefit from direct
instruction in how to construct a conceptual model for the to-be-learned material,
whereas more skilled students are likely to already possess and spontaneously use
sophisticated conceptual models that may conflict with models presented during
instruction (Mayer, 1989a, p. 44).
The simultaneous use of different (schema-based and instruction-based)
cognitive constructs when dealing with the same units of information may require
knowledgeable learners to unnecessarily expend extra cognitive resources
compared to instruction that relies more on pre-existing schemas for guidance.
Cross-referencing and integration of related redundant components that many
experienced learners may attempt to carry out might decrease cognitive resources
available for meaningful learning or even cause a cognitive overload. This
additional expenditure of cognitive resources may occur even if a learner
recognizes the redundancy and tries to ignore the redundant information.
Thus, instructional guidance, which may be essential for novices, may have
negative consequences for more experienced learners, and it may be preferable to
eliminate the instruction-based guidance for these learners. When a well-guided
instruction is beneficial for novices (resulting in better performance compared to
performance of novices who learned without such guidance) but disadvantageous
for more experienced learners (resulting in poorer performance compared to
performance of experts who learned from guidance-lean instruction), this is a case
of the expertise reversal effect.
For example, textual materials that were essential for novices in order to
understand wiring diagrams in Kalyuga et al. (1998) (e.g., Figure 5), became
redundant when presented to more experienced learners. Experts who acquired
considerable high-level schemas in their area of expertise may not require any
additional explanations. If explanations, nevertheless, are provided, processing
this redundant information and integrating this information with available
schemas may increase the load on limited capacity working memory.
Processing redundant information that is not necessary because it either deals
with issues already dealt with elsewhere or already known by the learner may
cause an unnecessary working memory load hindering instructional outcomes. For
experts, an instructional format with redundant material eliminated may prove
superior to a format that includes the redundant material. A minimalist
redundancy-free format may result in better learning with less cost in terms of
instruction time and cognitive effort.
Using appropriate instructional procedures and removing redundant activities

at each level of learner expertise would minimize interfering or unnecessary
cognitive load and increase proper or germane load. Such a process may be
generally considered as a procedure of optimization of cognitive load in
instruction at each stage of the acquisition of expertise in a domain. A perfectly
optimized instructional sequence is schematically represented in Figure 8, which
shows relative proportion of instruction-based and schema-based guidance at
different levels of learner expertise. At each level of expertise, instructional
guidance is provided for those (and only for those) components of information
that are unfamiliar for learners at this level of expertise and could not be
supported by preciously acquired schemas.
Instruction-
based
guidance
Schema-
based
guidance
Novices Experts
Figure 8. Optimized instructional sequence
In contrast, non-optimal instruction may have gaps corresponding to

information that is supported by neither instructional guidance nor learner internal
schematic knowledge structures (Figure 9). To make sense of this information,
learners would have to apply cognitively expensive problem-solving search
processes such as means-ends analysis. Non-optimal instruction may also have
overlaps corresponding to information that is supported by both schematic
knowledge structures and external instructional guidance. These sources of
cognitive support then need to be cross-referenced and integrated by the learner in
working memory resulting in an excessive cognitive load.
For example, in studies of Kalyuga, Chandler, and Sweller (2001) and
Kalyuga, Chandler, Tuovinen, and Sweller (2001), worked examples were
cognitively optimal instructional formats for novice learners because they
64 Slava Kalyuga
substituted for missing schemas by organizing information in working memory.

At intermediate levels of expertise, a mix of detailed examples for supporting
construction of higher level schemas not yet available and problem statements or
diagrams for supporting retrieval of previously acquired lower level schemas
could be optimal. At higher levels of learner knowledge, most cognitive activities
are based on using previously acquired schemas that can organize knowledge
elements in working memory. Activities designed to support construction of these
schemas at high levels of expertise might be redundant and inefficient.
Using the framework of the basic cognitive architecture (Figure 1), cognitive
structures and processes for learners with different levels of expertise learning
from instructions with different levels of guidance are graphically represented in
Figures 10 - 13. When novices without appropriate higher level schemas in a
domain learn from instruction with high level of guidance (a cognitively optimal
situation), only some low-level schemas could be activated in long-term memory.
All necessary executive support in constructing new knowledge comes from the
external instructional guidance (red-colored components in Figure 10). If such
guidance is not available (a cognitively non-optimal situation), novice learners
have to engage in cognitively inefficient problem search processes that would
results in less learning, if any (Figure 11).
Instruction-
based
guidance
Schema-
based
guidance
Novices Experts
problem cross-referencing
solving of schemas and
search instructions
Figure 9. Non-optimized instructional sequence.

WORKING MEMORY
Executive function
Constructing integrated
mental representations of
a current situation or task
Verbal Pictorial
information information
sub-system sub-system
Low- Low-
level level
schemas schemas
New
knowledge
LONG-TERM MEMORY
Text INSTRUCTIONAL GUIDANCE Pictures
INSTRUCTIONAL PRESENTATION
Figure 10. Cognitive structures and processes for novices learning from well-guided
instruction.
WORKING MEMORY
Executive function
Verbal Pictorial
information Problem solving information
sub-system search sub-system
Low- Low-
level level
schemas schemas
New
knowledge
LONG-TERM MEMORY
Text Pictures
Figure 11. Cognitive structures and processes for novices learning from non-guided
instruction.
66 Slava Kalyuga
WORKING MEMORY
Executive function
Verbal Pictorial
Low- Low-
level level
schemas High- schemas
level
schemas New
knowledge
LONG-TERM MEMORY
Text Pictures
Figure 12. Cognitive structures and processes for experts learning from non-guided
instruction.
However, if instruction with low level of guidance is presented to experienced

learners (a cognitively optimal situation), their previously acquired high-level
schemas might provide all necessary executive support in constructing new
cognitive representations (blue-colored components in Figure 12). If experienced
learners are provided with instruction with high level of guidance (a cognitively
non-optimal situation), both external instructional guidance and previously
acquired schemas would support constructing the same components of new
knowledge. Corresponding cognitive units in working memory could partially
overlap and conflict with each other (Figure 13).
Thus, there are strong theoretical arguments based on cognitive load theory
and accumulated empirical evidence that suggest that instructional techniques that
are highly efficient with less experienced learners may lose their efficiency and
even have negative instructional consequences when used with more experienced
learners. Therefore, instructional techniques and procedures may need to change
radically as learners acquire more expertise in a domain. Accordingly,
optimization of cognitive load in instruction should assume not only presenting
appropriate information at the appropriate time during continuing process of
acquisition of expertise, but also the timely removal of redundant information as
levels of learner expertise increase. Specific examples and empirical research-

based principles of optimizing cognitive load for more advanced learners will be
described in the following chapter.
WORKING MEMORY
Executive function
Verbal Pictorial
Low- Low-
level level
schemas High- schemas
level
schemas New
knowledge
LONG-TERM MEMORY
Text INSTRUCTIONAL GUIDANCE Pictures
Figure 13. Cognitive structures and processes for experts learning from well-guided
instruction.
Chapter 4
COGNITIVE LOAD PRINCIPLES IN

INSTRUCTIONAL DESIGN FOR ADVANCED
LEARNERS
ELIMINATING REDUNDANCY IN MULTIMEDIA INSTRUCTION

Multimedia instruction refers to the instructional presentations that use both
text (printed, on-screen, and/or spoken) and pictures (still or animated) (Mayer,
2001; 2005). It is assumed that information processing and the learning process
are facilitated when the text is accompanied by pictures (Hegarty & Just, 1989;
1993; Levin & Mayer, 1993; Mayer, 1989b, 1993; Mayer & Anderson, 1992;
Mayer & Sims, 1994). Comparative studies of textual and diagrammatic
representations by Larkin and Simon (1987) showed that diagrams could be better
representations not because they contain more information, but because they
provide indexing of this information in a manner that is more cognitively
efficient. In diagrams, information is organized by location. Most of the
information needed at any time is presented at a single location, each element may
be located beside any number of other elements, and little search is required. In a
text, the information is indexed by the position, with each element neighboring
only the next element in the list. Diagrams group together all information that is
used, thus avoiding large amounts of search for the elements (Larkin & Simon,
1987).
Studies in various domains have demonstrated that the instructional
advantages of diagrams depend on student characteristics such as general ability,
spatial reasoning, verbal ability, vocabulary, age, and gender (Winn, 1987). A
general pattern is that high-ability students are usually in less need of explicit
70 Slava Kalyuga
descriptions of the elements and their relations in graphic form than low-ability
students. The presence of a diagram may help the less able students but not the
high-ability students. Winn (1987) also noted that because there are limits to the
amount of information about elements and their relationships that low-ability
students can take in, the graphics might impose additional processing load for the
low-ability students. However, he did not study effects of restructuring
instructional materials to reduce this processing load, for example, effects of
integration of textual instructions at their appropriate locations on the diagram.
Prior knowledge has also been considered as an important factor contributing
to individual differences in the effect of instruction based on text and visual
displays (Schnotz, 2002). For example, when using graphics, more knowledgeable
learners (university students) concentrated on information that was relevant to the
construction of a mental model without calling upon textual information (Schnotz,
Picard, & Hron, 1993).
A number of studies of individual differences in learning from text and
graphics demonstrated that the instructional advantages of diagrams depended on
student domain-specific knowledge and experience. Acquired schemas allow
more knowledgeable learners to avoid processing overwhelming amounts of
information and reduce load on limited working memory. For example,
experiments by Hegarty and Just (1989) demonstrated that high-mechanical-
ability learners comprehended both a text alone and a diagram alone at a higher
level than low-mechanical-ability learners. It was assumed that high-mechanical-
ability learners were able to locate the relevant information in a diagram and were
less dependent on an accompanying text. The way a text and a diagram are
processed by high- and low-mechanical-ability learners depends on required
cognitive effort.
High-mechanical-ability learners might require less effort to construct a
mental representation from either medium alone. Since switching between
processing a text and processing a diagram requires considerable additional
cognitive effort, these learners are able to reduce it by switching less often
between the two media than low-mechanical-ability learners. The high-
mechanical-ability learners inspected the diagram rarely. They were able to hold
its representation in working memory because it contained fewer chunks for them.
Available schemas for mechanical systems made it less necessary for high-
mechanical-ability students to inspect the diagram in order to construct a
representation (Hegarty, Just, & Morrison, 1988; Hegarty & Just, 1993).
Lacking proper schemas, the low-mechanical-ability learners had to look at
the diagram each time a new piece of textual information referred to the diagram.
These learners switched often between the text and diagrams. Mental integration
Cognitive Load Principles in Instructional Design for Advanced Learners 71
of text and graphics occurred in small units that were manageable within the
capacity of learners' working memory. Later these units were combined at a
higher level, with the diagram used as an external representation that helped the
learners to free up resources necessary for integration (Hegarty & Just, 1993).
Glenberg and Langstone (1992) and Kieras (1992) also noted the role of a
diagram as an external memory aid that frees up working memory resources of
low-ability students while processing a text and a diagram.
There are other forms of graphic representation of information that have been
studied extensively, such as various types of cognitive maps (two-dimensional
node-link diagrams), for example knowledge maps (Lambiotte, Skaggs, &
Dansereau, 1993), concepts maps (Novak & Gowin, 1984), or semantic maps
(Johnson, Pittleman, & Heimlich, 1986). Evidence has also been obtained that
using cognitive maps is more beneficial to students with lower prior knowledge or
verbal ability (Lambiotte & Dansereau, 1992; Rewey, Dansereau, & Peel, 1991).
Cognitive maps may interfere with learning and would not be of much help when
learners have already constructed schemas of the instructional material.
When learners process both text and pictures, they have to mentally integrate
verbal and pictorial representations in order to achieve understanding. The
previous chapter described studies that demonstrated that when text and pictures
are not synchronized in space (located separately) or time (presented after or
before each other), the integration process may increase working memory load
and inhibit learning due to cross-referencing different representations. Physically
integrating verbal and pictorial representations may eliminate this split-attention
effect (Chandler & Sweller, 1991; 1992; 1996; Mayer & Anderson 1991; 1992;
Mayer & Gallini, 1990; Sweller, Chandler, Tierney, & Cooper, 1990; Tarmizi &
Sweller, 1988; Ward & Sweller, 1990). Sections of written text could be
embedded directly in the diagram in close proximity to relevant components of
the diagram, and segments of narrated text could be presented simultaneously
with the diagram (or relevant animation frames).
When instructing learners who are more experienced in a specific domain,
there could be a situation when the source of information (textual or pictorial) that
is essential for a novice learner may be redundant for a more knowledgeable
person. In the physically integrated format, processing the redundant information
and integrating that information with the learners schemas could be difficult to
avoid. Attending to and integrating redundant information with existing schemas
requires a share of cognitive resources that becomes unavailable for the
construction and refinement of new schemas. Therefore, in the case of more
advanced learners, elimination of redundant verbal or pictorial information might
be the optimal format of instruction.
72 Slava Kalyuga
Research by Mayer and Gallini (1990) and Mayer, Steinhoff, Bower, and
Mars (1995) in learning from text and illustrations demonstrated that well
designed text-with-pictures instructional formats were more helpful for low-
knowledge learners than for high-knowledge learners. With increases in learners
knowledge in a domain, beneficial effects of such presentations disappeared. For
example, these studies found that differences between subjects in their domain-
specific knowledge clearly influenced the contiguity (or split-attention) effect
when learning from text and pictures. Coordination of words and pictures
improved problem-solving transfer for low-experience learners but not for high-
experience learners. Supposedly, learners with a high level of domain-specific
knowledge were able to compensate for uncoordinated instruction by retrieving
relevant knowledge from long-term memory. Median effect size differences (the
effect size for the high-knowledge learners subtracted from the effect size for the
low-knowledge learners) were 0.60 for retention questions and 0.80 for transfer
questions (Mayer, 2001).
In experiments demonstrating the expertise reversal effect, Kalyuga et al.
(1998) found that advanced electrical trainees learned relatively new instances of
wiring diagrams in familiar domains significantly better from the circuit diagrams
alone than from diagrams with embedded detailed textual explanations.
Conservative estimates of effect sizes (using higher standard deviation values)
were 0.55 for questions about circuits operation and troubleshooting, and 0.65 for
diagram faultfinding questions. The advanced trainees also reported less mental
effort when studying the diagram-only formats. The integrated textual
explanations were clearly redundant for these learners and these explanations
could not be avoided without a substantial cognitive effort.
Dual-modality (e.g., combined auditory and visual) presentations have been
shown to be an effective alternative to direct physical integration of text and
diagrams in dealing with split-attention situations. Working memory capacity
could be effectively increased by presenting a visual diagram with spoken rather
than written explanations. According to the modality effect, novice learners can
integrate textual explanations and pictures more effectively when the text is
narrated rather than presented in an on-screen form (Mayer, 1997; Mayer &
Moreno, 1998; Mousavi, Low, & Sweller, 1995; Tindall-Ford, Chandler, &
Sweller, 1997). (Note, though, that presenting the same text simultaneously in
written and spoken form still may generate an excessive working memory load,
according to Kalyuga, Chandler, and Sweller, 2004).
For high-knowledge learners, however, narrated explanations may become
redundant and reduce learning efficiency. For example, when training
inexperienced apprentices of manufacturing companies in reading different
cutting speed charts (nomograms) used to determine the appropriate number of

revolutions per minute to run specific types of cutting machines, Kalyuga et al.
(2000) demonstrated that replacing visual texts with corresponding auditory
explanations was beneficial for the novice learners (modality effect). An on-
screen animated diagram combined with simultaneously narrated detailed
explanations on how to use the diagram (see Figure 14 for a snapshot of the
instructional presentation) was the most efficient instructional format. When a
novice trainee clicked on a particular step-heading button, an auditory narration of
an explanation of this step was delivered through headphones instead of being
displayed as an identical visual text next to the diagram. Diagram-only
presentation (Figure 15) was the least efficient instructional format for these
learners.
Figure 14. A snapshot of the multimedia instructional format for a cutting speed
nomogram. Adapted from Kalyuga, Chandler, & Sweller (2000). Copyright 2000 by the
American Psychological Association, Inc.
When learners became much more experienced in using these diagrams, the
advantage of auditory explanations on how to use a relatively new type of
diagrams disappeared while the efficiency of the diagram-alone presentations
increased. After additional training, when the trainees became more advanced in
74 Slava Kalyuga
the domain, a substantial advantage of an animated diagram-only presentation

over the diagram-with-audio-text condition was obtained. A conservative
estimation of the effect size was 0.62 for questions requiring application of
learned procedural steps for using a nomogram in different new task situations.
Subjective mental effort ratings supported a cognitive load interpretation of the
results (Figure 16).
Figure 15. A snapshot of the diagram-only instructional format for a cutting speed
nomogram. Adapted from Kalyuga, Chandler, & Sweller (2000). Copyright 2000 by the
American Psychological Association, Inc.
USING PROBLEM-BASED AND EXPLORATORY

LEARNING ENVIRONMENTS
Problem-solving and exploratory (or discovery) learning environments are
relatively unguided forms of instruction. Such instructions could be very
cognitively demanding for novice learners because of a heavy working memory
load and might result in poor learning outcomes. Studies reviewed in the previous
chapter demonstrated that using appropriately designed worked examples may
eliminate this source of cognitive overload. However, as learner experience in a

domain increases, solving problems or exploring relatively new tasks become
more knowledge-based activities due to acquired domain-specific schemas.
Processing a redundant worked example that fully describes a solution path and
integrating this description with corresponding previously acquired solution
schemas may impose a greater working memory load than just practicing in
problem solving or learning in an exploratory environment. For more advanced
learners, such practice or exploration activities may adequately facilitate further
schema refinement and automation.
Performance Mental effort

8
6
6
4 4
2 2
diagram with auditory text

diagram only
Figure 16. An interaction between instructional designs and levels of learner expertise in
Kalyuga, Chandler, & Sweller (2000).
Interactions between levels of learner expertise and levels of instructional

guidance were investigated in a series of longitudinal studies designed in
accordance with the general experimental sequence represented by Figure 17.
Actual experiments usually included more than two stages with intensive training
sessions conducted between the stages to increase learner experience in a specific
domain. The selected experimental domains were narrow enough to allow
substantial increase in learner expertise in a relatively limited amount of time (a
few weeks or months). On the other hand, the domains were expandable enough
to allow a gradual increase in complexity of the tasks faced by the learners.
Using this experimental design pattern, Kalyuga, Chandler, Tuovinen, and
Sweller (2001) demonstrated that the superiority of computer-based worked
examples of domain-specific procedures disappeared as trainees acquired more
experience in a domain. In the first experiment, worked examples providing full
instructional guidance and a problem-solving environment providing limited
76 Slava Kalyuga
instructional guidance were compared (a) with inexperienced learners, (b) after
two consecutive training sessions designed to increase the level of learner
experience, and (c) after two further consecutive training sessions. The task
domain was writing simple programmable logic controller (PLC) programs for
relay circuits of different levels of complexity. PLCs are usually used to automate
aspects of production line processing in manufacturing. The levels of task
difficulty were controlled by varying the number of elements in the circuits.
STAGE 1 STAGE 2
Full Full
guidance guidance
format format
Performance Performance
test INTENSIVE test
TRAINING
Mental SESSIONS Mental
effort effort
rating rating
Limited Limited
guidance guidance
format format
NOVICES EXPERTS
Figure 17. Experimental sequence for studying interactions between levels of learner
expertise and levels of instructional guidance.
In the problem-solving procedure, participants were presented a series of

relay circuits, displayed one at a time, and required to compose a program for
each circuit by dragging components of the program (commands and circuit
element numbers) to appropriate positions in the program table (see Figure 18 for
an example of a simple circuit and Figure19 for a more complex circuit). Trainees
were given three attempts to get a correct answer with 3 min allowed for each
attempt. An attempt ended when a student clicked on a Check button to allow
the software to check the correctness of the answer. If the student failed to obtain
the correct answer during those three attempts, the correct steps in programming
the circuit were provided. At the beginning, simple circuits containing only three
elements were presented. The following sets of circuits contained increasingly
larger number of elements. The procedure for the worked-example group was
identical to the problem-solving procedure except that it included examples of
relay circuits with the programming steps embedded in these circuits (Figure 20).
In this group, participants were requested to follow mentally all the steps
according to a numbered sequence.
Figure 18. A snapshot of the problem-solving practice in PLC programming (simple

circuit).
Figure 19. A snapshot of the problem-solving practice in PLC programming (complex

circuit).
78 Slava Kalyuga
Figure 20. A snapshot of a worked example of writing PLC programs.
As learner experience in the domain increased, the relative improvement in

performance of the problem-solving group was superior to the worked-example
group. Nevertheless, a redundancy effect with a statistically significant superior
performance by the problem-solving group was not obtained. It is possible that a
redundancy effect could have occurred if more training had been provided to
learners. However, because the experiments took place in natural training
environments (apprentice training centers), the length of trainees' exposure to
specific training materials was limited. It was expected that a redundancy effect
might occur if the learners were trained in a related but less complex domain.
In Experiment 2, the worked-examples and problem-solving instructional
approaches again were compared first with less experienced learners and then
after two consecutive training sessions. The task domain was writing Boolean
switching equations for relay circuits of gradually increasing levels of complexity.
By varying the number of elements in the circuits, it was possible to gradually
increase the level of task difficulty throughout the experiment and observe
continuous development of learner experience in the domain. Understanding the
basics of PLC programming learned in Experiment 1 was helpful (similar relay
circuits were used in both domains) but not sufficient for writing switching
equations. Therefore, at the beginning of Experiment 2, the materials were more
familiar to the participants than the materials at the beginning of Experiment 1.
For the problem-solving procedure, participants were shown a series of relay
circuits and requested to type in a Boolean equation for each circuit (Figure 21).
For the worked-example procedure, examples of the same relay circuits with the
corresponding Boolean equation for each circuit were indicated to the learners
(Figure 22). An expected redundancy effect with advanced learners was obtained
in this experiment. Because the learners were sufficiently knowledgeable at the
beginning of the experiment, worked examples were of no advantage in
comparison with the problem-solving procedure. With additional training, worked
examples became redundant resulting in a negative effect compared with
problem-solving practice.
Figure 21. A snapshot of the problem-solving practice in writing switching equations.
Thus, in Experiment 1, a worked example effect was obtained with

inexperienced learners but the superiority of worked examples disappeared with
training. In Experiment 2, there was no worked example effect prior to extensive
training sessions but eventually, with sufficient experience, learning relatively
new tasks in the familiar domain was facilitated more by problem-solving practice
than by studying worked examples (Figure 23). A conservative estimate of the
effect size was 0.75 for questions requiring trainees to write switching equations
for circuits with a larger number of components than that used during instruction.
Exploratory (or discovery) learning environments usually provide even less
instructional guidance than problem-solving practice, because the learners have to
formulate the problem on their own before attempting to solve it. This form of
instruction may impose a heavy working memory load on novice learners for
exactly the same reasons as problem-solving practice. On the other hand, it could
also be more beneficial than direct forms of instruction when used with advanced
80 Slava Kalyuga
learners for whom the source of potential excessive cognitive load is eliminated
with acquisition of appropriate schemas in their knowledge base.
Figure 22. A snapshot of the worked example on writing switching equations.
Figure 23. An interaction between instructional designs (worked examples vs. problem
solving) and levels of learner expertise in Kalyuga, Chandler, Tuovinen, & Sweller (2001).
Figure 24. An interactive screen-based template. Adapted from Kalyuga, Chandler, and
Sweller (2001). Copyright 2000 Taylor & Francis.
Figure 25. A problem presented to learners after an acceptable circuit had been
constructed.
Using the experimental sequence design represented in Figure 17, Kalyuga,

Chandler, and Sweller (2001) compared worked examples-based instruction on
how to construct switching equations for relay circuits with an exploratory
learning environment. In the exploratory environment, learners were first required
to construct their own circuits using an interactive screen-based template (Figure
24). Clicking on any thin contour gray line outside of symbols of input elements
82 Slava Kalyuga
highlighted that line. Clicking on any contour symbol highlighted that input
element. Clicking again on any highlighted element (a line or a symbol)
eliminated highlighting. After an acceptable circuit had been constructed, the
learners were invited to write a switching equation for this circuit (Figure 25). If a
participant was repeatedly incorrect in her or his answers, the correct equation
was provided.
When the knowledge level of trainees was raised as a consequence of
specifically designed computer-based training sessions using tasks with gradually
increasing levels of difficulty, the exploratory group demonstrated better results
than the worked examples group (Figure 26). A conservative estimate of the effect
size was 0.33 for questions requiring trainees to select the correct switching
equations for circuits with a larger number of components than that used during
instruction. Subjective measures of mental effort supported the cognitive load
interpretation of the effect.
8 Performance Mental effort

6
6
4
4
2
2
worked examples
exploratory learning
Figure 26. An interaction between instructional designs (worked examples vs. exploratory
learning) and levels of learner expertise in Kalyuga, Chandler, & Sweller (2001).
It should be noted that in this study, two levels of tasks were involved: simple
tasks with few input elements and a very limited number of possible options to
explore, and complex tasks with numerous options to explore. The exploratory
learning was more beneficial than direct instruction for advanced learners only for
the complex tasks. There were no differences between the procedures for the
simple tasks. Similarly to other cognitive load effects, this effect occurred only
when structurally complex instructional materials with high element interactivity
were used. Learning from such materials involved many interacting elements of
information that had to be processed simultaneously in working memory
potentially resulting in a heavy cognitive load. For relatively simple circuits,
cognitive load was much lower and within the limits of working memory for
either instructional format.
GRADUALLY REDUCING LEVELS OF INSTRUCTIONAL

GUIDANCE
Fully guided direct instruction is best represented by worked examples with
detailed explanations of problem solution steps. For novice learners, properly
designed worked examples were more beneficial instructional procedures than, for
example, problem-solving learning on many occasions (Carrol, 1994; Cooper &
Sweller, 1987; Paas, 1992; Paas & van Merrinboer, 1994a, 1994b; Quilici &
Mayer, 1996; Rieber & Parmley, 1995; Sweller & Cooper, 1985; Trafton &
Reiser, 1993). The studies reviewed in the previous section demonstrated that
providing examples of worked-out solutions, while reducing cognitive load for
novices, was redundant for more experienced learners. Additional cognitive
resources were required by advanced learners to integrate the instructional
guidance with available schemas that provided essentially the same guidance.
Minimal guidance formats (problem-solving practice and exploratory learning
environments) were cognitively optimal for these learners.
Because the acquisition of expertise is a gradual process, the switch from
fully guided instructional procedures to unguided problem-solving practice or
exploration needs to be designed as a continuing process. Faded worked examples
and completion assignments using the completion strategy (van Merrinboer,
1990; van Merrinboer et al., 2003) represent a suitable approach for instructional
implementation of such gradual change in guidance. The completion strategy is
based on a sequence of instructional procedures from fully worked-out examples
with complete task solutions to conventional problems. Completion assignments
contain a problem description, an incomplete worked-out solution, and tasks to
complete.
In a series of studies, Atkinson et al. (2000), Renkl (1997), and Renkl,
Atkinson, and Maier (2000) demonstrated that the detailed worked examples,
most effective for novices, should be gradually faded out with increased levels of
learner knowledge, to be eventually replaced by problems. Renkl, Atkinson,
Maier, and Staley (2002) and Renkl and Atkinson (2003) demonstrated the
advantage of gradually reducing guidance with increases in expertise in
84 Slava Kalyuga
comparison with an abrupt switch from worked examples to problems. In faded

worked examples, as levels of learner knowledge in a domain increase, parts of
worked examples are progressively replaced with problem-solving steps (Renkl,
Atkinson, & Groe, 2004).
The method can be illustrated by an example from a computer-based tutor in
solving elementary algebra equations (Kalyuga & Sweller, 2004). The tutor was
designed as a series of worked examples, completion assignments, and
conventional problems. The allocation of learners to appropriate completion
assignments or stages of the faded worked examples was based on the outcomes
of rapid diagnostic tests that are described in the second part of the book. The
learners progress through the stages was also monitored by the diagnostic tests,
and instruction was accordingly tailored to changing levels of expertise.
Least experienced learners were initially presented with a series of fully
worked-out examples (Figure 27), each followed by a problem-solving exercise.
Depending on the results of a diagnostic test at the end of this phase, if necessary,
some additional instructional materials were provided before proceeding to the
next training stage. These instructions were designed as a set of shortened worked
examples indicating only major steps without detailed explanations of
intermediate procedures.
Figure 27. A fully worked-out example used in the computer-based algebra tutor.
Figure 28. A faded worked example used in the computer-based algebra tutor.
The second stage contained completion assignments (or faded worked

examples) with the explanations of the last procedural step omitted. Learners were
asked to complete the solution themselves (Figure 28). Each of the following
training stages was similar to the previous one, except that a lower level of
instructional guidance was provided to learners. In completion assignments,
explanations of progressively more procedural steps were eliminated. The final
stage contained only problem-solving exercises without any explanations
provided.
Another procedure that can be used to replace worked examples when
instructing more knowledgeable learners was suggested by Cooper, Tindall-Ford,
Chandler, and Sweller (2001). They found that imagining procedures and
concepts might produce better instructional outcomes than simple studying of
worked examples. Students were asked to imagine computer-presented worked
examples on how to use a spreadsheet application (consisting of a set of diagrams
with embedded textual explanations of sequential steps) rather than simply study
the examples. High-knowledge students who had already acquired (at least,
partially) schemas that allowed them to incorporate the interacting elements and
support constructing relatively new representations, found the imagining
technique more beneficial for learning compared with studying worked-out
examples. Effect sizes in two sets of experiments were 1.24 and 0.75 for tests
consisting of both similar and transfer problems.
86 Slava Kalyuga
For low-knowledge students, on the other hand, the imagining procedure had
a negative effect because these students had to process all components of
instruction as individual elements in limited working memory. Worked examples
effectively guided low-knowledge learners in constructing new schemas of
complex procedures that contained many interacting elements. More advanced
learners already had such schemas and studying the worked examples was a
redundant activity for them. Thus, as learners acquire more expertise in a domain,
studying worked examples in similar and repetitive situations could be gradually
replaced with imagining corresponding procedures. This technique could provide
additional practice for more advanced learners leading to a higher degree of
schema automation. Similar results were obtained by Ginns, Chandler, & Sweller,
(2003).
A gradual reduction of levels of instructional guidance in exploratory learning
environments can be accomplished by providing learners with less specified task
goals or subgoals as the learners familiarity with the domain increases. If the
learning goal in an exploratory environment is well specified, an ordered
sequential structure of learning steps could be constructed, with the level of detail
adjusted according to the levels of learner knowledge in the domain. Such
instructional procedures could be suitable for relatively less advanced learners
when lower level schemas are targeted by instruction. This procedure assumes
systematic processing of all relevant knowledge components and acquisition of
corresponding sub-schemas. To avoid working memory overload, all the relevant
sub-schemas should be acquired one-by-one in advance to be readily available for
retrieval when higher-level schematic knowledge structures are developed.
In the case of higher-level schemas and poorly specified learning goals, it
could be practically impossible to precisely structure appropriate ordered
sequences of sub-tasks due to a large number of possible options and paths to
explore. A learner (even a relatively advanced one) could be lost searching for
relevant subgoals. This search might cause a heavy working memory load and
consume cognitive resources that would become unavailable for constructing
relevant higher order schemas. Therefore, an approach based on the acquisition of
all potentially relevant sub-schemas before constructing higher order schemas (to
reduce associated cognitive load) might require an excessive amount of time and
effort to produce required learning results.
On the other hand, irregular or random processing of lower level components,
even though low in cognitive load, is also an unrealistic and unreliable approach
with a low likelihood of producing required schematic knowledge structures in a
reasonable amount of time. A more suitable exploratory approach for relatively
advanced learners could be based on traversing an appropriately constructed
information space along several well-defined overlapping lines of representation.

Instructional benefits of such multiple traversing might be partly due to an
effective reduction of the number of relevant sub-schemas, which should be
acquired before the required higher level schema could be developed, to a few
overlapping contexts covering the information space.
This approach is related to studies of Spiro and Jehng (1990) in cognitive
flexibility. It was suggested that instruction in complex and ill-structured domains
(e.g., literature) could be improved by using forms of nonlinear (or random
access) learning that would allow exploring the domain by revisiting the same
content material in a variety of different contexts. Spiro and Jehng (1990)
assumed that a nonlinear multidimensional traversal of a complex instructional
subject could support achieving learners cognitive flexibility, the ability to
restructure one's knowledge for adaptation to a specific situation. This technique
could be used in the design of cognitively efficient exploratory environments for
advanced learners not only in ill-structured domains, but also in complex and
well-structured domains in the case of poorly specified learning goals.
When suggesting learners to explore several dimensions in an appropriately
structured information space, we actually provide them with some overlapping
instructional subgoals, thus preventing irrelevant activities that might overload
working memory. Such a partially directed way of exploring the material by
advanced learners could be more efficient for the acquisition of complex schemas
than a prescribed linear sequence of learning or a fully unguided exploration. It
could also be a means of a gradual reduction of instructional guidance. Providing
pre-defined overlapping subgoals for advanced students who learn from complex
exploratory environments might have the same cognitive load consequences as
eliminating specific goals for novice learners in problem-solving situations in
well-structured domains (goal-free effect; see Chapter 3). Both techniques could
reduce cognitive load irrelevant to schema acquisition and facilitate learning.
As an example of multidimensional representations in a complex information
space, consider schematic structure of knowledge about technical objects
described in Chapter 1 (Figure 2). The interconnected components, aspects, and
levels of description of technical objects could be used as a general framework for
constructing multidimensional representations of information in this area. It could
be suitable for exploring by multiple traversing to facilitate the process of schema
acquisition by advanced learners. In such an environment, a learner would be able
to attend to any technical component, any aspect or level of its description, at any
time and from any other location in the multidimensional space according to her
or his needs, level of understanding, and preliminary knowledge.
88 Slava Kalyuga
The possibility of an immediate access to information about functional,

operational, and structural aspects of any technical component with different
levels of specificity could be effectively realized in hypermedia learning
environments by establishing an appropriate network of hyperlinks between
different components, aspects, and levels of description. These hyperlinks would
make any required modules of information easily accessible from any other aspect
or level of description of any related object by single clicks on appropriate buttons
or areas.
For example, in a simple prototype of the suggested exploratory learning
environment developed using the Authorware software, a network of hyperlinks
between different components, aspects, and levels of description was created
using navigational facilities of the authoring tool. The navigation panel, in
addition to the set of standard pre-installed functions (back to home page,
previous page, next page, word search, etc.), included a choice of functional
(purpose), operational (operation), and structural (components) aspects of
described objects, as well as two levels of the description specificity (overview
and details).
Different technical objects could be selected by clicking on their screen
images. Having selected an object and a level of description, learners could
inspect the functional, operational, and structural aspects of the object at the given
level of specificity. Alternatively, having selected, for example, the functional
aspect of description and a level of specificity, they could inspect the functional
characteristics of different objects at the given level of specificity. Finally, having
selected an object and an aspect of its description, the learners could inspect
different levels of specificity of required information (zooming in or zooming
out). Thus, the model could effectively provide a framework for multiple
traversing of overlapping multidimensional representations of the technical
information space.
A specific way of traversing selected by a learner may depend on her or his
specific level of expertise. Experts are able to efficiently interconnect different
parts of their knowledge and switch between different levels of representations.
Paris (1989) analyzed expert-novice differences in processing technical
information and found that most technical texts fell into two groups. Texts for
experts (e.g., adult technical encyclopedias, car manuals for mechanics) were
organized around object subparts and their properties. Texts for novices (e.g.,
junior encyclopedias, car manuals for novices) focused on information about
processes that allowed technical objects to perform their functions. The suggested
user-model-based tailoring system was based on the assumption that the choice of
strategy should be based on the users level of expertise (Paris, 1989). Likewise,
specific ways of traversing multidimensional exploratory learning environments

may also need to be tailored to specific levels of learner expertise.
SUMMARY: TOWARD A COGNITIVELY
EFFICIENT INSTRUCTIONAL TECHNOLOGY
FOR ADVANCED LEARNERS
Recent studies in cognition and instruction have provided a basis for the
design and development of cognitively guided instructional systems. Such
systems not only achieve desired instructional effects, but achieve them efficiently
with optimal expenditure of resources (e.g., instruction time and mental effort).
Cognitive efficiency is becoming an important feature of contemporary
instructional systems. The change in research focus from the cognitive
characteristics of tasks and learners to cognitively efficient ways of structuring
and presenting instructional information is a shift from cognitive science toward a
cognitive technology of instruction (Sweller, 1989). The studies reviewed and
described in this part of the book represent an example of this move.
Human cognitive capacity is limited: we can process only a very limited
amount of information at any one time. The basic model of human cognitive
architecture assumes a working memory with a limited capacity for maintaining
unfamiliar information in an active state and a long-term memory with virtually
unlimited storage capacity and duration. Working memory is the major human
cognitive processor involved in constructing and integrating mental
representations and in short-term maintenance of the relevant information. Long-
term memory stores our organized knowledge base in the form of schemas. A
schema contains information about some class of structures or objects and is
directly related to our cognitive performance.
Schemas are the basic units of knowledge representation that allow us to treat
elements of incoming information in terms of larger higher-level chunks, thus
reducing capacity demands on working memory. The main difference between
92 Slava Kalyuga
experts and novices is in the way their knowledge base is structured and used.
Working memory limits are of much less concern to experts who have their
knowledge in the area of expertise well organized and stored in long-term
memory. These knowledge structures significantly influence the content and
characteristics of working memory and cause systematic differences among
individuals in their working memory capacity for specific tasks.
After sufficient practice, schemas become automated, and we are able to
access our long-term knowledge base rapidly in an automatic manner. Rather than
involving lengthy attention-demanding search, automated schemas require less
working memory resources and allow information processing to occur with
minimal mental effort. Cognitive mechanisms of schema acquisition and
automation are foundations of our intellectual abilities and skilled performance.
Well learned and developed schematic knowledge structures are major aspects of
competent performance that allow efficient use of basic information processing
features of human cognitive architecture.
Studies of expert performance in a variety of domains indicate that experts
can be characterized by efficient representations of problem situations in working
memory, extensive domain-specific knowledge schemas in long-term memory,
and efficient chunking mechanism of memory retrieval. Expert problem solving in
knowledge-rich domains can be viewed as finding and adopting appropriate
schemas in long-term memory. When solving specific problems, schema-driven
experts spend more time on planning their steps, apply forward-working
approaches, and use more efficient strategies of search. Experts in structurally
complex domains possess multi-level hierarchical schemas representing classes of
objects and situations. Constructing such schemas requires significant cognitive
effort and begins with simplified representations. Experts integrate various levels
of knowledge (intuitions, practical knowledge, theoretical knowledge; local and
general knowledge) and switch between these levels while solving problems.
Novice students cannot learn expert schemas directly. The instructional
design process should be based on cognitive models of transition between
different levels of expertise. Students' understanding of instructional materials is
based on their available schemas. Resolving the conflict between a learners
available schemas and conceptual models presented during instruction may
require a significant mental effort and cause a negative learning effect. The
backward problem-solving search strategies used by novices may also prevent
learning due to working memory overload. Cognitive load is a major factor of
complex cognitive performance and expertise acquisition.
Many instructional materials and techniques may be ineffective because they
ignore limitations of the human cognitive processing system and impose a heavy
Summary 93
cognitive load. Cognitive load theory is based on the assumption that a person has
a limited processing capacity, and proper allocation of cognitive resources is
critical to learning. Schema acquisition and transfer from consciously controlled
to automatic cognitive processing are the major learning mechanisms that reduce
the burden on working memory. Using limited cognitive resources on activities
not directly related to schema construction and automation (e.g., integration of
information separated over distance or time, or processing redundant information)
may inhibit learning.
Learning materials with a high degree of element interactivity may impose a
heavy intrinsic cognitive load on working memory. In this case, an appropriate
instructional design that reduces extraneous cognitive load might be critical for
efficient learning. Studies generated by cognitive load theory in realistic training
settings and laboratory environments have demonstrated that learning can be
significantly facilitated by restructuring instructional designs in a way that
emphasizes procedures and activities directed towards schema acquisition and
automation and places the primary cognitive burden on long-term memory
schematic knowledge structures. For example, extraneous cognitive load for
novice learners could be reduced when goal-free problems or worked examples
are used instead of conventional problem solving, and when split-attention and
redundancy situations are eliminated.
The split-attention effect occurs when instructional material requires learners
to unnecessary split their attention between multiple sources of information.
Physical integration of the elements of information reduces extraneous cognitive
load and enhances learning. The split-attention may also be eliminated if the
information is presented in a partly audio and partly visual format because
working memory capacity is enhanced under dual-modality conditions. If
individual sources of information are self-contained, integration of the redundant
information with essential information may impose an extraneous cognitive load
that would interfere with learning. In this situation, the elimination rather than
integration of redundant sources of information is beneficial for learning. All
these effects should be of concern only when material has an intrinsically high
level of element interactivity for the learner.
Evaluation of cognitive load that might be imposed on prospective learners
has to be an important part of the instructional design process. Cognitive load
theory emphasizes that the influence of extraneous cognitive load on learning
depends on the schemas that have been previously acquired by the learner and the
level of automation of operations involved in the processing of instructional
material. Novice learners require considerable assistance to understand new
concepts. Therefore, introductory materials should include plenty of explanations
94 Slava Kalyuga
and details, and they should be presented in a way that reduces unnecessary
cognitive overload. Expertise in a domain decreases some of the limitations of
working memory by enabling the use of organized schematic knowledge
structures, stored in long-term memory, to process information more efficiently.
However, in many instructional situations, expertise may also trigger additional
cognitive load when learners are required to process redundant information. There
is strong evidence that instructional techniques that are highly efficient with less
experienced learners may lose their efficiency when used with more advanced
learners (the expertise reversal effect).
As levels of learner expertise in a domain increase, the relative changes in
efficiency of instructional formats and procedures could be caused by the
variations in working memory load involved in relating schema-based and
instruction-based sources of cognitive support when constructing integrated
mental representations of corresponding situations or tasks. A cognitive load
explanation of the expertise reversal effect is based on the need for experts to
cross-reference and integrate knowledge-based and redundant instruction-based
cognitive structures dealing with the same units of information.
An expertise reversal may be expected in situations when highly guided and
integrated instructional presentations designed to assist novice learners are used
with more advanced learners. Inappropriately used instructional formats and
procedures could be very inefficient with advanced learners and may require an
unnecessary additional expenditure of cognitive resources and instructional time.
To be efficient, instructional techniques and procedures need to change
significantly as learners acquire more expertise in a domain. Instructional
implications of these findings for advanced learners are summarized below.
Reducing extraneous and increasing germane cognitive load. In general,
extraneous cognitive overload could be avoided by reducing diversion of
cognitive resources on procedures and tasks that are not directly associated with
learning (or schema acquisition). For example, eliminating the need to devote
cognitive resources to searching and locating an appropriate fragment in a
diagram and text or attending to unnecessary details could improve the learning
efficiency for both novice and advanced learners. Computer-based instructional
systems may also free up part of the learners cognitive load by carrying out some
necessary subtasks, for instance, an information search, engaging on-line tools,
referencing a dictionary, locating appropriate diagrams, simulating physical
dismantling of equipment, etc. (Lajoie, 1993).
There are instructional design principles and techniques that are specific for
advanced learners only. Some of these principles are reverse to those intended for
novice learners. For example, eliminating components of multimedia
Summary 95
presentations (e.g., auditory explanations of a diagram) is recommended where

they are redundant for advanced learners, even though this elimination may
transform the multimedia instruction into a simple single-media format (e.g.,
diagram-only visual presentation). Exploratory and problem-based learning
environments with minimal instructional guidance are recommended at higher
levels of expertise, even though they are strongly discouraged when instructing
novice learners. When dealing with learners at intermediate levels of expertise, a
gradual reduction of instructional guidance is recommended. As learner levels of
competency in a domain increase, completion tasks, faded worked examples, or
suggested lines of exploration could be used to gradually change the levels of
instructional guidance.
It is reasonable to suggest that similar reversal effects could be demonstrated
with other cognitive load reduction principles and methods as learners become
more advanced in a domain. Further comprehensive studies are required of the
effects of differing levels of learner expertise on alternative instructional
procedures and techniques with various instructional materials and learning
environments.
Developing learner skills in managing cognitive resources. A suitable
instructional design may not be the only means of optimizing learners cognitive
resources. It could be supplemented by students metacognitive resource-
allocation skills. For example, students could be advised to learn how to ignore
redundant instructional explanations if they are sufficiently experienced in a
domain. Advanced learners may spontaneously develop appropriate resource-
management skills as adaptive learning strategies (at least, the author
spontaneously learned a long time ago to ignore on-screen text in the Top Ten
section of the David Lettermans Late Night Show). Learners metacognitive
skills in managing cognitive load were not discussed in this book, as such skills
need more focused research.
Implementing cognitive design principles in computer-based learning
environments. In the paper-based format, some cognitive load reduction
techniques may produce rather cluttered instructional presentations. For example,
embedding full or partial textual explanations into a diagram may obscure or
distort components of the diagram. Consequently, unnecessary search processes
may not be reduced to a degree that would facilitate learning. Computer-based
multimedia instructional systems represent the best environment for the
implementation of the described cognitive load design principles and techniques.
In dynamic computer-based presentations, only relations and links
corresponding to selected elements of the text or diagrams could be displayed on
the screen when needed. For instance, arrow-directed embedded textual
96 Slava Kalyuga
explanations with color-coded elements or pop-up on-demand windows could be

shown on the screen when learners click on selected elements. In multimedia
systems, combining on-screen diagrams with auditory on-demand explanations
could be superior to conventional visual formats. Additionally, because the
elimination of any redundant sources of information could be beneficial for
advanced learners, experts can avoid verbal explanations that are unnecessary for
them by turning off the auditory mode.
Building learner-adapted instructional presentations. Advanced students,
in contrast to complete novices, have been involved in learning a specific domain
for some time. Determining the amount of instructional details or the level of
instructional guidance relatively advanced students need is a difficult task. A
commonsense assumption that additional details would not harm learning may be
wrong. Instruction for more advanced students should represent a compromise. It
should provide sufficient details enabling students to comprehend the material and
omit redundant explanations that may create cognitive overload and hinder
learning. A major instructional implication of the expertise reversal effect is that
instructional design should be tailored to the intended learners particular levels of
knowledge and skills related to the instructional area of study. Thus, cognitively
oriented instructional materials should be directed towards certain groups of
prospective learners, or materials should be adaptable to the levels of learner
expertise.
Computer-based instructional systems might include several different
interaction modes presenting the same information in different ways to different
learners. Moreover, the same information could be presented in different ways to
the same learner at different times taking into account the development of her or
his expertise. In learner-tailored instructional systems, an appropriate assignment
of the learner to an instructional format could be guided by tracing the individual's
learning performance (e.g., the number of reattempts during training exercises) or
using appropriate assessment data.
The empirical evidence described in this part of the book indicates that
instructional designs and procedures that are cognitively optimal for less
knowledgeable learners may not be optimal for more advanced learners.
Instructional designers or instructors need to evaluate accurately the learner levels
of expertise to design or select optimal instructional procedures and formats.
Frequently, learners need to be assessed in real time during an instructional
session in order to adjust the design of further instruction appropriately.
Traditional testing procedures may not be suitable for this purpose. The following
chapters describe a cognitive load approach to the development of rapid schema-
based tests of learner expertise. The proposed methods of cognitive diagnosis will
Summary 97
be based on contemporary knowledge of human cognitive architecture and will be

further used as means of optimizing cognitive load in learner-tailored computer-
based learning environments.
INDEX
autonomous, 17
A availability, 26, 30
ABC, 6
access, 1, 22, 25, 87, 88, 91 B
accuracy, 23
activation, 2 barrier, 11
adaptation, 87 behavior, 9, 15, 16, 30
adult, 88 beneficial effect, 72
affect, 4, 41 benefits, 43, 55, 60, 87
age, 4, 69 biochemical, 11
aid, 71 biological, 11
alternative, 15, 32, 40, 44, 72, 94 biology, 48, 49, 50, 60
alters, 15 blood, 48, 50
American Psychological Association, 73, 74 blood flow, 48, 50
application, 17, 37, 43, 61, 74, 85 body, 48, 50, 53
aptitude, 60 bottleneck, 6
arithmetic, 55 brain, 2
artificial, 8, 29, 32 buttons, 37, 38, 39, 47, 57, 88
artificial intelligence, 8, 29, 32 bypass, 24
assessment, 95
assignment, 95
associations, 2, 32 C
atoms, 11
attention, 2, 3, 15, 16, 34, 35, 44, 45, 46, 47, CAD, 50, 51
48, 49, 50, 51, 53, 54, 55, 57, 58, 60, 71, CAM, 50, 51
72, 91, 92 capacity, 1, 3, 4, 5, 6, 11, 17, 22, 28, 32, 35,
attention problems, 55 36, 40, 45, 51, 53, 55, 61, 62, 71, 72, 90, 92
audio, 39, 53, 54, 55, 74, 92 carbon, 52
automaticity, 33 carbon dioxide, 52
automation, 31, 35, 40, 46, 75, 86, 91, 92 causal relationship, 24
100 Index
central executive, 2, 3, 53 complexity, 30, 36, 75, 76, 78

channels, 53, 55 components, 2, 3, 7, 8, 9, 10, 11, 12, 14, 22,
children, 49, 52 25, 27, 28, 29, 30, 32, 33, 37, 39, 40, 52,
Chinese, 44 57, 60, 62, 63, 64, 66, 71, 76, 79, 82, 86,
chunking, 5, 6, 7, 58, 91 87, 88, 93, 94
CIA, 6 composition, 9
classes, 9, 16, 32, 33, 91 comprehension, 5, 6, 12, 52
classroom, 46 computer, vii, 16, 29, 41, 47, 48, 50, 51, 52,
clusters, 10, 12, 13 55, 75, 82, 84, 85, 94, 96
coding, 53 computer mouse, 41
cognition, 1, 2, 4, 7, 17, 28, 90 computer software, 50
cognitive, vii, 1, 2, 3, 4, 7, 8, 11, 12, 13, 15, concentrates, 43
16, 17, 21, 22, 23, 24, 26, 27, 28, 29, 30, concept map, 13, 32
31, 32, 33, 34, 35, 36, 37, 39, 40, 41, 42, conceptual model, 29, 30, 34, 62, 91
43, 44, 45, 47, 48, 49, 50, 51, 52, 53, 54, concrete, 28, 52
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 66, conditional rules, 7
70, 71, 72, 74, 75, 80, 82, 83, 86, 87, 90, confidence, 29, 42
91, 92, 93, 94, 95 configuration, 8, 13, 37, 39
cognitive activity, 39 conflict, 34, 62, 66, 91
cognitive capacity, 17, 45, 51, 90 Congress, iv
cognitive effort, 39, 62, 70, 72, 91 conservation, 22
cognitive flexibility, 87 constraints, 6
cognitive function, 11 construction, 23, 36, 61, 64, 70, 71, 92
cognitive load, vii, 15, 23, 30, 32, 35, 36, 37, consumption, 35
39, 40, 41, 42, 43, 44, 45, 47, 48, 49, 50, context, 3, 11, 29, 30
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, contiguity, 49, 54, 58, 72
63, 66, 74, 80, 82, 83, 86, 87, 92, 93, 94, 95 continuing, 66, 83
cognitive map, 33, 71 control, 2, 3, 17, 18, 27, 31, 48, 51, 54
cognitive models, 29, 91 control group, 54
cognitive performance, 34, 90, 91 controlled, 3, 28, 48, 57, 76, 92
cognitive process, 4, 7, 11, 15, 17, 23, 24, 26, coordination, 3, 52, 54, 61
31, 32, 35, 37, 39, 90, 91 covering, 44, 87
cognitive processing, 11, 17, 23, 24, 35, 37, cues, 12, 22, 31
39, 91 culture, 16
cognitive psychology, 8, 16 curriculum, 44
cognitive representations, 66 cycles, 15, 52
cognitive research, 29, 32
cognitive science, 29, 90
cognitive system, 1, 22 D
cognitive tasks, 2, 3
declarative knowledge, 13, 17, 28
coherence, 61
declarative memory, 8
coil, 37, 39, 40
degree, 6, 8, 33, 36, 38, 39, 40, 51, 86, 92, 94
communication, 35
demand, 30, 40, 44, 95
competency, 21, 32, 94
designers, vii, 32, 95
complex systems, 29
deviation, 72
Index 101
diagnostic, 84 expertise, vii, 1, 6, 16, 18, 21, 22, 26, 27, 28,
distortions, 12 29, 34, 38, 39, 57, 58, 59, 60, 61, 62, 63,
division, 9 64, 66, 72, 75, 76, 80, 82, 83, 84, 86, 88,
domain, 5, 6, 7, 9, 12, 16, 17, 18, 22, 23, 24, 91, 93, 94, 95
25, 26, 27, 28, 30, 34, 38, 58, 59, 60, 63, experts, 6, 16, 18, 22, 23, 24, 25, 26, 27, 28,
64, 66, 70, 71, 72, 74, 75, 78, 79, 84, 86, 29, 31, 32, 39, 61, 62, 66, 67, 88, 91, 93, 95
87, 91, 93, 94, 95 exposure, 22, 78
domain-specificity, 26 eyes, 2, 8
duplication, 55
durability, 1, 2
duration, 90 F
duties, 29
failure, 12, 23
false, 7
E fatigue, 4
FBI, 6
ears, 8 feedback, 17
education, 55 fire, 15
educational psychology, 48 flexibility, 87
elaboration, 33 flow, 37, 40, 48, 50, 52
election, 12 folding, 50
electric circuit, 30 Ford, 54, 72, 85
electric circuits, 30 foreign language, 24
electrical, 36, 37, 39, 40, 45, 47, 48, 49, 50, functional aspects, 3
54, 58, 72 furniture, 2
electronic, iv, 8, 21 fusion, 55, 56
electronics, 22, 28
electrostatic, iv
emergence, 11 G
encapsulated, 39
gender, 69
encoding, 6, 7, 11, 41, 51
gene, 8, 45
energy, 22
general knowledge, 91
engineering, 29, 32, 48, 49, 50, 54, 55
generalization, 9
environment, 30, 75, 81, 86, 87, 88, 94
generalizations, 8, 45
equipment, 50, 52, 93
genetics, 21
evidence, vii, 14, 15, 24, 27, 37, 43, 44, 46,
goals, 16, 33, 39, 86, 87
48, 59, 60, 66, 93, 95
GPS, 14
evolution, 12, 18
graph, 10
executive function, 3
grids, 51
executive functioning, 3
grouping, 2, 6
exercise, 84
groups, 6, 11, 22, 25, 43, 44, 45, 51, 52, 54,
experimental condition, 54
55, 88, 95
experimental design, 75
guidance, 23, 60, 61, 62, 63, 64, 66, 75, 76,
expert, iv, 6, 16, 17, 18, 19, 22, 24, 26, 27, 28,
79, 83, 85, 86, 87, 94, 95
29, 30, 32, 33, 34, 39, 48, 88, 91
expert systems, 29
102 Index
instructional materials, 32, 35, 37, 39, 41, 48,

H 49, 52, 53, 61, 70, 82, 84, 91, 94, 95
instructional practice, 55
handling, 3
instructional procedures, vii, 15, 63, 83, 86,
harm, 95
94, 95
harmful, 27, 40, 51
instructional techniques, 13, 66, 93
hearing, 12
instructional time, 93
heart, 48
instructors, vii, 95
heuristic, 14
integration, 3, 37, 39, 40, 47, 48, 49, 53, 54,
high school, 52, 60
57, 58, 62, 70, 71, 72, 92
high-level, 11, 23, 24, 28, 42, 62, 66
intellectual skills, 31
Holland, 30
intelligence, 8, 29, 32
human, vii, 1, 2, 4, 7, 8, 11, 13, 14, 16, 18, 21,
interaction, 36, 37, 59, 60, 75, 80, 82, 95
22, 31, 35, 50, 90, 91, 96
interaction effect, 59
human cognition, 1, 2, 4, 7
interactions, 4, 39, 51, 60, 76
human information processing, 11, 35
interactivity, 36, 37, 38, 39, 40, 51, 54, 58, 82,
hypermedia, 88
92
hypothesis, 4, 5, 46, 53
interest, 51, 60
interference, 7, 37, 43
I interpretation, 5, 42, 54, 60, 74, 82
interviews, 32
ideas, 7, 30 intrinsic, 36, 37, 39, 40, 51, 58, 92
images, 3, 8, 30, 40, 53, 88 isolation, 37, 45, 49, 57, 58, 61
implementation, 83, 94
in situ, 93
J
inclusion, 59
indexing, 69
judgment, 7
indicators, 32, 42
individual differences, 4, 5, 7, 70
induction, 14 K
industrial, 51
inferences, 57 kinematics, 43, 46
influence, 5, 16, 18, 24, 27, 37, 91, 92 knowledge, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
information processing, 11, 21, 24, 35, 69, 91 14, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27,
initial state, 14, 18 28, 29, 30, 31, 32, 33, 34, 39, 44, 51, 55,
injury, iv 57, 58, 60, 61, 63, 64, 66, 70, 71, 72, 75,
input, 81, 82 80, 82, 83, 85, 86, 87, 88, 90, 91, 92, 93,
instruction, vii, 12, 18, 26, 28, 29, 31, 32, 33, 95, 96
34, 36, 39, 41, 42, 44, 45, 47, 50, 52, 56,
57, 58, 60, 61, 62, 63, 64, 65, 66, 67, 69,
70, 71, 72, 74, 79, 81, 82, 83, 84, 86, 87, L
90, 91, 93, 94, 95
instruction time, 50, 52, 57, 62, 90 language, 5, 6, 24, 36
instructional design, vii, 18, 32, 34, 36, 40, 41, language skills, 5
59, 60, 75, 80, 82, 91, 92, 93, 94, 95 laws, 23, 29
Index 103
learners, vii, 17, 18, 25, 27, 30, 33, 36, 37, 40, mapping, 32, 33
41, 42, 44, 47, 48, 49, 50, 51, 53, 54, 56, Mars, 72
57, 58, 59, 60, 61, 62, 63, 64, 66, 70, 71, mass, 29
72, 73, 74, 75, 76, 78, 79, 81, 82, 83, 84, mathematical, 3, 27, 52
85, 86, 87, 88, 90, 92, 93, 94, 95 mathematics, 43
learning, vii, 1, 5, 11, 12, 15, 16, 17, 18, 27, meaningful tasks, 5
28, 29, 30, 31, 32, 34, 35, 36, 37, 39, 40, meanings, 29
41, 43, 44, 45, 46, 49, 50, 51, 52, 53, 54, measurement, 22, 41
55, 57, 58, 59, 60, 62, 64, 65, 66, 67, 69, measures, 5, 41, 42, 59, 60, 82
70, 71, 72, 74, 79, 81, 82, 83, 85, 86, 87, mechanical, iv, 21, 70
88, 89, 91, 92, 93, 94, 95, 96 mechanics, 22, 88
learning efficiency, 72, 93 media, 70, 94
learning environment, vii, 30, 32, 74, 79, 81, medical student, 27
83, 86, 88, 89, 94, 96 memory, 1, 2, 3, 4, 5, 6, 7, 8, 11, 12, 13, 15,
learning outcomes, 50, 74 16, 17, 18, 21, 22, 23, 24, 25, 26, 30, 31,
learning process, 16, 31, 49, 69 32, 33, 34, 35, 36, 37, 39, 40, 41, 44, 45,
learning styles, 60 48, 53, 54, 55, 56, 57, 58, 60, 61, 62, 63,
learning task, 32 64, 66, 70, 71, 72, 75, 79, 83, 86, 87, 90,
likelihood, 86 91, 92, 93
limitations, 3, 6, 7, 11, 24, 31, 35, 61, 91, 93 memory capacity, 5, 6, 11, 22, 35, 36, 40, 53,
linear, 9, 11, 13, 33, 45, 87 55, 61, 72, 91, 92
links, 9, 13, 94 memory performance, 53
listening, 55 memory retrieval, 91
literature, 87 mental load, 35, 41, 54, 57, 60
location, 69, 87 mental model, 7, 11, 33, 70
long period, 26 mental processes, 31
longitudinal studies, 27, 75 mental representation, 1, 3, 70, 90, 93
long-term, 1, 2, 5, 7, 8, 11, 12, 13, 16, 18, 21, mental simulation, 25
23, 24, 26, 31, 58, 61, 64, 72, 90, 91, 92, 93 messages, 54
long-term memory, 1, 2, 5, 8, 11, 12, 13, 16, metacognitive, 17, 18, 94
18, 21, 23, 24, 26, 31, 58, 61, 64, 72, 90, metacognitive skills, 18, 94
91, 92, 93 methodology, 22, 30, 32
low-level, 42, 64 metric, 59
lungs, 48 misconceptions, 27, 32
modality, 52, 53, 54, 55, 58, 72, 73, 92
mode, 53, 54, 55, 57, 95
M modeling, 18
models, 1, 2, 3, 7, 11, 13, 19, 26, 29, 30, 33,
machinery, 51
34, 41, 62, 91
machines, 9, 48, 73
modules, 45, 52, 88
magnetic, iv
molecules, 11
maintenance, 3, 90
momentum, 22
management, 94
monitoring, 41
manipulation, 2, 3
motor coordination, 52
manufacturing, 72, 76
mouse, 41
manufacturing companies, 72
104 Index
mouth, 8 35, 41, 42, 43, 44, 45, 46, 47, 48, 52, 53,
multidimensional, 87, 88, 89 54, 55, 56, 59, 62, 78, 90, 91, 95
multimedia, 41, 55, 73, 93, 94, 95 personality, 60
music, 12 personality characteristics, 60
perspective, 11, 50, 61
phone, 2
N phonological, 3, 53
physics, 16, 21, 22, 24, 29
narratives, 49
planning, 25, 91
natural, 78
plants, 30
needs, 2, 31, 51, 83, 87
PLC, 76, 77, 78
negative consequences, 62
poor, 74
nerve, 11
potatoes, 1
network, 88
practical knowledge, 91
New York, iii, iv
predictors, 5
nitrogen, 52
pre-existing, 62
node, 71
preparation, iv
nonlinear, 87
primary school, 50
non-linear, 11
primitives, 32
normal, 40, 46
principle, 12, 26, 49
prior knowledge, 3, 6, 11, 12, 32, 60, 71
O problem space, 14
problem-based learning, 94
on-line, 93 problem-solver, 18
operator, 14 problem-solving, 1, 13,14, 15, 16, 17, 18, 22,
optimization, 63, 66 23, 24, 25, 27, 31, 32, 35, 40, 42, 43, 44,
oral, 55 45, 47, 54, 63, 72, 75, 76, 77, 78, 79, 80,
organization, 1, 11, 21, 25, 33 83, 84, 85, 87, 91, 92
overload, 15, 37, 53, 54, 56, 62, 75, 86, 87, problem-solving skills, 15
91, 93, 95 problem-solving strategies, 22, 25
oxygen, 52 procedural knowledge, 16, 29
procedural memory, 8
procedures, vii, 9, 12, 15, 16, 17, 23, 24, 25,
P 26, 28, 30, 32, 33, 45, 60, 61, 63, 66, 75,
82, 83, 84, 85, 86, 92, 93, 94, 95
paper, 48, 50, 94 production, 7, 8, 13, 14, 15, 16, 30, 76
parameter, 43 program, 30, 51, 76
Paris, 88 programming, 16, 21, 30, 47, 48, 76, 77, 78
pathways, 15 property, iv
perception, 24, 26, 27 proposition, 7
perceptions, 27, 28 propulsion, 30
perceptual learning, 28 protocols, 43
performance, 2, 3, 4, 5, 6, 7, 13, 16, 17, 18, prototype, 88
21, 22, 24, 25, 26, 27, 28, 30, 31, 32, 33, psychological, 11
psychologists, 22
Index 105
psychology, 8, 16, 48
S
R scaffolding, 33, 60
schema, vii, 8, 9, 11, 12, 13, 14, 16, 18, 22,
random, 2, 6, 22, 23, 86, 87 23, 24, 25, 27, 28, 31, 35, 36, 38, 39, 40,
random access, 87 43, 46, 57, 61, 62, 63, 75, 86, 87, 90, 91,
random numbers, 2 92, 93, 95
range, 2, 4 schemas, 7, 8, 11, 12, 13, 14, 15, 16, 17, 18,
rating scale, 58, 59 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 34,
ratings, 41, 42, 57, 59, 74 38, 39, 43, 45, 57, 58, 60, 61, 62, 63, 64,
reaction time, 41 66, 70, 71, 75, 80, 83, 85, 86, 87, 90, 91, 92
reading, 3, 5, 6, 22, 45, 49, 55, 72 school, 44, 50, 52, 60
reading comprehension, 5, 6 science, 16, 27, 29, 43, 90
real time, vii, 30, 95 scores, 42, 57
reasoning, 26, 27, 32, 69 search, 14, 18, 23, 24, 39, 40, 44, 45, 46, 47,
recall, 2, 5, 6, 7, 12, 22, 29, 45, 52 49, 54, 61, 63, 64, 69, 86, 88, 91, 93, 94
recalling, 3, 11, 12, 22, 41, 51 searching, 16, 25, 46, 48, 86, 93
recognition, 13, 24, 31, 52, 56 second language, 36
reduction, 12, 14, 23, 43, 44, 45, 58, 86, 87, selecting, 15, 22, 33
94 self, 17, 18, 26, 27, 36, 42, 49, 50, 51, 57, 92
redundancy, 49, 50, 51, 52, 57, 58, 59, 61, 62, self-confidence, 42
78, 79, 92 self-monitoring, 18, 26
refining, 25 semantic, 7, 71
reflection, 30 semantic networks, 7
Reimann, 16, 18, 22, 24, 25 sensory memory, 3
rejection, 25 sensory modality, 53, 55
relationship, 27 sentence comprehension, 7
relationships, 8, 24, 28, 31, 32, 37, 70 sentences, 3, 5, 7, 25, 39
reliability, 9 sequencing, 33
remembering, 12, 30, 39 series, 7, 46, 48, 49, 50, 53, 57, 75, 76, 78, 83,
research, 4, 8, 13, 16, 29, 32, 41, 44, 53, 54, 84
67, 90, 94 services, iv
researchers, 48 short supply, 5
resources, 3, 4, 5, 15, 27, 28, 35, 39, 40, 45, short-term, 1, 2, 4, 22, 90
50, 54, 55, 61, 62, 71, 83, 86, 90, 91, 92, short-term memory, 1, 22
93, 94 signals, 35
restructuring, 18, 45, 70, 92 similarity, 12
retention, 33, 56, 72 simulations, 25, 30
retrieval, 6, 7, 11, 22, 61, 64, 86 skill acquisition, 28
returns, 40 skilled performance, 16, 18, 21, 91
risk, 55 skills, 5, 8, 15, 17, 18, 26, 31, 32, 33, 51, 60,
94, 95
software, 32, 50, 76, 88
solutions, 18, 23, 43, 54, 83
spare capacity, 35
106 Index
spatial, 8, 52, 69 technological, 9

specific knowledge, 6, 7, 16, 25, 26, 28, 33, technology, 90
58, 70, 72, 91 telephone, 2, 29
specificity, 26, 27, 88 temporal, 49, 54
speed, 1, 6, 26, 46, 47, 73, 74 test items, 47, 48, 51
spelling, 1 test scores, 57
stages, 15, 17, 55, 58, 75, 84, 85 textbooks, 52, 55
standard deviation, 72 theoretical, 1, 11, 15, 26, 28, 29, 55, 66, 91
standard model, 2 theory, 6, 7, 8, 11, 14, 15, 24, 26, 28, 29, 35,
statistics, 44 36, 40, 41, 42, 45, 49, 50, 51, 52, 53, 54,
stimulus, 52 55, 66, 92
STM, 2, 3, 6, 7 thermodynamics, 16
storage, 3, 4, 5, 6, 7, 90 thinking, 2
storms, 56 threshold, 3, 4
strategic, 45 time, vii, 2, 22, 23, 24, 25, 26, 30, 31, 33, 39,
strategies, 17, 22, 25, 26, 28, 29, 33, 91, 94 41, 43, 44, 47, 48, 50, 51, 52, 57, 62, 66,
strength, 1, 11 69, 70, 71, 75, 76, 86, 87, 90, 91, 92, 93,
structuring, 90 94, 95
student characteristics, 69 time consuming, 23
students, 9, 12, 18, 27, 29, 30, 34, 43, 44, 45, timing, 33
46, 48, 50, 52, 55, 56, 58, 60, 62, 69, 70, trade, 3, 4
71, 85, 86, 87, 91, 94, 95 tradition, 29
subjective, 39, 41, 42, 54, 55, 58, 60 trainees, 60, 72, 73, 75, 79, 82
subtasks, 93 training, 5, 6, 13, 17, 26, 27, 29, 30, 49, 52,
summaries, 13, 52 54, 59, 72, 73, 75, 76, 78, 79, 82, 84, 85,
superiority, 44, 54, 75, 79 92, 95
supply, 5, 35 transfer, 17, 34, 43, 44, 45, 48, 56, 72, 85, 92
surface component, 27 transfer performance, 44
surface structure, 23 transformation, 44
switching, 37, 38, 47, 70, 78, 79, 80, 81, 82 transformations, 1, 3
symbols, 7, 24, 81 transition, 19, 26, 31, 34, 91
syntax, 36 troubleshooting, 21, 29, 30, 72
systematic, 5, 86, 91 tutoring, 30
systematic processing, 86 two-dimensional, 71
systems, 3, 7, 8, 13, 15, 28, 29, 30, 53, 70, 90,
93, 94, 95
U
T university students, 70
task demands, 28
task difficulty, 76, 78 V
task performance, 6, 41
values, 8, 13, 42, 43, 72
teachers, 32
variable, 8
teaching, 29, 30, 47, 52
variables, 8, 15, 44
technician, 58
Index 107
variation, 29 windows, 2, 95
variety of domains, 21, 91 witnesses, 12
verbalizations, 56 word processing, 50
visual, 3, 8, 41, 49, 52, 53, 54, 55, 56, 57, 58, words, 3, 5, 6, 7, 12, 24, 36, 38, 52, 72
60, 70, 72, 73, 92, 94, 95 work, 18, 25, 27, 28, 56, 60
visual images, 3 working memory, 2, 3, 4, 5, 6, 7, 8, 11, 13, 15,
visual stimuli, 56 17, 18, 22, 23, 24, 25, 30, 31, 32, 33, 34,
visual stimulus, 52 35, 36, 37, 39, 40, 41, 44, 48, 52, 53, 54,
visuospatial, 3, 53 55, 56, 57, 60, 61, 62, 63, 64, 66, 70, 71,
vocabulary, 69 72, 74, 79, 83, 86, 87, 90, 91, 92, 93
writing, 76, 78, 79, 80
water, 52

2009 Slava Kalyuga - Cognitive Load Factors in Instructional Design For Advanced Leraners PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2009 Slava Kalyuga - Cognitive Load Factors in Instructional Design For Advanced Leraners PDF

Uploaded by

Copyright:

Available Formats

COGNITIVE LOAD FACTORS IN

INSTRUCTIONAL DESIGN FOR

Nova Science Publishers, Inc.

NOTICE TO THE READER

Independent verification should be sought for any data, advice or recommendations

LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA

ISBN: 978-1-60741-685-2 (E-Book)

Available upon request

Published by Nova Science Publishers, Inc.  New York

The empirical evidence described in this book indicates that instructional

BASIC ARCHITECTURE OF HUMAN

A cognitive approach to human learning emphasizes the internal cognitive

simultaneous use and integration of various sources of information, coordination

Figure 1. Basic architecture of human cognition.

The task-specific hypothesis (Daneman & Carpenter, 1980) assumed that

Simple chunking mechanisms provide an example of using long-term

In the WM model of Carpenter and Just (1989), the operation of WM during

interconnected units can be used to represent the meaning of sentences and

as cost and reliability). Requirements and objectives could be generally included

functions of the object

Figure 2. General schematic structure of technical knowledge.

Each of above aspects of technical knowledge may have different levels of

functions, processes, and structural components of a technical object. Simple and

corresponding intellectual abilities associated with operating such structures)

comprehend instruction might be caused by the lack of any appropriate schemas

PROBLEM SOLVING AND THE NATURE

All of our purposeful cognitive activities can be considered as problem

the later studies shifted to learning theories as theories of the acquisition of

problem-solvers do not connect their explanations (if any) with their

COGNITIVE STUDIES OF EXPERT-NOVICE

SCHEMA-BASED APPROACH TO STUDYING

situations, and events. To understand or interpret incoming information, the

the experts saw it as an example of a class of problems requiring a balance-of-

4x + 2) will be treated as a meaningful single unit or chunk. If a student practiced

abstract physics categories and principles, while novices do it according to surface

1) domain-specificity (experts exhibit superior performance mainly in their

COGNITIVE STUDIES OF EXPERT-NOVICE DIFFERENCES

Most studies of expertise have focused on discrete expert-novice differences

Expert routine problem solving is traditionally associated with using a

levels of development, schemas may also change from qualitative to quantitative

training should combine knowledge of system principles with procedures of how

COGNITIVE MODELS OF DEVELOPMENT OF EXPERTISE

Cognitive studies of human performance and learning have demonstrated that

instruction (Rumelhart, 1980). Students' schemas might interfere with instruction

diminishing levels of support within each class (process of scaffolding); 2)

(1989a) demonstrated significant advantages of using conceptual models that

COGNITIVE LOAD PERSPECTIVE IN

THEORETICAL AND EMPIRICAL BACKGROUND

Figure 4. Components of the Starter.

person's chunks (and the amount of information encapsulated within these

For a novice electrical trainee, understanding these instructions requires

The denominator 2 is used in this formula to allow an easy graphical

EFFECTS GENERATED BY COGNITIVE LOAD THEORY

to facilitate learning. Evidence of interference between conventional problem

A goal-free technique is highly effective for problems that have a limited

Smith and Goodman (1984) demonstrated that hierarchical instructions

5. To cease operation of the light the stop

1. The Starter consists of coil

Figure 5. Integrated diagram-and-text format of instruction on operation of the electrical

In general, the split-attention effect occurs when instructional material

using introductory teaching materials from coordinate geometry and computer

Thus, integration of all disparate sources of information is not always

redundant. A series of experiments involving an integrated manual only group, a

Published by Nova Science Publishers, Inc. New York