You are on page 1of 16

How does content analysis work?

Ezzy (2002: 83) suggests that content analysis starts


with a sample of texts (the units), defines the units of
analysis (e.g. words, sentences) and the categories to be
used for analysis, reviews the texts in order to code
them and place them into categories, and then counts
and logs the oc-currences of words, codes and
categories. From here statistical analysis and
quantitative methods are applied, leading to an
interpretation of the re-sults. Put simply, content
analysis involves coding, categorizing (creating
meaningful categories into which the units of analysis
words, phrases, sen-tences etc. can be placed),
comparing (categories and making links between them),
and conclud-ing drawing theoretical conclusions from
the text.
Anderson and Arsenault (1998: 102) indicate the
quantitative nature of content analysis when they state
that at its simplest level, content analysis involves
counting concepts, words or occurrences in documents
and reporting them in tabular form. This succinct
statement catches essential features of the process of
content analysis:

breaking down text into units of analysis

undertaking statistical analysis of the units


presenting the analysis in as economical a form as
possible.

This masks some other important features of content


analysis, including, for example, examination of the
interconnectedness of units of analysis (categories), the
emergent nature of themes and the testing, development
and generation of theory. The whole process of content
analysis can follow eleven steps.

Step 1: Define the research questions to be


addressed by the content analysis
This will also include what one wants from the texts to
be content-analysed. The research questions will be
informed by, indeed may be derived from, the theory to
be tested.
HOW DOES CONTENT ANALYSIS WORK?477

Chapter
Step 2: Define the population from which involved; who was present; where the documents
units of text are to be sampled come from; how the material was recorded and/or
edited; whether the person was willing to, able
The population here refers not only to people but
to, and did tell the truth; whether the data are
also, and mainly, to text the domains of the
accurately reported (Robson 1993: 273); whether
analysis. For example, is it to be newspapers,

23
the data are corroborated; the authenticity and
programmes, interview transcripts, textbooks,
credibility of the documents; the context of the
conversations, public domain documents,
generation of the document; the selection and
examination scripts, emails, online conversations
evaluation of the evidence contained in the
and so on?
document.

Step 3: Define the sample to be included


Step 5: Define the units of analysis
Here the rules for sampling people can apply
equally well to documents. One has to decide This can be at very many levels, for example,
whether to opt for a probability or non-probability a word, phrase, sentence, paragraph, whole text,
sample of documents, a stratified sample (and, if people and themes. Robson (1993: 276) includes
so, the kind of strata to be used), random sampling, here, for newspaper analysis, the number of stories
convenience sampling, domain sampling, cluster on a topic, column inches, size of headline, number
sampling, purposive, systematic, time sampling, of stories on a page, position of stories within a
snowball and so on (see Chapter 4). Robson (1993: newspaper, the number and type of pictures. His
275 9) indicates the careful delineation of the suggestions indicate the careful thought that needs
sampling strategy here, for example, such-and- to go into the selection of the units of analysis.
such a set of documents, such-and-such a time Different levels of analysis will raise different
frame (e.g. of newspapers), such-and-such a issues of reliability, and these are discussed later.
number of television programmes or interviews. It is assumed that the units of analysis will be
The key issues of sampling apply to the sampling classifiable into the same category text with the
of texts: representativeness, access, size of the same or similar meaning in the context of the text
sample and generalizability of the results. itself (semantic validity) (Krippendorp 2004: 296),
Krippendorp (2004: 145) indicates that there although this can be problematic (discussed later).
may be nested recording units, where one unit is The description of units of analysis will also include
nested within another, for example, with regard to the units of measurement and enumeration.
newspapers that have been sampled it may be thus: The coding unit defines the smallest element of
material that can be analysed, while the contextual
the issues of a newspaper sampled; the articles in an unit defines the largest textual unit that may appear
issue of a newspaper sampled; the paragraphs in an in a single category.
article in an issue of a newspaper sampled; the Krippendorp (2004: 99 101) distinguishes
propositions constituting a paragraph in an article in three kinds of units. Sampling units are those units
an issue of a newspaper sampled. that are included in, or excluded from, an analysis;
(Krippendorp 2004: 145) they are units of selection. Recording/coding units
This is the equivalent of stage sampling, discussed are units that are contained within sampling
in Chapter 4. units and are smaller than sampling units,
thereby avoiding the complexity that characterises
sampling units; they are units of description.
Step 4: Define the context of the
Context units are units of textual matter that set
generation of the document
limits on the information to be considered in the
This will examine, for example: how the material description of recording units; they are units that
was generated (Flick 1998: 193); who was delineate the scope of information that coders
CONTENT ANALYSIS AND GROUNDED THEORY

need to consult in characterising the recording sufficiently close to that which it is describing for
units (Krippendorp 2004: 101, 103). the researcher to see at a glance what it means (in
Krippendorp (2004) continues by suggesting a this respect it is unlike a number). For example,
further five kinds of sampling units: physical the code trust might refer to a persons
(e.g. time, place, size); syntactical (words, trustwor-thiness; the code power might refer to
grammar, sentences, paragraphs, chapters, series the status or power of the person in the group.
etc.); categorical (members of a category have Miles and Huberman (1984) advise that codes
something in common); propositional should be kept as discrete as possible and that
(delineating particular constructions or coding should start earlier rather than later as late
propositions); and thematic (putting texts into coding enfeebles the analysis, although there is a
themes and combinations of categories). The risk that early coding might influence too strongly
issue of categories signals the next step. The any later codes. It is possible, they suggest, for as
criterion here is that each unit of analysis many as ninety codes to be held in the working
(category conceptual, actual, classification memory while going through data, although clearly,
element, cluster, issue) should be as discrete as there is a process of iteration and reiteration
possible while retaining fidelity to the integrity of whereby some codes that are used in the early
the whole, i.e. that each unit must be a fair rather stages of coding might be modified subsequently
than a distorted representation of the context and and vice versa, necessitating the researcher to go
other data. The creation of units of analysis can through a data set more than once to ensure
be done by ascribing codes to the data (Miles and consistency, refinement, modification and
Huberman 1984). This is akin to the process of exhaustiveness of coding (some codes might
unitizing (Lincoln and Guba 1985: 203). become redundant, others might need to be broken
down into finer codes). By coding up the data the
researcher is able to detect frequencies (which codes
Step 6: Decide the codes to be used in are occurring most commonly) and patterns (which
the analysis codes occur together).
Codes can be at different levels of specificity and Hammersley and Atkinson (1983: 177 8)
generality when defining content and con-cepts. propose that the first activity here is to read and
There may be some codes which subsume others, reread the data to become thoroughly familiar
thereby creating a hierarchy of subsump-tion with them, noting also any interesting patterns,
subordination and superordination in ef-fect any surprising, puzzling or unexpected features,
creating a tree diagram of codes. Some codes are any apparent inconsistencies or contradictions
very general; others are more specific. Codes are (e.g. between groups, within and between
astringent, pulling together a wealth of ma-terial individuals and groups, between what people say
into some order and structure. They keep words as and what they do).
words; they maintain context specificity. Codes may
be descriptive and might include (Bog-dan and
Step 7: Construct the categories
Biklen 1992: 167 72): situation codes;
perspectives held by subjects; ways of thinking
for analysis
about people and objects; process codes; activity Categories are the main groupings of constructs or
codes; event codes; strategy codes; relationship and key features of the text, showing links between units
social structure codes; methods codes. However, to of analysis. For example, a text concerning teacher
be faithful to the data, the codes themselves derive stress could have groupings such as causes of
from the data responsively rather than be-ing teacher stress, the nature of teacher stress, ways
created pre-ordinately. Hence the researcher will go of coping with stress and the effects of stress. The
through the data ascribing codes to each piece of researcher will have to decide whether to have
datum. A code is a word or abbreviation mutually exclusive categories (preferable
HOW DOES CONTENT ANALYSIS WORK?479

Chapter
but difficult), how broad or narrow each category considers to be important; it may be as small as a
will be, the order or level of generality of a word or phrase, or as large as a paragraph, groups
category (some categories may be very general and of paragraphs, or, indeed, a whole text, provided
subsume other more specific categories, in which that it has meaning in itself.
case analysis should only operate at the same level Spradley (1979) suggests that establishing

23
of each category rather than having the same domains can be achieved by four analytic tasks:
analysis which combines and uses different levels
selecting a sample of verbatim interview and
of categories). Categories are inferred by the

field notes
researcher, whereas specific words or units of

looking for the names of things
analysis are less inferential; the more one moves

identifying possible terms from the sample
towards inference, the more reliability may be
searching through additional notes for other
compromised, and the more the researchers
items to include.
agenda may impose itself on the data.
Categories will need to be exhaustive in order to He identifies six steps to achieve these tasks:
address content validity; indeed Robson (1993:
select a single semantic relationship
277) argues that a content analysis is no better
prepare a domain analysis sheet
than its system of categories and that these can


select a sample of statements from respondents
include: subject matter; direction (how a matter is
search for possible cover terms and include
treated positively or negatively); values; goals;
those that fit the semantic relationship
method used to achieve goals; traits

identified
(characteristics used to describe people); actors
formulate structural questions for each domain
(who is being discussed); authority (in whose
identified
name the statements are being made); location;
list all the hypothesized domains.
conflict (sources and levels); and endings (how
conflicts are resolved). Domain analysis, then, strives to discover
This stage (i.e. constructing the categories) is relationships between symbols (Spradley 1979:
sometimes termed the creation of a domain 157).
analysis. This involves grouping the units into Like codes, categories can be at different levels
domains, clusters, groups, patterns, themes and of specificity and generality. Some categories are
coherent sets to form domains. A domain is any general and overarching; others are less so.
symbolic category that includes other categories Typically codes are much more specific than
(Spradley 1979: 100). At this stage it might be categories. This indicates the difference between
useful for the researcher to recode the data into nodes and codes. A code is a label for a piece of
domain codes, or to review the codes used to see text; a node is a category into which different
how they naturally fall into clusters, perhaps codes fall or are collected. A node can be a
creating overarching codes for each cluster. concept, idea, process, group of people, place or,
Hammersley and Atkinson (1983) show how items indeed, any other grouping that the researcher
can be assigned to more than one category and, wishes it to be; it is an organizing category.
indeed, see this as desirable as it maintains the Whereas codes describe specific textual moments,
richness of the data. This is akin to the process of nodes draw together codes into a categorical
categorization (Lincoln and Guba 1985), putting framework, making connections between coded
unitized data to provide descriptive and segments and concepts. It is rather like saying that
inferential information. Unitization is the process a text can be regarded as a book, with the chapters
of putting data into meaning units for analysis, being the nodes and the paragraphs being the
examining data, and identifying what those units codes, or the content pages being the nodes and the
are. A meaning unit is simply a piece of datum index being the codes. Nodes can be related in
which the researcher several ways, for example: one concept can define
CONTENT ANALYSIS AND GROUNDED THEORY

another; they can be logically related; and they analysis by inductively generating categories from
can be empirically related (found to accompany the text material. This is in contrast to explicit con-
each other) (Krippendorp 2004: 296). tent analysis, the opposite of summarizing content
One has to be aware that the construction of analysis, which seeks to add in further informa-tion
codes and categories might steer the research and in the search for intelligible text analysis and
its findings, i.e. that the researcher may enter too category location. The former reduces contex-tual
far into the research process. For example, a detail, the latter retains it. Structuring content
researcher may have been examining the extra- analysis filters out parts of the text in order to con-
curricular activities of a school and discovered struct a cross-section of the material using specified
that the benefits of these are to be found in non- pre-ordinate criteria.
cognitive and non-academic spheres rather than It is important to decide whether to code simply
in academic spheres, but this may be fallacious. It for the existence or the incidence of the concept.
could be that it was the codes and categories This is important, as it would mean that, in the case
themselves rather than the data in the minds of of the former existence the frequency of a
the respondents that caused this separation of concept would be lost, and frequency may give an
cognitive/academic spheres and issues from the indication of the significance of a concept in the
non-cognitive/non-academic, and that if the text. Further, the coding will need to decide whether
researcher had specifically asked about or it should code only the exact words or those with a
established codes and categories which similar meaning. The former will probably result in
established the connection between the academic significant data loss, as words are not often repeated
and non-academic, then the researcher would in comparison to the concepts that they signify; the
have found more than he or she did. This is the latter may risk losing the nuanced sensitivity of
danger of using codes and categories to predefine particular words and phrases. Indeed some
the data analysis. speechmakers may deliberately use ambiguous
words or those with more than one meaning.

Step 8: Conduct the coding In coding a piece of transcription the researcher


and categorizing of the data goes through the data systematically, typically line
Once the codes and categories have been decided, by line, and writes a descriptive code by the side of
the analysis can be undertaken. This concerns the each piece of datum, for example:
actual ascription of codes and categories to the text.
Text Code
Coding has been defined by Kerlinger (1970) as the
The students will undertake PROB
translation of question responses and respondent
problem-solving in science
information to specific categories for the purpose of
I prefer to teach mixed ability classes MIXABIL
analysis. As we have seen, many questions are
precoded, that is, each response can be immediately One can see that the codes here are abbreviations,
and directly converted into a score in an objective enabling the researcher to understand immediately
way. Rating scales and checklists are examples of the issue that they denote because they resemble
precoded questions. Coding is the ascription of a that issue (rather than, for example, ascribing a
category label to a piece of data; which is either number as a code for each piece of datum, where
decided in advance or in response to the data that the number provides no clue as to what the datum or
have been collected. category concerns). Where they are not
Mayring (2004: 268 9) suggests that summa- abbreviations, Miles and Huberman (1994) suggest
rizing content analysis reduces the material to that the coding label should bear sufficient
manageable proportions while maintaining fidelity resemblance to the original data so that the
to essential contents, and that inductive category researcher can know, by looking at the code, what
formation proceeds through summarizing content the original piece of datum concerned.
HOW DOES CONTENT ANALYSIS WORK?481

Chapter
There are several computer packages that can help code or word in the text, and the number of
the coder here (e.g. ETHNOGRAPH, N-Vivo), words in each category. This is the process of
though they require the original transcript to be retrieval, which may be in multiple modes, for
entered onto the computer. One such, Code-A- example words, codes, nodes and categories. Some
Text, is particularly useful for analysing dialogues words may be in more than one category, for

23
both quantitatively and qualitatively (the system example where one category is an overarching
also accepts sound and video input). category and another is a subcategory. To ensure
Having performed the first round of coding, the reliability, Weber (1990: 21 4) suggests that it is
researcher is able to detect patterns, themes and advisable at first to work on small samples of text
begin to make generalizations (e.g. by counting the rather than the whole text, to test out the coding
frequencies of codes). The researcher can also and categorization, and make amendments where
group codes into more general clusters, each with a necessary. The complete texts should be analysed,
code, i.e. begin the move towards factoring the as this preserves their semantic coherence.
data. Words and single codes on their own have
Miles and Huberman (1994) suggest that it is limited power, and so it is important to move to
possible to keep as many as ninety codes in the associations between words and codes, i.e. to look
working memory at any one time, though they at categories and relationships between categories.
make the point that data might be recoded on a Establishing relationships and linkages between
second or third reading, as codes that were used the domains ensures that the data, their richness
early on might have to be refined in light of codes and context-groundedness are retained. Linkages
that are used later, either to make the codes more can be found by identifying confirming cases, by
discriminating or to conflate codes that are seeking underlying associations (LeCompte and
unnecessarily specific. Codes, they argue, should Preissle 1993: 246) and connections between data
enable the researcher to catch the complexity and subsets.
comprehensiveness of the data. Weber (1990: 54) suggests that it is preferable
Perhaps the biggest problem concerns the coding to retrieve text based on categories rather than
and scoring of open-ended questions. Two solutions single words, as categories tend to retrieve more
are possible here. Even though a response is open- than single words, drawing on synonyms and
ended, an interviewer, for example, may precode the conceptually close meanings. One can make
interview schedule so that while an interviewee is category counts as well as word counts. Indeed,
responding freely, the interviewer is assigning the one can specify at what level the counting can
content of the responses, or parts of it, to be conducted, for example, words, phrases, codes,
predetermined coding categories. Classifications of categories and themes.
this kind may be developed during pilot studies. The implication here is that the frequency of
Alternatively, data may be postcoded. Having words, codes, nodes and categories provides an
recorded the interviewees response, for example, indication of their significance. This may or may
either by summarizing it during or after the not be true, since subsequent mentions of a word
interview itself, or verbatim by tape-recorder, the or category may be difficult in certain texts (e.g.
researcher may subject it to content analysis and speeches). Frequency does not equal importance,
apply it to one of the available scoring procedures and not saying something (withholding comment)
scaling, scoring, rank scoring, response counting, may be as important as saying something. Content
etc. analysis analyses only what is present rather than
what is missing or unsaid (Anderson and Arsenault
1998: 104). Further, as Weber (1990) says:
Step 9: Conduct the data analysis
Once the data have been coded and categorized, pronouns may replace nouns the further on one goes
the researcher can count the frequency of each through a passage; continuing raising of the issue may
CONTENT ANALYSIS AND GROUNDED THEORY

cause redundancy as it may be counter-productive associations in a pathway analysis of causal


repetition; constraints on text length may inhibit relations
reference to the theme; some topics may require dendrograms: tree diagrams to show the rela-
much more effort to raise than others. tionship and connection between categories
(Weber 1990: 73) and codes, codes and nodes.
The researcher can summarize the inferences
from the text, look for patterns, regularities and The calculation and presentation of statistics is
relationships between segments of the text, and test discussed in Chapters 24 26. At this stage the
hypotheses. The summarizing of categories and data argument here suggests that what starts as
is an explicit aim of statistical techniques, for these qualitative data words can be converted into
permit trends, frequencies, priorities and numerical data for analysis.
relationships to be calculated. At the stage of data If a less quantitative form of analysis is
analysis there are several approaches and methods required then this does not preclude a qualitative
that can be used. Krippendorp (2004: version of the statistical procedures indicated
48 53) suggests that these can include: here. For example, one can establish linkages and
relationships between concepts and categories,

extrapolations: trends, patterns and differences examining their strength and direction (how

standards: evaluations and judgements strongly they are associated and whether the
indices: e.g. of relationships, frequencies of association is positive or negative respectively).
occurrence and co-occurrence, number of Many computer packages will perform the

favourable and unfavourable items qualitative equivalent of statistical procedures.


linguistic re-presentations. It is also useful to try to pursue the
identification of core categories (see the later
Once frequencies have been calculated, statistical
discussion of grounded theory). A core category
analysis can proceed, using, for example:
is that which has the greatest explanatory

factor analysis: to group the kinds of response potential and to which the other categories and

tabulation: of frequencies and percentages subcategories seem to be repeatedly and closely


cross-tabulation: presenting a matrix where related (Strauss 1987: 11). Robson (1993: 401)
the words or codes are the column headings suggests that drawing conclusions from
and the nominal variables (e.g. the newspaper, qualitative data can be undertaken by counting,

the year, the gender) are the row headings patterning (noting recurrent themes or patterns),
correlation: to identify the strength and clustering (of people, issues, events etc. which
direction of association between words, have similar features), relating variables, building

between codes and between categories causal networks, and relating findings to
graphical representation: for example to theoretical frameworks.
report the incidence of particular words, While conducting qualitative data analysis using

concepts, categories over time or over texts numerical approaches or paradigms may be
regression: to determine the value of one criticized for being positivistic, one should note that
variable/word/code/category in relationship to one of the founders of grounded theory (Glaser
another a form of association that gives exact 1996) is on record as saying that not only did
values and the gradient or slope of the goodness grounded theory develop out of a desire to apply a
of fit line of relationship the regression line quantitative paradigm to qualitative data, but also
multiple regression: to calculate the weighting paradigmal purity was unacceptable in the real
of independents on dependent variables world of qualitative data analysis, in which fitness
structural equation modelling and LISREL for purpose should be the guide. Further, one can
analysis: to determine the multiple directions note that Miles and Huberman (1984) strongly
of causality and the weightings of different advocate the graphic display of data as
A WORKED EXAMPLE OF CONTENT ANALYSIS483

23Chapter
an economical means of reducing qualitative data. with subjects; analysing relevant literature while
Such graphics might serve both to indicate causal conducting the field research; generating concepts,
relationships as well as simply summarizing data. metaphors and analogies and visual devices to
clarify the research.
Step 10: Summarizing
By this stage the investigator will be in a position Step 11: Making speculative inferences
to write a summary of the main features of the This is an important stage, for it moves the
situation that have been researched so far. The research from description to inference. It requires
summary will identify key factors, key issues, key the researcher, on the basis of the evidence, to
concepts and key areas for subsequent posit some explanations for the situation, some
investigation. It is a watershed stage during the key elements and possibly even their causes. It
data collection, as it pinpoints major themes, issues is the process of hypothesis generation or the
and problems that have arisen, so far, from the data setting of working hypotheses that feeds into
(responsively) and suggests avenues for further theory generation.
investigation. The concepts used will be a The stage of theory generation is linked to
combination of those derived from the data grounded theory, and we turn to this later in the
themselves and those inferred by the researcher chapter. Here we provide an example of content
(Hammersley and Atkinson 1983: 178). analysis that does not use statistical analysis but
At this point, the researcher will have gone which nevertheless demonstrates the systematic
through the preliminary stages of theory approach to analysing data that is at the heart of
generation. Patton (1980) sets these out for content analysis.
qualitative data:

finding a focus for the research and analysis A worked example of content analysis
organizing, processing, ordering and checking
data In this example the researcher has already

writing a qualitative description or analysis transcribed data concerning stress in the workplace

inductively developing categories, typologies from, let us say, a limited number of accounts and
and labels interviews with a few teachers, and these have
analysing the categories to identify where already been summarized into key points. It is
further clarification and cross-clarification are imagined that each account or interview has been
needed written up onto a separate file (e.g. computer file),

expressing and typifying these categories and now they are all being put together into a
through metaphors (see also Pitman and single data set for analysis. What we have are
Maxwell 1992: 747) already-interpreted, rather than verbatim, data.

making inferences and speculations about


relationships, causes and effects.
Stage 1: Extract the interpretive
Bogdan and Biklen (1992: 154 63) identify comments that have been written on the
several important factors that researchers need to data
address at this stage, including forcing oneself to take
By the side of each, a code/category/descriptor
decisions that will focus and narrow the study and
word has been inserted (in capital letters) i.e. the
decide what kind of study it will be; developing
summary data have already been collected together
analytical questions; using previous observational
into 33 summary sentences.
data to inform subsequent data collection; writing
reflexive notes and memos about observations, ideas, Stress is caused by deflated expectation, i.e. stress
what is being learned; trying out ideas is caused by annoyance with other people
CONTENT ANALYSIS AND GROUNDED THEORY

not pulling their weight or not behaving as Stress comes through handling troublesome
desired, or teachers letting themselves down. students. CAUSE
CAUSE Stress occurs because of a failure of
Stress is caused by having to make greater management/leadership. CAUSE
demands on personal time to meet professional
Stress comes through absence of fulfilment.
concerns. So, no personal time/space is a cause CAUSE
of stress. Stress is caused by having to Stress rarely happens on its own, it is usually
compromise ones plans/desires. in combination like a rolling snowball, it is
CAUSE
cumulative. NATURE
Stress comes from having to manage several Stress is caused by worsening professional
demands simultaneously, CAUSE, but the conditions that are out of the control of the
very fact that they are simultaneous means participant. CAUSE Stress comes through
that they cant be managed at once, so stress
loss of control and autonomy. CAUSE
is built into the problem of coping its an Stress through worsening professional condi-

insoluble situation. NATURE tions is exponential in its effects. NATURE


Stress from one source brings additional stress Stress is caused when professional standards
which leads to loss of sleep a sign that things are felt to be compromised. CAUSE

are reaching a breaking point. OUTCOME


Stress occurs because matters are not resolved.
Stress is a function of the importance attached CAUSE
to activities/issues by the person involved. Stress comes through professional compromise
NATURE Stress is caused when ones own which is out of an individuals control. CAUSE
integrity/values are not only challenged but The rate of stress is a function of its size a

also called into question. CAUSE big bomb causes instant damage. NATURE
Stress comes from frustration frustration Stress is caused by having no escape valve; it
leads to stress leads to frustration leads to is bottled up and causes more stress, like a

stress etc. a vicious circle. NATURE kettle with no escape valve, it will stress the
When the best laid plans go wrong this can be
metal and then blow up. CAUSE
stressful. CAUSE Stress comes through overload and frustra-
The vicious circle of stress induces sleep tion a loss of control. Stress occurs when

irregularity which, in turn, induces stress. people cannot control the circumstances with
NATURE
which they have to work. CAUSE
Reducing stress often works on symptoms rather
Stress occurs through overload. CAUSE
than causes may be the only thing possible, Stress comes from seeing ones former work
CAUSE, given that the stressors will not go
being undone by others incompetence.
away, but it allows the stress to fester. CAUSE
CAUSE Stress occurs because nothing has been possible
The effects of stress are physical which, in turn, to reduce the level of stress. So, if the boil of

causes more stress another vicious circle. stress is not lanced, it grows and grows. CAUSE
OUTCOMES NATURE
Stress comes from lowering enthusiasm/ Stress can be handled through relaxation and
commitment/aspiration/expectation. CAUSE exercise. HANDLING
Pressure of work lowers aspiration which Trying to relieve stress through self-damaging
lowers stress. CAUSE behaviour includes taking alcohol and smok-
Stress reduction is achieved through compan-
ing. HANDLING NATURE
ionship. HANDLING Stress is a function of the importance attached

Stress is caused by things out of ones control.


to activities by the participants involved.
CAUSE NATURE
A WORKED EXAMPLE OF CONTENT ANALYSIS485

Chapter
The closer the relationship to people who cause stress itself causes more stress /
stress, the greater the stress. NATURE inability to reduce causes of stress /
lowering enthusiasm/commitment/aspiration /
The data have been coded very coarsely, in terms
pressure of work /
of three or four main categories. It may have been
things out of ones control //

23
possible to have coded the data far more
failure of management or leadership /
specifically, e.g. each specific cause has its code,
absence of fulfilment /
indeed one school of thought would argue that it is
worsening professional conditions /
important to generate the specific codes first. One
loss of control and autonomy //
can code for words (and, thereafter, the frequency
inability to resolve situation /
of words) or meanings it is sometimes dangerous
having no escape valve /
to go for words rather than meanings, as people
overload at work /
say the same things in different ways.
seeing ones work undone by others /

Stage 2: Sort data into key headings/areas


The codes that have been used fall into four main Nature of stress
areas: Stress is a function of the importance attached
to activities issues by the participants. /

causes of stress
Stress is inbuilt when too many simultaneous

nature of stress

demands are made, i.e. it is insoluble. /

outcomes of stress
It is cumulative (like a snowball) until it
handling stress.
reaches a breaking point. /

Stress is a vicious circle. //
Stage 3: List the topics within each key
The effects of stress are exponential. /
area/heading and put frequencies in
The rate of stress is a function of its size. /
which items are mentioned If stress has no escape valve then that causes
For each main area the relevant data are presented
more stress. //
together, and a tally mark (/) is placed against the Handling stress can lead to self-damaging
number of times that the issue has been mentioned behaviour (smoking or alcohol). /
by the teachers. Stress is a function of the importance attached

to activities-issues by the participants. /
The closer the relationship to people who cause
Causes of stress stress, the greater the stress. /

deflated expectation/aspiration /

annoyance /

others not pulling weight / Outcomes of stress

others letting themselves down / loss of sleep or physical reaction //



professional demands, e.g. troublesome effects of stress themselves causing more stress /

students / self-damaging behaviour /
demands on personal time from professional

tasks /

difficulties of the job /


Handling stress

loss of personal time and space /


compromising oneself or ones professional
physical action or exercise /

standards and integrity ///


companionship /
plans go wrong / alcohol and smoking /
CONTENT ANALYSIS AND GROUNDED THEORY

Stage 4: Go through the list generated demands on personal time from professional
in stage 3 and put the issues into tasks /
groups (avoiding category overlap)
difficulties of the job /
compromising oneself or ones professional
Here the grouped data are reanalysed and re-
standards and integrity ///
presented according to possible groupings of

plans go wrong /
issues under the four main heading (causes,

pressure of work /
nature, outcomes and handling of stress).

worsening professional conditions /

loss of control and autonomy //
Causes of stress overload at work /

Personal factors
Nature of stress

deflated expectation or aspiration /


Objective

annoyance /
demands on personal time from professional It is a function of the importance attached to
tasks / activities-issues by the participants. /

loss of personal time and space / Stress is inbuilt when too many simultaneous

stress itself causes more stress / demands are made, i.e. it is insoluble. /

inability to reduce causes of stress / It is cumulative (like a snowball) until it


lowering enthusiasm, commitment or aspira- reaches a breaking point. /
tion /
Stress is a vicious circle. //

things out of ones control //


The effects of stress are exponential. /

absence of fulfilment /
The rate of stress is a function of its size. /

loss of control and autonomy // If stress has no escape valve then that causes

inability to resolve situation / more stress. //


having no escape valve / Handling stress can lead to self-damaging
behaviour (smoking or alcohol). /
Interpersonal factors
Subjective

annoyance /

others not pulling weight / Stress is a function of the importance attached

others letting themselves down / to activities-issues by the participants. /


compromising oneself or ones professional The closer the relationship to people who
standards and integrity /// cause stress, the greater the stress. /
seeing ones work undone by others /
Management Outcomes of stress

pressure of work / Physiological

things out of ones control // loss of sleep /


failure of management or leadership /

worsening professional conditions / Physical



seeing ones work undone by others /
physical reactions //

increased smoking /
Professional matters
increased alcohol /

others not pulling weight /


Psychological
professional demands, e.g. troublesome

students / annoyance /
COMPUTER USAGE IN CONTENT ANALYSIS 487

Handling stress The outcomes of stress tend to be felt non-

Chapter 23
cognitively, e.g. emotionally and psy-
Physical
chologically, rather than cognitively (give
physical action or exercise /
frequencies).

There are few ways of handling stress


Social
(frequencies), i.e. opportunities for stress
social solidarity, particularly with close people reduction are limited.

///
companionship / The stages of this analysed example embody
several of the issues raised in the preceding
discussion of content analysis, although the
Stage 5: Comment on the groups or results
example here does not undertake word counts or
in stage 4 and review their messages statistical analysis, and, being fair to content
Once the previous stage has been completed, the analysis, this could some would argue even
researcher is then in a position to draw attention to should be a further kind of analysis. What has
general and specific points, for example: happened in this analysis raises several important
issues:
There is a huge number of causes of stress

(give numbers). The researcher has looked within and across


There are very few outlets for stress, so it is categories and groupings for patterns, themes,
inevitable, perhaps, that stress will accumulate. generalizations, as well as exceptions, unusual
Causes of stress are more rooted in personal
observations etc.
factors than any others management, profes- The researcher has had to decide whether

sional etc. (give frequencies here). frequencies are important, or whether an issue
The demands of the job tend to cause less stress is important even if it is mentioned only once
that other factors (e.g. management), i.e. people
or a few times.
go into the job knowing what to expect, but the The researcher has looked for, and reported,
problem lies elsewhere, with management disconfirming as well as confirming evidence

(give frequencies). for statements.


Loss of control is a significant factor (give The final stage of the analysis is that of theory

frequencies). generation, to account for what is being


Challenges to people and personal integrity/ self- explained about stress. It might also be
esteem are very stressful (give frequencies). important, in further analysis, to try to find
The nature of stress is complex, with several causal relationships here: what causes what and

interacting components (give frequencies). the directions of causality; it may also be

Stress is omnipresent. useful to construct diagrams (with arrows) to


Not dealing with stress compounds the show the directions, strength and
problem; dealing with stress compounds the positive/negative nature of stress.

problem.
The subjective aspects of the nature of stress
are as important as its objective nature (give
Computer usage in content analysis
frequencies). LeCompte and Preissle (1993) provide a summary
The outcomes of stress tend to be personal of ways in which information technology can be
rather than outside the person (e.g. systemic or utilized in supporting qualitative research (see also

system-disturbing) (give frequencies). Tesch 1990). As can be seen from the list below,
The outcomes of stress are almost exclusively its uses are diverse. Data have to be processed, and
negative rather than positive (give frequen- as word data are laborious to process, and as
cies). several powerful packages for data
CONTENT ANALYSIS AND GROUNDED THEORY

analysis and processing exist, researchers will Kelle (1995) suggests that computers are
find it useful to make full use of computing particularly effective at coping with the often-
facilities. These can be used to do the following encountered problem of data overload and retrieval
(LeCompte and Preissle 1993: 280 1): in qualitative research. Computers, it is argued,
enable the researcher to use codes, memos,
hypertext systems, selective retrieval, co-occurring

store and check (e.g. proofread) data codes, and to perform quantitative counts of qual-
collate and segment data and to make itative data types (see also Seidel and Kelle 1995).
numerous copies of data In turn, this enables linkages of elements to be
enable memoing to take place, together with undertaken, the building of networks and, ul-
details of the circumstances in which the timately, theory generation to be undertaken (Seidel

memos were written and Kelle 1995). Indeed Lonkila (1995) indicates
conduct a search for words or phrases in the how computers can assist in the genera-tion of
data and to retrieve text grounded theory through coding, constant
attach identification labels to units of text comparison, linkages, memoing, annotations and
(e.g. questionnaire responses), so that appending, use of diagrams, verification and, ulti-

subsequent sorting can be undertaken mately, theory building. In this process Kelle and

annotate and append text Laurie (1995: 27) suggest that computer-aided
partition data into units that have been methods can enhance validity (by the manage-ment
determined either by the researcher or in of samples) and reliability (by retrieving all the data

response to the natural language itself on a given topic, thereby ensuring trustworthiness of
enable preliminary coding of data to be the data).
undertaken A major feature of computer use is in the coding
sort, resort, collate, classify and reclassify pieces and compilation of data (for example, Kelle 1995:
of data to facilitate constant comparison and to 62 104). Lonkila (1995) identifies several kinds of

refine schemas of classification codes. Open coding generates categories and


code memos and bring them into the same defines their properties and dimensions. Axial
schema of classification coding works within one category, making
assemble, reassemble and recall data into connections between subgroups of that category and
categories between one category and another. This might be in
undertake frequency counts (e.g. of words, terms of the phenomena that are being studied, the
phrases, codes) causal conditions that lead to the phenomena, the
cross-check data to see if they can be coded context of the phenomena and their intervening
into more than one category, enabling conditions, and the actions and interactions of, and

linkages between categories to be discovered consequences for, the actors in situations. Selective
establish the incidence of data that are coding identifies the core categories of text data,
contained in more than one category integrating them to form a theory. Seidel and Kelle
retrieve coded and noded data segments from (1995) suggest that codes can denote a text, passage
subsets (e.g. by sex) in order to compare and or fact, and can be used to construct data networks.

contrast data
search for pieces of data that appear in a There are several computer packages for
certain (e.g. chronological) sequence qualitative data (see Kelle 1995), for example:

establish linkages between coding categories AQUAD; ATLAS/ti; HyperQuad2; Hyper-


display relationships of categories (e.g. hier- RESEARCH; Hypersoft; Kwaliton; Martin;
archical, temporal, relational, subsumptive, MAXqda; WINMAX; QSR.NUD.IST; Nvivo;

superordinate) QUALPRO; Textbase Alpha, ETHNOGRAPH,


quote data in the final report. ATLAS.ti, Code-A-Text, Decision Explorer,
COMPUTER USAGE IN CONTENT ANALYSIS489

Chapter
Diction. Some of these are reviewed by Prein et al. Additionally, dictionaries and concordances of
(1995: 190 209). These do not actually perform terms can be employed to facilitate coding,
the analysis (in contrast to packages for quantita- searching, retrieval and presentation.
tive data analysis) but facilitate and assist it. As Since the rules for coding and categories are
Kelle (2004: 277) remarks, they do not analyse public and rule-governed, computer analysis can

23
text so much as organize and structure text for be particularly useful for searching, retrieving and
subsequent analysis. grouping text, both in terms of specific words and in
These programs have the attraction of coping terms of words with similar meanings. Single words
with large quantities of text-based material rapidly and word counts can overlook the importance
and without any risk of human error in of context. Hence computer software packages
computation and retrieval, and releasing have been developed that look at Key-Words-In-
researchers from some mechanical tasks. With Context. Most software packages have advanced
respect to words, phrases, codes, nodes and functions for memoing, i.e. writing commentaries
categories they can: to accompany text that are not part of the orig-
inal text but which may or may not be marked
search for and return text, codes, nodes and as incorporated material into the textual analy-
categories sis. Additionally many software packages include

filter text an annotation function, which lets the researcher

return counts annotate and append text, and the annotation is


present the grouped data according to the kept in the text but marked as an annotation.
selection criterion desired, both within and Computers do not do away with the human

across texts touch, as humans are still needed to decide and


perform the qualitative equivalent of statistical generate the codes and categories, to verify and
analyses, such as: interpret the data. Similarly there are strict limits
Boolean searches (intersections of textwhich to algorithmic interpretations of texts (Kelle 2004:
have been coded by more than one code or 277), as texts contain more than that which can
node, using and, not and or; looking be examined mechanically. Further, Kelle (2004:

for overlaps and co-occurrences) suggests that there may be problems where
proximity searches (looking at clustering assumptions behind the software may not accord
ofdata and related contextual data either with those of the researchers or correspond to the

side of a node or code) researchers purposes, and that the software does
restrictions, trees, cross-tabs (including not enable the range and richness of analytic
andexcluding documents for searching, techniques that are associated with qualitative
looking for codes subsumed by a particular research. Kelle (2004) argues that software may be
node, and looking for nodes which subsume more closely aligned to the technique of grounded

others) theory than to other techniques (e.g. hermeneutics,


construct dendrograms (tree structures) of discourse analysis) (Coffey et al. 1996), that it

related nodes and codes may drive the analysis rather than vice versa
present data in sequences and locate the text in (Fielding and Lee 1998), and that it has a
surrounding material in order to provide the preoccupation with coding categories (Seidel and
necessary context Kelle 1995). One could also argue that software
select text on combined criteria (e.g. joint does not give the same added value that one finds

occurrences, collocations) in quantitative data analysis, in that the textual


enable analyses of similarities, differences and input is a highly laborious process and that it does
relationships between texts and passages of text not perform the analysis but only supports the
annotate text and enable memos to be written researcher doing the analysis by organizing data
about text. and recording codes and nodes etc.
CONTENT ANALYSIS AND GROUNDED THEORY

Reliability in content analysis of principles (e.g. the utilitarian school); a style


of life (e.g. a gentleman from the old school); a
There are several issues to be addressed in
group assembled for a particular purpose (e.g. a
considering the reliability of texts and their
gambling school), and so on. This is a par-
content analysis, indeed, in analysing qualitative
ticular problem for computer programs which
data using a variety of means, for example:
may analyse words devoid of their meaning.
Witting and unwitting evidence (Robson 1993: Coding and categorizing may lose the
273): witting evidence is that which was nuanced richness of specific words and their
intended to be imparted; unwitting evidence is
connotations.
that which can be inferred from the text, and Category definitions and themes may be
which may not be intended by the imparter. ambiguous, as they are inferential.
The text may not have been written with the Some words may be included in the same
researcher in mind and may have been written overall category but they may have more or less
for a very different purpose from that of the significance in that category (and a system of
research (a common matter in documentary
weighting the words may be unreliable).
research); hence the researcher will need to Words that are grouped together into a similar
know or be able to infer the intentions of the category may have different connotations and

text. their usage may be more nuanced than the


The documents may be limited, selective,
categories recognize.
partial, biased, non-neutral and incomplete Categories may reflect the researchers
because they were intended for a different agenda and imposition of meaning more than
purpose other than that of research (an issue the text may sustain or the producers of the

of validity as well as of reliability).


text (e.g. interviewees) may have intended.
It may be difficult to infer the direction of Aggregation may compromise reliability.
causality in the documents they may have Whereas sentences, phrases and words and
been the cause or the consequence of a whole documents may have the highest

particular situation. reliability in analysis, paragraphs and larger


Classification of text may be inconsistent (a but incomplete portions of text have lower
problem sometimes mitigated by computer
reliability (Weber 1990: 39).
analysis), because of human error, coder A document may deliberately exclude some-
variability (within and between coders), and thing for mention, overstate an issue or under-
ambiguity in the coding rules (Weber 1990: state an issue (Weber 1990: 73).

17).
Texts may not be corroborated or able to be At a wider level, the limits of content analysis are
corroborated. suggested by Ezzy (2002: 84), who argues that, due
Words are inherently ambiguous and polyva-lent to the pre-ordinate nature of coding and
(the problem of homographs): for example, what categorizing, content analysis is useful for testing or
does the word school mean: a building; a confirming a pre-existing theory rather than for
group of people; a particular movement of artists building a new one, though this perhaps understates
(e.g. the impressionist school); a depart-ment (a the ways in which content analysis can be used to
medical school); a noun; a verb (to drill, to generate new theory, not least through a grounded
induct, to educate, to train, to control, to attend theory approach (discussed later). In many cases
an institution); a period of instruc-tional time content analysts know in advance what they are
(they stayed after school to play sports); a looking for in text, and perhaps what the categories
modifier (e.g. a school day); a sphere of activity for analysis will be. Ezzy (2002: 85) suggests that
(e.g. the school of hard knocks); a collection of this restricts the extent to which the analytical
people adhering to a particular set categories can be responsive to the
data, thereby confining the data analysis to the agenda of the researcher rather than the other. In this way it enables
pre-existing theory to be tested. Indeed Mayring (2004: 269) argues that if the research question is very open or if the
study is exploratory, then more open procedures than content analysis, e.g. grounded theory, may be preferable.
However, inductive approaches may be ruled out of the early stages of a content analysis, but this does not keep
them out of the later stages, as themes and interpretations may emerge inductively from the data and the researcher,
rather than only or necessarily from the categories or pre-existing theories themselves. Hence to suggest that content
analysis denies induction or is confined to the testing of pre-existing theory (Ezzy 2002: 85) is uncharitable; it is to
misrepresent the flexibility of content analysis. Indeed Flick (1998) suggests that pre-existing categories may need to
be modified if they do not fit the data.

You might also like