You are on page 1of 39

BASIC MEASURES OF OCCURRENCE IN EPIDEMIOLOGICAL RESEARCH

Case definition

The first step in epidemiological research is the definition of the disease or condition of interest. That is,
epidemiologists first ask the question 'When is a case a case?' To give an ophthalmic example, in order
to count the number of people in a population with glaucoma, the definition of who is eligible to be
included as a case must he clearly defined, and this would typically be in terms of visual field, and
features of the Optic disc. Changing the definition of glaucoma, perhaps by changing the cut-off for
severity of the visual field loss, will change the number of people counted as cases of glaucoma in a
population. The need for a clear 'case' definition need not necessarily be lim-ited to an adverse event
such as occurrence of dis-ease, but applies equally well to favourable outcomes such as 'cure',
occurrence of other types of event or a health-related state. For simplicity's sake we will introduce the
epidemiological con-cepts in terms of measurements of magnitude and causes of disease, rather than
another health state.

Measures of occurrence

The hallmark of epidemiological investigations is the measurement of the frequency of occurrence of an


event (e.g. disease) in a defined population. It is generally not feasible to measure the occurrence of
disease in the entire population and so it is usual inepidemiological research to assess disease
occurrence in a sample of the population and then extrapolate these findings to the entire population. A
'population' can mean the whole population of a country or region, or a group defined by a com-mon
characteristic (e.g. people with diabetes or smokers).

The two basic occurrence measures are prevalence and incidence. Prevalence assesses the occurrence
of existing disease in a population, while incidence focuses on the occurrence of new cases of a disease.
These will be explained in detail below. An important purpose of measuring the occurrence of disease,
whether through prevalence or incidence, is to assess the magnitude of disease in a population, in order
to uncover epidemics, and plan and monitor health services. Measurement of disease frequency also
allows us to compare the occurrence of disease in different groups (e.g. compare the occurrence of
cataract in people with diabetes to those without diabetes) in order to identify predictors of disease. In
the comparison of two populations or sub-groups, the ratio of theincidences or of the prevalcnces gives
a relative measure of effect (such as relative risk, RR), andthe difference in incidence or in prevalence
gives an absolute measure of effect (such as attributablerisk, AR). The following sections describe these
measures briefly, and also consider the concept of risk in epidemiology and how it relates to incidence.

Readers interested in more detailed considera-tion and advanced statistical analysis of aspects of
occurrence and effect measures should refer to general texts in epidemiology such as those by
Kleinhaum et al.' and Rothman et al.4

2.2.1 Prevalence
The prevalence of a disease in a population is the proportion of individuals in that population who have
the disease at a given time. Prevalence (P) is a proportion which can have values 0 to 1 (often expressed
as a percentage), and is said to be dimensionless.

Prevalence of a disease in a defined population

(P) = D/N

where:

D is the number of cases with the disease in the defined population at a point in time, and

N is the total number of individuals in the defined population at that time.

For instance, the National Blindness and Visual Impairment Survey conducted in Nigeria during 2005-
2007 examined 13,599 people aged 40 years and above.' Among those examined, 569 were `cases' of
blindness (case definition: presenting visual acuity <20/400 or 3/60 in the better eye).

This gives a prevalence of 569 divided by 13,599, which is 0.042 or 4.2%.

Prevalence is thus a popu-lation measure, concerning the magnitude of dis-ease at a given time. It is
important to state the `point' in time when the prevalence was estimated.

Moreover, the population examined must also be described in terms of demographic or geographical
features — the prevalence of blindness in the whole population would be lower than the prevalence
inthose aged 40 years and above. Therefore ,the correct description for the Nigeria survey is that preva-
lence of blindness in the sample was 4.2% among people aged 40 years and above, examined in
2005-2007.

Prevalence (P) for a large population is usually estimated by an examination of a sample drawn from it
and extrapolating back to the large popula-tion. Although the estimated 'case' pool gives a static view of
the magnitude of the health problem at a point in time, the pool itself has a dynamic nature. Thenumber
of 'cases' in the population pool changes with time as new (i.e. incident)

`cases' enter it and others leave it through death or

cure (e.g. by successful treatment). The dynamics

of prevalence over time are therefore determined

by the occurrence of new cases, and the duration of

disease (determined by the death rate among cases

and/or the cure rate). Assessment of the popula-

tion need for health services requires knowledge

and consideration of the measures that determine

the pool, as well as a static view of the pool at the


start of a time period through which the popula-

tion need is to be projected. Prevalence is deter-

mined by incidence: incidence of new cases

entering the pool per unit time, and the incidence

of death and of reversal, removing cases from

the pool per unit time. Epidemiological models

have been developed to simulate the dynamics of

disease in large populations, projecting the changes

in the 'case' pool over time, under various scenar-

ios of service provision, level of intervention, and

of changes in demography and risk factor status.

Examples include models for onchocerciasis con-

trol' (see Chapter 18) and for cataract.' Simplistic

approaches to needs assessment tend to be focused

on prevalence (backlog) without taking proper

account of the time dimension, the incidence and

their root determinants (demographic and risk fac-

tors). Such approaches may be misleading, and

have no place in epidemiology.

In addition to its utility in planhing, provision

and monitoring of health services and needs assess-

ment, prevalence data from cross-sectional studies

(see Chapter 3) have also been used to compare

various sub-groups of the population, defined by level of exposure to a suspect determinant of the

disease, Examples include comparison of cataract

prevalence in sub-groups with various levels of

exposure to nutritional factors or comparing the

prevalence among men and women. If the sus- pected exposure definitely preceded the onset of

the disease, such as fbr genes, gender, ethnicity, or blood group, then these comparisons can give
insights into the development and predictors of the

disease. However, this is more difficult for poten-

tial exposures that change over time, such as age-

ing, weight or blood pressure, because both

exposure and disease are measured at the same

point in time and so it cannot be determined if

either is the causal agent. This latter point also

holds true for potential confounders of the associ-

ation between exposure and disease (e.g. smoking

confounding the association between ethnicity and

cataract), which need to be taken into account

during the analyses. Such comparative analyses,

however, have been helpful in generating new

aetiological hypotheses (concerning possible deter-

minants of disease), which could then be tested

through more appropriate studies (such as case-

control or cohort studies, discussed in Chapters 4

and 5 respectively), designed to generate compara-

tive measures of occurrence based on incidence.

2.2.2 Incidence

Incidence refers to the occurrence of new cases in

the population at risk, as opposed to prevalence,

which measures the frequency of existing cases in

the total population of cases and non-cases. There

are nvo main measures of incidence described in

detail below: cumulative incidence (which is the

proportion of the population that is disease free at

baseline that develops a disease over a given time


period) and incidence density or incidence rate

(which is the number of cases that occur in the

population per unit of person-time at risk and is a

rate as it has a time dimension). Both measures

concern a change of status in individuals who are

candidates for such change; e.g. occurrence of pri-

mary open-angle glaucoma in individuals who do

not have the disorder and who have at least one eye

intact which may develop the disorder, that is,

who are 'at risk' or susceptible. As described for

prevalence, incidence figures are meaningful when

a clear 'case' definition is specified, together with

the population 'at risk', and the time interval

or unit.

Cumulative Incidence (CI)

The cumulative incidence (CI) of a disease in a

population is the number of new 'cases ' that occur

in a population that is disease free at baseline (i.e.

excluding prevalent cases) over a specified period

of time. Cumulative incidence, like preva lence, is a

proportion which can have values 0 to 1 (often

expressed as a percentage) and is said to be dimen-

sionless. The CI is directly interpretable as the risk

of the event in the period t because it is the prob-

ability of the event occurring in that time period.

.• CI.in time kin a defined population = Dn„/Nfr,

Where:,

• t is the. time period


D„. • the'nunnber of new cases .ocCurring in ntrie.kin the defined population

N is the number of individuals free of the dis-

' ease at baseline in the defined population, (i.e.

individuals at risk)

For instance, in the Wisconsin Epidemiologic

Study of Diabetic Retinopathy (see Chapter 21),

610 people with young onset insulin-taking dia-

betes and 652 with older onset diabetes but no

macular oedema were followed for four years.'

During the follow-up there were 50 new cases of

macular oedema among the younger onset diabet-

ics and 34 among the older onset diabetics. The

cumulative incidence of macular oedema is there-

fore 50 divided by 610 in the young onset group,

which is 0.082 or 8.2%, and 34 divided by 652 in

the older onset group, which is 0.052 or 5.2% over

a four-year period.

CI gives important information for planning

and provision of services, and needs assessment, as

it can be used to derive the number of new cases

that are expected in a given time period in a given population. The main utility of CI, however, is in an
aetiological study where the objective is to com- pare risk between groups that have various levels of

exposure to potential risk factors, such as comparing

risk of disease in smokers to that in non-smokers. ci

is also a simple and highly interpretable measure of

the risk of cure (better termed as the probability of

cure) for the treatment groups in a clinical trial.

The main problem with CI is that, in most cir-


cumstances, disease occurrence is a rare event and so long follow-up is often required to assess CI

meaningfully. Long follow-up (one year or more) means that some members of the cohort are bound to
be lost to follow-up. Such loss, if non-trivial,

may introduce an unknown amount of bias into

the CI estimate as loss to follow-up may be related to their risk of becoming a case.

Incidence Rate or Density (11?)

The incidence rate (IR) uses person-time at risk as

the denominator rather than the number of people

at risk. Imagine we are following a group of 1,000

people who are disease free at baseline for a given

period of time to assess the incidence of cataract

blindness. During the follow-up people may: (a)

develop a cataract, (b) be lost to follow-up, (c) die

or develop another condition so that they are no

longer at risk of cataract (e.g. have eyes removed)

or (d) remain disease free until the end of follow-

up. Person-time of follow-up across the 1,000 peo-

ple is accrued from the start of follow-up until one

of the four events occurs. The total person-time at

risk is the sum of the person-time at risk for each

subject in the study. The incidence rate calculates

the number of new cases that occur during follow-

up divided by the total amount of person-time of

follow-up of the participants. Note that once a per-

son becomes a case, he/she is no longer at risk of

the disease and moves out of the denominator.

Incidence rate is expressed as cases per unit of

person-time (e.g. 5 per 100,000 person-years).


Incidence Rate .. (IR) = number of new

cases/Total person-time at risk.

As an example, 297,756 people from 2,315 villages in

11 countries were followed up during 1971-2001 to assess the incidence of blindness as

part of the Onchocerciasis Control Programme in western Africa.' In total, 367,788 person-years
of

follow-up were accumulated and 200 people

became blind as a result of onchocerciasis. The

incidence rate of onchocerciasis in this population

was 200 divided by 367,788 person-years, which is

0.00054 cases per person-year of follow up, or 5.4 cases per 10,000 person-years of follow-up.

Note that IR has a time dimension and is a meas-

ure of speed of occurrence or rate. It is an instanta-

neous rate of occurrence and in practice is difficult

to conceptualize and interpret. The IR, however,

can be estimated from longitudinal studies that

have variable follow-up periods and losses as people

only contribute person-time during the period that

they are actually at risk of being recorded as a case

in the study. It forms a valid and useful comparative

measure in aetiological studies and trials, in spite of

the fact that it has no direct interpretation as risk. In

the example given, the incidence rate was 5.4 per

10,000 person-years at risk, and although it is

tempting to interpret this to mean 54 new cases per

100,000 persons per year, in fact the '100,000

person-years' cannot he meaningfully segregated


into units of persons and of years.

Prevalence and incidence are distinct, but

related, concepts. If the prevalence is relatively low

(<10%) then prevalence approximates the incidence

multiplied by the average duration of the disease.'

2.3 MEASURES OF EFFECT

The.strength of the association between risk of dis-

ease and a potential causative factor can be evaluated

by comparing the occurrence measure (e.g. incidence

or prevalence) for a group that is exposed to a possi-

ble causative factor to that of a group that is unex-

posed (or less exposed). The main effect measures arc

discussed in Chapters 3-5, under the appropriate

study methods. Here, an outline description is given

to familiarize the reader with the basic concepts.

2.3.1 Difference (absolute) measure

Consider the comparison of two risks Re and R1,

where Ro is a baseline risk in those not exposed

(or least exposed) to the factor of interest, and R1 is the risk in those exposed. For instance, in the

Blue Mountains Eye Study, a cohort of people

were followed for ten years, and the risk of nuclear

cataract among non-diabetics (R0) was 35% while

the risk of nuclear cataract among diabetics (RI)

was 54%.10 To compare the risk among the exposed

and unexposed we can simply calcu late the differ-

ence R1—R0, which is an absolute measure of effect,


commonly called attributable risk (AR). For the

Blue Mountains Eye Study the attributable risk of

nuclear cataract related to diabetes was 54%-35% =

19%. From a public health perspective, the magni-

tude of AR indicates the amount of risk in the

exposed group that could be avoided if the group

became unexposed, assuming the putative factor is

responsible for the difference. A large AR, how-

ever, may simply reflect a large R0 and R1. Similarly,

a small AR may arise when R0 and R1 are small,

even when R1 is several times larger than R0. Thus

the AR is not a good measure of the strength of

association between an exposure and a disease in

the context of aetiological research.

2.3.2 Relative measures

Relative risk (RR)

The natural solution to estimating the association

between an exposure and a disease is to calculate a

relative measure that is independent of the magni-

tude of Ro. In epidemiological studies the most

common relative measure is the relative risk, or

RiRo. This can also be called the risk ratio, rate

ratio, odds ratio or incidence rate ratio (when R0

and R1 are incidence density measures). Relative

risk is a good indicator of the strength of associa-

tion; for example, a value of 1.0 (or close to unity)

indicates no association, 2.0 suggests doubling of


risk in the exposed, and 0.5 may indicate halving of

the risk in the exposed. For the above example, the

relative risk for the association between diabetes

and nuclear cataract is 54%/35% = 1.5. This means

that the risk of nuclear cataract was 50% higher

(1.5 times higher) among people with diabetes

compared with people without diabetes during ten

years of follow-up. It is also possible to calculate

the prevalence ratio, by dividing the prevalence of

disease in thc exposed group by the prevalence in the unexposed group.

Population attributable risk percent (PAR%)

A population is made up of people who are

exposed and those who are unexposed, and these

people may have a different risk ofdisease. The risk

in the total population is therefore determined by

the risk in those who are exposed and those who

arc unexposed, as well as the proportion ofsubjects

who arc exposed (i.c. the prevalence of exposure).

Although a large relative risk may indicate that the

exposure is of aetiological importance, this does

not necessarily mean that the exposure is of public

health concern; for example, when the disease or

the exposure is extremely rare. For a holistic appre-

ciation of the importance of an exposure (detri-

mental or beneficial) at the population level, it is

necessary to quantify both relative risk and one of

its derivatives. Population attributable risk per-

cent (PAR%) gives the proportion of all 'cases'


(occurring in a general population) that is attribut-

able to the exposure, by subtracting the risk in the

unexposed groups from the risk in the total popu-

lation. That is:

Population attributable risk percent PAR% =

100 (11.0, Ro)/R.L.,

In terms of the diabetes and nuclear cataract

example, if the risk of nuclear cataract in the total

population was 36% then the PAR% = (36%-35%)/

36% = 0.03 or 3%, suggesting that 3% of nuclear

cataract could be avoided if diabetes were elimi-

nated from the population. This has meaning only

when a causal relationship is assumed between

a detrimental exposure (diabetes) and the 'case'

status (nuclear cataract). Another form of the

equation (which is mathematically the same as

the above), provided by Miettinen," shows the relation between PAR% and RR:

PAR% = P(RR-1 )/RR

Where P is the prevalence (%) of exposUre among the 'eases'.

The utility of 1'AR% is clear in so far as it gives an

impression of the amount of the disease problem

that might 'disappear' if exposure to the risk factor at issue is removed. This could be valuable in the

planning of a preventive intervention.

Odds ratio (OR)

When separate estimates for R1 and Ro cannot be

made (e.g. in case-control studies, see Chapter 4),

the risk ratio or rate ratio cannot be estimated


directly. Usually in such situations, the study

design ordains that a sample of 'cases' (individuals

with the disease or condition of interest) are

obtained and compared with a contemporaneous

sample of non-cases. In the simp lest form, the data

from such studies may be summarized in a 2 by 2

contingency table (Table 2.1).

Table 2.1 Summary of data (in the simplest form)

from a case-control or a cross-sectional study.

Individuals are classified according to disease sta-

tus (cases and non-cases) and also by status of

exposure to the factor of interest. The cell fre-

quencies are denoted by the letters a, b, c and d

Exposed Not exposed

Cases a

Nan-cases t d

The odds of being 'exposed' can be computed

for the 'cases' and the `non-cases' as a/b and c/d,

respectively. The exposure Odds Ratio (OR) is

expressed as:

OR = (a/b)/(c/d) = ad/bc

From the diabetes and nuclear cataract example,

the odds of exposure to diabetes among the cases

of cataract was 37/415 and among the people

without cataract was 32/775. This gave an OR of

(37/415)/(32/775) = 0.090/0.041 = 2.2. This


indicates that the odds of exposure to diabetes is

2.2 times higher among cases with cataract than among people without cataract.

The OR obtained from well-designed case-

control studies is generally believed to give a good

indirect estimate of incidence rate ratio (IRR),

particularly when the incidence of the 'case' status is low (e.g. for rare diseases). Regardless of its

validity as an indirect measure of IRR, the OR

remains a valid measure of the strength of associa-

tion between exposure and `case status. Like IRR,

greater deviations from unity indicate stronger

associations. The OR has become a very popular

choice, even when direct estimation of IRR may be

possible, due to the ready availability of robust

statistical procedures for OR analysis, such as the

logistic regression model. Thus the OR is fre-

quently used in the analysis of data that come not

only from case-control studies, but also from clini-

cal trials, and from cross-sectional studies (preva-

lence data).

2.4 NOTES ON ESTIMATION OF EFFECT MEASURES

AND HYPOTHESIS TESTING

Most published reports of epidemiological studies

include estimation of the risk ratio, rate ratio or

odds ratio after adjustment for other factors besides

the exposure of interest that may influence the

association (these are called confounders). For


such an approach, multiple regression analysis is

undertaken using statistical tools such as the logis-

tic regression model. These models are used to test

hypotheses and to estimate the pertinent effect

measure, with adjustments for the effect of other

(extraneous) exposure factors (possible con-

founders). The analyses also quantify the level of

precision for the estimate, usually reported as con-

fidence limits. Estimation of the effect measure in

practice is thus considerably more complex than

the above-mentioned equations would suggest.

The objectives and the type of ep idemiological

study that generates the data largely determine the

choice of effect measure and the analysis tool. The

following section outlines the commonly used

study designs and related objectives.

2.5 TYPES AND OBJECTIVES

OF EPIDEMIOLOGICAL RESEARCH

The diversity of methods used in epidemiological

research may be grouped into two primary types.

The first is experimental studies, where the researcher assigns the exposure to the factor of

interest to participants, and this includes random-

ized clinical trials, randomized screening trials,

field trials and community intervention trials. The

second type is observational studies, where the

exposure is not under the control of the investiga-

tor (i.e. exposure is 'observed' rather than assigned


by the investigator). Observational studies include

cross-sectional studies, case-control studies, longi-

tudinal cohort studies, and their variants.

2.5.1 Randomized experimental studies

The gold standard of randomized experimental

studies is the randomized controlled trial (RCT)

(see Chapter 7). In this type of study, the

researchers randomly allocate participants to 'expo-

sure' (therapeutic or preventive intervention) or

not (control). The 'unexposed' or control group

may receive nothing, a placebo or the standard

treatment. Allocation is random so that if the

groups are sufficiently large they should be similar

with respect to all extraneous factors, known and

unknown, which might influence the outcome. For

instance, they will be similar in terms of age struc-

ture, health conditions and body size. This means

that any difference in outcome between the com-

parison groups could be attributed to the exposure

to 'treatment' under study rather than any other

difference between the two groups. The interven-

tion and control groups are then followed over

time to assess disease incidence; the incidence in

the intervention group is compared with that in

the control group to assess the effect of the inter-

vention on the disease risk. Participants and

researchers are typically masked to the exposure

status of the participants, meaning that they do not


know to which group the participant has been

assigned, and this is done in order to reduce bias in reporting and assessment of disease.

As an example, a randomized clinical trial was undertaken to assess whether beta carotene supple-

mentation reduced the incidence of age-related

maculopathy (ARM)." The investigators randomly allocated 22,071 male doctors to receive beta
carotene or placebo. They were treated and fol-

lowed up for 12 years and neither the investigators

nor the participants luiew to which treatment arm

they had been assigned. At the end of follow-up there were 162 cases of ARM in the beta carotene
group and 170 cases in the placebo group to give a relative risk of 0.96 (95% CI 0.78-1.20) showing no
protective effect of supplementation on the development of ARM.

(For explanation of 95% CI (95% confidence intervals) please see box in Chapter 1).

Variants on the randomized clinical trial model

exist, such as the randomized screening trial (see

Chapter 8). Another variant is the field trial,

where individuals in a population are randomized

to a preventive intervention (e.g. a vaccine) or no

intervention (or placebo), and the incidence of dis-

ease compared in the two groups. Such trials have

been used for diseases that are of great public

health concern. Some of the largest experimental

studies have been of this type; for example, the Salk

vaccine trial for poliomyelitis, involving more than

one million children. Community intervention

trials are similar to field trials but in this design

convenient clusters of people (communities) rather

than individuals are randomized to the preventive

intervention. An example is the first community-

based vitamin A trial" (see Chapter 14b).

The main limitations of randomized experimen-


tal studies arise from two sources. First, for practi-

cal reasons, strict eligibility criteria have to be used

in selection of subjects, so that the observations

are often made on a highly selected sample.

Consequently the inferences may be limited to

small, and sometimes peculiar, populations rather

than to a large population of general interest.

Second, randomization may be unethical if one of

the interventions or treatments is regarded to be

more beneficial by clinicians or by the patients.

Moreover, it may not be feasible to randomize

exposure to certain biological, behavioural or other

psychosocial factors (e.g. smoking, stress or alco-

holism). The high cost of experimental stucties in

some situations is an additional disadvantage; for

example, in the UK, the cost of a major ophthalmic

clinical trial has been in excess of £2,000 per

randomized individual. Much higher costs have

been incurred in the USA. Such studies require

considerable justification. In view of these limita-

tions, most epidemiological investigations of aetio-

logical factors are observational in design.

2.5.2 Observational studies — basic designs

In observational studies the investigators observe

the events as they unfold naturally. There are three

main basic types of observational study: cross-

sectional studies or surveys (see Chapter 3), case-


control studies (see Chapter 4) and cohort

(longitudinal, follow-up) studies (see Chapter 5).

Cross-sectional studies

In a cross-sectional study the investigators carefully

sample people from the population. They then

examine and/or interview the participants to assess

whether or not they have the disease and expo-

sure(s) of interest. This allows assessment of the

prevalence of disease (and exposure), so that the

magnitude of disease in the population can be esti-

mated. The prevalence can also be compared in

different groups (e.g. those exposed and those

unexposed) to explore whether there may be an

association between the exposure and the disease.

As an example, in the national survey of blindness

conducted in Nigeria described above, the overall

prevalence of blindness was 4.2% (95% confidence

intervals: 3.8 to 4.6%).' The prevalence of blind-

ness was higher among people who were illiterate

(5.8%) compared with those who could read and

write easily (1.5%) to give a prevalence ratio of 3.9,

suggesting that socio-economic status may play a

role in the incidence or persistence (duration) of

blindness. However, with cross-sectional surveys

we must emphasize the adage that 'association

does not equal causation'. We do not know

whether the exposure or disease came first, or if

both are caused by a third actor.


Cohort studies

Cohort studies allow us to measure disease inci-

dence, rather than focusing on prevalence as in the

cross-sectional studies. To conduct a cohort study,

a group of

people free from the disease of interest

are selected (i.e. prevalent cases

partici arenenxecrlvuid ewe de)c.iThe

pants are then examineds or

are categorized as 'exposed' or 'unexposed' wre ith

p ci

spect to the risk factor of interest. The parti -

ants are then followed over time and the numb er

of incident cases of disease that arise arc assessed

This allows the investigators to calculate the inci-

dence of disease (whether cumulative incidence or

incidence rate). The incidence can be calculated

separately in the exposed and unexposed group

so that the relative risk(CIR or IRR) can be

estimated.
The Copenhagen City Eye Study is a long-

running cohort study." During 1986-1988, 946

volunteers aged 60-80 living in Copenhagen were

examined and participated in the study. The cohort

was re-examined 14 years later and 359 (97% of

survivors) were re-examined and 301 included in

the analyses (81%). At follow-up 163 of the sub-

jects had ARM to give a risk (cumulative incidence)

or ARM of 163/301 = 0.54 = 54%. Risk of ARM

was higher among people who had a family history

of ARM (24/30 =.80%) compared with the risk in

people without a family history (139/271 = 51%)

to give a risk ratio of 1.56 (confidence intervals

1.26-1.93). These results indicate that having a family history of ARM may increase the risk of
developing ARM by 56%.

This example illustrates the problem of loss to

follow-up, which is often encountered by cohort

studies. Not all of the initial cohort were included

in the final analyses, and it is not known what

happened to those lost to follow-up in terms of

ARM. Although the majority were lost through

death, it is not known whether or not they devel-

oped ARM before they died. This means that the

risk measured may not be the 'true' risk in that

study population. Another problem with cohort

studies is that they usually either require a large

sample or long follow-up to accumulate enough

incident cases of disease to have sufficient power


to make meaningful inferences, (for further dis 7). This

cussion of 'power' see also Chapter

makes cohort studies expensive and time consum-

ing, which is why they are rare in the ophhalmic

literature. The strengths of cohort studies are

that incidence can be estimated and that the

researchers are relatively confident that the expo-

sure preceded the disease.

Case-control studies

The third type of observational study that is fre-

quently used is the case-control study, which is

used to study the aetiology of disease. Case-control

studies are conducted by recruiting people who

have the disease of interest (cases) as well as people

without the disease (controls) to allow comparison.

The cases and controls are then interviewed or

examined to assess whether they have the exposure

of interest. The odds of being exposed are then

compared for cases and controls to see if there is an

association between exposure and being a case.

Ideally, cases arc 'incident cases', so that they are

newly diagnosed and have not had time to change

their exposure status as a result of their diagnosis.

The controls are selected from the same population

that gave rise to the cases and represent the expo-

sure distribution in the source population. Case-

control cannot be used to estimate the burden of


disease, since the ratio of cases to controls is deter-

mined by the investigators.

A case-control study was undertaken to investi-

gate the association between childbearing and risk

of cataract in young women." The cases selected

were women aged 35-45 attending an eye hospital

in central India with bilateral 'senile' cataract.

Controls were selected among women of the same

age with clear lenses attending the hospital with

other complaints. Cases and controls were inter-

viewed about their history of pregnancy and child-

birth. The investigators found that the cases had

statistically significantly more live births than the

control subjects. Compared with women who had

had 1-3 children, the odds of a case having 4-6

children was 1.9 times higher (95% CI 1.1-3.1)

than for controls, and the odds of a case having

7-11 children was 4.6 times (2.0-10.6) higher

than for controls. These results suggest that in

central India there is an association between Thigh

parity and having a cataract.

There are many advantages to case-control

studies. They are relatively quick and cheap to carry out, and can be used to investigate rare diseases.
Recall bias is a problem, since cases and

controls may report exposure history differently

because of their case status. Avoiding bias due to

selection of inappropriate controls is also a major

challenge.

Nested case-control studies combine some of


the advantageous features of case-control and

cohort studies. For example, in one common form

all disease 'cases' occurring in a given population

are identified through a registry. A sample of non-

cases (sometimes matched for age etc.) is also

drawn from the same population. The past expo-

sure status is then ascertained for both the cases

and the sample of non-cases. Here, the number of

non-cases that have to be investigated for past

exposure is only a small fraction of the numbers in

a longitudinal cohort study (particularly when the

disease is rare). An important advantage over a

case-control study design is that a measure of dis-

ease frequency can be estimated for the population.

2.5.3 Variants of the basic

observational studies

There are also variants of the basic observational

designs that arise from a diversity of design options

which are available in observational studies. The

more pertinent variants are briefly described below.

Ecological studies

In the ecological study design the exposure status

data are not available for individuals but are

obtained as an average for groups. The unit of

observation is thus the group rather than an indi-

vidual. The groups are commonly defined geo-


graphically (e.g. as a whole country or region) but

could be defined by other factors, such as socio-

economic, occupational and demographic factors.

A proxy measure is often used to describe the

exposure status of the group as individual measures

are not obtained. For instance, average per capita

gross domestic product for the country could be

used as the measure of socio-economic status.

Comparisons are made between the groups in

respect of frequency of disease occurrence.

Examples in eye research include older studies of

the relationship between ultraviolet radiation from the sun and frequency of cataract. In one such

study in Australia, various geographical zones were

defined according to average levels of ambient

ultraviolet radiation. The zones were then com-

pared with respect to the population prevalence of

cataract.' It is now well known that inferences

from ecological data may be misleading. The prob-

lem, named 'ecologic fallacy', arises because there

are often insuff icient data on other pertinent

exposures to allow control of confounding in the

analysis (i.e. third factors that could explain the

association between the exposure and the disease).

In addition, the exposure and disease status of indi-

viduals is not known, and so even if an association

between the exposure and disease exists at the pop-

ulation level, it may not exist at the individual level.

Self-controlled case series


A self-controlled case series can be used to assess

the effect of a transient exposure on disease risk.

A patient who has developed a disease is inter-

viewed about his/her exposure pattern during a

specific period of time. This time is divided into

`risk' periods and 'control' periods. For instance,

imagine that an investigator wishes to assess the

relationship between administration of dilatation

drops and the development of acute open angle

glaucoma. The investigator would interview cases

of acute open angle glaucoma about whether or

not they had received dilatation drops during the

24 hours prior to disease onset. The period in the

two hours before disease onset could be defined as

the 'risk' period, while the previous 22 hours are

the 'control' period. The investigator can then

assess whether the onset of disease is associated

with administration of drops. This study design

includes only cases, and the cases act as their own

controls, therefore any potential association

between the exposure and disease cannot be attrib-

uted to differences in age or other risk factors

(i.e. confounding is removed).

Space/time cluster studies

Space clustering studies share the features of eco-

logical studies in so far as exposure data are not

available for individuals and clusters or groups are

compared. In addition, the grouping need not be


according to levels of exposure to a particular puta-

tive risk factor. The studies are designed to detect

clustering in space, that is, a non-uniform distribu-

tion of 'cases' over the total study arca, beyond the

level of clustering that might be expected from the

population distribution and chance. The studies

have been used to implicate or assess general envi-

ronmental influences in the disease aetiology.

Examples include international comparisons of dis-

ease risk or prevalence.

Time clustering studies are designed to detect

non-uniform distribution of the occurrence of

`cases' over a time period for a defined population.

The main objectives are to identify secular trends in

large populations, including cyclic fluctuations,

and to explore local epidemics. A study of optic

neuritis in Sweden is an example of the method in

ophthalmology.' The clustering (in time, and sep-

arately in space) of incident cases over a six-year

period among the population of Stockholm

County was investigated. Only a seasonal variation

emerged: highest incidence in spring, and lowest in

winter.

Time clustering designs are sometimes com-

bined with space clustering designs to detect clus-

tering in time and space. Clustering of 'cases' in

both time and space suggests the involvement of

infectious agents in causation of the disease.


Genetic studies

There is enormous interest in identifying genetic

risk factors for disease (Chapter 6). This can be

investigated through cross-sectional surveys,

cohort studies or case-control studies by including

genotype as the 'exposure' and assessing its role in

the aetiology of disease. There are also specific

study designs used for genetic studies: twin studies,

familial aggregation studies and pedigree studies.

Twin studies are generally very effective in pro-

viding evidence for the 'genetic effect' in the aeti-

ology of disease. The sample of twin pairs are

grouped' according to a two-way classification:

(i) as monozygotic (`identical') or dizygotic (`non-

identical'), and (ii) as concordant or discordant

with respect to the presence or absence of disease

(i.e. same or different disease status). Significantly greater concordance among the monozygotic pairs

is regarded as evidence for the genetic role in cau-

sation of the disease, because members of a

monozygotic pair share all of their genes, whereas

members of a dizygotic pair differ in respect of

some genes. The measure used to summarize the

results is the 'heritability' proportion,' which

estimates the percentage of the total variance in

disease status attributable to genetic factors

(although an environmental component is usually

needed for the disease to become manifest). A

recent example of a twin study in ophthalmology

reports the heritability for cataract.'-'


In familial aggregation studies, the variability of

the trait or disease (prevalent or incident) within fam-

ilies is compared to the variability between families. A

large between-family variance (relative to within-

family variance) indicates a correspondingly high

degree of familial aggregation. A US study (1997)

of diabetic retinopathy and neplu-opathy may serve

as an example of the method.' The study involved

patients from the Diabetes Control and Compli-

cations Trial (DCCT) and their first-degree rela-

tives. The interclass correlation was computed from

`severity-of-retinopathy' scores for all family mem-

bers. Significant interclass correlation levels were

found for parent—offspring, mother—child and

father—child relationships, thus providing evidence

that the severity of diabetic retinopathy is influenced

by familial (possibly genetic) factors.

Pedigree studies involve investigation of dis-

ease or traits in large families of at least three gen-

erations, in order to classify members according to

their genotype and disease status, so that a specific

genetic mechanism may be identified in relation to

other factors in the development of disease, such as

diet. Well-designed studies of this type allow esti-

mation of heritability for the population and also

provide a description of the mode of inheritance.

The main difficulties are obtaining sufficient data

from each pedigree.


2.5.4 Limitations of observational studies

There is a general limitation with all these observa-

tional studies. They utilize experiments arising from

accidental circumstances. Assessments of outcome

are made in a sample of individuals whose exposure

status is determined by an interacting mixture of nat-

ural, political, economic, cultural, social and behav-

ioural forces. Observational studies can be conducted

in populations where these 'natural' forces have cre-

ated a marked (easily measurable) variation in the

level of exposure to the suspect factor(s). These influ-

ences, however, do not allocate exposure randomly.

In a 'natural' non-experimental setting, a person

exposed to the putative harmful agent of interest is

often also more likely to have a multitude of other

exposures that might enhance the risk of the disease

under study; for example, persons 'exposed' to alco-

hol abuse are more likely to be heavy smokers; and

children exposed to poor personal hygiene in a rural

trachoma-endemic community tend to be also more

exposed to conditions of malnutrition, overcrowding

and poverty, with all the associated complex of risk

factors. Clearlv, in assessing the effect of any particu-

lar exposure on risk of disease or disability, the influ-

ence of all other pertinent extraneous factors should

be taken into account (adjusted for), as fully and as

simultaneously as possible. Such adjustment is com-

monly referred to as the control of 'confounding',


which is elaborated upon in Chapter 4. The problem

is made more challenging when many of the extra-

neous factors act in synergy with the study exposure,

and/or with one another, to influence the study out-

come in a complex way. Many of the recent advances

in epidemiology have been made through the devel

opment and refinement of study designs and of sta-

tistical models to resolve these complex problems,

which are often shared by experimental studies.

2.6 FUTURE DIRECTION OF OPHTHALMIC EPIDEMIOLOGY: CONCLUDING NOTES

In this book, many experimental and observational

studies are discussed in the context of the epidemi-

ology of the major eye disorders. The structure is

traditional insofar as various aetiological factors are

considered together, in relation to a particular dis-

ease. This disease-oriented approach is in keeping

with the process of research in ophthalmic epidemi-

ology, where most studies are initiated or led

by disease experts. Recent advances in molecular

biology and genetics open new, exciting avenues 0f research, including the highly challenging study or
gene—environment interactions in disease causation, where modern epidemiology and the related

advanced statistical methods have a central role. The approach in eye researcdhi,selals CelN■
(7;eern,tendlayas becomes even more profoundly

genetic and environmental risk tactors are consid-

ered in relation to a specific disease. The tendency is

already apparent in the more recent studies of

cataract aetio logy.


An alternative or complementary approach would

be to consider all disease or health outcomes of

exposure to a particular risk factor or protective fac-

tor. Such exposure-oriented research should lead to

a more holistic appreciation of the importance for

public health of the exposure factor. The approach

would require the working collaboration of disease

experts and exposure experts (e.g. nutritionists),

that is, epidemiologists with special interest in pub-

ilc health aspects of a particular exposure or risk fac-

tor complex. The methodology would focus on

major cohort follow-up studies and large-scale ran-

domized trials with long-term outcome, assessment.

Potential candidates for this research approach may

include nutritional/dietary factors, tobacco use,

alcohol consumption and hormone replacement

therapy, with a wide spectrum of ocular and other

health indicators included as outcomes.

Earlier, mention was made of some work in con-

struction of epidemiological models for cataract and for onchocerciasis, and the utility of such mod-

els (see Section 2.2.1). As more information is har-

vested from future research, epidemiological

models could be developed to study the population dynamics of the major eye disorders (specifically

and collectively), in relation to their determinants and their management and control.

1.1 INTRODUCTION: THE SCOPE

OF OPHTHALMIC EPIDEMIOLOGY

Why are ophthalmic conditions important?

Surveys consistently reveal that loss of sight ranks


second or third in people's perceptions after the

most feared conditions of cognitive loss or can-

cer. Although a good quality of life is

usually achieved by people without high quality

vision, normal vision nevertheless enables indi-

viduals to participate more fully in the myriad of

activities that make up daily living and opportu-

nities for recreation. Prevention of visual loss, or

restoration of vision loss, has become an impor-

tant priority for public health; it requires a mul-

tidisciplinary approach engaging as stake-holders researchers, eye care workers, the designers and

managers of programmes of prevention and

funders.

Why is epidemiology important? Epidemiology,

in its broadest sense, is the study of the distribution

and determinants of disease. First, epidemiological

research contributes to the body of scientific

knowledge about diseases of the eye and provides

clues to their likely aetiological pathways, which

may not emerge from laboratory research or obser-

vation in the clinic.

Second, it contributes to' the public hLlth

approach to blindness prevention by identifying

the magnitude of the problem, and describing

the main causes and distribution of eye diseases

that result in impairment and disability. It also

identifies differences between populations that may be modifiable.

Third, epidemiological research also contri-

butes to blindness prevention through understand-


ing the determinants of disease in human

populations. Establishing an association between

exposure to a risk factor or causative agent, and the

presence or absence of disease, requires an analyti-

cal study design. This may be either a case-control

study (Chapter 4), or a longitudinal (cohort) study

(Chapter 5), which may, in turn, be retrospective

or prospective.

Finally, epidemiology evaluates preventive meas-

ures, treatments or other intervention to reduce the impact of vision loss, by conducting clinical tri-

als that require randomization of populations or persons, together with standardized assessments of
outcomes (Chapter 7).

Thus the epidemiology of eye disease has made,

and continues to make, critical contributions to

both scientific knowledge and blindness preven-

tion. In this book, we review these contributions, and suggest new avenues for research that will fur-

ther enhance eye health. In this first chapter we begin by describing the research that documents the
magnitude of the problem of visual loss in the world, and its main causes.

1.2 THE MAGNITUDE OF GLOBAL BLINDNESS

1.2.1 Summary of global blindness

The most recent global estimates by the Global

Databank of the World Health Organization

Programme for the Prevention of Blindness

and Deafness were published in 2004, based on

the data available to 2002. The number of visu-

ally impaired people (best corrected vision

<6/18, 20/60, 0.33) was 161 million, of whom


37 million were blind (<3/60, 20/400, 0.05).

The history of the sequence of estimates by

the World Health Organization (WHO) of the

burden of global blindness is outlined in Section

1.5.1 below, and summarized in Fig. 1.1.

In the lists of figures for prevalence and

causes of blindness by country in Tables 1.7,

1.8 and 1.9, we have added those surveys

published since 2002 which meet the criteria

described in the text. In the past there have been

few firm data on the prevalence of blindness, and

even now the results of individual surveys must

be interpreted with care if extrapolated to a

region.

1.2.2 Summary of global causes

The most recent estimates from the WHO on the

causes of blindness need to be examined in two ways

because the proportions vary depending on the def-

inition of blindness used. When the estimates were

first published in 2004, the conventional definition

of best corrected vision was used.' With this defini-

tion, cataract was responsible for nearly half the

global blindness (47.8%); and glaucoma (12.3%)

became the second most important cause. However,

if presenting vision were to be considered (as in the

revised WHO definition described below)," uncor-


rected refractive errors would be the second com-

monest cause (18%). The comparison between the

causes based on presenting and best corrected vision

is shown in Figs. 1.2 and 1.3.

1.3.4 Population-based sample surveys

The only secure sources of prevalence data are population-based prevalence surveys conducted
according to strict criteria. The principles and some practical aspects are described in Chapter 3.

While still being population-based, rapid assess-

ment techniques have also been improved over the last 15 years. Employed initially for assessment of
cataract blindness (Rapid Assessment of Cataract Surgical Services — RACSS), these methods are now
being increasingly used for gathering evidence of other causes of visual impairments, both at local and
national level (Rapid Assessment of Avoidable Blindness — RAAB).

Because data are now available from a large

number of surveys, it becomes important to iden-

tify those which can stand the test of scientific

rigour. In this chapter we have required some basic

criteria, as follows, before including the results in

the Tables 1.7, 1.8 and 1.9.

1. The survey must have identified the population

to which it refers and selected a sample using

random sampling methods, so that the results can be generalized to the wider population in that area.

2. If it is to be generalizable, the survey should

include a representative population and not be

limited only to specific groups in, for example,

areas endemic for onchocerciasis, leprosy villages

or trachoma-endemic populations. This will

allow the findings to be extrapolated to a wider

population.
3. A sample large enough to provide confidence

intervals (95%) on the prevalence measures. If

blindness is relatively rare, the estimate will be

un-interpretable unless the sample size is large.

4. Clear-cut descriptions should be provided of

the sampling process, definitions used, enumer-

ation procedures adopted, clinical examination

protocol followed and methods of data analysis.

We have included the most recent data available for a country and have arbitrarily taken 1990 as the
cut-off year. Most earlier surveys were reported in the second edition of this book. With increased

access to services, older figures may no longer be valid for a particular country or region today. Only if
no studies since 1989 were available from a par-

ticular country have we retained the older data, so that the readers can understand at least the extent
of the problem in the past.

1.4 Definitions

1.4.1 Definition: magnitude

Magnitude is the size of the problem of a

condition or disease. The two measures used are

prevalence and incidence (which will be

described in more detail in Chapter 2 under

2.2.1). Prevalence is a static measure that provides a snapshot of the disease in the population at a
particular point in time, and includes all cases of disease regardless of duration.

Prevalence is a proportion and relates to a

defined population.

This crude prevalence measure can be made

more specific by giving the prevalence in a partic-

ular age group; for example, the number of peo-


ple with blindness aged 50-59 years as a

proportion of the total defined population aged

50-59 years. This is called the age-specific preva-

. lence. For meaningful comparison between stud-

ies in different populations, . or in the same

population at different times, prevalence can be adjusted to a reference population with a known age
and sex structure.

Incidence is a more dynamic measure of the

magnitude of the, problem, and measures the

number of new cases of disease in the popula-','

tion at risk (that is, the population that does not

already have the disease) over a defined period

of time. Incidence can be expressed in two main

(the

ways: either as a proportion or as a rate

incidence rate, explained further in Chapter 2 under 2.2.2). Incidence expressed in the

simplest way, as a proportion, is known as

cumulative incidence (CI).

number of new cases occurring

in a given time

CI-

total number of population at risk

at the beginning of the period

1.4.2 Definition: blindness, low vision and visual impairment

Blindness and low vision are conventionally meas-.

ured and defined in terms of visual acuity (VA)

and of reduction of visual field. The equivalence

between the different notations used in different


countries for visual acuity is given in Table 1.1.

Each country has its own definition of blind-

ness for legal and social purposes. Because there

are wide variations in these requirements, and so

that global comparisons could be made, a WHO

study group in 1972 recommended a standard-

ized method of testing and a uniform definition

of blindness and visual impairment.' This was - •

incorporated into the International Statistical.

Blindness was defined internationally, as a VA of less than 3/60 (20/400, 0.05) in the better

eye with best possible correction,, or a visual" field loss in each eye to. less than 10' from fixa-

tion. This corresponded to categories of visual

impairment 3, 4 and 5 in ICD710

Low vision was defined as VA of less than 6/18

(20/60, 0.3) but equal to or better than 3/60 in

the better eye with best possible correction.

(corresponding to visual impairment categories 1

and 2 in ICD-10). Category 1 was visual impair-

ment less than 6/18 to 6/60, and category 2 as

severe visual impairment less than 6/60 to 3/60.

You might also like