Professional Documents
Culture Documents
h
ta
9
ri 9
n
U
-
ti e
V
d
G
R
tahir99 - UnitedVRG
BASICS IN
V
d
ti e
G
R
n
U
-
h
ta
9
ri 9
Foreword
Waris Qidwai
The views and opinions expressed in this book are solely those of the original contributor(s)/author(s)
and do not necessarily represent those of editor(s) of the book.
All rights reserved. No part of this publication may be reproduced, stored or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior
permission in writing of the publishers.
All brand names and product names used in this book are trade names, service marks, trademarks
or registered trademarks of their respective owners. The publisher is not associated with any product
or vendor mentioned in this book.
Medical knowledge and practice change constantly. This book is designed to provide accurate,
authoritative information about the subject matter in question. However, readers are advised to
check the most current information available on procedures included and check information from the
manufacturer of each product to be administered, to verify the recommended dose, formula, method
and duration of administration, adverse effects and contraindications. It is the responsibility of the
practitioner to take all appropriate safety precautions. Neither the publisher nor the author(s)/editor(s)
assume any liability for any injury and/or damage to persons or property arising from or related to use
of material in this book.
This book is sold on the understanding that the publisher is not engaged in providing professional
medical services. If such advice or services are required, the services of a competent medical
professional should be sought.
Every effort has been made where necessary to contact holders of copyright to obtain permission to
reproduce copyright material. If any have been inadvertently overlooked, the publisher will be pleased
to make the necessary arrangements at the first opportunity.
Inquiries for bulk sales may be solicited at: jaypee@jaypeebrothers.com
tahir99 - UnitedVRG
V
d
Dedicated to
ti e
h
ta
9
ri 9
n
U
-
G
R
tahir99 - UnitedVRG
Foreword
9
ri 9
h
ta
n
U
-
ti e
V
d
G
R
Waris Qidwai
tahir99 - UnitedVRG
Preface
Basics in Epidemiology and Biostatistics introduces the medical/dental students,
postgraduates, researchers, or clinicians, to the study of statistics applied to
medicine. We have incorporated our experiences in medicine and statistics to
develop a comprehensive text covering the traditional topics of biostatistics
and epidemiology. Particular emphasis is given to study design and the interpretation of results of medical research.
It has been more than a decade that we have been giving lectures at
various undergraduate and postgraduate institutes. The students find these
lectures worthwhile for the understanding of basic concepts in biostatistics
and epidemiology. We realized that by writing a book, we could reach a large
number of students and faculty members in remote areas, which were not
accessible to us otherwise. Thus, we hope that anyone interested in research
will find the book extremely helpful.
We have tried to explain all statistical concepts in simple terms. No special
background knowledge will require to understand the text. An effort has been
made to cover all the fundamental concepts and important terms in the book.
V
d
n
U
-
ti e
G
R
Simple Text
The book is written in a very simple and easy-to-understand manner. The
information given in the book is relevant to the need of any junior and early
stage researcher. The information is presented in a schematic pattern. This is
necessary because a learner must understand the pre-requisite information
before understanding the more advanced concepts in basic epidemiology
and biostatistics. Thus, all the information have been presented in a schematic
and synchronized way so that the reader could grasp them very easily.
9
ri 9
h
ta
Waqar H Kazmi
Farida Habib Khan
tahir99 - UnitedVRG
Acknowledgments
V
d
h
ta
9
ri 9
n
U
-
ti e
G
R
tahir99 - UnitedVRG
What is Research ? 1
Types of Research 1
Steps to Conduct Research 3
Selection of Research Topic 3
Scale for Rating Research Topics 5
Resources of Literature Search 5
30
3. Sampling Procedure
n
U
-
9
ri 9
h
ta
ti e
Definition 8
Types of Epidemiological Study Designs 8
Descriptive Observational Studies 10
Analytical or Comparative Studies 14
Analytical Observational Studies 14
Registries 20
Interventional/Experimental Studies 21
Blinding 24
Consent Form 25
Intent to Treat Analysis 25
Quasi-experimental Studies 25
Clinical Trials and their Phases 25
Research Questions and Study Types 27
Meta-analysis 27
G
R
y
y
y
y
y
y
y
y
y
y
y
y
y
y
V
d
2. Study Designs
y
y
y
y
y
y
1. Introduction to Research
Contents
41
y Population 30
y Reasons for Sampling 31
y Sampling Techniques 31
51
5. Biostatistics: Basic
57
Point Estimate 57
Interval Estimate 57
Hypothesis Testing 57
Introduction to the Scale of Probability 58
Test of Hypothesis 59
Decision Errors 62
69
y
y
y
y
y
y
77
8. Measures of Association
Introduction 89
Bias 89
Control of Bias 92
Confounding 92
Effect Modifiers 93
103
11. Screening
Sample Size 95
Sample Size for Single Proportion 95
Sample Size for Single Group Mean 96
Sample Size for Two Proportions 98
Sample Size for Two Group Means 98
Sample Size for Sensitivity and Specificity 101
Suggested Websites for Sample Size Calculator 102
y
y
y
y
y
y
y
95
y
y
y
y
y
89
tahir99 - UnitedVRG
Contents xv
110
120
V
d
n
U
-
y
y
y
y
Methodology 129
Plan for Analysis of Results 130
Title/Topic 130
Introduction 130
G
R
115
ti e
129
151
y
y
h
ta
9
ri 9
157
y
y
y
y
164
169
Index
175
y
y
y
y
y
y
tahir99 - UnitedVRG
CHAPTER
Introduction to
Research
V
d
WHAT IS RESEARCH ?
G
R
n
U
-
TYPES OF RESEARCH
ti e
9
ri 9
h
ta
Qualitative Research
This type of research is context based. Here there is an inquiry with
the goal to understand a social or human problem so build up a
complex and holistic picture of the phenomena of interest. The
researcher interprets the results of perspectives or information
taken from subjects.
tahir99 - UnitedVRG
Introduction to Research
Quantitative Research
V
d
n
U
-
ti e
G
R
9
ri 9
h
ta
Purpose
Identifying study
Collection of data
Analyzing data
tahir99 - UnitedVRG
Introduction to Research
availability of resources for implementing the recommendations.
The opinion of the relevant stakeholders (i.e. potential clients
and of the responsible staff) will influence the implementation of
recommendations as well.
Low (2)
Low (3)
tahir99 - UnitedVRG
Introduction to Research
BIBLIOGRAPHY
1. Dawson B, Trapp RG (Eds). Reading the Medical Literature. Basic
and Clinical Biostatistics, 3rd edn. Singapore: Lange Medical Books;
McGraw Hill; 2001.pp.317-9.
2. Fathalla MF, Fathalla MMF (Eds). What research to do? WHO Regional
Publication, Eastern Mediterranean Series: A Practical Guide for Health
Researchers. Cairo: World Health Organization; 2004.pp.25-42.
3. Harvard L. How to conduct an effective and valid literature search?
[Online]. 2007 [cited 2008 Jul]; Available from: URL: http://www.
nursingtimes.net/ntclinical/how_to_conduct_a_literature_search.
html
4. Hulley SB, Newman TB. Getting started: the anatomy and physiology
of clinical research. In: Hulley SB, Cummings SR, Browner WS (Eds).
Designing clinical research. Philadelphia, PA: Lippincott Williams and
Wilkins; 2007.pp.3-15.
5. Research and Scientific Methods. In: World Health Organization.
Health research methodology: a guide for training in research methods.
Manila: World Health Organization; 2001.pp.1-10.
CHAPTER
Study Designs
DEFINITION
tahir99 - UnitedVRG
Study Designs
Flow chart 2.1 Types of epidemiological study designs
Case Report
It is report of a single case of disease, usually with an unexpected
presentation, which typically describes the findings, clinical course
and prognosis of the case. Writing of a case report is like writing a
good clinical history of a patient that includes presenting features,
clinical signs, lab investigations, and diagnosis after excluding a list
of differential diagnosis. A classical example of a case report from
history is that of a congenital anomaly affecting limbs and digits
tahir99 - UnitedVRG
Study Designs 11
Table 2.1: Baseline characteristics of patients with chronic kidney disease
(hypothetical table of a descriptive study design)
Patients characteristics
Mean SD or %
Age (years)
Male Gender
Race
Caucasians
African-American
Asians
Others
Insurance
Private
HMO
Medicare
Medical aid
None
Comorbidity Index
Zero
One
Two
Three
Cause of CRI
Diabetes mellitus
Hypertension
GN/PKD/IN
Other
Laboratory values
from Germany in late 1959 (The Thalidomide tragedy). The world has
never heard or seen such a unique congenital anomaly before. These
are the type of cases which should be presented as a case report.
Case Series
When several unusual cases all with similar conditions are described
in a published report, this is called a Case Series. A case series does
not include a control group. Subsequently after the first case report
of thalidomide tragedy a case series was published in 1961. The
thalidomide was used for nausea and vomiting in pregnancy in that
era, hence soon more such mal-developed children were identified
becoming a basis for a case series.
It was quite easy to identify the exposure now as thalidomide
because all mothers with the outcome (mal-developed children)
used this drug.
Cross-sectional Studies
tahir99 - UnitedVRG
Study Designs 13
Flow chart 2.2 Design of a cross-sectional study
Advantages
Easy to perform
Prevalence/frequency of the disease can be calculated
Inexpensive as compared to analytical studies
Useful for evaluating diagnostic procedures, e.g. comparing two
diagnostic or treatment modalities, or the usefulness of a new
diagnostic procedure
Disadvantages
The data about both the exposure to risk factors and the presence
or absence of disease are collected simultaneously, hence it is
difficult to determine temporal relationship of a presumed cause
and effect.
Nonresponders bias (in surveys), it is difficult to obtain sufficiently
large response rates, as some people are too busy or reluctant to
participate.
Hypothesis though can be generated but it is a weak hypothesis
which needs to be tested by conducting further analytical study.
tahir99 - UnitedVRG
Study Designs 15
If so, what is the strength of association between the exposure/
risk factor and the outcome/disease under study?
To ascertain whether the association between the exposure and
the outcome is not by chance. This is determined by a test of
significance commonly called the p-value.
Advantages
Disadvantages
Recall bias is the main problem as the cases will be more likely to
recall the past exposure. Similarly, if the researcher is working on
geriatric patients then recall bias can be problematic both in cases
and controls as the respondents might not have good memory
due to old age. For example, in a study looking at the association
of being a cigarette smoker for ten years and development of lung
cancer, some participants may have difficulty in recalling whether
they have been a cigarette smoker for ten years or not.
Selection bias is another problem if the cases and controls are not
properly selected. Here are two examples of selection bias in two
studies carried out at two leading tertiary care centers of the world
by two very eminent researchers of the time.
Study 1
In 1929, Raymond Pearl at John Hopkins, Baltimore conducted a
study to test the hypothesis that tuberculosis (TB) protected against
cancer. He selected 816 cases of cancer from 7500 consecutive
autopsies. He also selected 816 controls from others on whom
autopsies had been carried out at John Hopkins. Of the 816 cases
(with cancer), 6.6% had TB. Of the 816 control (without cancer), 16.3%
had TB. From the finding that the prevalence of TB was considerably
higher in the control group, Pearl concluded that TB was protective
against cancer. Actually at the time of this study, TB was one of the
major reasons for hospitalization at Johns Hopkins Hospital. Pearl
thought that the control groups rate of TB would represent the level
of TB in the general population; but because of the way he selected
the controls, they came from a pool that was heavily weighted with
tahir99 - UnitedVRG
Study Designs 17
TB. He should have compared the patients with cancer to a group of
patients admitted for some specific diagnosis other than cancer. The
way the controls are selected is a major determinant of whether a
conclusion is valid or not.
Study 2
Coffee-drinking and Cancer of the Pancreas in Women. The cases
(patients with cancer of the pancreas) were white cancer patients
from 11 Boston and Rhode-Island hospitals. The controls were
recruited from the Gastrointestinal Clinics of the same hospital.
McMohan found that coffee consumption was greater in cases
than controls. The controls were patients who had reduced their
coffee consumption because of Physicians advice. The controls
level of coffee consumption was not representative of the general
population. When a difference in exposure is observed between
cases and controls we must ask Is the level of exposure observed
in the controls really the expected level in the general population.
In the two studies (1 and 2) the researchers erroneously concluded
about the association between an exposure and outcome because of
improper selection of controls.
Cohort Studies
Cohort means a group of people sharing the same attribute, e.g. all
those who are exposed to the use of tobacco as compared to those
not exposed to the use of tobacco.
In a cohort study design, the two groups are made on the basis of
exposure (i.e. smokers and nonsmokers). These groups are followed
for a specific period of time for the outcome of interest. This study
design is preferred if the researcher aims to determine the incidence
and the risk factors associated with the disease.
There are two types of cohort studies:
1. Prospective Cohort Study or Concurrent Cohort Study
2. Retrospective Cohort Study or Historical Cohort Study
of interest. The subjects are then followed into the future in order
to record the development of an outcome of interest. The follow-up
can be conducted by mail questionnaires, by phone interviews, via
the Internet, or in person with interviews, physical examinations,
and laboratory or imaging tests. For example a study investigating
the association between cigarette smoking for ten years or more and
lung cancer, if the researcher wants to choose a prospective cohort
design then his study would start in the year 2013 and end into 2023
(Flow chart 2.4).
The Framingham Heart Study is a good example of large, pros
pective cohort study. It is an ongoing cohort study still in progress to
identify the risk factors associated with heart disease.
Advantages
Multiple outcomes to a single exposure can be detected
Incidence rates are calculated
It helps in calculating the relative risk and the attributable risk
Temporal association is best studied in prospective cohort study
It allows the assessment of dose response relationship
Flow chart 2.4 Prospective cohort study
tahir99 - UnitedVRG
Study Designs 19
It helps to accept or to refute the hypothesis with a high degree of
validity
Complete control over the data.
Disadvantages
Expensive
Time consuming
Strict follow-up is required
Not suitable for diseases that have a long incubation period
Not suitable for rare diseases
Attrition (loss to follow-up) due to migration or death of the
respondents.
Advantages
Less expensive
Less time consuming
Follow-up data is obtained through records so follow-up time is
saved
Other advantages of cohort studies are also there.
Disadvantages
REGISTRIES
In the developed world, researchers have collected data pertaining to
specific diseases like the United States Renal Data Systems (USRDS)
for end-stage renal disease patients (ESRD). The USRDS has data on
all dialysis patients being dialyzed in any of the 52 states in the US.
Any patient who initiates dialysis is immediately registered in this
data base and subsequently the entire follow-up including clinical
characteristics, labs and medicines are recorded continuously until the
tahir99 - UnitedVRG
Study Designs 21
patients is alive/dies/receives a kidney transplant. A researcher may be
interested to look at the risk factors associated with ESRD and may like
to study patients who initiated dialysis from 2001 to 2006. The data may
be used from this registry to conduct a retrospective cohort study.
Data from registries are ideal for retrospective cohort studies.
Clinicians of every specialty should be encouraged to conduct chart
audits to collect data retrospectively on disease of their interest.
Unfortunately, the hospital records are not well maintained in low
resource settings and, hence, it is difficult to create registries. In the
developed world, the majority of the studies done are retrospective
cohort studies using registries. We can also follow the foot-steps by
improving our in-door patients record system.
INTERVENTIONAL/EXPERIMENTAL STUDIES
Here intervention or some action is involved such as deliberate
application of a drug in the experimental (study) group and
no intervention in the control group. Later, the outcome of the
experiment is compared in both the groups (Flow chart 2.6).
Thus it differs from the observational analytical study designs
in that here the experiment is directly under the control of the
investigator whereas in the observational analytical studies, the
investigator takes no action, just observes.
There are three key components of an experimental study design:
(1) prepost test design, (2) a treatment group and a control group,
and (3) random assignment of study participants.
A prepost test design requires the collection of data on study
participants level of performance before the intervention is given
(pre-), and that you collect the same data on similar participants
after the intervention was given (post). This design is the best way to
be ensure that the intervention had a causal effect.
Flow chart 2.6 Sketch of experimental study design
Medical
therapy group
(N = 1140)
(N = 1130)
61.4 10.0
61.7 9.6
p-value
Characteristics
Ageyear
Sexno (%)
0.95
Male
974 (85.4)
964 (85.3)
Female
165 (14.5)
165 (14.6)
0.54
0.64
White
984 (86.3)
972 (86.0)
Black
55 (4.8)
55 (4.9)
Hispanic
66 (5.8)
56 (5.0)
Others
34 (3.0)
46 (4.1)
Contd...
tahir99 - UnitedVRG
Study Designs 23
Contd...
Surgical
Medical
p-value
therapy group therapy group
Clinical
Angina (CCS class)no (%)
0.24
132 (11.6)
146 (12.9)
338 (29.6)
339 (30.0)
11
407 (35.7)
423 (37.4)
111
259 (22.7)
219 (19.4)
Missing data
3 (<1)
2 (<1)
Duration of anginamonths
Median
0.53
0.83
Diabetes
365 (32.0)
395 (35.0)
0.12
Hypertension
755 (66.2)
763 (67.5)
0.53
56 (4.9)
51 (4.5)
0.59
Cerebrovascular Disease
99 (8.7)
100 (8.8)
0.83
Myocardial Infarction
435 (38.2)
437 (38.7)
0.80
Previous (PCI)*
173 (15.2)
183 (16.2)
0.49
124 (10.9)
124 (11.0)
0.94
968 (84.9)
974 (86.2)
0.84
552 (57.0)
550 (56.5)
Duration of treadmill
test-minute
6.9 2.6
6.8 2.2
Pharmacologicstress no (%)
415 (42.9)
425 (43.6)
Echocardiographyno (%)
61 (5.4)
52 (4.6)
683 (70.6)
705 (72.2)
0.59
152 (22.2)
159 (22.6)
0.09
441 (66.0)
481 (68.2)
0.09
Historyno (%)
Stress test
0.43
BLINDING
Blinding represents an important, distinct aspect of randomized
controlled trials. The term blinding refers to keeping trial participants,
investigators or assessors (those collecting outcome data) unaware
of an assigned intervention. Blinding is of three types:
Single-blind
Here the participants do not know whether they are assigned to the
study or the control group. It means that they do not know whether
they are getting the new drug which is under investigation or the
old conventional drug. However, only the investigator knows who is
getting which drug. This trial helps to overcome subject variation.
Double-blind
Here neither the investigator (doctor) nor the participant (patient)
knows the group allocation and treatment received. However, the
statistician knows it. The drug is coded before handing over to the
doctor. Usually this trial is in practice.
tahir99 - UnitedVRG
Study Designs 25
Triple-blind
It goes one step further. All the participants, the doctor and the
statistician are unaware (blind) of the group allocation. Only the
principal investigator is aware of the group allocation and the
treatment allocation.
CONSENT FORM
Since these studies involve human subjects, hence there are always
ethical issues which cannot be over looked. Approval from Ethical
Review Board (ERB) is mandatory. Consent forms are always
required and are scrutinized in detail by the ERB.
QUASI-EXPERIMENTAL STUDIES
In a quasi-experimental study, one characteristics of a true
experiment is missing, either randomization or the use of a separate
control group. A quasi-experimental study, however, always
includes the manipulation of an independent variable which is the
intervention.
One of the most common quasi-experimental designs uses two
(more) groups, one of which serves as a control group. Both groups
are observed before as well as after the intervention, to test if the
intervention has made any difference.
Preclinical Phase
Drug is developed and evaluated in cells and animals to see its
potential effect on human body.
Phase I Trial
These trials are conducted to determine recommended dose, side
effects and manner in which drug is processed by body. Here just
1020 healthy volunteers are recruited.
Phase II Trial
These are controlled clinical studies conducted to evaluate the
effectiveness of the drug or treatment to a larger group of people
(100300) to see if it is effective. These trials further evaluate its safety
and determine the common short-term side effects and risks.
Phase IV Trial
This includes post-marketing studies to delineate additional
information including the drugs risks, benefits, optimal use and
long-term side effects.
Post-marketing Surveillance
These involve observational studies such as case reports, cohort
studies or case control studies. Its purpose is to assess drug safety
tahir99 - UnitedVRG
Study Designs 27
under the conditions of use in general practice, as opposed to the
conditions under which they were tested in phase III trials.
META-ANALYSIS
A meta-analysis is a particular type of systematic review that
focuses on the numerical results. The main aim of meta-analysis
is to combine the results from individual studies to produce, if
appropriate, an estimate of the overall or average effect of interest
(e.g., the relative risk). The direction and magnitude of this average
effect, together with a consideration of the associated confidence
interval and hypothesis test result, can be used to make decisions
about the therapy under investigation and the management of
patients.
In the below study, Figure 2.2 is a meta-analysis comparing two
intervention for a certain outcome. The studies A [RR= 0.65 (CI = 0.1
0.7); p-value = 0.01] and E [RR = 0.7 (CI = 0.1 0.4); p-value = 0.0001]
show group A is better. While the study H [RR = 1.5 (CI = 1.2 2.0);
p-value = 0.001] shows that group B is better. The overall effect size is
not significant; [RR = 0.75 (95% CI = 0.3 1.1; p-value=0.32)].
Statistical Approach
We decide on the effect of interest and, if the raw data is available,
evaluate it for each study. However, in practice, we may have to
extract these effects from published results. For example, if the
outcome in a clinical trial comparing two treatments is numerical
the effect may be the difference in treatment means. A zero difference
implies no treatment effect. Similarly, if the outcome is binary (e.g.
died/survived) we consider the risks of the outcome (e.g. death) in
Knowing that a
problem exists but
knowing little about
its characteristics or
possible causes
Exploratory studies,
or descriptive
studies:
Descriptive case
studies
Cross sectional
studies
Suspecting that
certain factors
contribute to the
problem
Analytical
(comparative)
studies:
Cross sectional
comparative
studies
Case control
studies
Cohort studies
Having established
that certain factors
are associated
with the problem:
desiring to establish
the extent to which
a particular factor
causes or contributes
to the problem
Cohort studies
experimental or
quasi-experimental
studies
Having sufficient
knowledge about
cause(s) to develop
and assess an
intervention that
would prevent,
control or solve the
problem
tahir99 - UnitedVRG
Type of study
State of knowledge of
the problem
Study Designs 29
the treatment groups. The effect may be the difference in the risks
or their ratios, the RR. If the difference in risks equals zero or RR=1,
then there is no treatment effect.
BIBLIOGRAPHY
1. Hulley SB, Newman TB. Getting started: the anatomy and physiology of
clinical research. In: Hulley SB, Cummings SR, Browner WS. Designing
clinical research. Philadelphia, PA: Lippincott Williams and Wilkins;
2007.
2. Last John M. A Dictionary of Epidemiology. Oxford University Press
1983.
3. Park K. Parks Textbook on Preventive and Social Medicine 18th edn,
2005.
4. Schlesselman JJ. Case-Control Studies. Oxford University Press. New
York 1982.
5. Types of epidemiologic studies. In: Hennekens CH, Buring JE.
Epidemiology in Medicine. Boston: Little, Brown and Company; 1987.
pp. 101-204.
CHAPTER
Sampling Procedure
POPULATION
tahir99 - UnitedVRG
Sampling Procedure 31
SAMPLING TECHNIQUES
Broadly, there are two types of sampling techniques (Table 3.1):
1. Probability sampling techniques.
2. Nonprobability sampling techniques.
In a probability sampling technique, each participant in a study
population has an equal (or at least a known) chance of being
selected. The method protects the research from bias and ensures
1. Consecutive sampling
2. Convenience sampling
3. Cluster sampling
3. Purposive sampling
4. Quota sampling
Nonprobability sampling
Probability sampling
5. Snowball sampling
tahir99 - UnitedVRG
Sampling Procedure 33
For example in a recruitment for a study there are 100 participants
available, of these 25 have to be selected (sample size). The
participants to be recruited in the study will be selected randomly by
drawing a chit bearing the names/ID number of the 100 individuals.
Each individual in the study frame has an equal probability of being
selected for the study (i.e. when the first participant is to be selected
the probability is 1/100 for all participants, for second participant
the probability is 1/99 for all participants, for third participant
the probability is 1/98 for all participants and so on). Thus each
participant has an equal probability of being selected for the study.
The recommended way to select a simple random sample is to
use a table of random numbers, or a computer-generated list of
random numbers. For this approach each participants should have
an identification number (ID), and a list of ID numbers called a
sampling frame.
The steps of simple random sampling are as follows:
Prepare the sampling frame (assign a number to each element) of
the whole population [Participants are numbered from 1 to 100].
Determine the sample size [Estimated sample size is 25]
Randomly select the element [Any 25 numbers are picked from
1 to 100]
OR
If using computer generated lists to randomly select the
participant
Enter lowest ID number (i.e. in this case 001)
Enter highest ID number (i.e. in this case 100)
Enter the estimated sample size as 25
Computer generated randomization software will generate a
table of randomly selected participants/ID number (Fig. 3.3).
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
was 25, hence 100/25 would be 4 and so every 4th X-ray should be
selected.
First element is selected randomly from 1st to kth element (i.e.
in above example from 1 to 4). Then every kth element is selected
till the researcher achieves the required sample size. For example in
Figure 3.5 second individual in the study population is selected at
random and then every fourth individual is selected (i.e. 6th, 10th,
14th, etc.).
tahir99 - UnitedVRG
Sampling Procedure 35
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
Cluster Sampling
In clustered sampling technique sub-group of population is used as
a sampling unit instead of individuals. It is a probability sampling
technique, employed when the researcher aims to select participants
from a large geographical area i.e. country, province, state or city
(Flow chart 3.1). Suppose the city of Karachi consisted of 18 towns
and each town consisted of 10 union councils. Initially, 5 towns are
tahir99 - UnitedVRG
Sampling Procedure 37
Flow chart 3.1 Cluster random sampling technique
Convenience Sampling
Convenience sampling is presumed to be the most commonly used
technique in clinical research. It involves the selection of subjects
that are conveniently accessible to the researcher. Suppose, a
Purposive Sampling
Snowball Sampling
Snowball sampling method is employed when study participants
are difficult to identify, access or locate. The method is commonly
employed to recruit participants from hard to reach group (i.e. sex
workers, IV drug users, etc.). The sample is built through chain
referrals. Suppose, you are investigating the knowledge about
tahir99 - UnitedVRG
Sampling Procedure 39
Flow chart 3.2 Snowball sampling technique
Quota Sampling
Quota sampling is a nonprobability sampling method that
ensured a certain number of study participants from different
subgroups constitute the sample so that all these characteristics are
represented. Suppose you aim to identify the quality of life among
dialysis patients but you think that socioeconomic status has a
strong affect on quality of life in these patients. Thus you decide to
include 25% of respondents from each socioeconomic groups (i.e.
upper, middle, lower middle and lower). If the estimated sample
size is 200, each socioeconomic group will include 50 participants.
Thus initially a population is divided into different strata and then
any nonprobability sampling technique will be applied to select
participants.
BIBLIOGRAPHY
1. Beth Dawson-Saunders, Robert G Trapp. Basic and Clinical Biostatistics,
1989.
tahir99 - UnitedVRG
CHAPTER
Type of Variables
Dependent and Independent Variables
As in health system research you often look for causal explanations,
hence it is important to make distinction between dependent and
independent variables.
The variable that is used to describe or measure the problem
under study is called the dependent variable. It represents the
output or effect, or is tested to see if there is an effect. A dependent
variable is also known as a response variable, outcome variable,
and output variable.
The variables that are used to describe or explain the difference
in the dependent variable or to cause changes in the dependent
variables are called the independent (exposure) variables. It
represents the inputs or causes, or is tested to see if they are the cause.
An independent variable is also known as a predictor variable,
explanatory variable, and exposure variable.
For example, in a study of the relationship between smoking and
lung cancer, suffering from lung cancer (with the values yes or no)
would be the dependent variable and smoking (varying from not
Types of Data
* Mutually exclusive means both events cannot occur at the same time (i.e. tossing a
coin will result in either head or tail).
tahir99 - UnitedVRG
Categories
Male, female
Nominal data: In nominal data, the variables are divided into more
than two mutually exclusive categories. These categories however,
cannot be ordered one above another (as they are not greater or less
than each other).
Example: Nominal data
Categories
Marital status
Single, married, widowed, separ
ated and divorced
Employment status Unemployed, self-employed, public
employee and Govt. employee
Ordinal data: In ordinal data, the variables are also divided into more
than two mutually exclusive categories, but they can be ordered one
above another, from lowest to highest or vice versa.
Example: Ordinal data
Categories
Level of knowledge:
Good, average, poor
Level of blood pressure: High, moderate, low
child is equal with respect to providing one counting unit. There are
no intermediate values between each number.
Continuous variable is one in which there are no gaps in the values
of the variables: there are an unlimited number of possible values
between any two adjacent values on the scale. Thus, if the variable
is height measured in inches, then 4 and 5 inches are two adjacent
values of the variable. However, there can be an infinite number of
the intermediate values, such as 4.5 and 4.7 inches, variables such
as these are known as continuous variables (the values which can
occur in fractions or decimals).
Frequency Tables
tahir99 - UnitedVRG
Frequency
(n =100)
Relative
frequency
Cumulative
relative
Below 100
15
0.15
0.15
100120
25
0.25
0.40
121140
20
0.20
0.60
141160
30
0.30
0.90
Above 160
10
0.10
1.00
Total
100
1.00
Frequency
Percentage
Vomiting
30
30.0%
Fever
25
25.0%
Dyspepsia
20
20.0%
Nausea
15
15.0%
Headache
Total
10
10.0%
100
100.0%
Graphs
Another way to summarize and display data is through the use of
graph or pictorial representations of data, so that the data is easier to
interpret. Graphs should be designed so that they convey at a single
glance the general patterns in a set of data.
Types of Graphs
Bar charts
Pie charts
Histograms
Line graphs
Scatter plots
Bar Charts
Bar charts are used for binary, nominal and ordinal data (categorical)
and comprises of nonadjacent bar. The bars can be vertical or
horizontal.
Pie Charts
Pie charts can also be used to display binary, nominal and ordinal
data (categorical). A pie chart consists of circular region partitioned
into sections, with each percentage represents a part or a percentage.
Example: The data regarding knowledge of research ethics were
collected from 150 postgraduate trainees were collected. The survey
showed that 60 (40%) of the respondents were male and 90 (60%)
were female. The data is represented in Figure 4.2.
tahir99 - UnitedVRG
Histograms
A histogram depicts a frequency distribution for quantitative data, it
comprises of series of adjacent bars (Fig. 4.3).
Histograms are constructed to represent the continuous or
quantitative data. Ideally, every quantitative variable should be
normally distributed (bell shaped curve).
Line Graphs
A line graph (also called time series plot) is appropriate for
representing data that vary continuously. It shows a trend of variable
over time. To construct a time series plot, time is placed on a
horizontal axis and the variable being measured on a vertical axis,
with points being connected using line segments (Fig. 4.4).
Example: The population statistics of the US for the years 18601950
are as in Table 4.3:
Population
(in millions)
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
31.4
39.8
50.2
62.9
76.0
92.0
105.7
122.8
131.7
151.1
tahir99 - UnitedVRG
Scatter Plots
Scatter plot represents a relationship between two continuous
variable.
Example: Suppose, a researcher wishes to identify whether studying
for longer hours will lead to better scores. A collection of data is given
in Table 4.4.
Based, on the data below a scatter plot has been constructed as
shown in Figure 4.5. (Note: When connecting a scatter plot, do not
connect the dots).
Table 4.4: Data on studying hours and
corresponding scores
Participant No.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Study hours
3
5
2
6
7
1
2
7
1
7
Score
80
90
75
80
90
50
65
85
40
100
Figure 4.5 Scatter plot of students test scores and hours of study
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
Biostatistics: Basic
The mean weight would equal (110 + 110 + 140 + 150 + 160)/5 =
670/5 = 134 pounds.
The median value would be 140 pounds; since 140 pounds is the
middle weight.
Most frequent value is 110 (as occurring twice), so the mode of the
data set is 110 pounds.
The mode of the data is 110 pounds, since it is occurring twice
(more frequently).
MEASURES OF VARIATION
tahir99 - UnitedVRG
Biostatistics: Basic 53
For instant, the women weights in the above example were 110,
110, 140, 150 and 160 pound, the mean weight would be 134 pounds.
Variance (S) = S (xi x)2 / (n 1)
Where xi
= Individual sample observation
x
= Sample mean
n
= Total sample size
S = sum of the differences between individual sample observation
and sample mean
Example:
S = [(110134)2 + (110134)2 + (140134)2 + (150134)2 +
(160134)2]/51
S = [ (24)2 + (24)2 + (6)2 + (16)2 + (26)2]/51
S = [576 + 576 + 36 + 256 + 676]/4 = 2120/4
S = 530
Standard deviation is the square root of the variance. The standard
deviation is a measure, which describes how much individual
measurement differs, on the average, from the mean.
Standard deviation is the square root of variance (S):
SD = S
SD = (530) = 23.02
The same results can easily be obtained by SPSS (statistical
package).
Below is the SPSS output showing central tendency and variation
of above data set.
N (Number of observations)
Mean
134
Median
140.00
Mode
110.00
Standard deviation
Variance
Range
23.02
530.00
50.00
NORMAL DISTRIBUTION
tahir99 - UnitedVRG
Biostatistics: Basic 55
9
ri 9
n
U
-
ti e
V
d
G
R
Figures 5.1A and B Proportion of cases under portion of the normal curve
h
ta
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
Estimation and
Hypothesis Testing
V
d
G
R
POINT ESTIMATE
ti e
9
ri 9
n
U
-
INTERVAL ESTIMATE
h
ta
HYPOTHESIS TESTING
What is a Hypothesis?
Hypothesis is a testable theory. Hypothesis testing is the method
of testing whether claims or hypothesis regarding a population are
tahir99 - UnitedVRG
G
R
TEST OF HYPOTHESIS
n
U
-
ti e
V
d
9
ri 9
h
ta
Directional
Nondirectional
A directional hypothesis is one which the researcher is able to
explicitly state the direction of the relation between the populations.
tahir99 - UnitedVRG
9
ri 9
n
U
-
ti e
V
d
G
R
h
ta
DECISION ERRORS
tahir99 - UnitedVRG
9
ri 9
n
U
-
V
d
ti e
G
R
h
ta
Truth in the
Population
Decision
Retain the null
hypothesis
True
Correct
1a
Type I error
a
False
Type II error
b
Correct
1b
Power
95% CI
p-value
1.03
1.02, 1.04
<0.0001
1.12
0.97, 1.31
0.14
1.51
1.27, 1.78
<0.0001
1.04
1.03, 1.05
<0.0001
1.11
1.03, 1.18
0.02
1.12
1.04, 1.19
0.01
Ambulate independently
(ref = no)
0.48
0.39, 0.58
< 0.0001
LR (ref = ER)
1.66
1.30, 2.07
< 0.0001
tahir99 - UnitedVRG
n
U
-
V
d
ti e
G
R
The six steps for solving hypothesis testing from problems are as
follows:
1. State the hypothesis and identify the claim
2. Choose a significance level a
3. Find the critical value (s)
4. Compute the test value
5. Make the decision to reject or not to reject the null hypothesis
6. State the appropriate conclusion.
9
ri 9
h
ta
Solution
7 12
2
= 5/2 = 2.5
Interpretation
Since z-score calculated by statisticians for 2 standard deviation cut
of point is 1.96 and +1.96. Any z-score less than 1.96 and/or greater
Figure 6.3 Critical regions (the two tails) for rejecting the null hypothesis
(a = 0.025)
tahir99 - UnitedVRG
V
d
ti e
G
R
than +1.96 will fall in the region of rejection. In the above study, 2.5
is smaller than 1.96 we can see in Figure 6.4 the CKD sample mean
falls within the region of rejection (Fig. 6.4) of the population mean.
Hence, we reject the null hypothesis.
n
U
-
9
ri 9
h
ta
1. Duffy MS, Jacobsen BS. Key principles of statistical inference. In: Munro
BH (Ed). Statistical methods for health care research. Philadelphia:
Lippincott William and Wilkins; 2005. pp. 73-106.
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
V
d
G
R
n
U
-
ti e
9
ri 9
= 150/50
Female : Male = 3:1
h
ta
Proportion
It is a type of ratio in which those who are included in the numerator
must also be included in the denominator.
____________________________________________
Proportion
= 150/1500
= 1/10
Rate
Important Point
tahir99 - UnitedVRG
Prevalence
V
d
Prevalence =
Point Prevalence
ti e
G
R
_______________________________________________________
Point prevalence =
n
U
-
h
ta
9
ri 9
Period Prevalence
Period prevalence is the total number of cases (diseased) at any
point during a specified period of time divided by the population at
risk midway through the period.
Incidence
Incidence quantifies the number of new events or cases of disease
that develop in a population of individuals at risk during a specified
time interval.
tahir99 - UnitedVRG
Decreased by
In-migration of cases
Out-migration of cases
n
U
-
ti e
V
d
G
R
9
ri 9
h
ta
tahir99 - UnitedVRG
Jan
2008
Jan
2009
Jan
2010
Jan
2011
Jan
2012
Years at risk
---------
---------
---------
----------
---------
5 years
---------
---------
---------
----------
----x
4.5 years
---------
---------
-----x
---------
---------
-----L
2.5 years
---------
---------
-----x
3.5 years
Persons
4
5
---------
Total
2.5 years
18 years
Morbidity Rate
It is the incidence rate of nonfatal cases in the total population at risk
during a specified period of time. For example, the morbidity rate
of tuberculosis (TB) in the US in 1982 can be calculated by dividing
the number of nonfatal cases newly reported during that year by the
total US mid-year population.
Mortality Rate
It expresses the incidence of deaths in a particular population during
a period of time. It is calculated by dividing the number of fatalities
during that period by the total population. This can be further
divided into cause specific mortality rate, age specific mortality rate
or sex specific mortality rate, etc.
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
Measures of Association
Degree of association
1.0
Perfect
0.7 to 1.0
Strong
0.4 to 0.7
Moderate
0.2 to 0.4
Weak
0.01 to 0.2
Negligible
On 0.0
No association
10
Machine owned
(in months)
10
12
Hours exercised
10
If you display these data pairs as points in a scatter plot (Fig. 8.1),
then you can see a definite trend. The points appear to form a line
that slants from the upper left to the lower right. As you move along
that line from left to the right, the values on the vertical axis (hours of
exercise) decreases, while the values on the horizontal axis (months
owned) increases. Another way to express this is to say that the two
variables are inversely related: the more months the machine was
owned, the less the person tends to exercise. Thus, there seems to be
tahir99 - UnitedVRG
Measures of Association 79
a correlation between these two continuous variables, but the two
variables are correlated negatively.
Example 2: Now consider the data table below which contains
measurements on two continuous variables for ten people; the
number of months the person has owned an exercise machine and
their cardiovascular fitness (measured on a scale from 1 to 12, higher
scores showing better cardiovascular fitness).
Person
10
Machine owned
(in months)
10
12
Cardiovascular
fitness (score from
1 to 12)
11
Figure 8.2 Scatter plot of two continuous variables (Months exercise machine
owned and cardiovascular fitness score) showing positive correlation
10
Machine owned
(in months)
10
12
Height (meters)
1.3
1.8
1.5
1.9
1.3
1.9 1.4
1.8
1.5
tahir99 - UnitedVRG
Measures of Association 81
the closeness of the data points to the perfect line. Figure 8.5 shows
a stronger correlation than Figure 8.4.
tahir99 - UnitedVRG
Measures of Association 83
Table 8.2: SPSS output (labeled as coefficient) for linear regression
Coefficients
Standar95% confidence
dized cointerval for B
efficients
Mode
t
Signifi Lower Upper
B
Standar
Beta
cance bound bound
dized error
1(Constant) 19.932
2.07
0.300
9.61
0
15.801 24.064
dialysis
0.061
0.023
2.728 0.008 0.106 0.017
duration
in months
Unstandardized
coefficients
where
Y is the predicted value of the dependent variable Y
a is the intercept (in this case it is 19.932)
b is the slope or the gradient of the regression line (in this case, it
is 0.061)
X is the independent or explanatory variable
=
=
=
=
19.932-0.061 X
19.932-0.061 (16)
19.932-0.976
18.956
Y
Y
Y
Y
19.932-0.061 X
19.932-0.061 (17)
19.932-1.037
18.895
=
=
=
=
Y
Y
Y
Y
Relative risk
The relative risk (or risk ratio) is defined as the ratio of the incidence of
disease in the exposed group divided by the corresponding incidence
of disease in the nonexposed group. Relative risk can be calculated in
cohort studies such as the Framingham Heart Study where subjects
tahir99 - UnitedVRG
Measures of Association 85
Diseased
Nondiseased
Exposed
a+b
Nonexposed
c +d
a
________
a + b
Relative risk (RR) = _____________
c
_________
c + d
Total
Risk factor
Disease status
Total
CHD present
CHD absent
Smoker
112
a
176
b
288
a+b
Nonsmoker
88
c
224
d
312
c+d
Interpretation of RR
As compared to nonsmokers, the smokers have a 1.38 times greater
risk of developing CHD.
Alternative explanation: Compared to nonsmokers, the smokers
have a 38 percent greater risk of developing CHD.
Odds Ratio
tahir99 - UnitedVRG
Measures of Association 87
In research the word risk is used for the development of a
disease or outcome, e.g. the risk of developing CHD. In a case control
study because the cases and controls are defined on the basis of the
outcome/disease, i.e. those who have CHD are the cases, and those
who do not have CHD are controls. Since the study starts with the
disease/outcome, hence researchers want to use a different word
for looking at the prevalence of the exposure in those who had the
disease versus those who did not have the disease. The researchers
prefer to use the word odds for an exposure rather than risk.
Odds of exposure in the cases
Odds ratio = _____________________________________________
Odds of exposure in the controls
Control
Total
Exposed
a+b
Nonexposed
c+d
a
______
c
Odds ratio = ________
b
______
d
Breast cancer
Total
Yes
No
Exposed
(oral contraceptive users)
140 (a)
370 (b)
510
Nonexposed
40 (c)
234 (d)
274
a
______
c
Odds ratio = ________
b
______
Interpretation of OR
Compared to the controls (those who did not have Ca breast), the
odds of being an oral contraceptive user were 2.2 greater in those
who had Ca breast (cases).
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
Factors Affecting
Study Outcomes
INTRODUCTION
Results of an epidemiological studies may reflect the true effect of an
exposure(s) on the development of the outcome under investigation,
but it must always be considered that the results may in fact due to an
alternative explanations. Such alternative explanations, may be on
account of the effects of chance (random error), bias or confounding
which may produce spurious results, leading the researcher to
believe the existence of a valid statistical association when one does
not exists or alternatively the absence of an association when one is
truly present.
Observational studies are more susceptible to the effect of chance,
bias and confounding, so appropriate steps must be taken at both
the design and analysis so their effects could be minimized.
BIAS
Any systematic error that results in an incorrect estimate of the
association between an exposure and the disease/outcome is
called a bias. It is usually introduced by the researcher due to
nonstandardized measuring techniques.
Types of Bias
More than 50 types of bias are identified in epidemiological studies,
but for simplicity, they are broadly grouped into two categories:
1. Selection bias
2. Information bias
Selection Bias
tahir99 - UnitedVRG
CONTROL OF BIAS
Control of bias is mostly done at the design phase of the study.
Following are some means to ensure the same.
CONFOUNDING
The concept of confounding is a central one in the interpretation
of any epidemiological study. It can be thought of as mixing of the
effect of the exposure under study on the outcome, with that of an
extraneous factorthe confounder. This external factor or variable
must be associated with the exposure, and independent of the
exposure must be a risk factor for the disease to be deemed as a
confounder. Confounding can lead to an over or an underestimation
of the true association between exposure and outcome.
Example 1: In a study conducted to determine the association
between smoking and myocardial infarction (MI), age can be a
confounder as it is associated with both exposure and outcome
independently.
tahir99 - UnitedVRG
MI +
MI -
Yes
29
135
No
205
1607
Total
234
1742
= 1.68
Recent OC use
MI +
MI -
Yes
62
No
Yes
No
Yes
No
Yes
No
Yes
No
2
9
12
4
33
6
65
6
93
234
224
33
390
26
330
9
362
5
301
1742
Total
Estimated age-specific
relative risk
7.2
8.9
1.5
3.7
3.9
EFFECT MODIFIERS
Effect modifiers are variables that bring about a change in the
magnitude of an effect. Unlike confounder, effect modifier does not
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
10
SAMPLE SIZE
The sample size calculation depends on:
Type of study
Magnitude of the outcome of interest derived from previous
studies
Type of statistical analysis required (comparing means or
proportions)
Level of significance/power.
Figure 10.1 Sample size calculation and formula for single proportion
When the above values are entered into WHO sample size
calculator, the estimated sample size will be calculated (Fig. 10.1).
The estimated sample size calculated is 369. Thus, at least
369 participants must be recruited in the study to determine the
prevalence of CKD at confidence interval of 95 percent, with a
precision of 5 percent.
tahir99 - UnitedVRG
Figure 10.2 Sample size calculation and formula for single group mean
tahir99 - UnitedVRG
Figure 10.3 Sample size calculation and formula for two proportions
80
Power
Ratio of sample size (Group 2/Group 1)
Group 1
135
and
Mean
Standard
Deviation
16
1
Group 2
130
Enter means values of
each group
Enter standard
17
deviation or variance of
each individual group
Variance
Mean
Standard deviation
Variance
Sample size of group 1
Sample size of group 2
Total sample size
95%
80%
1
Group 1 Group 2
135
16
256
Mean
difference*
(135130)=5
130
17
289
172
172
344
tahir99 - UnitedVRG
Confidence level
95%
To achieve the precision of 0.05 for Sensitivity, we need the total sample size of = 347
is preferable as it will give precision of 0.05 or less for both sensitivity and specificity
With this sample size, the precision for Specificity will be = 0.027
0.97
0.94
0.15
0.05
Expected Sensitivity
Expected Specificity
Expected Prevalence
Desired Precision
This
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
11
Screening
Validity (Accuracy)
The term validity refers to what extent the test accurately measures
which it intends to measure. In other words, validity expresses the
No disease
Test Positive
*True Positives
(a)
#False Positives
(b)
Test Negative
False Negatives
(c)
~True Negatives
(d)
Total without
Disease (b + d)
Total Screened
(a + b + c + d)
*True positives = number of individuals with disease and a positive screening test
(a); #False positives = number of individuals without disease but have a positive
screening test (b); False negatives = number of individuals with disease but have a
negative screening test (c); ~True negatives = number of individuals without disease
and a negative screening test (d)
tahir99 - UnitedVRG
Screening 105
screened population. Sixty percent sensitivity means that 60 percent
of the diseased people screened by the test will give a true positive
result and the remaining 40 percent a false-negative result. Thus,
expressed as the proportion of those with disease correctly identified
by a positive screening test result.
Number of true positives
Sensitivity =
Total with disease
= a/(a + c)
when expressed in percent
a
100
=
a+c
V
d
G
R
9
ri 9
h
ta
d
100
b+d
n
U
-
ti e
PREDICTIVE VALUES
a
100
a+b
Negative predictive value describes the probability of not having
the disease given a negative screening test result in the screened
population. Thus, expressed as the proportion of those without
disease among all screening test negatives. The negative predictive
value of mammography, for example, will tell a woman the probability
that she truly does not have breast cancer, if the mammogram is
negative.
Number of true negatives
Positive predictive value (PPV) =
Total test negatives
= d/(c + d)
when expressed in percent
d
100
=
c+d
Example
A new ELISA (antibody test) is developed to diagnose HIV infections.
Serum from 80 patients that were positive by Western Blot (the Gold
Standard assay) was tested, and 60 were found to be positive by the
new ELISA screening test. The manufacturers then used the new
ELISA to test serum from 120 study participants that were negative
by Western Blot (the Gold Standard assay); 70 were found to be
negative by the new test.
Infected
ELISA
Test
Positive
Negative
Total
HIV
Non-infected
Total
a + b =110
60 (a = TP)
50 (b = FN)
Total test positive
c +d = 90
20 (c = FP)
70 (d = TN)
Total test negative
80 (a + c)
120 (b + d)
a + b + c + d = 200
Total infected Total not infected
Total screened
tahir99 - UnitedVRG
Screening 107
a
60 100
100 =
= 75%, i.e. the new test ELISA is
a+c
80
75 percent sensitive in correctly identifying HIV infection.
Sensitivity =
d
70 100
100 =
= 58%, i.e. the new test ELISA is
d+b
120
58 percent specific to detect non-HIV infected persons.
Specificity =
V
d
G
R
a
60 100
100 =
= 55% , i.e. based over, ELISA the new
a+b
100
screening technique for HIV 55 percent persons who test positive,
are actually suffering from HIV.
PPV =
n
U
-
ti e
d
70 100
100 =
= 78%, i.e. based over, ELISA the new
c +d
90
screening technique for HIV 78 percent persons who test negative,
are actually free from HIV.
NPV =
9
ri 9
h
ta
Disease
positive
Disease
negative
Total
99
495
594
Test (Negative)
9405
9406
Total
100
9900
10,000
Test (Positive)
99
9405
100; NPV =
100
594
9406
= 17%
= 99.99%
However, with the same sensitivity, specificity and population
size, if the prevalence changes then what will be the effect on the
tests positive predictive value (PPV); see example 2b.
Example 1b: In a population of 10,000 with a disease prevalence of
5% Sensitivity = 99%; Specificity = 95% with test A;
PPV =
Disease
prevalence
Disease
positive
Disease
negative
Total
Test (Positive)
495
475
970
Test (Negative)
9025
9030
Total
500
9500
10,000
5%
495
9025
100; NPV =
100
970
9030
= 51.03 %
= 99.94 %
PPV =
tahir99 - UnitedVRG
Screening 109
V
d
G
R
BIBLIOGRAPHY
h
ta
9
ri 9
n
U
-
ti e
CHAPTER
12
UNPAIRED SAMPLES
In unpaired samples, there is no relation between subjects in group
1 and subjects in group 2 (two independent groups). Suppose a data
is collected on ICT skills comparing medical versus engineering
students. These are two independent groups. Whenever you are
comparing mean of continuous variable in two independent groups
(e.g. medical students and engineering students), an independent
sample t-test will be applied.
PAIRED SAMPLES
tahir99 - UnitedVRG
Flow charts 12.1 and 12.2 give different choices of tests for
qualitative and quantitative data.
9
ri 9
n
U
-
V
d
ti e
h
ta
G
R
Nonparametric Tests
tahir99 - UnitedVRG
V
d
ti e
G
R
n
U
-
9
ri 9
h
ta
Participants ID
Placebo
Drug
Difference
(Placebo-Drug)
13
16
-3
19
11
Figures 12.1A to D (A) Neither valid nor reliable. The research method does
not measure the research outcome (not valid) and repeated attempts are unfocused; (B) Reliable but not valid. The research method does not measure
the research outcome (not valid), but repeated attempts get almost the same
(wrong) results; (C) Fairly valid but not very reliable. The research method
measures the research outcomes fairly closely, but repeated attempts have
very scattered results (not reliable); (D) Valid and reliable. The research
method precisely measures the research outcomes, and repeated attempts
produce similar results
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
13
Overview of Data
Collection Techniques
V
d
G
R
ti e
n
U
-
9
ri 9
h
ta
Observing
It is a technique that involves systematically selecting, watching
and recording behavior and characteristics of living beings, objects
or phenomena. Observations can be open (e.g. observing a health
worker during his/her routine activities) or concealed (e.g. mystery
Interviewing
Here there is oral questioning of respondents. Answers to the
questions posed during an interview are either written down or
recorded by a tape recorder, or both techniques could be used.
The unstructured method of asking questions is used. This method
is frequently used in exploratory studies where the investigator has,
as yet, little understanding of the problem, or if the topic is sensitive.
Questionnaire
Types of Questions
tahir99 - UnitedVRG
Mail
Telephone
Via computer
Interviewer.
G
R
h
ta
9
ri 9
n
U
-
ti e
V
d
[ ] 34
[ ] 56
[ ] 7 or more
If more details are required pertaining to a question, then the
filter/skip technique should be used to save time and allow
respondents to avoid irrelevant questions.
Example: Have you ever been told that you have hypertension?
Yes
No
If yes, proceed to the next question
How long back were you told that you have hypertension?
Always choose an appropriate means of measurement e.g. score/
scales.
Example: Two words that are often used inappropriately are
frequently and regularly. A poorly designed question might read,
I frequently engage in exercise, and offer a Likert scale giving
responses from strongly agree through to strongly disagree.
But frequently implies frequency, so a frequency based rating
scale (with options such as at least once a day, twice a week, and
so on) would be more appropriate.
Sensitive questions should be left for the end.
Using a previously validated and published questionnaire will
save your time and resources, so if similar research instruments
are available it may be a good idea to review and borrow questions.
Always try to ensure that if questions are to be asked in any
language besides English they shall be so written too.
1.
2.
Projective Techniques
When a researcher uses projective techniques, he asks an informant
to react to some kind of visual or verbal stimulus.
For example, the presentation of a hypothetical question or
an incomplete sentence or case/study to an informant (story with
a gap). The researcher then asks the informant to complete the
sentence in writing such as;
tahir99 - UnitedVRG
G
R
9
ri 9
BIBLIOGRAPHY
n
U
-
ti e
V
d
h
ta
CHAPTER
14
tahir99 - UnitedVRG
V
d
ti e
G
R
When making a plan for data processing and analysis the following
issues should be considered:
Sorting data
Performing quality-control checks
Data processing
Data analysis.
Sorting Data
n
U
-
9
ri 9
h
ta
For example:
Yes (or positive response)
No (or negative response)
Do not know
No response/unknown
tahir99 - UnitedVRG
code-Y or 1
code-N or 2
code-D or 8
code-U or 9
n
U
-
ti e
V
d
G
R
9
ri 9
h
ta
Smokers
51
Nonsmokers
93
Total
144
If numbers are large enough it is better to calculate the frequency
distribution in percentages (relative frequencies): 51/144 100 =
35 percent are smokers and 93/144 100 = 65 percent nonsmokers.
This makes it easier to compare groups than when only absolute
numbers are given. In other words, percentages standardize the
data.
Divide the range into three to five categories. You can either aim
at having a reasonable number in each category (e.g. 02 km,
34 km, 59 km, 10+ km for home-clinic distance) or you can
define the categories in such a way that they are each equal in size
(e.g. 2029 years, 3039 years, 4049 years, etc.).
Construct a table indicating how data are grouped and count the
number of observations in each group.
Cross-tabulations: Further analysis of the data usually requires
the combination of information on two or more variables in order
to describe the problem or to arrive at possible explanations for it.
For this purpose it is necessary to design cross-tabulations.
Depending on the objectives and the type of study, two major
kinds of cross-tabulations may be required:
1. Descriptive cross-tabulations that aim at describing the
problem under study.
2. Analytic cross-tabulations in which groups are compared in
order to determine differences, or which focus on exploring
relationships between variables.
A descriptive cross-tabulation (Table 14.1) would, for example,
relate smoking behavior to sex or occupational background:
The males appear to be smoking more (43%) than females (28%).
Table 14.1: Smoking by sex
Sex
Smoking
Not smoking
Total
Males
31 (43%)
41 (57%)
72 (100%)
Females
20 (28%)
52 (72%)
72 (100%)
Total
51 (35%)
93 (65%)
144 (100%)
tahir99 - UnitedVRG
Cough
No cough
Total
Smoking
10 (77%)
41 (32%)
51 (35%)
Not smoking
3 (23%)
90 (68%)
93 (65%)
Total
13 (100%)
131 (100%)
144 (100%)
tahir99 - UnitedVRG
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
15
Synopsis Writing
METHODOLOGY
Operational definitions
Type of study and general design
TITLE/TOPIC
INTRODUCTION
An introduction is the most important part of the research protocol
and it should come very strongly just like a thunder to grab the
readers attention. It is here that one tries to let the reviewer know
that his research is going to be different from what other people have
done. One should also know that in case of research protocol the
onus is on the researcher to tell the reviewer how important is the
study going to be. Let me explain here, in case the reviewer comes
from a specialty different than the researcher, the former might not
tahir99 - UnitedVRG
Third Paragraph
The third paragraph should point out the existing gaps in scientific
knowledge and how the present study will contribute to fill in the
gaps.
A template of a third paragraph: Using the same study as an example,
this is how the researchers made their point. The outcome mortality
has been a controversial issue among LR vs ER in end stage renal
disease (ESRD) patients. The previous studies on this subject have
been single center studies with a sample size of a few hundred
patients only.
Our study will be carried out on a generalizable United States
population of about 3,50,000 dialysis patients recruited from all
states of the US. Our study will also be using a novel statistical
technique called propensity score analysis (PS analysis). PS analysis
is a proxy for randomization. Thus using a PS analysis will make the
study as good as randomized controlled clinical trial. Hence, our
study is going to make an effort to settle this controversy regarding
LR vs ER in a robust fashion.
Fourth Paragraph
The fourth paragraph should give details about the rationale of the
study planned. Thus a clear emphasis must be made why this study
is important.
A template of a fourth paragraph: The outcome (i.e. mortality)
associated with late vs early referral has been a controversial subject
and has generated immense debate among the researchers. There
is lack of consensus among researchers whether late referral is
tahir99 - UnitedVRG
Research Objectives
A research objective is a statement that clearly depicts the goal to be
achieved by a research project. In other words, the objectives of a
research project summarize what a study plans to achieve.
The formulation of objectives will help you to:
Focus the study (narrowing it down to essentials)
Avoid the collection of data which are not strictly necessary for
understanding and solving the problem you have identified (to
establish the limits of the study)
Organize the study in clearly defined parts or phases.
Properly formulated, specific objectives will facilitate the
development of your research methodology and will help to orient
the collection, analysis, interpretation and utilization of data.
Objectives should be stated using action verbs that are specific
enough to be measured:
Examples: To determine , To compare, To verify, To calculate,
To describe, etc.
Do not use vague nonaction verbs such as:
To appreciate To understand To believe
An objective is intent of what the researcher wants to determine
and should be stated in clear, measurable terms. While developing
a research protocol a researcher must ensure that the research
objective must match the hypothesis and data analysis plan.
Moreover, a researcher can have as many objectives as he feels that
the study is feasible to achieve.
Given below is an example of specific aims/objectives mentioned
for a study looking at the impact of socioeconomic factors on
Operational Definition
tahir99 - UnitedVRG
Specific Aim 2
Determine the influence of
complications of CKD (anemia,
malnutrition, hyperlipidemia,
abnormal calcium-phosphorus
metabolism), comorbid conditions
(hypertension, diabetes,
cardiovascular), and socioeconomic
factors (decreased access to care)
on mortality among KTR
Hypothesis/Rationale
(For Specific Aim 1)
The prevalence of complications
of CKD (anemia, malnutrition,
hyperlipidemia, abnormal calciumphosphorus metabolism) and
comorbid conditions (diabetes,
hypertension, cardiovascular disease)
is high among kidney transplant
recipients
Hypothesis/Rationale
(For Specific Aim 2)
Complications of CKD (anemia,
malnutrition, hyperlipidemia,
abnormal calcium-phosphorus
metabolism), comorbid conditions
(hypertension, diabetes,
cardiovascular), and decreased
access to care are associated with
increased mortality among KTR
Statistical analysis
(For Specific Aim 1)
Descriptive statistics and/
or frequency distributions of
continuous variables and of
categorical variables will be
obtained. The prevalence of
Statistical analysis
(For Specific Aim 2)
Descriptive statistics and/or the
proportion of deaths among
kidney transplant recipients will be
determined for overall deaths and
by specific causes
Contd...
tahir99 - UnitedVRG
Sampling Method
tahir99 - UnitedVRG
Duration of Study
It is also important to make clear that during what time period the
data will be collected. For example, all participants who attend the
outpatient diabetic clinics of XYZ hospital from 1st January 2012 to
31st December 2013 will be included in the study.
Software
The sample size calculation was done using the WHO software for
Sample Size Calculation edited by Lemeshow L and Lwanga SK.
Reference Study
The reference study used for this sample size calculation is;
Charit, Virchow Klinikum et al. Betel quid chewing, oral cancer
and other oral mucosal diseases in Vietnam. J Oral Pathol Med.
2008 Oct;37(9):511-4. Epub 2008 Jul 8. The values obtained from the
reference study are P1 = 0.30 ; 30% of the controls in the reference
study were consuming betel quid (chemical similar to ghutka).
P2 = 0.70 ; (70% of the cases in the reference study were consuming
betel quid). These two numbers 30 percent and 70 percent were
plugged into the WHO sample size software.
According to the proportion of exposures in cases and controls
in the above study, the sample size calculated is 38 (Fig. 15.1). The
results of the study are valid as confirmed by sample size calculation
using WHO software for sample size calculation.
Although the calculated sample size according to the WHO
software is 38 cases and 38 controls.
tahir99 - UnitedVRG
Ethical Concerns
Ethical concerns are of paramount importance for any research.
The researcher must obtain an informed consent in the local
language from all the participants. The purpose of the research,
intervention to be given, potential benefits and harms, voluntary
participation, healthcare cost, etc. must be explained in detail to
all study participants. It is also important to protect the rights of
vulnerable groups (i.e. children, mentally ill people, etc.) If children
are to included in the study, a consent from guardian is essential.
A translated version of the inform consent form must be attached
as an appendix. It is the duty of the researcher to ensure that
anonymity of the participants will be maintained throughout the
research. Moreover, confidentiality of participants response must
also be maintained during research. The researcher must make
sure that appropriate data protection policies are adopted, so no
unofficial person has an access to confidential data collected from
study participants. Finally, the researcher must ensure that the
study is conducted in accordance with the guidelines of Helenski
Deceleration, and if deemed necessary an approval from the local
ethical review board should be obtained. All these details must be
included in the ethical consideration portion of the methodology.
tahir99 - UnitedVRG
Data Analysis
Descriptive Analysis
The data analysis usually begins with the descriptive analysis. The
descriptive analysis is the description about the characteristics of the
population/sample being studied. The descriptive analysis is usually
presented in research studies as shown in Table 15.2.
A universally accepted and prescribed descriptive analysis, if the
study is describing one sample/population is like given here:
A descriptive statistical analysis of continuous and categorical
variables will be performed. Data on continuous variables will be
presented as mean SD and data on categorical variables will be
presented as proportions.
Please note that there is no p-values column in Table 15.3 as no
comparison is being made.
If the comparison is to be made between two groups, then values
on each variable in both groups must be calculated, with a p-value
indicating any difference (Table 15.3).
Ideally, a statistical analysis should include various types of
analyses like cross-tabulations, linear regression, multivariate
regression analysis, and survival analysis. New researchers are
strongly encouraged to include these types of analysis to add
glamor and colour to the research. Examples of some of the analysis
mentioned above are given here.
Mean SD or %
Gender
Male
Race
Caucasians
African-American
Asian
Other
Insurance
Private
Health Maintenance Organization (HMO)
Medicare
Medicaid
None
Comorbidity index
Zero
One
Two
Three
Cause of CRI
Diabetes mellitus
Hypertension
GN/PKD/IN
Other
Laboratory values
Serum creatinine (mg/dL)
GRF (mL/ min/ 1.73m2)
BUN (mg/dL)
Serum albumin (g/dL)
Hematocrit (Hct) (%)
tahir99 - UnitedVRG
Cr (2 3)
mg/dL
Cr (3 4) Cr (4 5)
mg/dL
mg/dL
Cr (>5)
mg/dL
Hb>12 g/dL
Hb 11 12 g/dL
Hb 10 11 g/dL
Hb <10 g/dL
OR (95% CI)
p-value
tahir99 - UnitedVRG
A
A
B
B
tahir99 - UnitedVRG
synopsis stage. The researcher so far does not have the data but
he has in his mind how the associations should be between these
two continuous variables (Figs 15.4 and 15.5). A true association
between continuous variables hematocrit and GFR, and hematocrit
and creatinines can be seen in Figures 15.3A and B, which is a
published study by kazmi et al.
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
16
Dissertation Writing
TITLE
It should highlight the key features of the study.
TABLE OF CONTENT
Include headings and subheadings with respect to the page number.
TITLE PAGE
It includes complete title of the manuscript, the name of the authors
with their highest qualifications, the department or institution to
which they are attached, address for correspondence with telephone
numbers and fax number, if possible.
ABSTRACT
Structured: All original articles should have a structured abstract.
Usually the limit ranges from one hundred fifty to two hundred fifty
words. The abstract should be in structured form and should have
headings of objective, study design, settings, subjects, interventions
(if applicable), main outcome measures, results and conclusions.
Keywords: Below the abstract give few keywords, which should not be
more than ten. These keywords are used in cross-indexing the article
and are usually published with abstract. Use terms from the Medical
Subject Headings (MeSH) which are listed with standard medical
headings given in the list of index medicus, e.g. glomerulonephritis,
paraplegia, infertility. If some cases, MeSH terms are not yet available
for recently introduced terms, present term may be used. Keywords
are included with structured abstract.
INTRODUCTION
It includes:
Importance of the subject (what is known).
Limitation of previous studies/gray areas/controversies (what is
unknown).
Justification of your study/rationale (based on the above aspects
e.g., gaps in knowledge).
Any special strength of your study.
tahir99 - UnitedVRG
HYPOTHESIS
It is an expected relationship between the exposure and the outcome.
STUDY OBJECTIVE
Formulate your objective(s) clearly. Remember Quality Thoughts
Precede Quality Results.
RESULTS
Firstly, the demographic profile is shown (e.g. if the study is done
on human subjects, show the different age groups, common areas of
DISCUSSION
It should emphasize the salient features of present findings.
Comparisons should be made of variations or similarities with
results of previous similar studies both national and international
with references. The detailed data should not be repeated in the
discussion. It must be mentioned whether the hypothesis in the
article was rejected, or could not be rejected. It is important to
remember that in the discussion section only discuss points you
have highlighted in the results. The second last paragraph highlights
the limitations of your study. It is a good idea to mention your
limitations before they are pointed out to you by the reviewer. The
conclusions of your study must be based on what you have observed
in your results.
OPTIONAL COMPONENTS
tahir99 - UnitedVRG
REFERENCES
It includes citation in the text that should be serially numbered. List
the references in Vancouver style.
ANNEXES
It should be added, if they increase the understanding or evaluation
of the study. All annexure should be serially numbered and referred
to at appropriate places in the body of dissertation.
BIBLIOGRAPHY
tahir99 - UnitedVRG
CHAPTER
17
Reference Writing
Put a comma and 1 space between each name. The last author
must have a full-stop after his initial(s).
Format name (s) of author(s): Surname (1 space) initial(s) (no
spaces or punctuation between surname and initials) (full-stop
OR if further names comma, 1 space).
Example: Halpern SD, Ubel PA, Caplan AL. Solid-organ trans
plantation in HIV-infected patients. N Engl J Med. 2002;
347(4):284-7.
As an option, if a journal carries continuous pagination
throughout a volume (as many medical journals do) the month
and issue number may be omitted.
Example: Halpern SD, Ubel PA, Caplan AL. Solid-organ
transplantation in HIV-infected patients. N Engl J Med.
2002;347: 284-7.
More than six authors
Example: Rose ME, Huerbin MB, Melick J, Marion DW, Palmer
AM, Schiding JK, et al. Regulation of interstitial excitatory
amino acid concentrations after cortical contusion injury.
Brain Res. 2002; 935(1-2):40-6.
Organization as author
Example: Diabetes Prevention Program Research Group.
Hypertension, insulin, and proinsulin in participants with
impaired glucose tolerance. Hypertension. 2002; 40(5):679-86.
JOURNALS TITLE
tahir99 - UnitedVRG
Volume Number
If the journal has continuous page numbering through volume,
the month/day and issue information can be omitted.
Format volume of publication: Volume number (no space) issue
number in brackets (colon, no space) OR volume number (colon,
no space).
Example: 4(3):
Page Numbers
Format of page number: Page numbers (full-stop).
Example: pp. 122-9.
Example: pp. 1129-57.
tahir99 - UnitedVRG
OTHER AUTHORS
More than six authors: Give the first six names in full and add et
al. The authors are listed in the order in which they appear on the
title page.
Editor(s): Follow the same methods used with authors but use the
word editor or editors in full after the name(s). The word editor
or editors must be in lower case. (Do not confuse with edn used
for edition).
Example: Millares M, editor. Applied drug information:
strategies for information management. Vancouver, WA:
Applied Therapeutics, Inc.; 1998.
Sponsored by institution, corporation or other organization
(including Pamphlet)
Example: Australian Pharmaceutical Advisory Council.
Integrated best practice model for medication management
in residential aged care facilities. Canberra: Australian
Government Publishing Service; 1997.
Chapter or part of a book to which a number of authors have
contributed.
Format of book chapter: Author(s)/editor(s) of chapter. Title
of chapter. In: author(s)/editor(s) of book. Title of book. City of
publication (State or country of publication): Publisher; year.
pages of book chapter.
Example: Porter RJ, Meldrum BS. Antiepileptic drugs. In:
Katzung BG, editor. Basic and clinical pharmacology. Norwalk,
CN: Appleton and Lange; 1995.pp. 361-80.
DISSERTATION REFERENCE
Example: Borkowski MM. Infant sleep and feeding: a telephone
survey of Hispanic Americans [dissertation]. Mount Pleasant (MI):
Central Michigan University; 2002.
tahir99 - UnitedVRG
BIBLIOGRAPHY
1. International Committee of Medical Journal Editors. Uniform
requirements of manuscripts submitted to biomedical journal: sample
references. [monograph on the Internet]. Bethesda (MD): National
library of Medicine (US); 2003. [cited 10 Aug. 2008]; Available from:
URL: http://www.nlm.nih.gov/bsd/uniform_requirements.html.
2. Uniform requirements for manuscripts submitted to biomedical
journals. International Committee of Medical Journal Editors. CMAJ.
1995;152(9):1459-73.
CHAPTER
18
tahir99 - UnitedVRG
V
d
h
ta
9
ri 9
n
U
-
ti e
G
R
tahir99 - UnitedVRG
9
ri 9
h
ta
n
U
-
ti e
V
d
G
R
IMPORTANT NOTES
Studies should not be done on patients expenses.
If any new or additional tests are to be done as a requirement of
study, their cost should be supported by the study.
If a new treatment is compared with an existing and establish one
or two treatment modalities are being evaluated and compared,
cost of treatment or difference in cost of treatment should be
borne by the study. In addition any expected or unexpected
complication arising as a result of new treatment should also be
supported by the study.
Studies which are unlikely to produce any significant results
because of faulty design are often considered not to be ethical as
such studies cause wastage of time and resources. Theses should
be avoided unless there is a strong justification.
BIBLIOGRAPHY
1. Agard E, Finkelstein D, Wallach E. Cultural Diversity and Informed
Consent. The Journal of Clinical Ethics. 1998;9(2):173-6.
2. Sugarman J, Popkin B, Fortney J Rivera R. International Perspectives
on Protecting Human Research Subjects. Crystal City, VA: National
Bioethics Advisory Commission Draft, 2000.
3. World Health Organization and Council for International Organizations
of Medical Sciences (WHO-CIOMS). International Ethical Guidelines
for Biomedical Research Involving Human Subjects. Author, Geneva,
1993.
tahir99 - UnitedVRG
CHAPTER
19
Consent to Participate
in Research (Sample)
V
d
G
R
n
U
-
ti e
9
ri 9
h
ta
PROCEDURES
tahir99 - UnitedVRG
G
R
9
ri 9
n
U
-
ti e
V
d
h
ta
Confidentiality
IDENTIFICATION OF INVESTIGATORS
If you have any questions or concerns about this research, please
contact; identify research personnel: principal Investigator, faculty
Sponsor (if student is the PI), Co-Investigator(s), if any. Include
day phone numbers, addresses, and email addresses for all listed
tahir99 - UnitedVRG
G
R
V
d
ti e
________________________________________
Printed Name of Subject
n
U
-
________________________________________
________________________________________
Signature of Subject
9
ri 9
Date
________________________________________
________________________________________
Signature of Witness
h
ta
Date
1. www.uoguelph.ca/research/forms/.../sample%20consent%20form.
doc
BIBLIOGRAPHY
Index
Page numbers followed by f refer to figure and t refer to table
A
Alternate hypothesis, types of 60
Analytical observational studies 14
Antibody test 106
Bar charts 46
Basic statistical tests 110
Bias 89
control of selection 92
interviewer 91
misclassification 91
types of 89
Biostatistics 51
Blinding 24
Conduct research 4t
Consecutive manner 37
Consecutive sampling 37
Consent form 25
Convenience sampling 37
Coronary artery disease 22f
Coronary heart disease 94
Cross-sectional studies 12
design of 13
Cumulative incidence rate 73
C
Calculating odds ratio 87
Case control study 15
design 15
Categorical data 43
Causes of CRI 11
Central tendency, measures of 51
Chronic kidney disease 11t, 62, 95,
134, 144f
Citing book reference 159
Citing internet and electronic
sources 161
Citing journal article 157
Closed ended questions 116
Cluster random sampling
technique 37
Cluster sampling 32, 36
Cohort studies 17
Comorbidity index 11
Comparative studies 14
E
End-stage renal disease 131
Epidemiological study designs,
types of 8, 9
Estimation and hypothesis
testing 57
tahir99 - UnitedVRG
F
Fever 45
Focus group discussion 118
Formulate analysis plan 60
G
Gender distribution of
respondents 47f
General ethical principles 164
Generating hypothesis, observational designs for 8
Graphs 45
types of 45
H
Headache 45
Histograms 47
Hypertension 6
Hypothesis 57, 134, 135, 153
alternative 59
test of 59
I
Incidence 72
density rate 74
rates, special types of 73
Information bias 91
Interpretation 66, 86, 88
J
Journal article, title of 158
Journal title 158
Judgmental sampling 38
L
Laboratory values 11
Line graphs 48
Literature search, resources of 5
Lottery sampling technique 32f
M
Mapping and scaling 119
Mean, median and mode, example
of 51
Methodology 129
Morbidity rate 75
Mortality rate 76
Multivariate analysis 146t
Multivariate regression analysis 149
Myocardial infarction, relation
of 93t
N
Nausea 45
Negative predictive value 107
Nephropathy 7
Nonparametric tests 112
Nonprobability sampling
techniques 31, 37
Null hypothesis 59
Numerical data 43
O
Observation bias 91
Odds ratio 86
Open-ended questions 116
Operational definition 134
Optional components 154
Oral contraceptive and breast
cancer 87
Oral contraceptive use 93t
P
Page numbers 159
Participation and withdrawal 172
Pie charts 46
Population 30
Positive predictive value 107-109
Post-marketing
clinical trials 27
surveillance 26
Probability sampling
techniques 31, 32
Index 177
T
Table of content 152
Title 152
page 152
Tuberculosis 16
morbidity rate of 75
S
Sample data 60
Sample of title page 155
Sample size 95
calculation 139
calculation result 100t
estimation 95
for single group mean 96
for single proportion 95
Sampling
method 138
procedure 30
techniques 31, 32f
Recall bias 91
References 155
study 140
writing 157
Research questions and study
types 27
Research subjects, rights of 173
Research topic, selection of 3
Research
classification of 2
types of 1
Retrospective cohort study 19
Qualitative data 43
Qualitative research 1
Quantitative data 43, 122, 123
Quantitative research 3
Quasi-experimental studies 25
Questions, types of 116
Quota sampling 39
Scatter plots 49
Selection bias 90
Significance level, selection of 60
Simple linear regression 81, 82f
Simple random sampling 32
Snowball sampling 38
technique 39
Solving hypothesis testing
problems 65
Sorting data 121
Special package for social
sciences 83
Standard error of mean 54
State appropriate conclusion 66
Steps in
hypothesis testing 60
writing dissertation 151
Stratified random sampling 32, 35
technique 36f
Study designs 8
Study duration 139
Study objective 153
Study purpose 169
Synopsis writing 129
Systematic random sampling 32,
33, 34f, 35f
Systolic blood pressure 45
V
Variables, types of 41
Variation, measures of 52
Volume number 159
Vomiting 45
tahir99 - UnitedVRG