You are on page 1of 27

Medical Statistics

Part I: Medical Research


Definition of Research:
Research is systemic collection, presentation, analysis and interpretation of data to
answer a certain question or to solve a problem. So before beginning of collection of the
data, study problem or question, and objectives should be clear.
Steps of medical research: (Simplified in the following figure)
Study Problem:
Unusual health related situation we need to know more about it.
Study Goal (study question):
Focuses on the reason for conducting the study, may also put a hypothesis.
Study Objectives:
After identifying your goal, precise objectives should be stated.
Stating objective will specify what will be done in the study, where, when.
Study Design:
Which type of study is used?
Sample Type and Size.
Sources of Data & Tools for Data Collection.
Statistical Analysis and Presentation:
After data collection the statistical analysis and presentation will take place
to satisfy the specific objectives. Interpretation follow statistical analysis.

Ch

apter1: Study Design in Medical Research .


Types of Epidemiological Studies:
I- Observational Studies :
A. Descriptive studies:
Population-based: Correlation studies
Individual-based :
- Case report and case series
- Cross-sectional studies
B.

Analytic studies:
Case Control Study
Cohort Study
II- Experimental Studies (Interventional studies):
Clinical Trials
Community Trials
I- Observational Studies:
A. Descriptive Studies: Hypothesis generating

These studies usually explore frequency (prevalence) and describe pattern (i.e.
distribution according to person, time and place) of the disease in the community. This will
help us to develop hypothesis about risk factors of the disease.
Cross Sectional Survey:
In this type the health status of individual is assessed with respect to presence or
absence of exposure to disease at the same point of time (a cross-section of the
4

population). For this reason you cannot determine if really exposure preceded disease
or not, e.g. assess the presence of obesity in relation to diabetes mellitus.

Screening Tests
Definition:
Application of tests, examinations, or other procedures which can be applied rapidly to sort
out apparently well persons; who probably have a disease from those who probably do not.
Why using screening tests:
The disease is an important public health problem.
Early detection of disease.
Early treatment that help rapid cure.
Criteria of screening test:
Simple, easy to conduct.
Not invasive.
Not painful.
Not time consuming.
Cheap.
Valid (sensitive and specific) and accurate.
Reliable (give similar results whenever repeated).
Validity: is the rate at which a test is capable of differentiating the presence or absence of a
disease concerned.

Example:

Screening test done to neonate to detect low level of TSH (done in Egypt, in PHC
facilities).

Advantages of Descriptive Studies:


1.

Studies could be conducted with the least resources (personnel and equipment).

2.

They give a general overall picture of the problem (prevalence rate).

3.

Very quick, inexpensive.

Disadvantages of Descriptive Studies:


1- Impossible to calculate disease occurrence rate.
2- Not used to establish relation between exposure to factors and disease.

B. Analytical studies: Hypothesis testing


They study Determinants of the disease; why (cause or risk factors) and how the
disease is occurring.

Forms of analytical studies:


1. Case/Control study:
It is an observational design comparing exposures in disease cases versus healthy controls
from same population. Exposure data collected retrospectively.

Advantages of Case-Control Study:


Quick, inexpensive.
Well-suited to the evaluation of diseases with long latency period.
Useful in rare diseases.
Examine multiple etiologic factors for a single disease.

Disadvantages of Case-Control Study:


Not useful in rare exposure.
Incidence rates cannot be estimated.
Selection Bias and recall bias.

2. Cohort Study:
It is a prospective study i.e. follows up the incidence of a disease in the future.
It involves:
i. Study cohort:
Individuals exposed to a certain factor that may be associated with a disease e.g. smoking
and lung cancer.
ii. Control cohort:
A group of individuals not exposed to the studied factor.

Advantages of Cohort Study

II-

Disadvantages of Cohort Study

Describe the natural history

Large number

Temporal sequence

Long term of follow up

Study rare exposure

Loss to follow up

Multiple outcome

Expensive

Calculate relative risk

Change of exposure during the study

Experimental (Interventional) Studies:


Experimental studies in epidemiology usually take the form of clinical trials and
community intervention trials. The objective of most clinical trials is to test the possible
effect, that is, the efficacy, of a therapeutic or preventive treatment such as a new drug,
physical therapy or dietary regimen for either treating or preventing the occurrence of a
disease.
The objective of most community intervention trials is to assess the effectiveness of a
prevention program.

Chapter 2: Sampling in Medical Research


A sample is a part of a whole population selected in order to gain information about
the whole population, so sample should be representative to generalize its result to the
whole population.

Benefits of Sampling:
- Saves effort, money and time.
- Testing every unit can be destructive.
3 Steps to Sampling:
- Identify the population.
- Determine the required sample size.
- Select the sample.
10

Sampling Techniques:

I- Random (probability) Sample :


Random in statistics means:
All units of population are known and available for sampling.
All units have an equal chance to be taken in the sample (probability).
Unit: is the element of interest (person, house, and place).

Types of Random Sample:


1. Simple Random Sample:
1. Cluster Random Sample:
2. Stratified Random Sample:
3. Systematic Random Sample:
4. Multi Stage Sampling:

II-Non probability sample (Non random sample):


Convenience Sampling: (haphazard or accidental)
Purposive Sampling:

Factor affecting sample size:


Importance of study (more important need larger sample).
Variable of study (the more the variable, the larger sample size).
Magnitude of the problem (inversely affect sample size).
Facilities 3M (man, money and material).
Statistical analysis and power.

11

12

Chapter 3: Sources of Data


Definitions:
Data: Measurement with precise definition.
Information: Translation of the measurement into meaningful knowledge.
e.g.

Ahmed temp is 37 C (by mouth)


Ahmed temp is normal

data
information

Sources of Data:
1. Census Data: It is usually taken every ten years to;
Enumerate the population.
Know socio-demographic characteristics of people.
Calculate vital statistics (morbidity, mortality, and fertility indices).
2. Records:
a. Records of health offices :
These are records for registration of births, deaths, occurrence of infectious diseases
and immunization of newborns and children.
b. Annual statistical reports:
These reports are published by the Ministry of Health (MOH) and the
World Health Organization (WHO).
c. Case records : It is usually reported by hospitals and outpatient clinics.
3. Survey: data collection for a specific health problem by conducting special studies.
13

4. Others: Focus Group Discussion, observation, and in-depth Interview.

Data Collection Tools:


1. Questionnaire.
2. Observation checklist.
3. Data collection forms.
4. Other data collection tools.
Photography / Video: provides visually represented information
Maps and drawing.

14

Part II: Statistical Management of Data


Statistical management of data contains both statistical presentation and analysis of data. It
will take place after data collection to satisfy the specific objectives.

Variable:
Definition:
Variable is a character with different measurement (values) that may vary from object
to object, each measure give different disease picture. Variable are better expressed as data.
e.g. Age is a variable has different measurements.
Sex is variable has 2 measurement Male, Female.
Types of variables (data):
Quantitative
Qualitative
I- Quantitative (Numerical):
Measurements are expressed in numbers.
A- Discrete variables: variable expressed as a whole number with no fraction.
e.g. - Number of children in family.
- Number of pregnancies.
- Pulse rate.

B- Continuous: there is continuous change in its value, fraction may be present.


e.g. Height, Weight, and Age.

II- Qualitative (categorical):


Measures expressed as description.
A- Nominal: no special arrangement.
15

e.g. : Sex (Male and Female), Presence of Hypertension (Yes or No) (Dichotomous).
- Blood group (ABO) and race (White, Black, and Hispanic)
(Multichotomous).

B- Ordinal: data can be arranged.


e.g. grade of disease (mild, moderate, and severe).

Chapter 1: Presentation of data


(Descriptive statistics)
Descriptive statistics refer to statistical techniques used to summarize and describe the
main features of data.
Aim of presentation

find out the commonest value

Find out Group variation

Identify the odd value (strange value)

Ways of data presentation


Tables, Graphs, Parameters

16

I- Presentation of quantitative (Numerical) variables:


Table
Frequency distribution and relative frequency table;
Weight

Frequency

Interval

Relative
Frequency

1019
2029
3039
4049
5059
6069
7079
Total

5
19
10
13
4
4
2
57

Graph: Histogram and the Frequency Polygon;

Histogram

17

(%)
8.8
33.3
17.5
22.8
7.0
7.0
3.5
100.0

Frequency polygon

If the smooth line passes between the points instead polygon, it is known as frequency
curve.

Frequency curve

18

Parameters:
In quantitative variables we use two concomitant measures to summarize the data,
measures of central tendency (the middle) and measures of dispersion (variability).
I- Measures of Central Tendency: mean and median only
a. Mid Range:
The value that lies mid way between the highest (maximum) and lowest
(Minimum) values.

Mid range = Maximum + Minimum / 2.


b. The Mean:

Where (Greek letter sigma) means to add, X represents the individual observations,
and n is the number of observations.
c. The Median:
The median is the middle value in a set of data arranged in order of magnitude. It
divides the data into 2 equal groups above and below the median value.
d. The Mode:
The mode is the value that occurs most frequently.

19

II- Measures of Dispersion: SD and percentile (interquartile range)


a. Minimum and Maximum:
It's the highest and lowest value of the data.
b. The Range:

Range = Maximum Minimum


c. The Standard Deviation:
The standard deviation is a measure of the average spread of the observations
about the mean. The standard deviation is symbolized as SD, or simply s and its
formula is:

d. Percentiles, Deciles and inter-quartile range:

In percentile, we divide the data into 100 equal parts, each part represent 1 % of all
values, 90 percentile is the value which 90 % of all values below it & 10 % above it,
50 % percentile equal the median value of the data set. Deciles divide the data into 10
equal parts. While quartile divides the data into four equal parts. The interquartile
range IQR contains the middle 50% of the scores. It is obtained by Q3 Q1 (i.e. the
75th percentile the 25th percentile).

20

Presentation of qualitative (categorical) variables:


Table:
1. Simple Table: "simple frequency & relative frequency table":

Sex
Male
Female
Total

Number
42
18
60

Percent %
70
30
100

Sex distribution of set of out-patients in hospital A.

2. Contingency Table or cross tabulation of 2 variables:


Sex
Male
Female
Total

Diseased
12
8
20

Not diseased
30
10
40

Total
42
18
60

Frequency distribution of disease status according to out-patients sex.

Graph:
Two types of graphs can be used
21

a. Bar chart:

Bar Chart
b. Pie Chart:

Pie chart

22

Parameters:
-Proportion= part/total100
-Ratio= part/part
-Rate is a proportion has a relation to time.

Exercise: 20 students in the first round, 8 of them are females,


What are parameters can be calculated.
Proportion of females = 8/20 100 = 40 %
Male to female ratio = 8/12 = 0.6

23

Presentation of Data

Quantitative data

Tables

Graphs

Qualitative data

Parameters

Of central tendency;
Midrange
Mean
Frequency polygon Median
Relative frequency table
Mode
Frequency distribution

Histogram

Tables

Simple frequency

Bar chart

Parameters

Ratio

Relative frequency Pie diagram


Cross tabulation

Frequency curve

Of dispersion;

Range
SD
Percentile, deciles & interquartile

Figure summarize Main Methods for Presentation of Data

24

Graphs

Proportion

25

Chapter 3: Statistical analysis of data


(Inferential statistics)

Inferential statistics mean the use of statistics to make conclusions about a population
the basis of the results obtained from a sample drawn from that population. The common
2 methods which used are estimation of the parameters and hypothesis testing.

Hypothesis Testing:
To answer a statistical question, we should translate it into hypothesis to be subjected to
a test. Depending on the result of the test we accept or reject the hypothesis. The hypothesis
testing is known as null hypothesis (H0), for every null hypothesis there is alternative
hypothesis (HA)
Statistical tests used in testing a hypothesis:
The type of statistical test to be used depends on type of data, how the data distributed,
and the objectives of the study.

26

Analytic tests of significance

Quantitative data

Qualitative data

Parametric
Non-parametric
Parametric
(Data obtained from normal distribution)
(Data not obtained from normal distribution)

Z- test

2 groups;
Student t-test

> 2 groups;
ANOVA test

2 groups;
Mann-Whitney test

> 2 groups;
Kruskal - wallis test

Figure summarize some analytic tests of significance


For different types of data

27

Non- parametric

Chi- square

You might also like