You are on page 1of 4

[Downloadedfreefromhttp://www.ijph.inonMonday,August11,2014,IP:36.76.177.

94]||ClickheretodownloadfreeAndroidapplicationforthisjournal

Development and Validation of Risk Scoring System for


Prediction of Cancer Cervix
*V. Patil (Gawande)1, S. N. Wahab2, S. Zodpey3, N. D. Vasudeo4
Summary
A Hospital based group matched case-control study was conducted to devise a risk scoring
system for the prediction of cancer cervix at the Gynecology Clinic, Government Medical
College Hospital, Nagpur, India. The study consisted of 230 cases of cancer cervix
(histopathologically confirmed) and equal number of controls, group matched for age. The
risk factors considered were Illiteracy, long duration of married life (>25 years), Early Menarche
(<13 years), marital status (widow, separated, divorcee), multiparity (> 3), h/o abortion, h/o
tobacco use, h/o passive smoking, poor genital hygiene, (grade Ill & IV) and low socioeconomic
status. Statistical Analysis included unconditional multiple logistic regression analysis
Receiver Operating Characteristic (ROC) curve analysis. The overall predictive accuracy was
calculated by Wilcoxon statistic as an equivalent of area under ROC curve. Five risk factors,
illiteracy, poor genital hygiene, long duration of married life, multiparity and early menarche
were identified to be significantly associated with cancer cervix. These factors were given
statistical weights of 13, 10, 7, 5 and 5 respectively. A total score of 21 was found to be the best
cut off for prediction and the overall predictive accuracy of the risk scoring system was
calculated to be 0.74(0.67 - 0.81). In case of consistent further validation using other data sets
this additive risk scoring system can be used for reducing the cost of universal screening by
subjecting only high-risk subjects to laboratory screening procedure (Pap smear) in population
setting.
Key Words: Cancer, cervix, case-control study, risk scoring system.

Introduction

It is generally acknowledged that cancer cervix


is a multifactorial disease. A number of risk factors
have been associated with cervical cancer, namely;
Illiteracy, low socioeconomic status, (SES), long
duration of married life, early menarche, marital status,
early marriage, early first childbirth, age at last child
birth (LCB), multiparity, abortion, multiple sexual
partners, late menopause, genital infection, poor
genital hygiene, tobacco use, passive smoking and
contraceptive use 2,3,4,5,6,7. However their relative
contribution to outcome of cancer cervix reportedly
varies.1 The combined effect of these risk factors can
help in better prediction of cancer cervix as-compared
to the individual effects.
Currently, Pap smear is being routinely practiced
to detect cervical cancer in hospital setting and in few
community settings. Considering the population size
and illiteracy status of this country this technique has

certain limitations particularly in community settings


i.e. availability of skilled manpower, equipment and
acceptability.
Since known risk factors are recognized to be
associated with cancer cervix; an alternative to
universal screening by Pap smear can be the
development of risk scoring system for udentifying the
population at risk & directing early detection efforts to
this group3. This study was therefore undertaken to
develop and validate a risk scoring system for
prediction of the cancer cervix, based on presence or
absence of some risk factors.

Materials and methods


The present hospital based, group matched,
case-control study was carried out at Gynecology
Clinic, Govt. Medical College Hospital, Nagpur during
1995-1996.

1Clinical Epidemiology Research and Training Unit, Boston University School of Medicine,
Preventive & Social Medicine, Govt. Medical College & Hospital, Nagpur, India
*Corresponding author: E-mail: vaishali@bu.edu

2,3,4Department

of

[Downloadedfreefromhttp://www.ijph.inonMonday,August11,2014,IP:36.76.177.94]||ClickheretodownloadfreeAndroidapplicationforthisjournal

Development and Validation of Risk Scoring System for Prediction of Cancer Cervix

39

Table-1: Results of unconditional multiple logistic regression from female patients admitted
analysis and statistical weights for the significant risk factors to study hospital for conditions
Risk factors

Regression Odds ratio 95% Cl for OR


Coefficient

Full Model
Illiteracy*

1.2534

3.502

2.207-5.556

Long duration
0f Married life

0.8075

2.242

1.439-3.494

Early menarche

0.5299

1.699

1.063-2.715

Marital status

0.0447

1.499

1.848-2.649

H/o genetal infection

0.3093

0.734

0.453-1.190

Multiparity*

0.7630

2.145

1.354-3.398

H/o abortion

0.0673

1.070

0.660-1.732

H/o tobacco use

0.1410

1.151

0.708-1.873

H/o passive smoking

0.2291

1.258

0.811-1.954

Poor genital hygiene* 0.9795

2.663

1.693-4.190

Low SES

1.402

0.889-2.212

0.3382

Statistical
Weights

other than gynecological


cancers and showing Pap
smears within normal limits
(Bethseda System) 10 . The
controls w ere group matched
(frequency matching) for five
years class interval.
Risk factors: The present
study included Illiteracy, long
duration of married life (>25
years); Early Menarche (<13
years); marital status (widow,
separated,
divorcee),
multiparity (> 3), H/o abortion,
H/0 tobacco use, H/O passive
smoking, poor genital hygiene,
(grade Ill & IV) 2-7 and low
socioeconomic status as risk
factors for cancer cervix. The
interview technique was used as
a tool for data collection.

Statistical analysis 8 : To
investigate significance of
included
risk
factors
Illiteracy*
1.3834
3.989
2.569-6.193
13
unconditional multiple logistic
Poor genetal hygene* 1.0976
2.997
1.940-4.630
10
regression (MLR) analysis was
carried out by using MULTI-R
Long duration of
0.7596
2.137
1.388-3.392
7
statistical software package.
Married life*
The risk factors identified to be
Multiparity*
0.7546
2.127
1.357-3.332
7
significant at an alpha of 0.1
were included in the final model
Early menarche
0.5552
1.742
1.103-2.752
5
of logistic regression analysis.
However level of significance
Sample size8: The sample size was calculated
was fixed at an alpha of 0.05, for judging the
based on the estimates of relative risk for poor genital
significance of risk factors in the final model. The
hygiene as an important risk factor 2 for cancer cervix
significant risk factors in final model were then given
as 2.5, prevalence of this factor in control population
statistical weight. The weight for the factor was
to be 0.1304, alpha error of 0.05 and power 90 percent
calculated by using following linear transform on the
The sample size was estimated to be 230 cases arid
regression coefficient of the variable in final
equal number of controls.
unconditional logistic regression model. Statistical
weight = Round [B x 10]
Cases: A total of 230 incident cases confirmed
by histopathology (stage 1 onwards)9 and admitted to
This transform was necessary to make this
study hospital were included in the present study.
additive risk scoring system easy to use. All the 460
study subjects were then scored individually using the
Controls: Equal number of controls were selected
Final Model

[Downloadedfreefromhttp://www.ijph.inonMonday,August11,2014,IP:36.76.177.94]||ClickheretodownloadfreeAndroidapplicationforthisjournal

40

Indian Journal of Public Health

Vol.XXXXX No.1

Table-2: Classification of study subjects


by total risk score
Score

Cases
(n=230)

Controls
(n=230)

0-7

18

72

8-14

20

46

15-21

25

34

22-28

50

40

28-35

56

20

36-42

61

18

developed risk scoring system. The prediction


accuracies, sensitivities, specificities and Cohens
kappas of risk scoring system at various cut offs of the
total score were calculated11. The overall predictive
accuracy was calculated as an equivalent of the
Wilcoxon statistic as described by Hanley and Mc
Neil12. The best cut off for the total score was obtained
graphically by plotting Receiver Operating
Characteristic (ROC) curve.

Results
Table-1 describes results of unconditional logistic
regression analysis in full and final model. Of the total
11 risk factors included, five risk factors illiteracy, long
duration of married life, early menarche, multiparity
and poor genital hygiene were identified to be
significant in full model. The results of final model
confirmed the significance of these five risk factors in
the outcome of cancer cer vix. Using the
aforementioned linear transform, statistical weights of
13, 7, 5, 7, and 10 respectively, were attributed to these
risk factors (Tablel). Table-2 shows the classification of
cases and controls by the total risk score categories
using a total risk score of 7 as the class interval. It is
apparent that most of the controls were clustered in
the less than 21 total score category, while the opposite
was true of cases. This finding was also confirmed
graphically by the ROC curve (Fig 1). Table-3 depicts
the performance characteristics of the risk scoring
system at various cut offs. It can be seen that the
Cohens Kappa was maximum at the total risk score of

January-March, 2006

21. The overall predictive accuracy of the risk scoring


system, as calculated by Wilcoxon statistic was
0.73432 (95% CI 0.6670 - 0.8193), with std error of
0.03809.

Discussion
Looking at the demographic characteristics of
the study subjects, it is found that illiteracy, long
duration of married life, early menarche, multiparity
and poor genital hygiene were significantly associated
with cancer cervix. The contribution of these risk
factors in the outcome of cervical cancer is also
recognized and endorsed by other investigators 2-7.
These five factors representing the main
contributory factors leading to or associated with the
cancer cervix outcome. This was also confirmed by

Table-3: Performance characteristics of


the risk scoring system
Cut-off for Sensitivity Specificity
total score
>7
>14
>21
>28
>3

0.92
0.83
0.73
0.51
0.27

0.31
0.51
0.66
0.83
0.92

Positive
Predictivity

Cohens
kappa

0.57
0.63
0.68
0.75
0.77

0.22
0.34
0.38
0.34
0.18

overall predictive accuracy of the risk scoring system,


which can predict an individual to be at higher risk of
developing cancer cervix by taking into account only
these five risk factors with a probability of 0.7432. It
appears that illiteracy, poor genital hygiene, long
duration of married life, multiparity and early menarche
exerted respective effect:s in that order. The fact that a
total score of 21 or more was the point from which
there was higher risk of cancer cervix implies that
presence of a minimum of 3 factors essentially including
either illiteracy or poor genital hygiene is essential to
render an individual at higher risk of developing cancer
cervix. As the number of risk factors present goes on
increasing, so does the risk of acquiring cervical cancer.
Though the present study developed an additive
risk scoring system, the predictive accuracy of which

[Downloadedfreefromhttp://www.ijph.inonMonday,August11,2014,IP:36.76.177.94]||ClickheretodownloadfreeAndroidapplicationforthisjournal

Development and Validation of Risk Scoring System for Prediction of Cancer Cervix

was 0.7432, this method of validation as discussed by


Herman et al13 is back validation where the same data
from which the risk scoring system was developed has
been used for validation. Obviously, it may lead to an
over-optimistic estimate of the predictive accuracy. The
moderate predictive accuracy found in this study may
also be attributable to the statistical weighting system
which was based on robust unconditional multiple
logistic regression model.
The obvious advantages of this risk scoring system
are its simplicity, noninvasiveness, economy, ease of
administration in field studies and moderate predictive
accuracy. Moreover, it is based on a few hypothesized
risk factors. Whether these results can be readily
generalized depends mainly on the demographic
characteristics of the population under consideration

41

and future-jongitudinal, population based,


heterogeneous epidemiological- studies needed to
validate this risk scoring system. If the results of future
studies confirm the findings of this study then this risk
scoring system can be used to screen and identify
high risk subjects who can further be screened by
suitable laboratory screening procedure i.e. pap smear,
thereby reducing the cost of screening for cancer
cervix, particularly when dealing with a large
population size.
The inherent quality of such a risk scoring system
is that it may help in the primary prevention cancer
cervix. The four risk factors except early menarche,
included in the risk scoring system are modifiable risk
factors, and therefore the implication the risk scoring
system is underscored.

You might also like