You are on page 1of 34

Data Analysis & Decision

Model
Cancer Mortality Analysis Report

Prepared by:
Aakash Parwani
Sumit Sameriya
Kshitij Tiwari

Contents
Abstract...................................................................................................................... 2
Introduction................................................................................................................ 3
Literature Review....................................................................................................... 6
Factors........................................................................................................................ 8
Sex.......................................................................................................................... 8
State...................................................................................................................... 10
Age Group............................................................................................................. 11
Cancer Sites.......................................................................................................... 13
Ethnicity................................................................................................................ 16
Results & analysis..................................................................................................... 20
Correlation Analysis............................................................................................... 20
Deaths & Age Group........................................................................................... 21
Deaths & Cancer Sites....................................................................................... 22
Deaths & State................................................................................................... 24
Deaths & Ethnicity............................................................................................. 25
Multivariate Analysis............................................................................................. 28
Multivariate analysis systems normally utilized for............................................28
With Multivariate Analysis you can.....................................................................29
Analysis.............................................................................................................. 29
Conclusion................................................................................................................ 31
Future Enhancement................................................................................................ 32
Acknowledgement.................................................................................................... 32
References................................................................................................................ 33

Abstract
Cancer is a noteworthy general wellbeing issue in the United States and numerous different parts
of the world. It is presently the second driving reason for death in the United States, and is
required to surpass heart sicknesses as the main source of death in the following couple of years.
In this Report, we will concentrate on the central point that is identified with Cancer passings.
The components that give some dissected proof of growth demise. We will be doing Regression
Analysis to locate the most ideal relationship between the autonomous variables and the reliant
variable, which will give enough proof to consider the relationship into down to earth.
For this investigation, we will consider information from 1999 to 2012. We will be supporting
the examination through diagrams which will itself picture the conclusion. In the middle of, we
will be indicating growth passings in light of age gathering, sex, state and every single tumor site
which are our autonomous variables.

Introduction
Cancer is a disease in which body tissue gets destroyed by the abnormal cells present in the body.
Some of the symptoms of cancer are unexplained weight loss, abnormal bleeding, a new lump, a
prolonged cough and change in bowel movements of human body. While these all symptoms are
indicators of cancer, there are also some other issues through which it can occur. The quantity of
individuals living past a disease finding came to about 14.5 million in 2014 and is relied upon to
ascend to right around 19 million by 2024. National consumptions for tumor care in the United
States totaled almost $125 billion in 2010 and could reach $156 billion in 2020 (Murphy, 2013).
The deaths due to cancer are increasing in USA. In order to analyze the situations in cancer,
American Cancer Society was established in May, 1913. It was developed to showcase the
increasing cancer deaths over the years through facts and figures. Their main focus is on the new
cases that are occurring and the deaths that can take place in each year to find the total
contemporary cancer burden. They concentrate on the past years deaths and provide the trend for
next years and the new symptoms coming in picture. The society has helped over the years to
help people survive and recover from cancer disease (Howlader, 2014).
The Organizations in USA have built some numerical related facts and figures based on cancer
deaths which have helped for future forecast for total number of deaths. There are many types of
Cancer disease that are based on different body parts such as brain, chest, respiratory system,
eye, anal, etc. Some of the cancers that have caused maximum deaths over the years are Male
and Female Breast Cancer, Digestive System Cancer, Respiratory System Cancer, Male Genital
System and many more. From 1999 to 2012, Respiratory System Cancer and Male and Female
Breast Cancer have caused more than 50 thousand deaths. Among all of the Cancer diseases

Respiratory System Cancer and Male and Female Breast Cancer has been the major death
resulting disease in Cancer types. We will discuss few of them (Copeland, 2014).
Breast cancer is the group of cells that starts growing in the breast or chest of human. It can also
affect the entire body. Breast Cancer is the 2 nd most common cancer in women. American Cancer
Society has made a total estimation of 231840 new cases that will occur in women in 2015. Of
which 40290 womens would lead to death. American Cancer Society has predicted an
estimation of around 2350 new cases of breast cancer in men in 2015, of which there is a
possibility that 440 men will die. Male breast cancer can sometime be caused by the inherited
gene changes. Radiation exposure and family history of breast cancer can be become very severe
for men, because it will increase the chances of breast cancer in men (Institute, 2005).
Malignancy of any tissue making up the respiratory framework. The respiratory framework
incorporates every one of the organs included in the breathing process, for example, the lungs,
bronchi and throat. The respiratory framework supplies oxygen to your blood and disperses the
oxygen to whatever is left of your body. The framework incorporates the mouth, nose, trachea,
lungs and stomach. Growth can influence any of the respiratory organs. Some of the symptoms
of respiratory cancer are syncope, breath shortness, dyspnoea, etc.
In the United States, colon cancer is the most widely recognized type of digestive cancer.
Pancreatic cancer is turning out to be more normal; cautioning signs are not particular but rather
incorporate a solid family history, smoking, stomach torment, weight, misfortune, and,
periodically, yellowing of the skin (jaundice). Cancer of the throat is likewise expanding in the
United States; indications incorporate trouble gulping and weight reduction. Hazard elements
incorporate smoking, liquor use, and a little rate of patients with continuous acid reflux. Gastric

(stomach) cancer is a great deal less normal in the United States than some different parts of the
world however chance variables incorporate contamination with Helicobacter pylori and eating a
ton of sustenances with nitrites or nitrates (Hamilton, 2000).
Cancer disease can grow anyplace in the GI tract, yet the speediest developing GI growth is
esophageal malignancy. There are two sorts of esophageal tumor: squamous and
adenocarcinoma. Squamous is connected with smoking and liquor admission, while
adenocarcinoma is connected with long standing reflux. Esophageal growth can create in
individuals who have reflux for quite a while, which can prompt Barrett's throat. Barrett's throat
is the place the coating of the throat changes to oblige the expanded corrosive and is a
precancerous condition. Since esophageal growth is the quickest developing malignancy, I
every now and again screen for Barrett's in anybody with long standing heartburn.
Albeit ordinarily there are no side effects for right on time caner, the most widely recognized
protestation for esophageal disease is inconvenience gulping. In the event that you ever
experience difficulty gulping you ought to see a gastroenterologist (Odze, 2004).

Literature Review
The primary PubMed search produced 500 citations, of which 263 met the inclusion criteria. The
first published study that met our inclusion criteria was published in 1975 by SEER, which is
authoritative source of information on cancer incidence, mortality and survival in the United
States.
All the reports present on SEER was in the public domain, which can be used for analysis. It
provided information on Mortality rate of 19 age groups and then Regression is applied on that
data to find relationship between Age Group of patients and Mortality Rate. This Regression
resulted in 72% (R-Square).
The second published study that met our criteria was published by CDC (Centers for Disease
Control and Preventions), this study was about Cancer Rates by U.S. State and Sex. This study
was done on data between years 1999-2012. For understanding the relationship Multi-Linear
Regression was applied between Mortality Rate (Dependent Variable) and Sex, Age-Group, State
(Independent Variables). This Regression analysis resulted in 80% (R-Square).
To understand which type of Cancer is most frequent in the US Citizens, we found help from
CDC (Centers for Disease Control and Preventions). According to recent studies done Leukemia
& Lymphoma is the most frequent Caner Site in US Citizens. Also, a particular society named
Leukemia & Lymphoma Society is developed to fight against this Cancer.

Factors

Sex
Sex is one of our independent variable in the analysis. Cancer deaths differ based on gender. It is
found that male have more higher death rates of cancer than women. Research shows that cancer
death frequency in men is more is higher and the survival becomes even worse once it occurs in
them. Cook has published his study in Cancer Epidemiology, Biomarkers and prevention
which shows that if the causes of both the genders in cancer incidence can be identified, then
preventive measures can be taken to reduce the burden created by cancer on men and women.
The quantity of new instances of malignancy (tumor occurrence) is 454.8 for every 100,000 men
and ladies for each year (taking into account 2008-2012 cases). The quantity of disease passings
(malignancy mortality) is 171.2 for every 100,000 men and ladies for every year (in light of
2008-2012 passings).
Cook and his entire research team analyzed in depth USA data from a huge database, which had
statistics on approximately 35 cancers by sex and age from 1977 to 2006. Their findings included
that Male cancer deaths are higher than women (Hankey, 2002).
Researchers also focused on the 5 year survival rate of the people with different types of cancer.
They found that Male had very bad survival as compared to female. They also said that in future
research should be based on factors that have higher diagnosis rate of cancer among men.
In youth cancer, males are again at a higher danger than females. The sex differential in the
frequency of adolescence cancer is settled and reliable around the world (Ashley, 1969;
Greenberg and Shuster, 1985; Linet and Devesa, 1991; Pearce and Parker, 2001; Cartwright et
8

al., 2002; Desandes et al., 2004). The M:F proportion for every single occurrence cancer is
around 1.2. Special cases to the male overabundance in youth cancer incorporate newborn child
leukemia, thyroid carcinoma, threatening melanoma, and alveolar delicate part sarcoma. As in
grown-ups, NHL demonstrates a steady male overabundance in all age bunches amid
adolescence and immature period (range=1.73.2), while Hodgkin lymphoma (HL)
demonstrates an intriguing age-subordinate variety in its M:F proportion (Ries et al., 1999). The
general rate of HL in youngsters is more noteworthy in females than in males (M:F ratio = 0.8),
yet the sex circulation is age-subordinate, with the striking M:F proportion in HL in more
youthful ages when the disease is uncommon switching for teenagers when it turns out to be
more basic (Spitz et al., 1986). The SEER information from 1990 to 1995 period show M:F
proportions of 5.3 (<5year), 1.4 (59year), 1.1 (1014year), and 0.8 (1519; Ries et al., 1999)
(Ebru, 2012).
Cancer
Lip Cancer
Larynx
Hypopharnyx
Esophagus
Bladder Cancer
Lung Cancer
Colorectal Cancer
Pancreatic Cancer
Leukemia
Bile Duct Cancer

Male to Female Death Ratio


5.51 to 1
5.37 to 1
4.71 to 1
4.08 to 1
3.36 to 1
2.31 to 1
1.42 to 1
1.37 to 1
1.75 to 1
2.23 to 1

State
California, and maybe soon New York, is confronting a conceivable growth scourge of more
noteworthy extent than as of now exists. This is of real sympathy toward all folks and youngsters
over the US on the grounds that "as California goes, so goes the country." Bills are being

presented and marked all through the United States that are evacuating your restorative decision
and educated assent. Beginning in 2016, California will require all government funded school
kids to be infused with known growth making fixings all together get an instruction. In New
York, a comparative bill has been presented.
Maybe the most evident and warmed point for families confronting this medicinal ambush is the
way that it is obscure if the immunizations being constrained onto the kids cause growth. The
genuine immunization paper embeds (or online PDF) for every antibody expresses the
accompanying:
"This immunization has not been assessed for its cancer-causing or mutagenic possibilities or
debilitation of ripeness."
Each antibody contains a huge number of dangerous fixings, or adjuvants, which are each
cancer-causing in their own particular right. The fixings are Aluminum, Formaldehyde and
Mercury.
A recent report distributed in Molecular Carcinogenesis indicated reliably lifted dangers for
pancreatic malignancy in people working in the aluminum creation and metalworking
commercial ventures. A recent report distributed in the Journal of Applied Toxicology found that
the impact of aluminum on cell multiplication and cell senescence is strikingly like that of
enacted oncogenes in human epithelial mammary tissue.
Formaldehyde presentation is an exceptional sympathy toward kids and the elderly. Youngsters
may get to be touchy to formaldehyde all the more effortlessly, which may make it more
probable they will get to be wiped out (Pickle, 2007).

10

A study in the Scandinavian Journal of Work, Environment and Health presumed that methyl
mercury chloride causes kidney tumors in male mice and mercury chloride has demonstrated
some cancer-causing movement in male rats. The concentrate likewise expressed that
epidemiological information indicates the likelihood of a danger of lung, kidney, and focal
sensory system tumors.

Age Group
The cancer can occur at any point of life. The age has a very important effect in the death due to
the cancer. The Age Group category represented in the following analysis is the age of the
cancer patient at which he/she died. Chart in Fig 2. Gives a good view of variation in death rate
because of Cancer with Age. As we can see number of deaths observed in cancer patients of age
less than 1 year is very less and patients of age group 65-69 and 70-74 has observed highest
deaths. But one thing to notice here is that death rate for cancer are higher among middle-aged
and elderly populations (Ghosh, 2012).
As the population ages, numerous diseases that overwhelmingly influence more seasoned people
will turn out to be more common. In addition numerous conditions that influence the elderly will
happen in mix, in this way muddling watch over a particular condition (5,6). Propelling age is a
high hazard element for cancer, with persons more than 65 representing 60% of recently
analyzed malignancies and 70% of every single cancer death (7,8). The age balanced cancer
frequency rate is 2151/100,000 population for those more than 65 contrasted with 208/100,000
for those under 65 (7,8). Essentially, the age balanced cancer death rate for those more than 65 is
1068/100,000 contrasted with 67/100,000 for those under 65 (7,8). In this manner, the frequency

11

of cancer in those more than 65 is 10 times more prominent than in those more youthful than 65
and the cancer death rate is 16 times more noteworthy in patients more than 65 contrasted with
more youthful patients. More than 70% of the mortality connected with numerous cancers
including prostate, bladder, colon, uterus, pancreas, stomach, rectum and lung happen in patients
65 and more established (7,8). Indeed, even with a dynamic diminishing in the cancer frequency
and death rate, maturing of the population will be joined by a checked increment in the aggregate
number of patients with cancer and the requirement for doctors and guardians to have
extraordinary aptitude in both oncology and geriatrics.
The turning gray demographics in the United States and the way that cancer rate in people rises
exponentially in the last many years of life, recommends that cancer may soon supplant coronary
illness as the main source of death in this nation. These demographics raise basic difficulties to
be met by American pharmaceutical. Despite exhibiting the criticalness of wanting to deal with
the extended weight of growth, these data offer climb to different request regarding the
association of developing to cancer (Robert, 2006),

Death With Age Groups


600000
500000
400000
300000
200000
100000
0

Fig 2. Illustrates how Cancer Death rate varies with Age.


12

Cancer Sites
Another most important factor which influences cancer death is the cancer sites. There are
certain parts of the body which are more prone to the cancer cells. The cancer sites in the
following study represents the area of the body which was affected from cancer and causes the
death of the patient.
In below table we can see the estimated new cases and deaths in the year 2015. From the table,
we can conclude that Lung and bronchus Cancer which is respiratory cancer and breast cancer
have maximum new cases and deaths in 2015.
In relation analysis section we would try to find out any relationship between death and cancer
site using Linear Regression (Walter, 2013).

13

Somewhere around 2010 and 2020, we expect the quantity of new cancer cases in the United
States to go up around 24% in men to more than 1 million cases for every year, and by around
21% in women to more than 900,000 cases for each year.
The sorts of cancer we hope to build the most are
Melanoma (the deadliest sort of skin cancer) in white men and women.
Prostate, kidney, liver, and bladder cancers in men.
Lung, breast, uterine, and thyroid cancers in women.
Throughout the following decade, we anticipate that cancer rate rates will stay about the same;
however the quantity of new cancer cases to go up, for the most part due to a maturing white
population and a developing dark population. Since cancer patients general are living longer, the
quantity of cancer survivors is relied upon to go up from around 11.7 million in 2007 to 18
million by 2020.
14

Cigarette smoking is connected to numerous sorts of cancer, particularly lung cancer. In the
United States, smoking has declined following the first Surgeon General's Report on Smoking
and Health was distributed in 1964. In like manner, new instances of lung cancer have gone
down subsequent to the mid-1980s in men and the late 1990s in womenquicker in men than
women. The quantity of new lung cancer cases in men is required to finish what has been started
somewhere around 2010 and 2020, yet more than 10,000 extra new lung cancer cases are relied
upon to be found in women every year by 2020.
Overweight and obesity raise hazard for female breast, colorectal, esophageal, uterine, pancreas,
and kidney cancers. In the wake of expanding in the course of recent decades, around 66% of
grown-ups and 33% of youngsters are currently overweight or fat. With the exception of breast
and colorectal cancers, the quantity of weight-related cancers is required to go up 30% to 40% by
2020.

15

Cancers brought on by contaminations are likewise anticipated that would increment. New
instances of liver cancer are relied upon to go up more than half, likely the aftereffect of the
increment in hepatitis diseases, especially in individuals conceived somewhere around 1945 and
1965. Oral cancers in white men are relied upon to increment by around 30%, likely the
consequence of more human papillomavirus (HPV) contamination (Thompson, 2015).

Ethnicity
Another effective explanatory variable found out to be is ethnicity of the people. From the
Data collected over the years, we provide some relation between the ethnicity and the death rate
due to cancer in USA. Ethnicity in the study is categorized majorly into Non-Hispanic White,
American Indian/Alaska Native, Hispanic, and Asian/Pacific Islander women.
The below table shows Incidence and Death Rates by Site, Race, and Ethnicity, United States,
2007 to 2011 (Anderson, 2013).

16

From 19992012, the rate of individuals kicking the bucket from cancer has shifted, contingent
upon their race and ethnicity. The diagram underneath demonstrates that in 2012, among men,
dark men will probably pass on of cancer than whatever other gathering, trailed by white,
Hispanic, American Indian/Alaska Native, and Asian/Pacific Islander men. Among ladies, dark
ladies will probably bite the dust of cancer than whatever other gathering, trailed by white,
American Indian/Alaska Native, Hispanic, and Asian/Pacific Islander ladies.

17

Findings

Below, we have written few findings from articles which include background, methods, results,
conclusions, impact:
Background: A relationship between Jewish ethnicity and pancreatic cancer danger was
recommended by investigations looking at pancreatic cancer death rates in the middle of Jews
and non-Jews in New York in the 1950s. These examinations needed data on potential perplexing
variables and the relationship between Jewish ethnicity and pancreatic cancer has not been
analyzed in any contemporary U.S. populace or in any accomplice study (Jacobs, 2009).
Methods: Analyzed the relationship between Jewish ethnicity and pancreatic cancer mortality
among roughly 1 million members in the Cancer Prevention Study II companion. Members
finished a survey at enlistment in 1982 which included data on religion, smoking, corpulence,
and diabetes. Amid follow-up through 2006, there were 6,727 pancreatic cancer passings,
including 480 among Jewish members. Relative perils displaying was utilized to figure
multivariable rate proportions (RR) (Jemal, 2009).
Results: After conforming for age, sex, smoking, body mass list, and diabetes, pancreatic cancer
mortality was higher among Jewish members than among non-Jewish whites (RR = 1.43; 95%
CI, 1.301.57). In investigations by origin, RRs were 1.59 (95% CI, 1.311.93) for North
Americanborn Jews with North Americanborn folks, 1.43 (95% CI, 1.271.61) for North
Americanborn Jews with 1 or more folks conceived outside North America, and 1.03 (0.73,
1.44) for Jews conceived outside North America (Pheterogeneity = 0.07) (Coughlin, 2000).
Conclusions: These results bolster a higher danger of creating pancreatic cancer among U.S.
Jews that is not clarified by built up danger elements.
18

Impact: Future studies might clear up the part of particular ecological or hereditary components
in charge of higher danger among U.S. Jews. Cancer Epidemiol Biomarkers.

Results & analysis

19

Correlation Analysis

In correlation analysis, we gauge an example correlation coefficient, all the more particularly the
Pearson Product Moment correlation coefficient. The specimen correlation coefficient, meant r,
ranges between - 1 and +1 and evaluates the course and quality of the straight relationship
between the two variables. The correlation between two variables can be sure (i.e., larger
amounts of one variable are connected with more elevated amounts of the other) or negative (i.e.,
more elevated amounts of one variable are connected with lower levels of the other).
The indication of the correlation coefficient shows the bearing of the affiliation. The greatness of
the correlation coefficient shows the quality of the affiliation.
For instance, a correlation of r = 0.9 proposes a solid, positive relationship between two
variables, though a correlation of r = - 0.2 recommend a frail, negative affiliation. A correlation
near zero proposes no direct relationship between two persistent variables (Mukaka, 2012).
We will be highlighting separate analysis of relationship between dependent variable i.e.
Number of Deaths because of Cancer and independent variables i.e. Age Group, State,
Ethnicity, Cancer Sites. These relations are generated on data taken between years 1999-2012
(Midthune, 2000).

Deaths & Age Group

20

To understand relationship between these two variables in a better way, we delve into analysis
using Linear Regression with ANOVA and Fig 3. Illustrates the results.

Fig 3. Linear regression results using PHSTAT add-in.

From above statistics we can say, Cancer is primarily a disease of older people, with mortality
rates increasing with age for most cancers.

Deaths & Cancer Sites

21

Chart in Fig 4.gives a good idea of how Death count varies with different Cancer Sites between
years 1999-2012. Cancer in Digestive System has caused highest number of deaths (716791),
and cancer at Eye & Orbit (4516) has caused least number of deaths.

Deaths & Cancer Sites


800000
600000
400000
200000
0

Fig 4. Illustrates how Cancer Death rate varies with different cancer sites.

From Fig 5. We can analyze the most frequent cancer sites, data suggests that Lymphomas,
Leukemia are the most frequent cancer sites. On the other hand Eye and Orbit is the least
frequent cancer site (Ward, 2012).

Cancer sites with Frequency


1000
800
600
400
200
0

Frequency

Fig 5. Cancer sites with frequency.

22

Fig 6. Linear regression results using PHSTAT add-in.

Deaths & State


23

Rates of dying from cancer vary from state to state. In the following maps, the U.S. states are
divided into groups based on the rates at which people died from cancer till 2012, which is the
most recent year with numbers available.

For now we have analyzed death rates in two states only New York and California from Fig 7.
We can easily analyze that California State has observed more death (2042179) because of
cancer and New York has got less (1686003) number of death (Kramer, 1989).

State & Deaths


2500000
2000000
1500000
1000000
500000
0

California

NewYork

Fig 7. Graph to illustrate state-wise rate of dying from cancer.

24

Deaths & Ethnicity

From 19992012, the rate of people dying from cancer has varied, depending on their race and
ethnicity. The graph in Fig 8 below shows that in 2012, among men, black men were more likely
to die of cancer than any other group, followed by white, Hispanic, American Indian/Alaska
Native, and Asian/Pacific Islander men.

Fig 8. Graph to illustrate state-wise rate of dying from cancer.

25

From Fig 9. We can say that black women were more likely to die of cancer than any other
group, followed by white, American Indian/Alaska Native, Hispanic, and Asian/Pacific Islander
women (Izmirlian, 2013).

Fig 9. Graph to illustrate state-wise rate of dying from cancer.

To better understand the relation between Death and Ethnicity factor, we performed linear
regression using PHSTAT and Fig 10 is showing the results of linear regression, here ethnicity is
taken as Categorical variable. However, value of R-Square from this analysis is 11%.

26

Fig 10. Linear regression results using PHSTAT add-in.

27

Multivariate Analysis
Multivariate Data Analysis alludes to any factual system used to dissect data that emerges from
more than one variable. This basically models reality where every circumstance, item, or choice
includes more than a solitary variable. The data age has brought about masses of data in each
field. In spite of the quantum of data accessible, the capacity to acquire a reasonable picture of
what is going on and settle on keen choices is a test. At the point when accessible data is put
away in database tables containing lines and sections, Multivariate Analysis can be utilized to
handle the data in a significant manner.
Multivariate analysis systems normally utilized for

Buyer and statistical surveying


Quality control and quality certification over a scope of businesses, for example, nourishment
and refreshment, paint, pharmaceuticals, chemicals, vitality, information transfers, and so forth
Process enhancement and process control
Innovative work

With Multivariate Analysis you can

28

Acquire an outline or a diagram of a table. This analysis is frequently called Principal


Components Analysis or Factor Analysis. In the diagram, it is conceivable to recognize the
predominant examples in the data, for example, bunches, anomalies, patterns, et cetera.
Dissect bunches in the table, how these gatherings vary, and to which aggregate individual table
lines have a place. This kind of analysis is called Classification and Discriminant Analysis.
Discover connections between segments in data tables, for occurrence connections between
procedure operation conditions and item quality. The goal is to utilize one arrangement of
variables (segments) to foresee another, with the end goal of streamlining, and to discover which
sections are imperative in the relationship. The relating analysis is called Multiple Regression
Analysis or Partial Least Squares (PLS), contingent upon the span of the data table (Wakkee,
2014).

Analysis

After completing correlation analysis of cancer mortality with individual variables, its time to do
multilinear regression analysis of Cancer Deaths with State, Sex Code, Age Group.
Below is the result of multilinear regression performed using PHSTAT (RENCHER, 2002).

29

Fig 11. Multiple Regression results using PHSTAT add-in.

30

Conclusion
From above single and multivariable analysis we can draw few valuable conclusions
1) More than three-quarters (78%) of cancer deaths occur in people aged 65 years and over, and
more than half (52%) occur in those aged 75 years and over. Linear regression Model (Death &
Age Group) has shown accuracy of 73% with data set of 9000.
2) California State has observed highest death because of cancer. Also, multi linear regression
model has produced result of 80% which is self-explanatory for drawing this conclusion.

Future Enhancement

31

Although, we have tried to dig hard some valuable facts about cancer disease. But, data analysis
is a field which always produces some interesting facts as much you delve inside it. Same
happened with our team because; in current analysis we have focused mainly on cancer death
rate dependency on different factors (Mariotto, 2008). In future we as a team has decided to
analyze two interesting things which has got satisfactory effect on survival of cancer patients:
1.) How might physical activity affect cancer survivorship?
2.) How healthy eating affect cancer survivorship?

Acknowledgement

We would like to thanks Professor John Wang for providing valuable suggestions in our analysis.

References
Anderson, M. (2013). Adult stature and risk of cancer at different anatomic sites in a
cohort of postmenopausal women. Cancer Epidemiol Biomarkers Prev.
32

CDC. (n.d.). Retrieved from Wonder.cdc.gov: http://wonder.cdc.gov/CancerMortv2012.html


Copeland, G. (2014). Registry-Specific Cancer Incidence in the United States and
Canada. Springfield, IL: North American Association of Central Cancer
Registries Inc.
Coughlin, S. (2000). Predictors of pancreatic cancer mortality among a large cohort
of United States adults.
Ebru, K. (2012). Gender Differences in Cancer Susceptibility: An Inadequately
Addressed Issue.
Ghosh, K. (2012). Predicting US- and state-level cancer counts for the current
calendar year: Part II: evaluation of spatiotemporal projection methods for
incidence.
Hamilton, S. (2000). World Health Organization Classification of Tumors: Pathology
and Genetics of Tumors of the Digestive System.
Hankey, B. (2002). Impact of reporting delay and reporting error on cancer
incidence rates and trends.
Howlader, N. (2014). Bethesda, MD: National Cancer Institute.
Institute, N. C. (2005). MD: Statistical Research and Applications Branch.
Izmirlian, G. (2013). The National Lung Screening Trial: results stratified by
demographics, smoking history, and lung cancer histology.
Jacobs, E. (2009). Family history of various cancers and pancreatic cancer mortality
in a large cohort.
Jemal, A. (2009). Cancer statistics.
Kramer, B. (1989). The role of prostate-specific antigen (PSA) testing patterns in the
recent prostate cancer incidence decline in the United States.
Mariotto, A. (2008). Quantifying the role of PSA screening in the US prostate cancer
mortality decline.
Midthune, D. (2000). Permutation tests for joinpoint regression with applications to
cancer rates.
Mukaka, M. (2012). A guide to appropriate use of Correlation coefficient in medical
research.
Murphy, S. (2013). Deaths: Final Data for 2010. National Vital Statistics Reports.

33

Odze, R. (2004). Surgical Pathology of the GI Tract, Liver, Biliary Tract, and
Pancreas.
Pickle, L. (2007). A new method of estimating United States and state-level cancer
incidence counts for the current calendar year.
Rencher, A. (2002). Methods of Multivariate Analysis.
Robert, M. (2006). Cancer in the Elderly.
Thompson, T. (2015). Meeting the Healthy People 2020 objectives to reduce cancer
mortality. Preventing Chronic Disease.
Wakkee, M. (2014). Journal of Investigative Dermatology - Multivariable Analysis.
Walter, R. (2013). Height as an explanatory factor for sex differences in human
cancer.
Ward, E. (2012). Trends in colorectal cancer incidence rates in the United States by
tumor location and stage,.

34

You might also like