You are on page 1of 11

Candidate name: Felix Dyrek Candidate number: 001528-031

Investigation
Math Studies

Working Title:

The correlation between lung cancer


incidents and the mean amount of
smoked cigarettes

Candidate Name: Felix Dyrek

Candidate number: 001528-031

School name: Kolegium Europejskie

School number: 001528

Assignment supervisor: Katarzyna Nosalska

1
Candidate name: Felix Dyrek Candidate number: 001528-031

Introduction:
The aim of my investigation is to find out if there is a correlation between lung cancer incidents
in total and tobacco consumption of men and women in 6 countries.

Hypothesis:
My hypothesis assumes that the rate of lung cancer incidents is bigger in the countries with
higher tobacco consumption then in the countries with smaller tobacco consumption.

Method:
In order to be able to investigate the correlation between lung cancer incidents and tobacco
consumption I needed to collect data from the tobacco industry and various (lung) cancer
constitutions. The next step is to verify the collected data and form statistics. In order to calculate
the correlation the following mathematical methods are used:
 The Pearson’s Correlation Coefficient
 The X2 Test

2
Candidate name: Felix Dyrek Candidate number: 001528-031

Raw Data:

Table 1 : Lung Cancer incident rates per 100000 people

Country Lung Cancer Female Lung Cancer Male Lung Cancer


incidents incidents incidents
China 93 27 66
Japan 60 13 47
Thailand 87 37 50
Sweden 35 13 22
Poland 85 21 64
UK 73 22 51

Chart 1: Lung Cancer incident rates per 100000 people

100
90
80
70
60
Total
50
Female
40
Male
30
20
10
0
China Japan Thailand Sweden Poland UK

3
Candidate name: Felix Dyrek Candidate number: 001528-031

Table 2: Adult smokers in total

Country Adult smokers in Adult smokers (per Adult smokers (%)


total 100000 people)
China 462.800.000 35600 35.6
Japan 42.132.466 33100 33.1
UK 12.721.380 21000 21
Sweden 1.939.183 19000 19
Poland 13.149.988 34500 34.5
Thailand 15.019.407 23400 23.4

Chart 2: Comparison between percentage of adult smokers in total and countries in %

Adult Smokers

China
Japan
UK
Sweden
Poland
Thailand

4
Candidate name: Felix Dyrek Candidate number: 001528-031

Table 3: Average amount of cigarettes smoked per year

Country Total amount of cigarettes Total amount of cigarettes


smoked per person smoked per 100000 people
China 1791 179.100.000
Japan 3023 302.300.000
UK 2232 223.200.000
Sweden 1202 120.200.000
Poland 2061 206.100.000
Thailand 1067 106.700.000

Chart 3: Comparison between the average amounts of cigarettes smoked per person / year and
countries in %

Average amount of smoked cigarettes per


person / year

China
Japan
UK
Sweden
Poland
Thailand

5
Candidate name: Felix Dyrek Candidate number: 001528-031

Calculations

The Pearson Correlation coefficient

The Pearson Correlation coefficient is used to identify if there is a correlation between the lung
cancer incidents and the average amount of smoked cigarettes per person. Table 4 is divided into
6 countries in order to be able to compare them. First of all it is to chart the researched data for
lung cancer incidents (x) and the amount of smoked cigarettes per person / year (y). The next
step is to multiply these data in order to obtain XY. Data in x and y have to be raised by 2 to
obtain the results for the last two columns. The following step is to sum up the obtained data up.

Table 4

Table X Y XY X2 Y2
4
Country Lung Cancer Amount of
Incidents smoked cigarettes
per person / year
1 China 93 179100000 16656300000 8649 32076810000000000
2 Japan 60 302300000 18138000000 3600 91385290000000000
3 UK 73 223200000 16293600000 5329 49818240000000000
4 Sweden 35 120200000 4207000000 1225 14448040000000000
5 Poland 85 206100000 17518500000 7225 42477210000000000
6 Thailand 87 106700000 9282900000 7569 11384890000000000
Total 6 433 1137600000 82096300000 33597 241590480000000000

6
Candidate name: Felix Dyrek Candidate number: 001528-031

The last step is to insert the collected data into the Pearson Correlation Coefficient formula and
solve the equation.

r= 82096300000−6×72.16×189600000_______________
√33597−6×72.162 √241590480000000000−6×1896000002

r= 82089216000____________
√2354.60√25901520000000000

r= -1.54x1011

There is no correlation.

7
Candidate name: Felix Dyrek Candidate number: 001528-031

The X2 Test

( fo  fe )2
 calc
2

fe
Where:

f o is an observed frequency

f e is an expected frequency

Observed Value Table (fo): Taken from Table 1, average of the European and Asian countries
within the female and male lung cancer incidents.

Female lung cancer Male lung cancer Sum


incidents incidents
Europe 18.67 45.67 64.34
Asia 25.67 54.33 80
Sum 44.34 100 144.34

Calculation Table: The calculation table will be used to change the observed values into expected
values to have the possibility to calculate the x2 test.

S1 S2 Sum
R1 wy÷n wz÷n w
R2 xy÷n xz÷n x
Sum Y Z n

8
Candidate name: Felix Dyrek Candidate number: 001528-031

Expected Value Table (fe): This table represents the lung cancer incidents between Europe and
Asia. The data is based on my previous data on the 6 countries – China, Japan, Thailand,
Sweden, Poland and the United Kingdom divided into their representing continents. It is also
divided between male and female groups.

Female lung cancer Male lung cancer Sum


incdents incidents
Europe 19.76 44.58 64.34
Asia 24.58 55.42 80
Sum 44.34 100 144.34

Now I am going to calculate the x2 test in order to observe if there exists a correlation between
observed and expected values extracted from the tables concerning male and female lung cancer
incidents within Europe and Asia.

 2 Calculations:

fo fe fo−fe (fo−fe)2 (fo−fe)2÷fe


18.67 19.76 -1.09 1.1881 0.0601
45.67 44.58 1.09 1.1881 0.0267
25.67 24.58 1.09 1.1881 0.0483
54.33 55.42 -1.09 1.1881 0.0214
Total 0.1565

So,  2 = 0.1565

The  2 is small enough to observe that there is a correlation between observed and expected

9
Candidate name: Felix Dyrek Candidate number: 001528-031

Degrees of freedom

df = (r – 1)(c – 1)

The next step is find df and using a table to find the meaning of x2 which I just have obtained.

The x2 distribution depends on the number of degrees of freedom (df) where df = (r – 1)(c – 1)

My table equals:

df=(r-1)(c-1)
df=(2-1)(2-1)
df=1x1=1

10
Candidate name: Felix Dyrek Candidate number: 001528-031

Conclusion and Evaluation

Due to the results which I have obtained during my research it can be concluded that there
doesn’t exist a direct correlation between the amount of smoked cigarettes and the lung cancer
incidents. So my hypothesis is proven to be wrong. There can be various factors resulting in lung
cancer such as second hand smoke, car exhaust, multiple alpha, beta and gamma rays. As these
facots can oncrease the chance of lung cancer my data is not 100% accurate as there are external
factors which can increase the lung cancer incidents. Thus lung cancer incidents are not purely
based on the amount of consuming cigarettes even though it is a known fact that excessive
cigarette consumption may cause lung cancer. As Due to the explanation above the investigation
could be improved by including more external factors such as the one previously mentioned.

Name: Felix Dyrek

11

You might also like