You are on page 1of 35

MALAYSIA HIGHER SCHOOL CERTIFICATE (STPM)

954/4 MATHEMATICS (T) (PAPER 4)


THIRD TERM: STATISTICS






ASSIGNMENT C: MATHEMATICAL INVESTIGATION
(2012/2013)







STATISTICAL INFERENCES ON THE DISTRIBUTION OF DIGIT IN
RANDOM NUMBERS






By






Stephen, P. Y. Bong
(September 2013)







MALAYSIA EXAMINATION COUNCIL
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

i

DECLARATION

I hereby declare that this report entitled Statistical Inferences on the Distribution of Digit
in Random Numbers is the result of my own work except for quotations and citations
which have been duly acknowledged.

Name : Stephen, P. Y. Bong
Email : stephenbongpy@gmail.com
Date : 21 September 2013



Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

ii

TABLE OF CONTENTS

DECLARATION ....................................................................................................................... i
LIST OF FIGURES ................................................................................................................ iii
LIST OF TABLES .................................................................................................................. iii
ABSTRACT ............................................................................................................................. iv
1.0 INTRODUCTION ........................................................................................................ 1
1.1 Aims and Objectives ............................................................................................. 2
1.2 Outline of Report .................................................................................................. 3
2.0 METHODOLOGY ....................................................................................................... 4
2.1 The Generation of Thirty 3-Digit Numbers by Drawing of Poker Cards from
French Deck .......................................................................................................... 4
2.2 The Generation of One Hundred 3-Digit Numbers by Computer Algebra System
(CAS) Wolfram Mathematica 7.0 ...................................................................... 5
2.3 The Revise and Comparison of Probabilities Obtained by Simple Experiment
(Drawing of Cards from French Deck) by Interval Estimation ............................ 8
2.4 The Sampling of Fifty 3-Digit Random Numbers by Random Number Tables ... 8
2.5 The Sampling of Sixty Four 3-Digit Numbers Generated by the Function
5
1000 n .............................................................................................................. 9
3.0 RESULT ANALYSIS AND DISCUSSIONS ........................................................... 10
3.1 The Analysis of Thirty 3-Digit Numbers Obtained from Simple Experiment
(Drawing of Cards from French Deck) ............................................................... 10
3.2 The Analysis of One Hundred 3-Digit Numbers that are Randomly Generated by
Wolfram Mathematica 7.0 .................................................................................. 11
3.3 The Analysis of Fifty 3-Digit Numbers by Random Number Tables ................. 13
3.4 The Analysis of Sixty Four 3-Digit Random Numbers Generated by the Function
5
1000 n ............................................................................................................ 14
4.0 CONCLUSIONS ......................................................................................................... 17
5.0 REFERENCES ........................................................................................................... 18
APPENDICES ....................................................................................................................... A1
Appendix 1 Computational Codes............................................................................ A1
Appendix 2 List of Statistical Tables ....................................................................... A2
Appendix 3 Screenshots of Random Numbers Generated from Wolfram
Mathematica 7.0 ............................................................................................... A11
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

iii

LIST OF FIGURES

Figure 1: French Deck used in the simple experiment to generate thirty 3-digit numbers ........ 5
Figure 2: The computer algebra system (CAS) - Wolfram Mathematica 7.0 used in this work
to generate random numbers .......................................................................................... 6
Figure 3: An illustration of the selection of random numbers from a portion of Random
Number Tables ............................................................................................................... 9
Figure 4: The screenshot of random generation of one hundred 3-digit numbers ranged from 0
to 999 by Wolfram Mathematica 7.0 ........................................................................... 11
Figure 5: The screenshot of random generation of sixty four real numbers ranged from 0 to 1
by Wolfram Mathematica 7.0 ....................................................................................... 11

LIST OF TABLES

Table 1: Confidence intervals for the population proportions, p (Crawshaw & Chambers 2002)
.................................................................................................................................................... 7
Table 2: Outcomes obtained from the drawing of cards from French Deck ............................ 10
Table 3: Frequencies and probabilities of 3 different digits, 2 same digits, and 3 similar digits
.................................................................................................................................................. 10
Table 4: The frequencies and proportions computed based on the sample of one hundred 3-
digit random numbers ................................................................................................... 11
Table 5: The symmetric 90% and 95% confidence intervals for the probabilities that a 3-digit
number has three different digits, two same digits and three identical digits .............. 12
Table 6: Frequencies and probabilities correspond to each respective category in a sample of
fifty 3-digit random numbers ....................................................................................... 13
Table 7: The observed frequency and the expected frequency for Chi-square test ................. 14
Table 8: Processing of raw numbers into a sample of sixty four 3-digit random numbers ..... 15
Table 9: Random Number Tables ........................................................................................... A2
Table 10: Percentage Points of the
2
_ Distribution ............................................................... A7
Table 11: The Upper Tail Probabilities ( ) z u of the Standard Normal Distribution X~N(0, 1)
................................................................................................................................................. A9


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

iv

ABSTRACT

The information perceived and processed by humans cognitive system may be simple
or complex, clear or distorted, and complete or filled with gaps. Yet the options available are
often subjected to uncertainties and natural variable which are unpredictable, thus subjective
probability has been extensively employed in decision making. The outcomes and results are
often validated by the aid of inferential statistics in which inferences and conclusion can be
drawn based on the estimation of population parameters and the test of hypotheses.
Consequently, mathematical investigations on the distribution of digit in one thousand 3-digit
numbers ranged from 0 to 999 are conducted. Due to the immense size in terms of population,
it would be a tedious task if census on each data is conducted. Hence, samples with sizes of
30, 100, 50 and 64 are obtained through various methods such as simple experiment, number
generator and random sampling. The results obtained from the analyses conducted on each of
the sample have a high resemblance in terms of probability in which the discrepancy is not
exceeded by 0.15. In addition, the testing of hypotheses drawn by one-tail test and Chi-square
goodness of fit test also concluded that the suggested probability in which a 3-digit number
encompasses of 3 distinct digits is 0.73 and 0.7 are accepted with clear justification. As a
result, in order to getting a sample which possesses the capability to provide a better
indication to population parameters with higher accuracy, a sample with larger size which has
a range 100 < n < 400 is recommended.



















Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

INTRODUCTION 1

1.0 INTRODUCTION

The information perceived and processed by humans cognitive system may be simple
or complex, clear or distorted, and complete or filled with gaps (Chai 2012). Hence,
Kahneman and Tversky (1972) addressed that subjective probability has been extensively
employed in decision making. This is due to the options available are annexed with
uncertainties and variations. As a result, statistical inferences such as estimation and
hypothesis test (Soon & Lau 2013) could be conducted in the examination of corresponding
attributes and outcomes associated (Wallsten & Diederich 2011). As mentioned by Machina
and Schemeidler (1992), although there are several literatures that done beforehand by
theorists and philosophers such as Koopman (1940) and Ramsey (1931) in which definitions
of probability are proposed, and yet the shortage of unique phrase and concise definitions in
which subjective probability can be explicitly described in detail still exists. In a nutshell,
according to Anscombe and Aumann (1963), the term subjective probability can be
interpreted as a persons preferences, in so far as these preferences satisfy certain
consistency assumptions. In terms of simplicity, it could be defined as the likelihood that a
particular outcome will occur based on individual judgment or degree of belief (Clemen 1997).
Besides, according to Kyburg (1978), the computations of subjective probability of an event
do not possess a standard mathematical equation or formula as it encompasses higher
frequency of bias.

Apart from branches of psychological science such as cognition as well as decision
making, the advances and rapid development of mathematical statistics have also contributed
subjective probability to be extensively utilized in real-life and industrial applications.
Biotechnological industry and forensic science are typical pervasive fields in which subjective
probability is extensively intervened. For instances, the research conducted by Biedermann
and colleagues (2013) reveals that law and its interface with forensic science represent an
illustrative example of an area of application where decision-making plays a central role.
Probabilistic inference is a part of this framework, but only as a preliminary step which based
on ones beliefs. Subsequently, a decision is made and a conclusion or verdict is drawn.
Besides, Velasco (2012) and Weir (2013) also clearly illustrate that in their researches,
subjective probability is widely applicable in the processing of Deoxyribonucleic acid (DNA)
and gene expression. This is due to the fact that gene expression would be inderministic if the
question of whether a gene will be expressed in a given cell in a given time frame is not
determined even by the exact state of all the cellular and environmental components at a given
time. In addition, some widespread situations in real life in which subjective probability is
involved are presented in the subsequent paragraph.

The probability that double Ace could be drawn from French Deck in a Game of
Blackjack (or often referred as The Twenty-One) is 7.51% with a constraint of the dealer
stand on all 17s (Blackjack Info 2013; Maskalevich 2011). This is owing to the existence of
chances of drawing cards with odd ranks out of the four French suits could constitute to a sum
of twenty-one in the game (Christenstock 2013). In addition, the chances of an election
candidate from a political party could win in a general election and become a Member of
Parliament (MP) is 33%. This is due to the number of options in terms of political parties
involved in the general election (Jacob 2011). Apart from that, ones could also deduce the
probability in which one of the columns in a structure will experience buckling is 0.23%
(Hibbeler 2009), since parameters such as settlement of soil (Whitlow 2000), compression
property of distinct alloy steels (Benham et al. 1987), and the vibration induced by wind loads
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

INTRODUCTION 2
(Thomson & Dahleh 1998) will also deviate the occurrence of buckling drastically. Moreover,
ones could overthrow or oppose the hypothesis created by an experienced engineer that the
time taken for an infinitesimal sand to reach the ground level from the top of a construction
building is 10 minutes. This is due to the fact that the induced drag and viscosity of air at 1
atmosphere will result in variation of terminal velocity which might prolonged the settling
time (Wong 2012).

Based on the four real-life situations in which subjective probability is intervened as
mentioned above, it can be distinctly seen and concluded that experiences and personals
degree of belief act as a dominant role in the determination of subjective probability.
Therefore, Crawshaw and Chambers (2002) addressed that techniques of statistical inferences
such as sampling and estimation as well as the test of hypotheses are frequently conducted to
validate the outcomes and results obtained. Consequently, in order to understand and
familiarize the fundamental mechanics and physics behind subjective probability, a
mathematical investigation on the distribution of digit in one thousand 3-digit numbers ranged
from 0 to 999 has constituted the primary intention of this work. This distribution of digit in
random numbers is taken into considerations in this work as it is in fact a strong of the 10
digits (0-9) arranged in an irregular way. As mentioned by Bafna and Kumar (2012), although
the term random in random number reflects the irregularities in terms of the arrangement of
digit, but it is always emphasized that random numbers cannot be truly random since they are
generated by a fixed algorithm or programming code such as the linear congruence method
for generating pseudo random numbers which is extensively applied in linear recurrence
relation (Marsaglia & Zaman 1993): x
i+1
ax
i
+ b mod M, where a and b are arbitrary
constants, M is the element in the set of real numbers in the closed interval from 0 to 1, and
| ) 0, i e . As a result, it would be an interesting intellectual and an enjoyable process in the
pathway of conducting analysis and mathematical investigation on the distribution of digit in
random numbers.

1.1 Aims and Objectives

As mentioned above, since the nature of subjective probability are permeated with
uncertainties, and it is a strong dependent variable corresponds to past experiences and
personals degree of belief, therefore, it is an indispensable predecessor to possess a sound
comprehending on the fundamental mechanics on inferential statistics in which conclusions
and verdict could be drawn. Hence, the specific aims and the corresponding objectives of this
work are listed below according to the point of indentation:

- To visualize the pattern of arrangement of digit in random numbers.

To investigate the distribution of digit based on sample statistics corresponds to
a population of one thousand 3-digit numbers ranged from 0 to 999.
To observe the frequency distribution corresponds to the samples of random
number obtained with classifications of 3 distinct digits, 3 identical digits, and 2
similar digits.

- To conclude or make a statistical inference on the population of one thousand 3-digit
numbers based on statistics of samples.
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

INTRODUCTION 3
To determine the sample statistics which will accurately reflects the
characteristics of the entire population (one thousand 3-digit numbers) interval
estimation and hypothesis testing.

1.2 Outline of Report

The methodologies and approaches employed in the solving of problems in this work
are reviewed in Section 2.0. Whereas, Section 3.0 outlines the results and discussions on the
outcomes obtained. Finally, draw of conclusions for the entire assignment is presented in
Section 4.0.




Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

METHODOLOGY 4

2.0 METHODOLOGY

As mentioned in Section 1.0, since past experiences and ones degree of belief play a
significant role and act as the determinism factors which would drastically varies the
outcomes and results of subjective probability, and in order to increase its degree of
credibility, the results obtained are often validated by the aid of methodologies such as
interval estimation and hypothesis testing. Such techniques from statistical inferences are
adopted are due to considerations of time and costs (Soon & Lau 2013). Apart from that, as
mentioned by Crawshaw and Chambers (2002), it would be a tedious survey if the size of
targeted population for census to be conducted is large. On the contrary, although numerous
advantages as mentioned could be resulted through the application of sampling and estimation,
but, biases such as uncertainties and natural variations could significantly affect the
characteristics of population parameters. Hence, in order to obtain a good sample in which the
entire population can be represented, the methods used to obtain the samples in this work, and
the approaches taken in the analyses of outcomes and results are presented in the subsequent
subsections.

2.1 The Generation of Thirty 3-Digit Numbers by Drawing of Poker Cards from
French Deck

Since a deck of poker card (or often referred as French Deck see Figure 1 below)
encompasses of 52 cards which are made up from 13 distinct ranks with four of each French
suits namely: Diamond (), Club (), Heart (), and Spade (). By using 10(s) as 0,
Ace(s) as 1, 2(s), 3(s), 4(s), 5(s), 6(s), 7(s), 8(s), and 9(s) correspond to
integers ranged from 2 to 9 respectively, a 3-digit number is obtained by drawing of three
cards one-by-one, without replacement. If cards such as J(s), Q(s), K(s) and Joker(s)
are drawn, it will be neglected until those cards which had been mentioned in the preceding
sentence are obtained. In order to ensure all the cards are randomly distributed, shuffling of
cards is performed before each of the subsequent draw. The outcomes are then tabulated in
Table 2 (see Section 3.1). Based on the outcomes tabulated in Table 2, the frequencies of
numbers with 3 different digits, 2 similar digits, and 3 identical digits are counted. Lastly, the
probabilities of getting each of the categories mentioned are determined by Eq. (1), and a
deduction on the nature of probabilities obtained is made.

( )
( )
( )
n
P
n
X x
X x
S
=
= =
Eq. (1)

where X represents the number of identical digit (X = 0, 2 and 3)
S is the sample space, n(S) = 30.


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

METHODOLOGY 5

Figure 1: French Deck used in the simple experiment to generate thirty 3-digit numbers

2.2 The Generation of One Hundred 3-Digit Numbers by Computer Algebra System
(CAS) Wolfram Mathematica 7.0

In order clearly illustrates the distribution of digit in random number, as well as to
obtain a result which can give a more accurate indication of the population characteristics
being studied, a sample of one hundred 3-digit numbers are generated randomly by the aid of
computer algebra system (CAS) Wolfram Mathematica 7.0 as shown in Figure 2. The
computational code used to generate one hundred 3-digit random numbers can be found in
Appendix 1. Wolfram Mathematica 7.0 is adopted as the random number generator in this
work. This is due to the equitability of the software in Swinburne University of Technology
(Sarawak Campus). Besides, it is also a trending and powerful computer algebra system
which has been extensively used by researchers and statistician nowadays, in such a way that
the randomness of the random integers generated can be assured. The outcomes are then
tabulated in Table 4 (see Section 3.2) and the frequencies of each case (distribution of digit)
are counted by the aid of tally. With the frequencies of each case known, the proportions (or
relative frequency of occurrence) are determined. Since the sample size is 100 (n 30), thus
the sampling distribution can be approximated by a normal distribution by the employment of
Central Limit Theorem (Crawshaw & Chambers 2002) as shown in Eq. (2).

N ,
s
pq
P p
n
| |
|
\ .

Eq. (2)

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

METHODOLOGY 6

where p is the proportion of successes in the population
q = 1 p and n is the sample size


Figure 2: The computer algebra system (CAS) - Wolfram Mathematica 7.0 used in this work
to generate random numbers

Usually, a continuity correction factor of
1
2n
is added when normal approximation to
the binomial distribution is considered. On the contrary, when significance test is
implemented and confidence interval approach is used, it is perfectly reasonable to specify the
confidence coefficient in advance at some conventional values such as 90% and 95%. Hence,
the approximate limits using the continuity correction also tend to be conservative
(Mendenhall et al. 2013). With the continuity correction neglected, interval estimation is
conducted based on symmetric 90% and 95% confidence intervals in such a way that the
population parameters can be accurately indicated and reflected since it possesses the
capability of providing a range of values which has certain probability of containing the
population parameter (Elsevier Inc. 2012). Apart from that, the results from interval
estimation are used to revise the probability computed from the sample of thirty 3-digit
numbers obtained by drawing of poker cards. The standard equation for the determination of
confidence limits is provided in Eq. (3) as follows. Lastly, a comment on the population
proportion is made.


( ) ( )
2 2
Lower Confidence Limit Upper Confidence Limit
(LCL) (UCL)
1 1
,
s s s s
s s
p p p p
p z p z
n n
o o
| |
|

|

|
|
|
\ .
Eq. (3)
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

METHODOLOGY 7
where
2
z
o
is the critical z-values in confidence intervals. Table 1 below summarized the confidence intervals for the population proportions, p.

Table 1: Confidence intervals for the population proportions, p (Crawshaw & Chambers 2002)
90% 95% 99%
Normal
Distribution
Curve


10% = 0.10 5% = 0.05 1% = 0.01
Critical
z-values

(see Appendix 2 for
complete table)
The upper tail probability is 0.05, so the
lower tail probability is 0.95.
( )
( )
0.10
2
0.05
1
0.05
P 0.95
0.95
0.95
1.645
Z z
z
z

| |
< =
|
\ .
u =
= u
=

The upper tail probability is 0.025,
so the lower tail probability is
0.975.
( )
( )
0.05
2
0.025
1
0.025
P 0.975
0.975
0.975
1.96
Z z
z
z

| |
< =
|
\ .
u =
= u
=

The upper tail probability is 0.005, so
the lower tail probability is 0.995.
( )
( )
0.01
2
0.005
1
0.005
P 0.995
0.995
0.995
2.576
Z z
z
z

| |
< =
|
\ .
u =
= u
=

Confidence
intervals
Lower Confidence Limit Upper Confidence Limit
(LCL) (UCL)
1.645 , 1.645
s s s s
s s
p q p q
p p
n n
| |
|
|

|
|
|
\ .

Lower Confidence Limit Upper Confidence Limit
(LCL) (UCL)
1.96 , 1.96
s s s s
s s
p q p q
p p
n n
| |
|
|

|
|
|
\ .

Lower Confidence Limit Upper Confidence Limit
(LCL) (UCL)
2.576 , 2.576
s s s s
s s
p q p q
p p
n n
| |
|
|

|
|
|
\ .

Width 2 1.645
s s
p q
n
2 1.96
s s
p q
n
2 2.576
s s
p q
n

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 8
2.3 The Revise and Comparison of Probabilities Obtained by Simple Experiment
(Drawing of Cards from French Deck) by Interval Estimation

In order to revise the compare the probabilities obtained by drawing of poker cards
from French Deck, the standard error of the sample with size of 30 is computed. In addition, a
comparison on the probability distribution is conducted based on the results obtained from
interval estimation as tabulated in Table 4. If the revised probabilities are found beyond or
outside the confidence limits, the experiment will be repeated until the sample statistics could
fit the symmetric 90% and 95% confidence intervals.

2.4 The Sampling of Fifty 3-Digit Random Numbers by Random Number Tables

A sample of fifty 3-digit random numbers is obtained by the method of simple random
sampling. This is done in such a way that the numbers are obtained from Random Number
Tables (see Appendix 2) since the randomness of the sample can be assured. By selecting the
5
th
, 10
th
, 15
th
, 20
th
, and 25
th
rows, a sample of fifty 3-digit random numbers could be obtained.
On the other hand, the numbers fall outside the range will be neglected. For the sake of
sensible visualization, an example of the selection of random numbers is presented based on a
portion of Random Number Tables as depicted in Figure 3. Hence, according to Figure 3, the
random numbers selected from the 10
th
Row are listed as follows:

28 55 53 09 48 86 28 30 02 35 71 30 32 06 47 10
th
Row

Thus, the 10 random numbers after processed are:

{285, 553, 094, 886, 283, 002, 357, 130, 320, 647}

Once the sample of fifty 3-digit random numbers is obtained by the methodology
mentioned above, the data is then tabulated in Table 5 (see Section 3.3) and the frequency of
each category is determined. Besides, the proportion of each category is computed as well.
With both the frequency and proportion known, a table which encompasses of observed
frequency and expected frequency is constructed. This is done in such a way that Chi-square
test can be conducted to determine whether the sample of fifty 3-digit numbers fit the
distribution of the revised probability. Whereby, the test statistic or often termed as the
Pearson Chi-square statistic is given by Eq. (4), and with the following null and alternative
hypotheses:

H
0
: The distribution of digit obtained by the sample of fifty 3-digit numbers fit the
distribution of the revised probability.

H
1
: The distribution of digit obtained by the sample of fifty 3-digit numbers does
not meet the distribution of the revised probability.


( )
2
2 i i
i
O E
E
_

=

Eq. (4)

where O
i
is the observed frequency (The frequency computed directly from the sample data
by tally method)

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 9
E
i
is the expected frequency, and it can be determined by multiplying the proportion
with the sample size, E
i
= np and n = 50.

Since the proportions of the population, p is known, then the degree of freedom is
given by v = n 1 (Crawshaw & Chambers 2002), where n = 3 due to three categories of data.
Thus, the Chi-square goodness of fit test is performed at 5% significance level with degree of
freedom of 2 which corresponds to a critical value of 5.991 (see Appendix 2). If
2
_ is less than
5.991, then there is sufficient evidence to reject the null hypothesis. On the contrary, if
2
_ is
found greater than 5.991, then it can be concluded that the distribution of digit obtained by the
sample of fifty 3-digit numbers fit the distribution of the revised probability.


Figure 3: An illustration of the selection of random numbers from a portion of Random
Number Tables

2.5 The Sampling of Sixty Four 3-Digit Numbers Generated by the Function
5
1000 n

A sample of sixty four numbers ranged from 0 to 1 is generated randomly by the aid of
Wolfram Mathematica 7.0. The irrational numbers generated are then substituted into the
function ( ) | |
5
1000 , : 0,1 f n n n = e . The computations are carried out by the aid of
Microsoft Excel 2013, and only the integer part of those numbers is taken into considerations.
The computational codes are attached in Appendix 1. The probability of getting a 3-digit
number with three different digits will be different from the revised probability. However, the
deviation shall not exceed by 0.1; therefore, the suggested probability of obtaining a 3-digit
number with three distinct digits is 0.73. Consequently, the null and alternative hypotheses are
stated as follows:

H
0
: P(X = 0) = 0.73
H
1
: P(X = 0) > 0.73

Since the primary intention of the hypothesis test conducted here is to determine whether the
suggested probability will be accepted, thus, it is a one-tail test with = 0.05 and = 0.10 at
significance level of 5% and 10 % respectively.

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 10

3.0 RESULT ANALYSIS AND DISCUSSIONS

The computations and analysis of results in the following subsections are based on the
methodologies proposed in Section 2.0.

3.1 The Analysis of Thirty 3-Digit Numbers Obtained from Simple Experiment
(Drawing of Cards from French Deck)

The outcomes obtained from the drawing of poker cards from French Deck are
tabulated in Table 2 below.

Table 2: Outcomes obtained from the drawing of cards from French Deck
No.
Drawing of Poker Cards
3-Digit
Numbers
No.
Drawing of Poker Cards
3-Digit
Numbers
1
st
Draw 2
nd
Draw 3
rd
Draw 1
st
Draw 2
nd
Draw 3
rd
Draw
1 4 5 5 455 16 7 4 2 742
2 0 5 7 57 17 9 0 4 904
3 6 6 6 666 18 5 4 2 542
4 6 8 8 688 19 8 4 8 848
5 3 5 5 355 20 7 7 5 775
6 4 1 2 412 21 1 8 5 185
7 8 8 5 885 22 9 0 0 900
8 3 2 0 320 23 1 6 4 164
9 7 2 5 725 24 4 8 5 485
10 3 8 9 389 25 9 3 8 938
11 6 1 4 614 26 6 4 5 645
12 6 8 5 685 27 1 5 5 155
13 7 0 1 701 28 6 2 3 623
14 9 1 8 918 29 7 5 6 756
15 7 3 8 738 30 0 3 8 38

Based on the outcomes tabulated in Table 2 above, the frequencies and probabilities for 3
different digits, 2 same digits and 3 similar digits are computed and tabulated in Table 3.

Table 3: Frequencies and probabilities of 3 different digits, 2 same digits, and 3 similar digits
X =x Outcomes corresponding to X =x
Frequencies
f
Probabilities
P(X =x)
0
(6,1,4), (1,8,5), (0,5,7), (6,8,5), (7,0,1), (1,6,4), (9,1,8),
(4,8,5), (7,3,8), (9,3,8), (4,1,2), (7,4,2), (6,4,5), (9,0,4),
(3,2,0), (5,4,2), (6,2,3), (7,2,5), (7,5,6), (3,8,9), (0,3,8)
21 0.7
2
(4,5,5), (9,0,0), (6,8,8), (3,5,5), (8,8,5), (1,5,5), (8,4,8),
(7,7,5)
8 0.266666667
3 (6,6,6) 1 0.033333333
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 11
According to the frequencies and probabilities for each case tabulated in Table 3, it
can be deduced that the probability of getting a number with three identical digits is extremely
low (i.e. P(X = x) = 0.0333) as compared to the probabilities of the other two categories. On
the contrary, the results also reveals that obtain a number with three different digits is the
easiest.

3.2 The Analysis of One Hundred 3-Digit Numbers that are Randomly Generated by
Wolfram Mathematica 7.0

The samples of one hundred 3-digit numbers that are randomly generated by the aid of
Wolfram Mathematica 7.0 are listed below:

765 905 863 718 518 307 166
002 156 727 556 699 813 056
216 091 078 096 777 996 790
362 171 666 294 856 819 951
324 404 562 775 626 255 518
177 980 760 795 392 917 916
407 281 396 245 042 820 812
743 810 953 228 037 168 861
388 393 695 869 802 405 409
991 118 768 976 080 809 388

Based on the data listed above, the frequencies, f and proportions, p
s
for each case are counted
and tabulated in Table 4 as follows:

Table 4: The frequencies and proportions computed based on the sample of one hundred 3-
digit random numbers
X = x Frequencies, f Proportions,
s
p
0 69 0.69
2 29 0.29
3 2 0.02

With both the frequencies and proportions known, the symmetric 90% and 95% confidence
intervals for the probabilities that a 3-digit number has three different digits, two same digits
and three identical digits are determined and tabulated in Table 5.



Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 12

Table 5: The symmetric 90% and 95% confidence intervals for the probabilities that a 3-digit number has three different digits, two same digits
and three identical digits
Symmetric 90% Confidence Interval
Category 3 Different Digits (X = 0) 2 Same Digits (X = 2) 3 Same Digits (X = 3)
Proportion, p
s
0.69 0.29 0.02
Confidence Interval
( )
0.69 0.31
0.69 1.645
100
0.6139, 0.7661


( )
0.29 0.71
0.29 1.645
100
0.2154, 0.3646


( )
0.02 0.98
0.02 1.645
100
0.003, 0.04303





Symmetric 95% Confidence Interval
Category 3 Different Digits (X = 0) 2 Same Digits (X = 2) 3 Same Digits (X = 3)
Proportion, p
s
0.69 0.29 0.02
Confidence Interval
( )
0.69 0.31
0.69 1.96
100
0.5994, 0.7806


( )
0.29 0.71
0.29 1.96
100
0.2011, 0.3789


( )
0.02 0.98
0.02 1.96
100
0.074, 0.0474



Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 13
The results tabulated in Table 3 and
Table 5 clearly manifested that there is a high resemblance in terms of proportions
obtained between two samples of distinct sizes. For instances, the deviation in terms of
probabilities from two samples that a 3-digit numbers has three distinct digits is only 0.01,
which is relatively small. Hence, it can be said that a larger sample would results in a more
accurate indication on the population parameter (Soon & Lau 2013). Apart from that,
although the probability that a 3-digit number has three identical digits is 0.02, but the results
obtained from the interval estimation distinctly illustrate that the lower confidence limit for
both symmetric 90% and 95% confidence intervals are -0.003 and -0.0074 respectively, in
which the probabilities are negative that is impossible to occur in real-life situations. In
addition, the occurrence of probabilities with negative values also implies that although the
true population parameters will be included if an interval with larger size is employed, but it
might leads to discrepancies of results as well (Crawshaw & Chambers 2002). Therefore, in
order to overcome the drawbacks of interval estimation, a larger sample with size of 100 < n <
400 would be recommended.

3.3 The Analysis of Fifty 3-Digit Numbers by Random Number Tables

Based on the tabulation of results for both symmetric 90% and 95% confidence
intervals in Table 5, it could be clearly seen that none of the probability in Table 3 lies outside
the confidence limits. Thus, the subjective probability obtained by drawing of poker cards
from French Deck in Section 3.1 is acceptable. Consequently, this constituted the formulation
of the following null and alternative hypotheses.

H
0
: The distribution of digit obtained by the sample of fifty 3-digit numbers fit the
distribution of the revised probability.

H
1
: The distribution of digit obtained by the sample of fifty 3-digit numbers does
not meet the distribution of the revised probability.

According to the approaches proposed in Section 2.4, fifty 3-digit random numbers are
selected from Random Number Tables and listed below:

555 956 356 438 548 246 223 162 430 990
576 086 324 409 472 796 544 917 460 962
378 594 351 283 395 008 304 234 79 688
311 693 324 350 278 987 192 015 370 049
299 498 942 468 496 910 825 375 919 330

Hence, the frequencies and probabilities correspond to the respective categories are computed
and tabulated in Table 6 below.

Table 6: Frequencies and probabilities correspond to each respective category in a sample of
fifty 3-digit random numbers
X = x Frequencies, f Probabilities, P(X = x)
0 40 0.8
2 9 0.18
3 1 0.02

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 14
With the observed frequency in Table 6 known, the expected frequency are calculated and
tabulated in Table 7.

Table 7: The observed frequency and the expected frequency for Chi-square test
X = x
i

Observed
Frequency
O
i

Revised
Probability
p
Expected
Frequency
E
i
= np
i i
O E
( )
2
i i
i
O E
E


0 40 0.7 35 5 0.7143
2 9 0.2667 13.335 -4.335 1.4092
3 1 0.0333 1.665 -0.665 0.2656
50
i
O =

50
i
E =


( )
2
2.3891
i i
i
O E
E



As mentioned in Section 2.4, since the upper point critical value for Chi-square goodness of
fit test at 5% significance level with 2 degree of freedom is
2
0.05,2
5.991 _ = (see Appendix 2),
which is much more greater than the test value of
2
2.3891 _ = , therefore, there is no
sufficient evidence to reject the null hypothesis. As a verdict, it can be concluded that the
distribution of digit obtained by the sample of fifty 3-digit numbers fit the distribution of the
revised probability.

3.4 The Analysis of Sixty Four 3-Digit Random Numbers Generated by the Function
5
1000 n

The sixty four real numbers ranged from 0 to 1 that are randomly generated by Wolfram
Mathematica 7.0 are listed below:

0.708477 0.295863 0.971974 0.099977 0.962334 0.630794 0.379602 0.900808
0.217191 0.914257 0.156891 0.405242 0.23194 0.252731 0.369021 0.630475
0.845501 0.307553 0.397659 0.211588 0.993561 0.394024 0.297604 0.17321
0.428942 0.179826 0.291179 0.271801 0.250618 0.022988 0.760863 0.013218
0.904237 0.497101 0.001202 0.869522 0.927179 0.410847 0.135921 0.837393
0.760437 0.253564 0.639631 0.981092 0.06209 0.578899 0.994646 0.773488
0.669724 0.980035 0.180364 0.008676 0.898413 0.776303 0.951459 0.435168
0.367291 0.138426 0.519327 0.411859 0.848798 0.819825 0.566698 0.132321


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 15
By the aid of Microsoft Excel 2013, the raw numbers listed above are substituted into the
function ( )
5
1000 f n n = , computed and tabulated in Table 8.

Table 8: Processing of raw numbers into a sample of sixty four 3-digit random numbers
No.
Raw
Random
Number
n
f(n) int(f(n)) No.
Raw
Random
Number
n
f(n) int(f(n))
1 0.708477 933.3943 933 33 0.962334 992.3507 992
2 0.217191 736.8318 736 34 0.23194 746.5779 746
3 0.845501 966.9919 966 35 0.993561 998.7089 998
4 0.428942 844.2668 844 36 0.250618 758.2326 758
5 0.904237 980.0686 980 37 0.927179 984.992 984
6 0.760437 946.7006 946 38 0.0620896 573.5929 573
7 0.669724 922.9521 922 39 0.898413 978.8028 978
8 0.367291 818.4687 818 40 0.848798 967.7449 967
9 0.295863 783.8232 783 41 0.630794 911.9636 911
10 0.914257 982.2311 982 42 0.252731 759.5069 759
11 0.307553 789.9216 789 43 0.394024 830.0505 830
12 0.179826 709.5296 709 44 0.0229878 470.2183 470
13 0.497101 869.5387 869 45 0.410847 837.0204 837
14 0.253564 760.0069 760 46 0.578899 896.4386 896
15 0.980035 995.9747 995 47 0.776303 950.6185 950
16 0.138426 673.3541 673 48 0.819825 961.0461 961
17 0.971974 994.3309 994 49 0.379602 823.8833 823
18 0.156891 690.4299 690 50 0.369021 819.2382 819
19 0.397659 831.5764 831 51 0.297604 784.7435 784
20 0.291179 781.3255 781 52 0.760863 946.8066 946
21 0.00120195 260.6017 260 53 0.135921 670.8993 670
22 0.639631 914.5046 914 54 0.994646 998.9269 998
23 0.180364 709.9536 709 55 0.951459 990.0976 990
24 0.519327 877.1789 877 56 0.566698 892.6277 892
25 0.0999769 630.9282 630 57 0.900808 979.3241 979
26 0.405242 834.724 834 58 0.630475 911.8714 911
27 0.211588 732.9903 732 59 0.17321 704.2301 704
28 0.271801 770.6376 770 60 0.0132181 420.9532 420
29 0.869522 972.425 972 61 0.837393 965.1301 965
30 0.981092 996.1895 996 62 0.773488 949.9281 949
31 0.00867574 386.9558 386 63 0.435168 846.7036 846
32 0.411859 837.4323 837 64 0.132321 667.3071 667


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

RESULT ANALYSIS AND DISCUSSIONS 16
For the sake of sensible visualization, the processed random numbers in Table 8 are further
arranged into an 8 by 8 arrays as follows:

933 783 994 630 992 911 823 979
736 982 690 834 746 759 819 911
966 789 831 732 998 830 784 704
844 709 781 770 758 470 946 420
980 869 260 972 984 837 670 965
946 760 914 996 573 896 998 949
922 995 709 386 978 950 990 846
818 673 877 837 967 961 892 667

Based on the 8 by 8 arrays listed above, the number of 3-digit number with three
different digits is 45. In order to determine whether the probability that a number has three
different digits is more than the probability suggested in Section 2.5, which is p = 0.73,
hypothesis testing at significance levels of 10% and 5% are conducted.

If H
0
is true, then p = 0.73. So, X ~ B(64, 0.73). Since n = 64 (> 30), np = 64 0.73 =
46.72 (> 5) and nq = 64 0.27 = 17.28, hence normal approximation to the binomial
distribution is employed with a continuity correction factor of 0.5.

Thus, X = 44.5 and
X ~ N(64 0.73, 64 0.73 0.27)
X ~ N(46.72, 12.61)

Then,
46.72
12.61
X
Z

=

Test statistic:
44.5 46.72
0.6252
12.61
Z

= =

For hypothesis test with significance level of 5%, the cut-off region is at z = 1.645.
But, based on the test statistic, since the Z
observed
= -0.6252 (< 1.645), the sample value 45 is
not in the critical region. Thus, it can be concluded that the suggested probability is true.

Likewise, for the test of hypothesis at significance level of 10%, the upper critical
point in such a way that the null hypothesis could be rejected is 1.281. Again, Z
observed
< 1.281,
therefore, it can be concluded that the statement on the probability suggested is justified.
Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

CONCLUSIONS 17

4.0 CONCLUSIONS

The studies conducted above clearly manifested that subjective do exists in any
applications in real-life situations, since it is significantly affect by past experiences and ones
degree of belief. Although its determination do not possesses any standard mathematical
formulation and equations, but its outcome is often validated by the aid of inferential statistics
in which interval estimations and hypothesis testing are conducted, and inferences are made.
According to statistical analyses conducted on various samples obtained above, it can be
concluded that the probability that a 3-digit number has three distinct digit is the highest
which lies between 0.65 < P(X = 0) < 0.75. This is true as if a census is conducted on each
data in a population of one thousand 3-digit numbers ranged from 0 to 999; the probability of
getting a 3-digit number with three different digits is also lies in the ranges from 0.65 to 0.75.
Besides, the results obtained from testing of hypotheses by Chi-square and one-tail tests also
lead to the probabilities lie between the confidence limits as computed in Section 3.2. As a
verdict, it can be concluded that the inferences and conclusions drawn from the statistical
analyses on the distribution of digit in random numbers are accepted and properly justified.
















Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

REFERENCES 18

5.0 REFERENCES

Anscombe, FJ & Aumann, RJ 1963, A Definition of Subjective Probability, The Annals of
Mathematical Statistics, vol. 34, No. 1, pp. 199-205, JSTOR, viewed 9 September
2013.

Bafna, A & Kumar, S 2012, ProActive Approach for Generating Random Passwords for
Information Protection, 2
nd
International Conference on Computer, Communication,
Control and Information Technology (C31T-2012), vol. 4, pp. 129-133.

Benham, PP, Crawford, RJ & Armstrong, CG 1987, Mechanics of Engineering Materials, 2
nd

edn., Pearson Longman Group Limited, China.

Biedermann, A, Garbolino, P & Taroni, F 2013, The subjectivist interpretation of probability
and the problem of individualization in forensic science, Science & Justice, vol. 53,
pp. 192-200, Elsevier ScienceDirect, viewed 9 September 2013.

Blackjack Info 2013, Blackjack Dealer Outcome Probabilities, BlackjackInfo.com
<www.blackjackinfo.com/bjtourn-dealercharts.php>, viewed 9 September 2013.

Chai, A 2012, Decision Making in HES3360 Human Factors [slide], Swinburne University of
Technology (Sarawak Campus), Kuching, Sarawak.

Christenstock 2013, Blackjack Using the Probability Theory to Increase your Odds,
Hubpages Inc., <christenstock.hubpages.com/hub/BlackJack---Using-The-Probability-
Theory>, viewed 9 September 2013.

Clemen, RT 1997, CHAPTER 8 Subjective Probability in Making Hard Decisions: An
Introduction to Decision Analysis (Business Statistics), 2
nd
edn., Duxbury.

Crawshaw, J & Chambers, J 2002, A CONCISE COURSE IN ADVANCED LEVEL
STATISTICS With Worked Examples, 4
th
edn., Nelson Thones Ltd., United Kingdom.

Elsevier Inc. 2012, CHAPTER 11 RANDOM NUMBERS in Probability and Random
Processes, Academic Press.

Hibbeler, RC 2009, Structural Analysis, 7
th
edn., Pearson Prentice Hall, Singapore.

Jacob, SM 2011, Probability and Statistics: Sampling and Estimation in HMS211 Engineering
Mathematics 3A [slide], Swinburne University of Technology (Sarawak Campus),
Kuching, Sarawak.

Johnson, R 2005, Chapter 6 Sampling Distribution in Miller & Freunds Probability and
Statistics for Engineers, 7
th
edn., Pearson Prentice Hall, United States of America.

Kahneman, D & Tversky, A 1972, Subjective Probability: A Judgement of
Representativeness, Cognitive Psychology, vol. 3, pp. 430-454.

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

REFERENCES 19
Kyburg, H 1978, SUBJECTIVE PROBABILITY: CRITICISMS, REFLECTIONS, AND
PROBLEMS, Journal of Philosophical Logic, vol. 7, pp. 157-180, D. Reidel
Publishing Company, Dordrecht, Holland.

Machina, MJ & Schmeidler, D 1992, A MORE ROBUST DEFINITION OF SUBJECTIVE
PROBABILITY, Econometrica, vol. 60, pp. 745-780.

Marsaglia, G & Zaman, A 1993, Monkey Tests for Random Number Generators, Applied
Computers Mathematics, vol. 26, pp. 1-10, Elsevier ScienceDirect, viewed 9
September 2013.

Maskalevich, T 2011, Probability involving sampling without replacement and dependent
trials in Math 728 Lesson Plan.

Soon, CL & Lau, TK 2013, CHAPTER 16 Sampling and Estimation in Pre-U Text STPM
Mathematics (T) Third Term, Pearson Malaysia Sdn. Bhd.

Thomson, WT & Dahleh, MD 1998, Theory of Vibration with Applications, 5
th
edn., Pearson
Prentice-Hall, Inc., United States of America.

Velasco, JD 2012, Objective and subjective probability in gene expression, Progress in
Biophysics and Molecular Biology, vol. 110, pp. 5-10, Elsevier ScienceDirect, viewed
9 September 2013.

Wallsten, TS & Diederich, A 2001, Understanding pooled subjective probability estimates,
Mathematical Social Sciences, vol. 41, pp. 1-18, Elsevier ScienceDirect, viewed 9
September 2013.

Weir, BS 2013, DNA Statistical Probability in Encyclopedia of Forensic Sciences, 2
nd
edn.,
University of Washington, Seattle, WA, USA, Elsevier Ltd.

Whitlow, R 2000, Basic Soil Mechanics, 4
th
edn., Pearson Prentice Hall, Malaysia.

Wolfram 2013, Why Mathematica? Wolfram Mathematica 9, Wolfram Inc.,
<www.wolfram.com>, viewed 13 September 2013.

Wong, BT 2012, Surface Resistance Part II in HES5340 Fluid Mechanics 2 [slide],
Swinburne University of Technology (Sarawak Campus), Kuching, Sarawak.









Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A1

APPENDICES

Appendix 1 Computational Codes

1. The computational code used to generate one hundred 3-digit random numbers ranged
from 0 to 999 in Wolfram Mathematica 7.0 is:

RandomInteger[999,100]

The left-hand side of the comma in the square bracket represents the domain, which is
ranged from 0 to 999 in this work, while the right-hand side of the comma is the
number of random integers going to be generated which set to be 100. The general
code is: RandomInteger[domain,n]

2. The computational code used to generate sixty four random numbers ranged from 0 to
999 in Wolfram Mathematica 7.0 is:

RandomReal[1,64]

3. The integer part of the sixty four real numbers generated by the function
( )
5
1000 f n n = are obtained by the following formula in Microsoft Excel:

( ) ( )
int f n =

























Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A2

Appendix 2 List of Statistical Tables

Table 9: Random Number Tables

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A3


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A4


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A5


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A6


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A7
Table 10: Percentage Points of the
2
_ Distribution

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A8


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A9
Table 11: The Upper Tail Probabilities ( ) z u of the Standard Normal Distribution X~N(0, 1)

Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A10


Statistical Inferences on the Distribution of Digit in Random Numbers
by Stephen, P. Y. Bong (September 2013)

APPENDICES A11
Appendix 3 Screenshots of Random Numbers Generated from Wolfram Mathematica
7.0


Figure 4: The screenshot of random generation of one hundred 3-digit numbers ranged from 0
to 999 by Wolfram Mathematica 7.0


Figure 5: The screenshot of random generation of sixty four real numbers ranged from 0 to 1
by Wolfram Mathematica 7.0

You might also like