You are on page 1of 8

NPTEL

Course On

STRUCTURAL RELIABILITY
Module # 02 Lecture 6
Course Format: Web

Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati

6. Lecture 06: Hypothesis Testing

Tests of Distributions Generally, the data available to us is either through experimental observations or recorded signals. They are not bound to exactly follow any mathematical probability distribution model. Behaviour of the data may be of approximately matching any defined probability distribution. Hence, data analysis is required in concern to see whether it is following any distribution or not, if yes than with how much significance. Following sections will discuss two very popular distribution tests, they are Chi Square Test and KolmogorovSmirnov Test.

Chi Square Test Chi square test is used to analyze whether a given random sample follows any theoretically defined probability distribution or not. The basic idea is based on evaluating the cumulative error between the probability density of the random sample and theoretical probability distribution. Stepwise detailed discussion for conducting Chi square test in view of checking probability distribution with its statistical parameters are stated below. Step 1. Firstly, null and alternate hypotheses are formulated based on the probability distribution and statistical parameters of the random data. Both hypotheses can be presented as 0 : ~ , : , 2.6.1 2.6.2

where 0 and denotes null and alternative hypotheses, respectively. is the random variable and is the assumed probability distribution. Rejection of null hypothesis states that the given information either doesn't follows the distribution or assumed parameters or both. Step 2. Null hypothesis is defined by assuming an appropriate model for the observed data. The model must be defined with probability distribution and corresponding statistical parameters. This will stand as basis for estimating the expected frequencies. According to null hypothesis, chi square test is defined on the basis of comparison of observed Course Instructor: Dr. Arunasis Chakraborty 1

Lecture 06: Hypothesis Testing frequencies and expected frequencies as per the assumed model. The test statistics is expressed as

=
=1

2.6.3

where, 2 represents the chi square distribution, and are observed frequencies of the given sample and expected frequencies of the assumed model of th interval, respectively and is total number of intervals of the histogram. Step 3. Level of significance is selected based on the importance or priority of the data. Usually, a level of significance of 5% is selected for the general data. This value can be reduced for higher priority or critical data. Step 4. Histogram of the observed data is formed to evaluate observed frequencies. Similarly, expected frequencies are also evaluated from assumed probability distribution and its parameters. Step 5. Accepting or rejecting of null hypothesis is dependent on the degrees of freedom of chi square distribution, level of significance and chi square value. This gives a region of rejection that above a certain 2 value the hypothesis is rejected. This value is based on the degrees of freedom and level of significance. The degree of freedom is defined by ( ), where is number of quantities estimated from the given sample for use in calculating the expected frequencies. Generally, these quantities are number of observations, mean and standard deviation of sample data, hence, a total of 3 degrees of freedom are subtracted (i.e., 3). Note, if only the number of observations is considered than the degrees of freedom is increased ( 1). Now, for a specific degrees of freedom and level of significance one can find the 2 value from a 2 value table with respect to level of significance. This is clearly explained in the following example. Step 6. If null hypothesis is rejected than an alternative hypothesis is selected and the assumed model is again chosen to conduct the chi square test from Step 1.

Example Ex # 01. A series of random data of sample size 40 is mentioned below from an experimental outcome. Assumed probability distribution model is exponential distribution with level of significance 5% and 1%. Check whether the null hypothesis is rejected or accepted by conducting Chi square test.
0.2049 0.0434 0.8633 0.0989 0.0357 0.0880 2.0637 1.8476 0.2329 0.0906 0.0298 0.0414 0.4583 0.0438 0.4220 2.3275 0.7228 3.3323 1.2783 0.2228 0.1635 0.6035 1.9527 0.0683

Course Instructor: Dr. Arunasis Chakraborty 2

Lecture 06: Hypothesis Testing


0.3875 1.2840 0.2774 3.0754 0.2969 2.3317 0.9359 2.8821 0.4224 2.4319 1.7650 1.1098 0.3481 3.3258 3.4473 0.1206

Solu. Initially, null hypothesis and alternate hypothesis is defined as 0 : ~ : where, is probability density function [ ] for exponential distribution. Before forming a histogram, one must find out mean, standard deviation, the number of intervals and interval class. Mean of the observed data is evaluated as = = 1.0420. For exponential distribution standard deviation is equal to mean. Parameter is evaluated from mean as = 1 = 0.9597. Number of intervals can be evaluated from as shown below = 1 + 3.3 log10 = 1 + 3.3 log10 40 = 6.2868 7

Interval class is selected based on the difference between the minimum value 0.0298 and maximum value 3.4473 of the above mentioned sample per interval. Thus, the interval class comes out to be Class = 3.4473 0.0298 7 = ( < < ) ( < < ) 40 0.9597 exp 0.9597 0.0 0.9597 exp(0.9597 0.5) 40 0.9597 exp 0.9597 0.5 0.9597 exp(0.9597 1.0) 40 0.9597 exp 0.9597 1.0 0.9597 exp(0.9597 1.5) 40 0.9597 exp 0.9597 1.5 0.9597 exp(0.9597 2.0) 40 0.9597 exp 0.9597 2.0 0.9597 exp(0.9597 2.5) = 16.9249 10.0522 5.9703 3.5459 2.1060 = 0.4882 0.5

Now, the histogram for the observed data is formed as shown in table below Class Interval < 0.5 0.5 1.0 1.0 1.5 1.5 2.0 2.0 2.5 21 4 3 3 4 0.9812 3.6439 1.4778 0.0840 1.7033

Course Instructor: Dr. Arunasis Chakraborty 3

Lecture 06: Hypothesis Testing 2.5 3.0 3.0 1 4 = 40 40 0.9597 exp 0.9597 2.5 0.9597 exp(0.9597 3.0) 40 0.9597 exp 0.9597 3.0 0.9597 exp(0.9597 ) 1.2508 1.8295 = 40 0.0503 2.5751 2 = 10.5155

Now, degrees of freedom, for this example, is evaluated as (7 3 = 4). Based on this and level of significance one can obtain 2 value as 9.492 (for = 5%) and 13.280 (for = 1%). According to Chi square test, null hypothesis is rejected for 5% level of significance whereas for 1% level of significance it is accepted.

KolmogorovSmirnov Test Chi square test considers the probability density whereas KolmogorovSmirnov (KS) test considers cumulative distribution function. The philosophy KS behind is determining the maximum absolute difference between the values of cumulative distribution of given random data and assumed model as per null hypothesis. Steps for conducting KS test on a given random sample and with assumed model and its parameters are explained below. Step 1. Similar to Chi square test, null and alternate hypotheses are formulated in terms of probability distribution and statistical parameters of the random data. Also, level of significance is also selected (generally, = 5% is selected). Step 2. As defined above, cumulative mass density of the observed sample is calculated as shown in equation below 0 = 1 for < 1 for < < +1 for 2.6.4

where, is random data placed in ascending order, is sample size and ranges from 1,2, , . Step 3. Cumulative distribution of the random sample as per the assumed probability distribution and its parameters , i.e. is calculated. Step 4. Finally, maximum absolute difference between the cumulative function of the observed and expected is evaluated as shown below

Course Instructor: Dr. Arunasis Chakraborty 4

Lecture 06: Hypothesis Testing KS = max 1 0 , 1 1 , 2 1 , 2 2 , , 1 , , 1 , 2.6.5

Step 5. Critical KS value with respect to and is calculated from a KS value table for comparing the observed KS value evaluated as per Eq. 2.6.5. Step 6. Like Chi square test, null hypothesis is rejected if the computed KS value is more than critical value from Step 5.

Example Ex # 02. Considering Ex # 01 check whether the null hypothesis is rejected or accepted by conducting KolmogorovSmirnov test. For ease the random data is arranged in ascending order. 0.0434 0.2049 0.3875 0.8633 1.284 0.0357 0.088 0.0989 0.2774 3.0754 0.2329 0.2969 1.8476 2.0637 2.3317 0.0298 0.0414 0.0906 0.9359 2.8821 0.0438 0.422 0.4224 0.4583 2.4319 0.7228 1.1098 1.765 2.3275 3.3323 0.1635 0.2228 0.3481 1.2783 3.3258 0.0683 0.1206 0.6035 1.9527 3.4473

Solu. Again, null hypothesis and alternate hypothesis, mean of the observed data and parameter is taken from Ex # 01. Now, for performing KS test one have to evaluate cumulative mass distribution as per Eq. 2.6.4 ,i.e. and are given in table below Rank 1 2 3 4 5 6 7 8 9 0.0306 0.0115 0.0078 0.0308 0.0554 0.0563 0.0624 0.0849 0.1021 0.0056 0.0135 0.0328 0.0558 0.0804 0.0813 0.0874 0.1099 0.1271

0.0298 0.0250 0.0306 0.0357 0.0500 0.0365 0.0414 0.0750 0.0422 0.0434 0.1000 0.0442 0.0438 0.1250 0.0446 0.0683 0.1500 0.0687 0.0880 0.1750 0.0876 0.0906 0.2000 0.0901 0.0989 0.2250 0.0979

Course Instructor: Dr. Arunasis Chakraborty 5

Lecture 06: Hypothesis Testing 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 0.1206 0.2500 0.1181 0.1635 0.2750 0.1566 0.2049 0.3000 0.1922 0.2228 0.3250 0.2072 0.2329 0.3500 0.2155 0.2774 0.3750 0.251 0.2969 0.4000 0.2661 0.3481 0.4250 0.3042 0.3875 0.4500 0.3322 0.4220 0.4750 0.3558 0.4224 0.5000 0.356 0.4583 0.5250 0.3797 0.6035 0.5500 0.4668 0.7228 0.5750 0.5291 0.8633 0.6000 0.5932 0.9359 0.6250 0.6229 1.1098 0.6500 0.6854 1.2783 0.6750 1.7650 0.7250 0.736 0.841 1.2840 0.7000 0.7376 1.8476 0.7500 0.8541 1.9527 0.7750 0.8693 2.0637 0.8000 0.8835 2.3275 0.8250 0.9115 2.3317 0.8500 0.9119 2.4319 0.8750 0.9207 2.8821 0.9000 0.9504 3.0754 0.9250 0.9594 3.3258 0.9500 0.9687 0.1069 0.0934 0.0828 0.0928 0.1095 0.099 0.1089 0.0958 0.0928 0.0942 0.119 0.1203 0.0582 0.0209 0.0182 0.0229 0.0604 0.086 0.0626 0.141 0.1291 0.1193 0.1085 0.1115 0.0869 0.0707 0.0754 0.0594 0.0437 0.1319 0.1184 0.1078 0.1178 0.1345 0.124 0.1339 0.1208 0.1178 0.1192 0.144 0.1453 0.0832 0.0459 0.0068 0.0021 0.0354 0.061 0.0376 0.116 0.1041 0.0943 0.0835 0.0865 0.0619 0.0457 0.0504 0.0344 0.0187

Course Instructor: Dr. Arunasis Chakraborty 6

Lecture 06: Hypothesis Testing 39 40 3.3323 0.9750 0.9689 3.4473 1.0000 0.9725 0.0189 0.0025 0.0061 0.0275

max( 1 , ) = 0.1453 KS value observed from the random data is 0.1453. Critical KS values based on sample size 40 and level of significance (5% and 1%) are evaluated as KS5% = KS1% = 1.36 1.63 , , for > 35 for > 35 = = 0.2150 0.2577

Thus, according to KS test, for both the cases null hypothesis is accepted as the observed value is less than the critical values.

Course Instructor: Dr. Arunasis Chakraborty 7

You might also like