You are on page 1of 6

Application of Statistical Concepts in the Determination of Weight Variation in Samples

Raffi R. Isah Institute of Chemistry, University of the Philippines, Diliman, Quezon City 1101 Philippines Date/s Performed: June 20, 2012; Date Submitted: June 29, 2012

The experiment aims to apply statistical concepts in quantitative analysis. 10 samples of 1-peso coins were weighed and subject to different parameters in statistical concepts. Two data sets were prepared that vary in number of values. Through the parameters, different conclusions can be made. Measures of precision and accuracy were answered, and the mean is an important element in discussing such. Pooled standard deviation also makes the result sharper since different sets are compared. Confidence limits were found to be narrower when more samples were measured, therefore obliging the rule that fewer measures would make results less reliable. Q-tests were performed to know to reject a certain value or not, and no values were rejected therefore increasing the precision of the values. Normal distribution graphs showed that the distribution of data was found to be symmetrical around the mean. Introduction Quantitative analysis in chemistry deals with determination and measurement of the analyte(s) involved in a given sample. Hence, it is important that the results yielded by a scientist should be reliable enough so that his interpretations of the results are useful. However, for every experiment there are always errors and uncertainties, either from the instrument itself or the experimenter, that are impossible to be eradicated. Statistical analysis is therefore integrated in analytical chemistry to analyse data and to be used as a tool in providing sharp conclusions. It is noted that every experiment can only allow limited number of measurements (sample) rather than a whole population, which would require huge number of measurements (infinite). Therefore, the population mean, , may not be experimentally determined but the sample mean (or simply mean), x can be

more reliable than the individual values. It is used to analyse variations in the values. Mean is solved by the equation, = (1 + 2 + 3 + ) (1)

where x = mean

xt represents each value of the sample n is the total number of values in the sample. Median, M, is also used as a measure of central tendency. In a sample that has its values arranged in increasing or decreasing order and are odd in number, the median is evaluated directly (the middle value), while if the values are even in number, the average of the middle pair is the median. Mean is used in measures of precision. Precision is the reproducibility of the values of a sample. Hence, if the values are close enough among each other (regardless of how close they are to the true values), they are precise. Several terms are used in describing the precision of the values. In this 1

determined and is considered experimentally to be the true value since by confidence intervals, they are both expected to lie at the same interval of high probability. Mean is a measure of central tendency, and it is usually the

experiment, the measures used are standard deviation, relative standard deviation (in ppt), range, relative range (in ppt), and pooled standard deviation (used among samples). Standard deviation, s, measures how variant are the measurements from the mean, hence the lower the standard deviation, the closer the measurements are to the mean. It is given by the equation, = ( )2 <1 1

case 2 sets) of the same population to get a better estimate of the population standard deviation (since s is a rough estimate of population standard deviation). The equation is:

S pooled

( X i X 1 ) 2 ( X j X 2 ) 2 ( X k X 3) 2
i 1 j 1 k 1

n1

n2

n3

n1 n2 n3 ... ns

(7)
where n1 is the number of values in set 1, n2 is the number of values in set 2, and so on. In terms of accuracy, on the other hand, the parameters used are measures of the closeness of the experimental values to the true or accepted value and are reported through errors. The parameters are absolute error and relative error.

(2)

Where s = standard deviation

x = mean n = number of values in sample


Variance, a measure of how far the values are scattered, is just the square of standard deviation,

2 =

( )2 <1 1

(3)

Where s = standard deviation s2 = variance

x = mean n = number of values in sample


Relative standard deviation (RSD) is used in absolute terms rather than relative and is given by (in ppt): = 1000 | | (4) Figure 1. Accuracy vs Precision Absolute error E is the difference between the observed and true value and is given by the equation, = where xi is the experimental value xt is the true/accepted value. Relative error is a more useful tool than absolute error since it is a relative value. Its ewuation is: = 1000 (9) (8)

Range, R, is the difference between the highest value and the smallest value in the set. = (5)

Relative range (in ppt) is given by the equation: =


| |

1000

(6)

Pooled standard deviation, spooled, is used for more than one subset of data (in this

Where s = standard deviation s2 = variance

x = mean n = number of values in sample


Explanations to why results are inaccurate are because of experimental errors. There are 3 types of experimental errors, random (indeterminate), systematic (determinate) and gross errors. Random errors are based from the scatter of the values, hence the less precise the values are, the more random error there is. Systemic errors are the reason why the mean of a set differ from the true value. Examples of which are instrumental errors, method errors and personal errors. Gross errors, also from human errors, are the reason for outliers, values that appear too different from other values. As said earlier, the population mean cannot be determined since it involves infinite measurements, so the sample mean is considered as the accepted value. Confidence interval, with its confidence limits, indicates that the accepted mean and the population mean lie at a boundary of high probability, hence making them approximate. The formula for confidence limit is: = (10)

Figure 2. A typical electronic analytical balance. The method used for weighing each coin was weighing by difference. First the watch glass with the 10 coins was weighed; after which the balance was calibrated (tare) so that when a coin was removed, the weight reflected is the weight of the coin removed since it was subtracted from the preceding weight. The process was repeated successively for each coin. It should also be noted that when removing a coin from the watch glass inside the analytical balance, forceps should be used. An analytical balance is very sensitive, hence any moisture or dirt coming from the hands when not using forceps may add to the recorded weight of the coin. Results and Discussion The following data were recorded after weighing each of the 10 1-peso coins: Table 1. Weight of Coins Coin Weight(g) 0.0002 g 1 5.3599 2 5.4370 3 5.3179 4 5.4484 5 5.4056 6 5.4592 7 5.4291 8 5.4420 9 5.4454 10 5.3655 The data was divided into 2 sets. data sets 1 and 2. Data set 1 included coins 1-10, while set 2 included coins 1-6.

where t is the tabulated value (refer to Appendix) and (n-1) is the degree of freedom in the system. Experimental Details Ten (10) random samples of 1-peso coin were prepared for weighing using the analytical balance (no. 11). In general, an analytical balance is an instrument used to determine the mass of a sample. The analytical balance used can weigh samples ranging from 0.0001g- 200 g, and has an instrument error of 0.0002 g.

In order to eradicate questionable values, Q-test is performed for each. Q-test is a statistical test to determine whether a value, probably an outlier, should be retained or rejected for further calculations. The formula is: | | (11) where xq is the questionable value, xn is the numerically nearest value to xq, and R is the range of the set. = After which, the Qexp is compared with critical values Qcrit (see Appendix). If Qexp > Qcrit, then the corresponding value should be rejected; otherwise, it is retained. After applying the Q-test to each value, none exceeded the value of Qcrit = 0.468, hence all values were remained for further calculations. The following statistical parameters were solved for data set 1: Table 2. Statistical Parameters of Data Set 1 Parameter Mean Standard Deviation Relative Standard Deviation (ppt) Range Relative range (ppt) Confidence limits (95%) Values 5.4410 0.0006 0.044998 8.3160 0.14130 0.0003 26.113 0.000005 5.4410 0.0322

half of the grapg is symmetrical to each other, therefore normal distribution is achieved. Also the area of 95% interval seem to occupy most of the area of the curve, therefore there is a high probability, given by the confidence limits, that the true value lies at the same interval with the estimated value of mean. On the other hand, data set 2 also yielded similar results with set 1. No value was removed by Q-test. Table 3. Statistical Parameters of Data Set 2 Parameter Mean Standard Deviation Relative Standard Deviation (ppt) Range Relative range (ppt) Confidence limits (95%) Values 5.4047 0.0005 0.050770 9.3937 0.14130 0.0003 26.144 0.000006 5.4047 0.0533

Figure 4. Normal Distribution Curve of Data Set 2 The graph of data set 2 is steeper than that of set 1. Moreover, the confidence limits of data set 2 is higher, therefore the mean and the true value may lie far away from each other, decreasing accuracy. Also set 2 only contained 6 values which does not make it pleasing for approximation. It should be noted that the difference between the population mean and sample mean decreases when number of measures increase. Conclusion The purpose of the experiment was to determine the weight of each 1-peso coin. Based on the gathered data, it can be concluded that the values recorded were

Figure 3. Normal Distribution Curve of Data Set 1 The mean and standard deviation of data set 1 were interpreted through a normal distribution curve. It can be noted that each

precise since the standard deviation was found to be very close to zero; therefore the values were close enough to the mean. Also, no values were rejected during Q-test therefore strengthening the closeness of the values. Another conclusion was that as the number of measures increases, the confidence limit narrows. The values of t and s decrease and, being proportionate with confidence limit, the confidence limit also decreases which technically means that there is a high probability that the population mean and the sample mean lie in the same interval or are equal. References (1) Crouch, S. R.; Holler, J. F.; Skoog, D. A.; West, D. M. Fundamentals of Analytical Chemistry, 8th ed.; Cengage; Singapore, 2010; pp. 23, 32, 90-96, 126, 142-168. Appendices Table 4. Values of t for 95% Confidence Level n-1 1 2 3 4 5 T95% 12.7 4.3 3.18 2.78 2.57

(2) Mettler Balance. N.d. Photograph. Mettler Balance. Web. 28 June 2012. <http://daphne.palomar.edu/mettlerbalan ce/>.
(3) Workholding and consistency with the Leigh FMT frame mortise & tenon jig. N.d. Photograph. Sandal Woods. Web. 28 June 2012. <http://sandalwoodsblog.com/2009/09/06/workholdingand-consistency-with-the-leigh-fmt/>

6 2.45

7 2.36

8 2.31

9 2.26 9 0.493

10 2.23 10 0.468

Table 5. Critical Values for Rejection Quotient Q at 95% Confidence Level N 3 4 5 6 7 8 Q95% 0.970 0.829 0.710 0.625 0.568 0.526 Sample Calculations: 1. Mean,
(1 : 2 : 3 : )

= 6 = 5.4047 2. Standard Deviation, s

5.3599:5.4370:5.3179:5.4484:5.4056:5.4592

=
=

( ; )2 =1 ;1

(5.3599;5.4047)2 : (5.4370;5.4047)2 ::(5.34592;5.4047)2 5

= 0.050770 3. Relative Standard Deviation (ppt) = | | 1000

= 9.3937 ppt 4. Range, R = = 5.4592 5.3179 = 0.14130 5. Relative range (ppt) = || 1000 = 26.144 ppt 6. Confidence limits = = 5.4047 5.4047 0.0533 = |5.4047| 1000
0.14130

0.050770 (1000) |5.4047 |

2.570.050770 6

You might also like