Application of Statistical Concepts in the Determination of Weight Variation in Samples
Robles, Jeffrey Department of Chemical Engineering, College of Engineering University of the Philippines, Diliman, Quezon City Date Due: July 4, 2012 Date Submitted: July 4, 2012
Keywords: statistics, standard deviation, uncertainty, accuracy, precision
METHODOLOGY
Ten 1peso coins were used as samples in the experiment. Forceps were used in handling these samples to prevent accumulation of moisture which can affect the measured weight. Using an analytical balance, their weights were taken and were subjected to further calculations. The weights of the first six samples belong to Data Set 1(DS1) while all masses constitute Data Set
2(DS2).
RESULTS AND DISCUSSION
One of the objectives of this experiment is to observe the fact that errors in experiment are inevitable. It is therefore important to be aware of such errors. There are basically three types of errors: Determinate/ Systematic, random/ Indeterminate, and Gross Errors. Determinate errors are errors which can be attributed to known causes and are often reproducible. These can either be methodic, operative or instrumental. An example of this type is the possible failure of the analytical balance used in the experiment. Meanwhile, gross errors are errors that occur only occasionally mostly due to human errors. These errors are too impactful that the process should be repeated to obtain accurate results. Handling the samples with bare hands
rather than with forceps is a possible gross error source in the experiment. Lastly, random errors cannot be ascribed to any definite cause. These most often are due to the limitation of measurements. One possible source of random error in the experiment is the fluctuation of temperature in the balance room brought about by people who “come & go.” Random errors are always present in an experiment and cannot be eliminated.
Another objective of the experiment is to apply statistical concepts in treating random errors. One of such is the concept of normal distribution or normal error curve and its parametrical properties. The Normal error curve or Gaussian curve is a moundshaped curve representing the normal probability distribution of data around the mean defined by the equation
=
( ) ^{}
_{√}
^{(}^{1}^{)}
where y is the relative frequency, µ is the population mean, σ is the population standard deviation, and е and π are mathematical constants. This accounts the graph of all data populations provided that determinate errors are dismissed. In a normal error curve, the mean occurs at the central point of maximum frequency thereby resulting to a symmetrical
distribution of positive and negative deviation around the maximum. Other properties of a normal distribution include statistical parameters such as sample mean, range, and standard deviation.
The results tabulated in Tables A in the attached result sheet reveal that there are apparent weights that vary quite enormously from the others (outliers). For Data Set 1, the suspected outliers were 5.3090 and 5.4680 while those for Data Set 2 were 5.3090 and 5.9581. These values were then subjected to a test concerning their rejection or retention called the Qtest. The Qtest is a statistical test that affirms the acceptance of results, more particularly the outliers at a certain confidence level, say 95%. _{}_{}_{} , given by the equation
=  _{} _{} 
_{}
(2)
where _{} is the suspected value, _{} is the nearest neighbour value, and R is the sample range ( _{} _{}_{}_{}_{}_{}_{} − _{"}_{#}_{$} _{}_{} ), should be less than the tabulated value of Q, _{%}_{&} , in order for the value to be accepted. Otherwise the value is rejected and therefore, should not be included in further calculations.
The only rejected result among the ten is the last entry in Data Set 2. Its _{}_{}_{} , 0.6859, is greater than the _{%}_{&} at 95% confidence level, 0.466. The Qtest guarantees that the suspect value, upon rejection was subjected to some error. One possible source of error can be traced not on the experimenter but on the sample itself. All other samples were made on 2004 or later, while the rejected one was made on 2001. According to the Bangko Sentral ng Pilipinas (BSP), 1peso coins made on the late 90’s and early 2000 (6.1 g) are relatively heavier than the present coins (5.35 g).
The Sample standard deviation s, given by the equation
( = ) ^{∑}
,
( _{} − ̅) . − 1
(3)
where ̅is the sample mean ( ^{∑}
the number of samples, expresses the degree of dispersion of values from the mean. The term n1 refers to the “number of degrees of freedom” or the number of individual results that could be allowed to vary while ̅and s are held constant. With n1 in the equation instead on n, s becomes unbiased in estimating the population standard deviation σ. Moreover, the standard deviation provides a measure of the dispersion or spread of results and thus gives a gauge of the analysis’ precision. That is, the higher the s, the wider the results are spread while the lower the s, the more compressed are the results around the mean. The sample standard deviation, as a measure of precision, is related to the sample range R. As R gets narrower, s becomes smaller. The R and s values of Data set 1 are 0.1590 and 0.05403 respectively while that of 2 are 0.2039 and 0.05991 respectively. With a lower s and narrower R, DS1 is said to be more precise than DS2 since the values in DS1 tend to nearly equal to the sample mean ̅. Another way of expressing standard deviation is by expressing it as a ratio of s
and ̅, also known as the Relative Standard Deviation (RSD) which is usually expressed in parts per thousand (ppt).
123
1
) and n is
The Confidence level (CL), given by
45
7(
= ̅± √. (4)
where t (Student’s t) is the tabulated value for n 1 measurements at a certain level of probability, say 95%, is the interval within which the population mean µ has the highest risk of being
found. At 95% probability, it can be inferred that if the experiment is repeated 100 times, thus obtaining 100 confidence intervals, 95 of which would contain µ. The narrower the interval and the higher the level of probability, the more accurate is ̅in estimating µ. In the experiment, the confidence intervals were 5.3319 to 5.4453 (for DS1) and 5.3588 to 5.4510 (for DS2). Having quite narrow intervals, it can be concluded that the results are of high accuracy. Furthermore, the little difference between the confidence limits or the boundaries (0.0269 and 0.0057) describes the results are precise.
From (4), the essence of a reliable s is evident. The less reliable it is, the less precise the confidence interval and consequently the analysis are. Thus, it is indeed important to obtain a precise value of s. In the experiment, one way of producing a more reliable standard deviation is by increasing the number of coins to be weighed, provided that no further errors would be committed. Another is by pooling the standard deviations of the two data sets, assuming that the two sets have the same source(s) of error(s). By pooling s, a better estimate of the “true” standard deviation is obtained. In general,
^{(} ##" 9 ^{=} ^{:} ∑
3
123
( _{1} ̅ _{} ) ^{} ; ∑
123
( _{1} ̅ _{} ) ^{} ;⋯
_{3} ; _{} ;⋯ _{=}
(5)
where . _{} is the total number of samples which indicates the total number of degrees of freedom lost. Using (5), the ( _{} _{#}_{#}_{"} _{} _{9} of the experiment is 0.05772. This is assumed to be the standard deviation of the analysis.
The sample mean ̅ is the average value of a finite number of samples. Assuming that only few errors were present, ̅ is assumed to be equal to the “true” weight of a 1peso coin. The sample means of Data Sets 1 and 2 are 5.3886±0.0002 and 5.4049±0.0003, respectively. Using the ttest(6) and Ftest(7) given below,
these means are compared to determine whether they are significantly different or not.
7
=  ̅ _{3} ̅ _{} 
:
3 _{3} ;
_{}
(6)
^{} ,
@ = ^{} ^{3} _{}
( _{} > ( _{}
(7)
The ttest and Ftest are applied almost the same as the Qtest is, only that the tabulated values are different. If the s values in DS1 and DS2 are equal, there is no need for the Ftest. If the s values, upon Ftest are found to be significantly different, ttest can’t be applied. Since the s values of the two sets are different, Ftest is applied.
Table 1 ttest and Ftest result at 95% probability
Tabulated 
Experime 
Remark 
Conclusion 

value 
ntal value 

F 
4.77 
1.23 
exp<tab 
Not significantly 
different 

t 
1.77 
0.572 
exp<tab 
Not significantly 
different 
From Table 1, it is concluded that the two means are not significantly different. Thus, the null hypothesis that the means are identical is correct and the obtained results are reproducible and precise, except of course the rejected value.
The mean in the analysis, either 5.3886 g or 5.4049 g (since they are not significantly different), deviates by 0.72% from the true mass of 1peso coins which is 5.35 g. This merely shows the accuracy of the analysis.
CONCLUSION
Statistical tools are indeed efficient tools in analyzing and interpreting results. Parameters such as mean, standard deviation, and confidence level help in determining the degree of precision, reproducibility, and accuracy of the analysis as well as in treating random errors— errors that no matter what happen, remain in the process. With such parameters, the results of this experiment were analyzed and have been found to be precise and accurate.
Generally, the precision of results of an experiment increases as the number of samples examined increases. However, in the experiment the precision of Data Set 1 is higher than that of Data Set 2 despite its fewer samples. This perhaps was due to the significant variation in the samples themselves. Therefore, it is recommended to have more samples in the analysis, provided that these are of the same quality (dirtfree, same year of production, etc.) before the weighing.
REFERENCES
Skoog, D. et.al.; Fundamentals of Analytical Chemistry, Eighth Edition; Raffles Corporate Center: Pasig City, 2010; pp. 91146
Day
Underwood, A.L.;
Quantitative Analysis, Sixth Edition.; PrenticeHall, Inc.: New Jersey, 1991; pp. 725
Jr.,
R.A.;
Mendenhall,W.; Beaver, R.; Beaver, B.; Introduction to Probability and Statistics, Eleventh Edition; Books/Cole:CA, 2003; pp 206208, 363 366, 401407, Appendix tables 1 & 6
http://www.bsp.gov.ph/
2012)
(accessed
July
3,
APPENDIX
N 

SAMPLE CALCULATIONS: 
NN = 
_{} _{̅} Q 1000RR7 
For Data Set 1
̅= ^{∑}
,
.
(5.3708 ± 0.0001) F (5.4265 ± 0.0001) F (5.3090 ± 0.0001) F
(5.468 ± 0.0001) F (5.3794 ± 0.0001) F (5.3778 ± 0.0001)
=
6
= 32.3315 ± I6(0.0001) ^{}
6
= 5.3886 ±
5.3886 J K ^{I}^{6}^{(}^{0}^{.}^{0}^{0}^{0}^{1}^{)} 2 L
32.3315
= 5.3886 ± 0.0002
(
(
= ) ^{∑}
( _{} − ̅) ^{} . − 1
,
= M (5.3708 − 5.3886) ^{} F (5.4265 − 5.3886) ^{} F
(5.3090 − 5.3886) ^{} F (5.4680 − 5.3886) ^{} F
(5.3794 − 5.3886) ^{} F (5.3778 − 5.3886) ^{}
6 − 1
( = 0.05403
=
=
0.1590 ± 0.0001
5.3886
± 0.0002 ^{Q} ^{1}^{0}^{0}^{0}^{R}^{R}^{7}
29.51 ± 29.51 ) S ^{0}^{.}^{0}^{0}^{0}^{1}
5.3886
0.1590 ^{T} ^{F} ^{S} 0.0002
^{T}
= 29.51 ± 0.03
45
=
= ̅±
7(
√.
5.3886 ± (2.57)(0.05403)
√6
= 5.3886 ± 0.0567
QTest
For Data Set 2
OU(RVW7 XYZUV: 5.9581
_{%}_{&} : 0.466
= \ _{} − _{} \
N
5.9581 − 5.5129
( 
= 

NOP = _{} _{̅} Q 1000RR7 
0.2039 

0.05403 
= 
0.6859 

= 
_{5}_{.}_{3}_{8}_{8}_{6} 
Q 1000RR7 
Since _{}_{}_{} > _{%}_{&} , suspect value is rejected 

= 
10.03RR7 

^{N} 
^{=} ^{} 
^{−} ^{} "#$ 

= 
(5.4680 ± 0.0001) − (5.3090 ± 0.0001) 

_{=} 
_{0}_{.}_{1}_{5}_{9}_{0} _{±} I2(0.0001) ^{} 

= 
0.1509 ± 0.0001 
Lebih dari sekadar dokumen.
Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbitpenerbit terkemuka.
Batalkan kapan saja.