UNIVERSITY OF PRETORIA
October 2009
_________________________________________________________________________
Executive summary
The most essential task of any production company is to be able to manage production with
reliable methods developed using realistic data, thus creating a platform for continuous
improvement using forecasting methods that can predict the future given past data. The
objective of this paper is to study key performance indicators using statistical tools and
develop a model that can be used to identify a combination of key performance indicators
that are critical in terms of reaching production targets.
highlight relevant tools, techniques and methods that can be used to develop an analysis
model that will reach the objectives of the project. This should aid Anglo Coal South Africa in
accelerated decision making capabilities utilising proposed solution strategies to achieve
stated business objectives.
ii
Acknowledgements
I wish to express my gratitude to people who through their contribution made this project
possible.
Thank you Mr. Derek Laing from Anglo Coal South Africa, Mr. Zaais Van Zyl, Mr. Gary Smith
and Mrs. Lauren Theron from Mining Consulting Services.
Additionally I would like to acknowledge the contribution of the following people:
Mr Olufemi Adentunji, for the reviews and provision of theoretical guidance. Andile Ndiweni
and Takalani Mudau for guidance and support.
iii
Contents
1. Introduction ..................................................................................................................... 1
1.1Mining Process............................................................................................................. 2
1.2 Pick Six ....................................................................................................................... 3
1.3 Problem Statement...................................................................................................... 6
1.4 Research design and methodology ............................................................................. 7
iv
5. Recommendations ........................................................................................................ 27
5.1 Future work ............................................................................................................... 27
List of figure
1.1: Shift breakdown........2
1.2: Components of a shift time.....2
1.3: Mining process, cutting cycle flow chart...........3
1.4: KPI tree...4
1.5: Production process...5
2.1: Least squares estimation......10
3.1: Studentised residual graph of the final model: Vlaaklagte-Simunye Section.......23
vi
List of table
3.1: Regression terminology..................................................................................................16
3.2: Pick six model symbols.......17
4.1: Significant parameter estimates.................................................................................... 25
vii
Acronyms
ACSA
CM
Continuous miner
KPI
SAS
Definitions
Continuous miner: remote-controlled machine with a large rotating steel drum equipped
with tungsten carbide teeth that scrapes coal from the seam.
Production time:
Shuttle car:
Sump:
primary cutting motion of moving cutter head of CM into the coal face.
Tram:
viii
Part I
Introduction
Business organisations make use of key performance indicators (KPIs) to measure the
success of production management and to define a suitable set of performance indicators for
monitoring ongoing performance. KPIs are quantifiable measurements
critical success factors of an organisation thus Anglo Coal South Africa (ACSA) has
identified and defined six key performance indicators referred to as the Pick Six to manage
underground coal mining operations.
ACSA is a division of Anglo American and one of South Africas largest coal producers. It
operates four underground and five open cut mines in Mpumalanga, Witbank Coalfield. It is
well known for producing thermal and metallurgical coal to customers in the inland domestic
markets as well as the export market. ACSAs coal demand has grown tremendously
inducing a need to improve production managing methods that can help with managing coal
operations, budgeting processes and forecasting tools to predict and lessen the risk of not
meeting coal demand.
A standard model has been built to assist in managing coal production using the Pick Six
and fixed mine parameters. With the aid of a third-party data capturing and management
company, Mining Consulting Services, data used to manage coal production is captured
daily using data cards that are attached to each operating Continuous miner during a
scheduled shift. A face shift report document is prepared daily using the mining data from
the previous day illustrating actual key performance indicators relative to the benchmarked
KPIs, required production tonnes, production time etc.
The reports are used as
a check tool to measure actual coal against budgeted coal targets and
illustrate graphically the actual coal tonnes per day against the Pick Six values
Day shift
06:00-15:00
1-1:30 hour
travel time
30-40 minutes
pre and post
meeting
Afternoon
shift
Night shift
14:00-23:00
1-1:30 hour
travel time
30-40 minutes
pre and post
meeting
Figure 1.2:
1. Components of a shift time
22:00
22:00-07:00
1-1:30
1:30 hour
travel time
30--40 minutes
pre and post
meeting
During a mining operation, a continuous miner(CM) trams coal using the cutter head that is
designed to mitigate methane gas level by absorbing in the fumes and using water sprays to
put out fire sparks. The mining process is illustrated graphically in figure 1.3.
shift patterns
maintenance allowances
pillar centers
size of coal
haulers
Productive times
Load time
Away time
Tram time per meter cut
Pick Six
Unproductive times
Operational critical success factors for Anglo Coal for underground operations are to;
Critical factors are only achievable if the KPIs benchmarked values are reliable and their
effect on the coal production tonnes is well understood. The three past projects defined and
reviewed the KPIs to ensure that Anglo Coal has a coal managing standard that can be
deployed in all the underground mines. [3][6][11] The KPIs are explained as follows;
First sump late
It is the time difference between the scheduled start of a shift and the first sump that
the CM makes. It decreases the face time available during a shift by the same
amount of minutes the shift started late.
*Measurement unit: Minutes
Load time
The total time a CM takes to fill up a shuttle car or battery loader with coal. It makes
up the production process together with the away time as depicted in figure 1.5. It is
calculated by the following formula per shift;
Production cycle
Load
Away
Production cycle
Load
Away
Production cycle
Load
Away
Production cycle
Load
Away
CM
Time
Downtime
Total non productive time recorded in minutes during a shift that minimises the overall
production time. The breakdowns are categorised as follows;
Engineering breakdowns
Operational breakdowns
Other
production time
!
"
!#
$%&&
( "
*"
" +
, -.+ '
"
Part II
Literature review
The problem should be defined in such a way that each KPI can be analysed both
individually and collectively with the other KPIs. As mentioned before, each KPI is used to
calculate the production time and thus there exists a relationship between the KPIs and the
actual coal tonnes mined. These relationships will not be considered imperative for
technique selection, however they will only be considered where applicable.
Both of these goals require that the pattern of the observed time series data be
identified and more or less formally described. Interpretation and integration can be made
once a pattern has been identified with the other data i.e., use it in the theory of the
investigated phenomenon, e.g., seasonal death rates. Extrapolation of the identified pattern
can be made as to predict future events.
2.2.1 Smoothing
Smoothing method uses the averaging methods and exponential smoothing methods to
reveal clearly the underlying trend, seasonal and cyclic components from data sets. The
methodology behind smoothing methods is taking the averages of the data sets and
analyzing the data using the following;
The above mentioned can be derived from a built forecasting model that when analysed, the
Flaw of averages is sometimes checked for in the model. What the Flaw of averages
states is that plans based on the assumption that average conditions will occur are usually
wrong.
/0 ( /1 21 ( 3 ( / 4 24 ( 5
6789:
Where
y
= dependent variable
= independent variable
= y-intercept of the line, i.e. point at which the line cuts through the y-axis
= slope of the line, i.e. amount of increase in the mean of y for every 1-unit
increase in x.
,. . . ,
This means the distance between the fitted line which consists of the predicted
values and
the actual Y values denoted by asterisks in figure 2.1 is being minimised. This distance is
referred to as the residual or prediction error.
error
For a test to be accurate, certain assumptions have to be satisfied [1]
Figure 2.1:
2. Least squares estimation
10
'-+
-<+ " -
=6>? ( >@:A
678$:
C #" -- ! -+
-<+ " -
ED ( >@:A
=6>
?
678F:
-<+ " -
=6>?
?
ED :A
>
678G:
Emphasis is put on the fact that the predictive models are used for predicting the responses
and also on trying to understand the underlying relationship between the explanatory
variables.
11
The methodology followed in generating a mathematical model fits perfectly with the
methodology planned for developing the solution strategy for the KPI identification process.
However, problems arise when generating the algorithm as known KPI production tonnes
relationships will be used thus the productive KPIs that have a direct relationship to the
production tones will be favored.
12
For an example, the first sump late, last sump early and downtime KPIs are merely
subtracted from the face time unlike the other KPIs that are included in the production time
calculations used to calculate the actual coal produced in a shift/day.
properties of these objects that can be examined or altered; in the coal production
model the value of the KPIs can be altered.
An excel model can be embedded with calculations and measured values to output the
desired end goals. Using this tool all the requirements can be met using defined formulations
with the use of macros to show each KPI effect on the total coal tonnes. Each KPI factor can
be calculated which can easily be viewed on the excel spreadsheet. Further formulation can
be made using the KPI factors to select a combination of KPI that are critical in order to
reach budgeted coal production tones.
13
2.7 Conclusion
From the different techniques studied, it was deemed multiple regression analysis is the
most appropriate as it can yield the accurate required solution; however multiple regression
has been extensively used in medical fields than engineering fields when dealing with
variable selection studies. The other methods give rise to more problems as the solution
formulation gets complicated due to extensive computations that defeats the purpose of the
study.
Statistical Analysis Software (SAS) was the prominent in terms of usage and simplicity
understanding. The partial ordinary least square regression analysis methods will be used as
it is considered to yield accurate solution outcomes.
14
Part III
Multiple regression analysis
Multiple regression analysis is widely used in the psychology, dietary, population and
financial fields for prediction purposes. The relationship between the predictor variables and
response variable will be analysed as to determine which predictor variable/s has the most
impact on the response variable.
The following are the model assumptions as explained by Mendenhall for a regression
model [7]
Assumption 1: The mean of the probability distribution of
is normal
Assumption 4: The errors associated with any two different observations are independent i.e.
the error associated with one y has no effect on the errors associated with
other y values.
A test for residual normality is crucial before any inferences can be made from the model.
The p-value [table 3.1] can only be used if the residual normality test is satisfied, if not
transformations are made and explained before using the p-value for parameter selection.
Test for normality
H0 : Residual come from a normal distribution
H1 : Residual deviate from a normal distribution
Shapiro-Wilk test
n
50
Reject H0 if W Is small
p-value = p(w)< 0,05 : deviate from normal
>0,05: H0 is not rejected
15
A Shapiro-Wilk test is used only when the number of observation is less or equal to 50. The
rule of thumb is used for the confidence level i.e.
failure to comply with normality tests. Data transformations are performed on the y values as
to make them nearly satisfy the assumptions, and for the latter reason, to achieve a model
that provides a better approximation to E(y).
3.1 Terminology
The basic terminology presented in part II; section 2.2 will be used with the additional
terminology explained in table 3.1 throughout the rest of the document.
Table 3.1: Regression terminology
Least square:
R-square:
P value:
16
1,
. . .,
6.
and its
variance
Step 5: Evaluation of the utility of the model
Step 6: Validation of assumptions on
Step 7: If the model is deemed adequate, estimate the mean value of Y , identify the critical
KPI and make other inferences.
Regression algorithm
>
J9K L K%M
/0 ( = /? 2? ( 5?
?I1
6$89:
In this formulation, the dependent variable is denoted by Y, which is the production tonnes
and the independent variables by 2? which are the Pick six as described in table 3.1.
Table 3.2 Pick six model symbols
Term
Description
X1
X2
load time
X3
away time
X4
X5
Downtime
X6
17
SAS algorithm
The model explained here is for one mine section, a couple of other models were run as to
accomplish a general KPI combination that can be used for all Anglo Coal underground
mining processes.(Addendum A and B)
Input data: Simunye Section
options nodate pageno=1;
title' Vlaklaagte: Simunye Section';
data vlakkies;
label y = 'Actual Production tonnes'
x1 = 'Load time sec'
x2 = 'Away time sec'
x3 = 'FS late min'
x4 = 'Tram time min'
x5 = 'Downtime min'
x6 = 'LS early min';
input y x1 x2 x3 x4 x5 x6;
datalines;
59457 63
72
22
304
13
54847 67
70
26
187
13
51350 67
74
25
155
16
60042 67
76
24
142
20
60898 66
71
20
143
51956 70
72
25
148
46085 67
70
25
203
45962 63
84
28
199
58854 67
71
27
167
52279 88
68
22
187
12
56704 66
66
24
201
19
68864 63
70
20
215
11
18
Data points for each mine section were extracted from an excel spreadsheet and defined as
variables as in line 4(input) of the model algorithm. For this algorithm Simunyes data was
tested using 12 data points.
The full model was ran first and further parameter selection steps were made as to obtain
parameter estimates that were significant. For this model, x2 was eliminated first, followed by
x6, x5 then x4 using the outputted p-value. If the p-value for any predictor variable was
significantly large, the variable was eliminated from the model. This was done till the
remaining p-values for each predictor variable fell within the 95% confidence interval.
Check if errors are normally distributed
goptions reset=all;
symbol i=none value=dot cv=green height=0.7;
proc gplot data=a1;
plot studres1*pred1 / vref=0;
run;
In order to use the t-test for the stepwise elimination process, errors have to be normally
distributed with their mean equal to zero as mentioned earlier.
19
This checks for how y and xs are related i.e. for example if there is a negative relationship
between y and one of the x values, take for instance the away time, the expected actual
production tonnes will decrease as the away time increases.
Check for correlation between the x variables
proc corr data=vlakkies;
var x1 x2 x3 x4 x5 x6;
run;
This was used to check for relationships between the xs themselves. If highly correlated
predictor variables are present in the model, the results may be confusing. In some cases
the t test on the individual parameter estimates may be non-significant even though the Ftest for the overall model adequacy is significant and the parameter estimates may have
signs opposite from what is expected. Variation inflation factors aid in determining whether a
serious multicollinearity problem exists.
Data Normality Test
proc univariate data=vlakkies normal;
var y x1 x2 x3 x4 x5 x6;
histogram y x1 x2 x3 x4 x5 x6 /normal;
run;
The data normality test was to check if some KPIs data were normally distributed. The
sample size was too small, i.e. less than 50 datasets thus it was infrequent for normality to
be observed in the datasets.
20
This ensures that the errors are normally distributed as to satisfy the error normality
assumptions mentioned before. From the graph it can be seen that some of the errors do
come from a normal distribution or are normally distributed. (Refer to figure 3.1)
21
parameters. If parameter estimates fell within the significance level, the null
hypothesis was rejected, stating that the parameters were significant from zero and if
not, stepwise regression was used to eliminate the parameters that were not
significant. The model was run until we had only significant parameters in the model.
This can be viewed in the iteration and output step of the SAS algorithm. X6 was
eliminated first followed by x5 and x4 consecutively.
F-test: The test is done on the full model using the F test for all the predictor variables
except for the intercept values as they have no significance on the solution strategy.
The null hypothesis can be rejected if the p-value is less than the
value.
H1 :
22
Studentised residual graph of the final model: As explained before, residuals are values
between the observed responses from the predicted responses. The studentised
residual graph plots the scatter points within the 95% confidence level. In order for
the model not to be rejected, the scatter points should not show any trend.
Studentized residual of the final model
Studentized Residual
2
-1
-2
48000
50000
52000
54000
56000
58000
60000
62000
64000
66000
Predicted Value of y
Figure 3.1 Studentised residual graph of the final model: Vlaaklagte, Simunye Section
23
Part IV
Computational Results
4.1 Model outputs
Out of the five mine sections models, 5 x 6 parameter estimate outputs were studied as to
identify the critical KPIs. As mentioned before the parameter estimate,
, is the amount of
increase or decrease in the mean of y for every 1-unit increase in x. Monthly data was used
for all the models except for Greenside as it had weekly data. The five model outputs give
explanation as per data set inputted in each mine sections model and the output is
demonstrated using the general regression equation (4.1).
)
/N ( /1 O1 ( /A OA ( /P OP ( /Q OQ ( /R OR ( /H OH ( 5
6F89:
Vlaaklagte
Simunye Section
)
9FFGG%
G&S8F7G O1 ( %%%8&&GOA
7F9G89FFOP
9&TFU OQ
TG8%UU OR ( FSG8FF OH
Mangwapa Section
)
7SSGGT
99T$O1
GUG8%&$OA
ST8TS OP
77F89%9OQ ( T9&F89%9 OR
F$G8G$F OH
The parameter estimate for Simunye Sections x1 which is the load time can be interpreted
as for every one unit increase in the load time, the coal production tonnes will decrease by
508.425 units and for x6 which is the last sump early can be interpreted as for every one unit
increase in the last sump early, the coal production tonnes will increase by 485.44 units.
The parameter estimates for Vlaaklagte KPIs cannot be used to derive a general set of
combinations as the estimates differ tremendously. The R2 for the Simunye Section model
was 73% and for Mangwapa Section model, 78% which is significantly high, however the
significant parameter estimates for each mine differ.
24
Greenside
George Section
)
7$%T8%$S
98G7%O1
S8$$%OA ( 9F8S$TOP
9%$8T7TOQ
78F%7OR
%8%&&OH
Vumangara Section
)
GUFS8FU7 ( G8%9&O1
$%8%FFOA
9G877$OP
77S8&GF OQ
&8FGF OR
$8S$%OH
The parameter estimates for Greenside differ slightly as compared to the Vlaaklagte
parameter estimates, however the R2 for both models were significantly low, 45% and 48%
for George and Vumagara respectively. If the model is not good enough, no KPI combination
can be derived from such a model.
New Denmard Central
Simunye Section
)
TFG79
9&$8$ O1
U8S$TOA
$9S8GF%OP
GSS%8SF OQ ( F98G7S OR
$%G89&T OH
The R2 for the New Denmark section was 55% and the adjusted R2 has a negative value
which shows that the model contains terms that do not help in predicting the response.
All the parameter estimates where tested for significance and the significant parameter
estimates are as shown in the table 4.1.
Table 4.1: Significant parameter estimates
Section Name
1
Greenside: Vumagara
NDC_Central :Simunye
Vlaaklagte: Simunye
Greenside: George
Vlaaklgate: Mangwapa
1
1
1
1
1
25
From the table, no parameter estimate(s) dominate the others. This can be due to the reality
that all the mines differ due to the following;
shift patterns
maintenance allowances
pillar centers
size of coal
haulers
From the five models, only three KPIs dominated and this cannot be deemed legitimate as
the sample size of the models is minute. It should not be ignored that the five models did not
yield favorable outputs thus impeding further investigation due to the sample size being too
small &/other unknown factors.
26
Part IV
Recommendations
The model outputs did not show any trend that could be used to select a general KPI
combination. The data samples when tested using the normality test did not yield favorable
results thus suggesting the small sample size might be a problem. The parameter estimates
told a different story for each mine section even though some of the models were significant
and had more or less the same data points as the others.
However the model outputs gave ample information on the effect they have on the actual
production tonnes. As mentioned under future work, further regression analysis
methodologies can be carried out as to improve the solution integrity with the addition of a
bigger sample size.
27
Bibliography
[1] Berger, D.E. (2003). Introduction to Multiple Regression. Claremont Graduate
University. Pages1-6.
[2] Bise, C.J., and Albert, E.K. Comparison of model and simulation techniques for
production analysis in underground coal mines-document summary. Available online
from http://www.onemine.org/(Retrived 23 May 2009)
[3] Denkhaus, M. Eardley, M. Mufamadi, C. and Raphulu, R.(2007).Review of the
Current Approach to Measurement and Reporting of Continuous Miner Relocation
Time as a Key Performance Indicator. Presentation.
[4] Giltow, H.S., Oppenheim, A.J, and Levine, D.M.(2005). Quality management,3rd
edition. Singapore: McGraw Hill.
[5] Johnson, R.A,(2005).Probability and statistics for engineers.7TH edition. Pages 101146 and 226-281. United States of America: Prentice Hall.
[6] Khumalo, S. (2008). Determining the Suitability of the Current Functional Production
KPIs at Greenside Colliery in Managing Production Section Downtime and To
Identify Critical Areas of Lost Time.
[7] Mendenhall, W, Sincich, T. A Second course in Statistics: Regression Analysis. 6th
edition, pages 90-741. New Jersey: Prentice-Hall.
[8] Montgomery, D.C, Peck, E.A, Vinig, G.G. Introduction to linear regression analysis.
Pages 169-217.
[9] Nass,
P.
(1999).
Multivariate
analysis
methods.
Available
online
from
28
Website.(2009)
Time
Series
Analysis.
Available
online
from
29
Addendum
Addendum A: Sections SAS models
A 1: Vlaaklagte: Mangwapa section
A 2: Vlaaklagte: Mangwapa Section
A 3: Greenside: George Section
A 4: Greenside: Vumagara Section
A 5: New Denmark Colliery: Simunye Section
30
304
187
155
142
143
148
203
199
167
187
201
215
13
13
16
20
8
7
5
8
6
12
19
11
y
y
y
y
=
=
=
=
x1
x1
x1
x1
x3
x3
x3
x3
x4 x5 x6;
x4 x5;
x4;
;
goptions reset=all;
symbol i=none value=dot cv=green height=0.7;
proc gplot data=a1;
plot studres1*pred1 / vref=0;
run;
proc corr data=vlakkies;
var x1 x2 x3 x4 x5 x6;
with y;
run;
proc corr data=vlakkies;
1
var x1 x2 x3 x4 x5 x6;
run;
proc univariate data=vlakkies normal;
var y x1 x2 x3 x4 x5 x6;
histogram y x1 x2 x3 x4 x5 x6 /normal;
run;
**********Checking errors for normality for the fitted model*************;
proc reg data=vlakkies;
model y = x1 x3 ;
output out=a p=pred r=res student=studres;
run;
proc print data=a;
run;
goptions reset=all;
symbol i=none value=dot cv=red height=0.7;
proc gplot data=a;
plot studres*pred / vref=0;
title3 'Studentized residual of the final model';
run;
y
y
y
y
y
=
=
=
=
=
2
2
3
3
2
2
2
2
2
2
2
2
2
155
147
224
174
139
165
157
208
134
141
229
201
175
3
12
16
17
21
9
14
16
16
10
17
25
16
x1 x2 x3 x5 x6;
x1 x2 x3 x6;
x1 x3 x6;
x1 x6;
x6;
run;
goptions reset=all;
symbol i=none value=dot cv=green height=0.7;
proc gplot data=a1;
plot studres1*pred1 / vref=0;
title3 'Studentized residual of the full model';
run;
proc corr data=vlakkies1
var x1 x2 x3 x4 x5 x6;
with y;
run;
proc corr data=vlakkies1;
var x1 x2 x3 x4 x5 x6;
run;
3
x3 x4 x5 x6 y;
63
57
65
64
62
62
65
61
64
56
58
62
68
65
58
64
61
64
59
58
65
64
57
64
64
60
62
62
56
61
68
3
2
2
2
3
3
2
2
3
3
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
2
3
3
3
175
163
125
138
208
69
95
106
50
59
52
199
105
98
154
142
126
159
226
50
118
138
20
81
203
330
195
138
115
178
155
30
16
45
32
26
21
10
26
13
11
11
11
10
22
19
24
27
21
24
23
16
20
19
17
17
22
17
19
20
21
22
2104
1857
1903
1886
1382
1850
2563
2082
2299
2206
1894
1782
2003
2249
1815
1355
1722
1704
1500
2276
2138
1944
1376
1886
1368
1314
1703
1643
1761
1565
1593
y
y
y
y
y
=
=
=
=
=
x2
x2
x2
x3
x5
x3
x3
x3
x5
;
x4 x5 x6 ;
x5 x6 ;
x5 ;
;
goptions reset=all;
symbol i=none value=dot cv=blue height=0.7;
proc gplot data = a;
plot studres*pred / vref = 0;
run;
proc corr data=greenside1;
var x1 x2 x3 x4 x5 x6;
with y;
run;
proc corr data=greenside1;
var x1 x2 x3 x4 x5 x6;
run;
proc univariate data=greenside1 normal;
var y x1 x2 x3 x4 x5 x6;
histogram y x1 x2 x3 x4 x5 x6 /normal;
run;
**********Checking errors for normality for the fitted model*************;
proc reg data = greenside1;
model y = x5;
output out = a1 r = res1 p = pred1 student = studres1;
title1 ' Fitted regression model';
run;
goptions reset=all;
symbol i=none value=dot cv=red height=0.7;
proc gplot data = a1;
plot studres1*pred1 / vref=0;
title3 'Studentized residual of the final model';
run;
x3 x4 x5 x6 y;
88
99
98
70
68
68
70
59
67
67
67
64
68
65
65
60
62
57
58
60
62
65
57
65
61
58
58
63
63
68
61
62
59
67
68
63
55
60
63
61
63
67
63
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2
2
2
2
3
3
3
3
2
2
2
2
2
3
3
3
3
3
3
3
2
2
2
2
221
202
169
183
172
228
158
174
273
141
176
42
159
163
165
105
169
101
166
143
144
126
188
149
222
76
188
103
183
135
129
145
240
235
130
215
220
260
40
223
141
99
134
15
16
18
14
21
19
22
7
23
23
19
16
50
28
18
4
10
25
28
14
45
71
19
47
14
22
23
30
22
26
31
26
25
27
20
24
22
30
25
18
18
19
26
1539
1652
1605
1377
1895
1544
1887
1997
1674
1650
1686
1405
1494
1741
1635
2170
1977
2079
1908
1955
2295
2087
1902
2116
1680
2434
2117
3212
2421
2078
2455
2112
2137
1600
1915
1770
1688
1827
1566
2271
2229
1931
1811
7
6
6
26
3
9
15
14
;
69
66
67
64
61
64
64
69
64
60
57
58
60
60
2
2
3
2
3
2
3
138
183
153
173
134
214
131
18
23
11
19
19
20
20
1844
2171
1898
2169
2012
2290
2068
greenside;
x4;
r = res1 p = pred1 student = studres1;
regression model';
goptions reset=all;
symbol i=none value=dot cv=red height=0.7;
proc gplot data = a1;
plot studres1*pred1 / vref=0;
title3 'Studentized residual of the final model';
run;
267
222
236
215
232
214
260
243
159
200
173
26
21
8
8
24
21
9
4
5
6
13
y
y
y
y
=
=
=
=
x1
x1
x1
x4
x3
x4
x4
x6
x4 x5 x6;
x5 x6 ;
x6;
;
goptions reset=all;
symbol i=none value=dot cv=green height=0.7;
proc gplot data = a;
plot studres*pred / vref = 0;
run;
proc corr data=ndc;
var x1 x2 x3 x4 x5 x6;
with y;
run;
proc corr data=ndc;
var x1 x2 x3 x4 x5 x6;
run;
10
*Only one model output is included under this section, the rest of the output
models are saved in the attached disc
11
B 1: Vlaaklagte-Simunye Section
!
#
#
$
$
"
"
"
" '"
" "
$
"
/3./20+-+&
//+31
.2 +,03
$$
"
(
8
!:
(
:
"
"
,-+.-+, &.1,-31/
-0+013-33
8
9
9&
9,
99/
9+
&
&
+
/
%
%
"
/01&-,1+
&/.++1 0
5 )
! 6 5 )
* (
&2&,
320&00
32-33.
"
"
--//+
5/312-&-1+
+++233-+.
5&- /2 -- &
5 30-.
50/2+.1.-1/2-,..1
12
32 .10
* 7 7
,..,.
&+&2,&3.&
00.2 31+0
1. 21/&3&
0+-&2& -00
/-2+,3&1
,+1200&--
,2+&
5 2.321/
5&20
5 25 2,.
2,&
323 /&
32 3,
32-, 0
323-&32& 1+
32&&-/
32&-/&
/
!
#
#
$
$
"
"
%
%
&
.
4
"
" '"
" "
"
$
$
"
&+//,+,..
& &--33
-0+013-33
-1--20-/ +
//+31
120 &&.
$$
&
&
,&0+1 ..
&,-0 //+
5 )
! 6 5 )
"
* (
/2++
323&/0
32//+.
32-/1/
"
"
7 7
8
9
9,
&&++3
5,,12./.,,
5 1,/20.3,
"
(
& 1/
&&320003.
/0+2.,33&
/2+
5 2/5,2 1
32333,
32 /.
323
4
;
"
"
9&
9,
"
#
9
9&
9,
99/
9+
&
&
&
&
&
&
&
9-
9/
9+
"
%
//+31
+021,,,,
0&233333
&-233333
,2,,,,,
102/1,,,
2/3333
+/1+2+1//1
-2/0&+/
&2//132-.&,0
--2+. 0.
-2.+,/3
'#8 ! 8!
<!
++0&.1
1 -233333
1+-233333
&11233333
-3233333
&&/
,1233333
-/.+&
+,233333
++233333
&3233333
,233333
-&233333
/233333
"=
"
13
9
+11+11233333
1-233333
&1233333
-233333
,3-233333
&3233333
:
'
4
$$
%
""
&
//+312 ++0
+/1,2/. 0
32&133,,&.
,20/1- 3
21,.&/&,
;
"
>
4
14
" "
"
%
"
&
++0&.1
-,,-,+0&20
32 / &1.-1
-0+013-33
.332/ .30
? "
"
"
//+312 0
//00/2/3
2
"
+/1-,,-,+0,
&&.3&
13.0
" " $
"
"
A"
5;
%5
5%
" 5
55555
&.2&/.-+
+
,.
" " $
55
"
"
>
4
!
3@3
;
%
" "
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
555
B2333
32333/
32333/
55555
32./+0+32 &0/ ,
323,&3/0
32&--000
'#8 ! 8!
"
" $
555555
555555
B ;
*
* ;5 )
* !5 )
320,+1
*32 /33
*32&/33
*32&/33
"
"
//+312 0
+/1,2/.
""5 $5(
"
>
4
!
" " $
5555
%5
5%
" 5
%
" "
;5 )
!5 )
"
"
55555
555555
32 &0/ , 0
323,&3/+..
32&--000-.
'#8 ! 8!
9
<
555555
*
* ;5 )
* !5 )
"
*32 /3
*32&/3
*32&/3
"
#
:
'
4
$$
%
""
&
+021,,,,,,
+2+1//0.&,
&21+1+-1//031
.21//1.30+
;
"
>
4
15
" "
"
%
"
&
1 --2+.+.+.0
.2 -/&1 /+
-. 2+++++0
2.&..+3-.
? "
"
+021,,,,
+0233333
+0233333
"
16
"
+2+1//1
--2+.+.0
&/233333
&2/3333
" " $
"
3@3
"
A"
,/2 -0/&
+
,.
" " $
"
>
4
!
;
%
" "
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
"
555
B2333
32333/
32333/
55555
32+3 .-&
32,1&.,&
32,/&,/+
2. 1 &
555555
B ;
*
* ;5 )
* !5 )
'#8 ! 8!
"
" $
555555
55
5;
%5
5%
" 5
55555
32333
B323 33
B3233/3
B3233/3
"
"
+021,,,,
+2+1//0.
""5 $5(
"
>
4
!
" " $
5555
%5
5%
" 5
%
" "
;5 )
!5 )
"
"
55555
555555
32,1&., /1
32,/&,//+2. 1 &3,.
'#8 ! 8!
9& <!:
555555
*
* ;5 )
* !5 )
"
B323 3
B3233/
B3233/
"
#
:
'
4
$$
%
""
&
0&
-2/0&+-/.203-+/-0/
+&-,1
+2,/31.0 -
? "
;
"
>
4
"
17
"
%
"
" "
"
"
&
1+&32.3.3.3.
-2 /./-+,
&,3
2,&333. 1
0&233333
0 233333
03233333
18
-2/0&+/
&32.3.3.
1233333
,233333
" " $
3@3
"
"
A"
5;
%5
5%
" 5
55555
/-2/-/31
+
,.
" " $
55
"
"
>
4
!
;
%
" "
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
555
321-+0&&
32&/
32 ,31-0
320+-,-/
555555
B ;
*
* ;5 )
* !5 )
B2333
32333/
32333/
55555
'#8 ! 8!
"
" $
555555
323,,/
323,0+
323,1,
323,/
9&
"
"
0&
-2/0&+-+
""5 $5(
"
>
4
!
" " $
5555
%5
5%
" 5
%
" "
;5 )
!5 )
"
"
55555
555555
32&/333333
32 ,31-0 3
320+-,--0/
555555
*
* ;5 )
* !5 )
'#8 ! 8!
9, <(
323,1
323,1
323,/
"
#
:
'
4
&
&&2//1-31+
532&0,+30+
+.132++33,/1
%
""
$$
? "
;
"
>
4
"
" "
"
19
"
"
%
"
&
&11
+2/-/-/-//
5320&, -1
0&
320,1/-1./
&-233333
&-2/3333
&/233333
20
&2//1+2/-/-/
1233333
,2/3333
" " $
"
"
A"
5;
%5
5%
" 5
55555
,&2-.+ /
+
,.
" " $
55
"
"
>
4
!
3@3
;
%
" "
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
555
B ;
*
* ;5 )
* !5 )
B2333
32333/
32333/
55555
32.-- 0/
32 ++++0
323/ 30,
32,3.+&
'#8 ! 8!
"
" $
555555
555555
32//-3
*32 /33
*32&/33
*32&/33
9,
"
"
&&2//1-3.
""5 $5(
"
>
4
!
" " $
5555
%5
5%
" 5
%
" "
;5 )
!5 )
"
#
55555
32 ++++++0
323/ 30,,1
32,3.+ ./1
21
"
555555
*
* ;5 )
* !5 )
555555
*32 /3
*32&/3
*32&/3
'#8 ! 8!
9- <
=
"
#
%
""
:
'
4
&
,2,,,,,,,,
32-.&,+/.+
321 &-3,1,+
-2003.01.
$$
? "
>
4
"
32-.&,0
32&-&-&
233333
233333
" " $
"
A"
>
4
!
5;
%5
5%
" 5
3@3
" " $
55
"
;
%
" "
55555
&,2-/&31
+
,.
"
"
&
-3
32&-&-&-&5 2+/
&2+++++++0
32 -& ,,1
"
" "
"
,2,,,,,,
,2333333
,2333333
"
"
%
"
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
555
32+313.
32- 0-+/
32- -0./
&2,&,/.
22
555555
B2333
32333/
32333/
55555
B ;
*
* ;5 )
* !5 )
555555
32333
B323 33
B3233/3
B3233/3
'#8 ! 8!
"
" $
9-
"
"
,2,,,,,,
32-.&,++
""5 $5(
"
" " $
5555
>
4
!
%5
5%
" 5
"
%
" "
"
55555
555555
32- 0-+-03
32- -0./-1
&2,&,/1..1
;5 )
!5 )
'#8 ! 8!
9/ <
555555
*
* ;5 )
* !5 )
B323 3
B3233/
B3233/
"
#
%
""
:
'
4
&
102/1,,,,
--2+. 01.+
2+3.1-.,+
---&&
&,21&/3,&&
$$
? "
>
4
"
--2+. 0.
..0
+&233333
/32/3333
" " $
"
A"
>
4
!
5;
%5
5%
" 5
3@3
" " $
55
"
;
%
" "
55555
-2/,.0+
+
,.
"
"
&
&&/
..02,/+3+
,20&,,/& .032. +0
&2.3 -31-
"
" "
"
102/1,,
1023333
1023333
"
"
%
"
;5 )
!5 )
* 7 7
*@ 7 7
*@ 7 7
555
321,.&.0
32 .1,.0
3231+-.+
32+- .+1
23
555555
B2333
32333/
32333/
55555
B ;
*
* ;5 )
* !5 )
555555
323&0
*32 /33
32 //,
3230-0
'#8 ! 8!
"
" $
9/
"
"
102/1,,
--2+. 0.
""5 $5(
" " $
"
>
4
!
5555
%5
5%
" 5
"
%
" "
"
55555
555555
32 .1,.0,/
3231+-.//1
32+- .+1 .
;5 )
!5 )
555555
*
* ;5 )
* !5 )
'#8 ! 8!
9+ <
*32 /3
32 //
3230/
=
"
:
'
4
&
2/
-2.+,/3, +
32-+3,,-,&
1/1
-,2 +31.0
%
""
$$
? "
>
4
"
"
5
A"
-2.+,/3
&-2+,+,+
/233333
0233333
)
"
"
" " $
"
"
8
"
&
,1
&-2+,+,+,+
5321+& /-&
&0
2-,&1,..-
"
" "
"
2/3333
2/3333
1233333
"
%
"
$ &
" :
3@3
"
123&+3 .
+
,.
24
55555
* 7 7
*@ 7 7
*@ 7 7
555555
B2333
32333/
32333/
$ &2
" " $
"
55
5;
%5
5%
" 5
>
4
!
"
&
,
/
+
0
1
.
3
&
/.-/0
/-1-0
/ ,/3
+33-&
+31.1
/ ./+
-+31/
-/.+&
/11//&&0.
/+03+11+-
#
"
;
%
" "
;5 )
!5 )
555
55555
32.,+0
32 0+,30
323-,+&0
32,3&1-&
9&
9,
9-
9/
9+
+,
+0
+0
+0
++
03
+0
+,
+0
11
++
+,
0&
03
00+
0
0&
03
10
+1
++
03
&&
&+
&/
&&3
&/
&/
&1
&0
&&
&&3
,
,
,
,
,
,
,
,
,310
//
-&
-,
-1
&3,
..
+0
10
&3
& /
,
,
+
&3
1
0
/
1
+
&
.
B ;
*
* ;5 )
* !5 )
555555
32-/+0
*32 /33
*32&/33
*32&/33
"
+3. 123/
/&& .23/
/-3/-21//1.32+,
+,/0&20/
/,3,02.+
/-3/-21-..3,2,
/3,1,2&+
/&---230
/+&&.2/.
+-/1.2+,
"
5 -+ 23/
&+&02./
5&03-21- / 2,0
5&+0-20/
5 31 2.+
50.+.215,.- 2,
1-03205 +/230
-0-2-&0-2,0
"
532,,0.32/1-//
532/100321./+.
532+0,13
532&,+1
5 20, 01
532..&1+
2.+/&.
532 & 132 3&+0
2 &1/0
-1
-2
48000
50000
52000
54000
56000
58000
Predicted Value of y
25
60000
62000
64000
66000