Introduction .................................................................................................................................... 1
Solution ........................................................................................................................................... 1
3.1
3.2
Experiment 1: Correction of mechanically limited, highly biased and irregular distortion ............ 3
4.1
4.2
4.3
4.4
Discussion........................................................................................................................................ 4
Limitations....................................................................................................................................... 4
1 INTRODUCTION
When people provide nominal data there can be biases and omissions that make the data difficult to
work with. Filling in the incomplete data allows an analyst to use their normal methodology in
further assessment, instead of having to create new practices and programs that can handle
irregularly distorted data.
This algorithm is not limited to people and can be used with sensors, pathing algorithms, control
systems and government department spending analyses.
2 OVERVIEW OF ALGORITHM
This algorithm uses the apparent bias of a person in the data they have provided to fill in data that
they have omitted or to correct for wildly uncharacteristic values.
3 SOLUTION
Setting up the data as a simple graphical example.
1
11
1
Christopher Lindfield
Page 1 of 5
=
The mean, variance and standard deviation are now determined.
=
2 =
[( )2 ]
[( )2 ]
=
Determining the nominator bias. Note that probability functions are superior in generating general
curves but do not allow discrete inputs without curve estimations. The discrete statistics allow for
continuous inputs using quanisation and weightings determined from integration.
The mean bias determines whether how much higher or lower than other nominators, relatively,
this nominator is.
= =
= =
Using the profile bias of the nominator from the total analysis at the beginning is important at the
individual variable level as there may be uncharacteristic local bias within a structure, which is
substantially decreased by including the global bias.
These expected means are then averaged to get the reference mean that would be expected if the
distribution is usual.
New data is generated using this expected mean so that analysis can be done as if the data set was
complete.
= = +
Christopher Lindfield
Page 2 of 5
N1
N2
1.00
1.00
1.00
2.00
3.00
4.00
5.00
5.00
N3
1.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
N4
N5
Var
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
std dev
0.80
1.70
2.50
2.50
2.50
2.50
2.50
3.70
0.89
1.30
1.58
1.58
1.58
1.58
1.58
1.92
characteristic
bias
-0.52
-0.86
-1.21
-1.11
-1.00
-0.90
-0.79
-1.10
-0.52
-0.86
-0.58
-0.47
-0.37
-0.26
-0.16
-0.06
-0.52
-0.09
0.05
0.16
0.26
0.37
0.47
0.46
0.60
0.68
0.69
0.79
0.90
1.00
1.11
0.98
1.71
1.44
1.32
1.42
1.53
1.63
1.74
1.50
-0.94
-0.41
0.15
0.84
1.54
New mean
1.68
2.42
3.29
4.12
Characteristic values
0.63
0.90
1.44
2.27
1.10
1.58
2.27
3.10
1.60
2.31
3.15
3.98
2.22
3.21
4.25
5.08
Christopher Lindfield
2.84
4.12
5.35
6.18
Page 3 of 5
4.96
5.79
6.62
7.57
3.10
3.94
4.77
5.32
3.93
4.77
5.60
6.33
4.81
5.65
6.48
7.40
5.91
6.75
7.58
8.74
7.01
7.85
8.68
10.07
-0.60
-0.15
-0.05
0.00
0.04
0.06
0.07
0.08
-0.11
-0.07
-0.06
-0.02
0.01
0.04
0.05
0.03
0.05
-0.03
-0.07
-0.03
0.00
0.02
0.04
-0.01
-0.10
-0.58
-0.13
-0.03
0.02
0.05
0.07
0.10
, = 10%
The reliability of this approach is approximately 90%.
5 DISCUSSION
From table 7.4 it can be seen that the algorithm corrected the significantly distorted data by up to
60% yet left the undistorted data almost untouched with a correction of ~4%.
The actual error of the analysis was 10%.
Why is this valuable in general?
This algorithm can also be used when exact relationships are not known, but suspected, between
entities such as cities or industries. It does not require any consideration of weighting between
indirectly related variables as this does not affect the bias.
6 LIMITATIONS
The particular described algorithm assumes that the profiles of the nominators are consistent across
structures. This does not need to be the case and simply treating each structure as separate and
creating a weighting for each when determining the global bias will allow the algorithm to be used
with inconsistent biases across multiple structures.
Christopher Lindfield
Page 4 of 5