Boots Trapping

Deloitte Consulting, 2005
Introduction to Bootstrapping
James Guszcza, FCAS, MAAA
CAS Predictive Modeling Seminar
Chicago
September, 2005
Whats it all about?
Actuaries compute points estimates of

statistics all the time.
Loss ratio/claim frequency for a population
Outstanding Losses
Correlation between variables
GLM parameter estimates
A point estimate tells us what the data

indicates.
But how can we measure our confidence in
this indication?
More Concisely
Point estimate says:

what do you think?
Variability of the point estimate says:
how sure are you?
Traditional approaches
Credibility theory
Use distributional assumptions to construct
confidence intervals
Is there an easier and more flexible way?
Enter the Bootstrap
In the late 70s the statistician Brad Efron

made an ingenious suggestion.
Most (sometimes all) of what we know about
the true probability distribution comes from
the data.
So lets treat the data as a proxy for the true
distribution.
We draw multiple samples from this proxy
This is called resampling.
And compute the statistic of interest on each

of the resulting pseudo-datasets.
Philosophy
[Bootstrapping has] requires very little in the

way of modeling, assumptions, or analysis,
and can be applied in an automatic way to
any situation, no matter how complicated.
An important theme is the substitution of
raw computing power for theoretical analysis
--Efron and Gong 1983
Bootstrapping fits very nicely into the data

mining paradigm.
The Basic Idea
Theoretical Picture
Any actual sample of data

was drawn from the unknown
true distribution
The true
distribution
in the sky
We use the actual data to

make inferences about the
true parameters ()
Each green oval is the

sample that might have
been
Sample 1
Y 1, Y Y
1
1
2
Y1
1
k
Sample 2
Y 1, Y Y
2
2
2
Y2
2
k
Sample 3
Y , Y 2 Y
3
1
Y3
Sample N
YN1, YN2 YNk
The distribution of our estimator (Y) depends on both the true

distribution and the size (k) of our sample
YN
The Basic Idea
The Bootstrapping Process
Treat the actual distribution

as a proxy for the true
distribution.
The actual
sample
Sample with replacement

your actual distribution N
times.
Y1, Y2 Yk
Compute the statistic of

interest on each re-sample.
Re-sample 1
Y* 1, Y* 2 Y*
1
Y*1
Re-sample 2
Y* 1, Y* 2 Y*
2
Y*2
Re-sample 3
Y* , Y* Y*
3
1
3
2
3
k
Re-sample N
Y*N1, Y*N2 Y*Nk
Y*3
{Y*} constitutes an estimate of the distribution of Y.
Y*N
Sampling With Replacement
In fact, there is a chance of

(1-1/500)500 1/e .368
that any one of the original data points wont
appear at all if we sample with replacement 500
times.
any data point is included with Prob .632
Intuitively, we treat the original sample as the

true population in the sky.
Each resample simulates the process of taking
a sample from the true distribution.
Theoretical vs. Empirical

Graph on left: Y-bar calculated from an number of
samples from the true distribution.
Graph on right: {Y*-bar} calculated in each of 1000 resamples from the empirical distribution.
Analogy: : Y ::
Y : Y*
bootstrap distribution (Y*-bar)
0.6
0.4
0.02
0.2
0.01
0.0
0.00
phi.ybar
0.03
0.8
0.04
true distribution (Y-bar)
70
80
90
100
ybar
110
120
98.5
99.0
99.5
100.0
y.star.bar
100.5
101.0
Summary
The empirical distribution your data

serves as a proxy to the true distribution.
Resampling means (repeatedly) sampling
with replacement.
Resampling the data is analogous to the
process of drawing the data from the true
distribution.
We can resample multiple times
Compute the statistic of interest T on each resample
We get an estimate of the distribution of T.
Motivating Example
Lets look at a simple case

where we all know the answer
in advance.
Pull 500 draws from the
n(5000,100) dist.
The sample mean 5000
Is a point estimate of the
true mean .
But how sure are we of this
estimate?
From theory, we know that:
s.d .( X ) / N 100
500
4.47
raw data
statistic
value
#obs
500
4995.79
mean
98.78
sd
2.5%ile
4812.30
97.5%ile
5195.58
Visualizing the Raw Data
500 draws from n(5000,100)

Look at summary statistics,
histogram, probability density
estimate, QQ-plot.
looks pretty normal
raw data
statistic
value
#obs
500
4995.79
mean
98.78
sd
2.5%ile
4812.30
97.5%ile
5195.58
Normal Q-Q Plot
4700
0.000
4900
0.002
5100
0.004
n(5000,100) data
4700
4800
4900
5000
5100
5200
5300
-3
-2
-1
Sampling With Replacement

Now lets use resampling to estimate the
s.d. of the sample mean (4.47)
Draw a data point at random from the data set.
Draw a second data point.
Then throw it back in
Keep going until weve got 500 data points.
Then throw it back in
You might call this a pseudo data set.
This is not merely re-sorting the data.
Some of the original data points will appear more than

once; others wont appear at all.
Resampling
Sample with
replacement 500 data
points from the
original dataset S
Call this S*1
Now do this 999
more times!
S*1, S*2,, S*1000
Compute X-bar on
each of these 1000
samples.
R Code
norm.data <- rnorm(500, mean=5000, sd=100)
boots <- function(data, R){
b.avg <<- c(); b.sd <<- c()
for(b in 1:R) {
ystar <- sample(data,length(data),replace=T)
b.avg <<- c(b.avg,mean(ystar))
b.sd <<- c(b.sd,sd(ystar))}
}
boots(norm.data, 1000)
Results
From theory we know that

X-bar ~ n(5000, 4.47)
Bootstrapping estimates this
pretty well!
And we get an estimate of
the whole distribution, not
just a confidence interval.
raw data
statistic
value
#obs
500
4995.79
mean
98.78
sd
2.5%ile
4705.08
97.5%ile
5259.27
Normal Q-Q Plot
4985
4995
5005
0.00 0.02 0.04 0.06 0.08
bootstrap X-bar data
X-bar
theory bootstrap
1,000
1,000
5000.00 4995.98
4.47
4.43
4991.23 4987.60
5008.77 5004.82
4985
4990
4995
5000
5005
5010
-3
-2
-1
Two Ways of Looking at a Confidence

Interval
Approximate normality assumption
X-bar 2*(bootstrap dist s.d.)
Percentile method
Just take the desired percentiles of the
bootstrap histogram.
More reliable in cases of asymmetric bootstrap
histograms.
mean(norm.data) - 2 * sd(b.avg)
[1] 4986.926
mean(norm.data) + 2 * sd(b.avg)
[1] 5004.661
raw data
statistic
value
#obs
500
4995.79
mean
98.78
sd
2.5%ile
4705.08
97.5%ile
5259.27
X-bar
theory bootstrap
1,000
1,000
5000.00 4995.98
4.47
4.43
4991.23 4987.60
5008.77 5004.82
And a Bonus
110
105
100
95
90
Note that we can calculate both the mean and standard

deviation of each pseudo-dataset.
This enables us to estimate the correlation between the
mean and s.d.
Normal distribution is not skew mean, s.d. are
uncorrelated.
Our bootstrapping experiment confirms this.
sample.sd
4985
4990
4995
sample.mean
5000
5005
5010
More Interesting Examples
Weve seen that bootstrapping replicates a

result we know to be true from theory.
Often in the real world we either dont know
the true distributional properties of a
random variable
or are too busy to find out.
This is when bootstrapping really comes in
handy.
Severity Data
2700 size-of-loss data points.
severity distribution
4 e-04
Lets estimate the distributions of the sample

mean & 75th %ile.
Gamma? Lognormal? Dont need to know.
2 e-04
Mean = 3052, Median = 1136
0 e+00
10000
20000
30000
40000
50000
Bootstrapping Sample Avg, 75th %ile

Normal Q-Q Plot
0.000
2800
3000
0.002
3200
0.004
3400
bootstrap dist of severity sample avg
2800
3000
3200
3400
-3
-2
Normal Q-Q Plot
0.000
2800
3000
0.002
3200
3400
bootstrap dist of severity 75th % ile
-1
2800
2900
3000
3100
3200
3300
3400
-3
-2
-1
What about the 90th %ile?
So far so good bootstrapping shows that many of our sample

statistics even average severity! are approximately normally
distributed.
But this breaks down if our statistics is not a smooth function of
the data
Often in the loss reserving we want to focus our attention way
out in the tail
90th %ile is an example.
Normal Q-Q Plot
0.0000
7000
8000
0.0010
9000
bootstrap dist of severity 90th % ile
7000
7500
8000
8500
9000
-3
-2
-1
Variance Related to the Mean
6000
5500
5000
As with the normal example, we can calculate both the

sample average and s.d. on each pseudo-dataset.
This time (as one would expect) the variance is a function
of the mean.
sample.sd
2800
2900
3000
3100
sample.mean
3200
3300
3400
Bootstrapping a Correlation Coefficient #1
80
60
40
20
About 700 data points

Credit on a scale of 1-100
1 is worst; 100 is best
Age, credit are linearly related
See plot
R2.08 .28
Older people tend to have better credit
What is the confidence interval around ?
Plot of Age vs Credit
age
20
40
60
80
100
appears normally distributed.
.28
s.d.() .028
Both confidence interval calculations agree fairly well:

> quantile(boot.avg,probs=c(.025,.975))
2.5%
97.5%
0.2247719 0.3334889
> rho - 2*sd(boot.avg); rho + 2*sd(boot.avg)
0.2250254 0.3354617
Normal Q-Q Plot
0.20
0.25
0.30
10
0.35
15
correlation coefficient - bootstrap dist
0.20
0.25
0.30
0.35
-3
-2
-1
Lets try a different example.

1300 zip-code level data points
Variables: population density, median #vehicles/HH
R2.50 ; -.70
Median #Vehicles vs Pop Density
0.0 0.5 1.0 1.5 2.0 2.5
veh
5000
loess line
10000
15000
regression line
density
20000
25000
30000

more skew.
-.70
95% conf interval: (-.75, -.67)
Not symmetric around
Effect becomes more pronounced the higher the
value of .
Normal Q-Q Plot
-0.75
10
-0.70
15
-0.65
20
correlation coefficient - bootstrap dist
-0.75
-0.70
-0.65
-3
-2
-1
Bootstrapping Loss Ratio
Now for what weve all been waiting for

Total loss ratio of a segment of business is
our favorite point estimate.
Its variability depends on many things:
Size of book
Loss distribution
Accuracy of rating plan
Consistency of underwriting
How could we hope to write down the true

probability distribution?
Bootstrapping to the rescue
Bootstrapping Loss Ratio & Frequency
50,000 insurance policies

Severity dist from previous example
LR = .79
Claim frequency = .08
Lets build confidence intervals around these

two point estimates.
We will resample the data 500 times
Compute total LR and freq on each sample
Plot the histogram
Results: Distribution of total LR
A little skew, but somewhat close to normal

LR .79
s.d.(LR) .05
conf interval 0.1
Confidence interval calculations disagree a bit:
2.5%
97.5%
0.6974607 0.8829664
> lr - 2*sd(boot.avg); lr + 2*sd(boot.avg)
0.6897653 0.8888983
Normal Q-Q Plot
0.7
0.8
0.9
1.0
bootstrap total LR
0.7
0.8
0.9
1.0
-3
-2
-1
Dependence on Sample Size
Lets take a sub-sample of 10,000 policies

How does this affect the variability of LR?
Again re-sample 500 times
Skewness, variance increase considerably
LR:
.79
.78
s.d.(LR):
.05
.13
Normal Q-Q Plot
0.0
0.6
1.0
0.8 1.0
2.0
1.2
1.4
3.0
bootstrap total LR
0.6
0.8
1.0
1.2
1.4
-3
-2
-1
Distribution of Capped LR
Capped LR is analogous to trimmed mean from robust

statistics
Remove leverage of a few large data points
Here we cap policy-level losses at $30,000
Affects 50 out of 2700 claims
Closer to frequency
distribution less skew close to normal

s.d. cut in half! .05 .025
Normal Q-Q Plot
0.55
0.60
10
0.65
15
0.70
bootstrap LR - losses capped @ $30K
0.55
0.60
0.65
0.70
-3
-2
-1
Results: Distribution of Frequency
Much less variance than LR; very close to normal

freq .08
s.d.(freq) .017
Confidence interval calculations match very well:
2.5%
97.5%
0.07734336 0.08391072
> lr - 2*sd(boot.avg); lr + 2*sd(boot.avg)
0.07719618 0.08388898
Normal Q-Q Plot
0.076
50
0.080
100 150
200
0.084
bootstrap total freq
0.074
0.076
0.078
0.080
0.082
0.084
0.086
-3
-2
-1
When are LRs statistically different?
Example: Divide our 50,000 policies into two

sub-segments: {clean drivers, other}
LRtot = .79
LRclean = .58
LLRclean = -27%
LRother = .84
LRRother = +6%
Clean drivers appear to have 30% lower LR

than non-clean drivers
How sure are we of this indication?
Lets use bootstrapping.
Bootstrapping the difference in LRs
Simultaneously re-sample the two segments

500 times.
At each iteration, calculate

LRc*, LRo*, (LRc*- LRo*), (LRc* / LRo*)
Analyze the resulting empirical distributions.

What is the average difference in loss ratios?
what percent of the time is the difference in
loss ratios greater than x%?
LR distributions of the sub-populations

Normal Q-Q Plot
0.4
0.6
0.8
LR: clean driving record
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3
-2
Normal Q-Q Plot
0.70
0.80
0.90
1.00
LR: non-clean record
-1
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
-3
-2
-1
LRR distributions of the sub-populations

Normal Q-Q Plot
0.0
0.5
1.0
0.7
2.0
0.9
3.0
1.1
LRR: clean driving record
0.5
0.6
0.7
0.8
0.9
1.0
1.1
-3
-2
Normal Q-Q Plot
1.00
1.05
10
1.10
15
LRR: non-clean record
-1
1.00
1.05
1.10
-3
-2
-1
Distribution of LRR Differences

Normal Q-Q Plot
-0.1
0.1
0.3
0.5
LRR_other - LRR_clean
0.0
0.2
0.4
0.6
-3
-2
Normal Q-Q Plot
0.0
1.0
0.5
1.5
1.0
2.0
1.5
2.5
LRR_other / LRR_clean
-1
1.0
1.5
2.0
2.5
-3
-2
-1
Final Example: loss reserve variability
A major issue in the loss reserving

community is reserve variability
Bootstrapping is a natural way to tackle this

problem.
Predictive variance of your estimate of

outstanding losses.
Hard to find an analytic formula for variability

of this o/s losses.
Approach here: bootstrap cases, not

residuals.
Bootstrapping Reserves
S = database of 5000 claims

Sample with replacement all
policies in S
Call this S*1
Same size as S
Now do this 499 more times!
S*1, S*2,, S*500
Estimate o/s reserves on each

sample
Get a distribution of reserve

estimates
Simulated Loss Data
Simulate database of 5000 claims
Each of the 5000 claims was drawn from a

lognormal distribution with parameters
500 claims/year; 10 years
=8; =1.3
Build in loss development patterns.
Li+j = Li * (link + )
is a random error term
See CLRS presentation (2005) for more

details.
Bootstrapping Reserves
Compute our reserve estimate on each S*k
These 500 reserve estimates constitute an

estimate of the distribution of outstanding losses
Notice that we did this by resampling our

original dataset S of claims.
Note: this bootstrapping method differs from
other analyses which bootstrap the residuals
of a model.
These methods rely on the assumption that your

model is correct.
Distribution of Outstanding Losses

4 e-04
total reserves - all 10 years
3 e-04
0 e+00
1 e-04
2 e-04
Blue bars: the

bootstrapped
distribution
Dotted line:
kernel density
estimate of the
distribution
Pink line:
superimposed
normal
19000
20000
21000
22000
23000
24000
25000
95% confidence
interval
4 e-04
3 e-04
Mean:
$21.751M
Median: $21.746M
:
$0.982M
/ 4.5%
2 e-04
1 e-04
The simulated dist of

outstanding losses
appears normal.
0 e+00
19000
20000
21000
22000
23000
24000
(19.8M, 23.7M)
Note: the 2.5 and 97.5 %iles of the bootstrapping distribution

roughly agree with $21.75 2
25000
We can examine a QQ plot to verify that the

distribution of o/s losses is approximately normal.
However, the tails are somewhat heavier than normal.

Remember this is just simulated data!
Real-life results have been consistent with these results.
Normal Q-Q Plot
19000
0 e+00
21000
2 e-04
23000
25000
4 e-04
19000
20000
21000
22000
23000
24000
25000
-3
-2
-1
References
Bootstrap Methods and their Applications

--Davison and Hinkley
An Introduction to the Bootstrap

--Efron and Tibshirani
A Leisurely Look at the Bootstrap

--Efron and Gong
American Statistician 1983
Bootstrap Methods for Standard Errors

-- Efron and Tibshirani
Statistical Science 1986
Applications of Resampling Methods in Actuarial

Practice
-- Derrig, Ostaszewski, Rempala
PCAS 2000

Boots Trapping

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Boots Trapping

Diunggah oleh

Hak Cipta:

Format Tersedia

Deloitte Consulting, 2005

Deloitte Consulting, 2005

Whats it all about?

Actuaries compute points estimates of

A point estimate tells us what the data

Deloitte Consulting, 2005

Point estimate says:

Is there an easier and more flexible way?

Deloitte Consulting, 2005

Enter the Bootstrap

In the late 70s the statistician Brad Efron

This is called resampling.

And compute the statistic of interest on each

Deloitte Consulting, 2005

[Bootstrapping has] requires very little in the

Bootstrapping fits very nicely into the data

Deloitte Consulting, 2005

The Basic Idea

Any actual sample of data

We use the actual data to

Each green oval is the

The distribution of our estimator (Y) depends on both the true

Deloitte Consulting, 2005

The Basic Idea

The Bootstrapping Process

Treat the actual distribution

Sample with replacement

Compute the statistic of

{Y*} constitutes an estimate of the distribution of Y.

Deloitte Consulting, 2005

Sampling With Replacement

In fact, there is a chance of

Intuitively, we treat the original sample as the

Deloitte Consulting, 2005

Theoretical vs. Empirical

bootstrap distribution (Y*-bar)

true distribution (Y-bar)

Deloitte Consulting, 2005

The empirical distribution your data

Deloitte Consulting, 2005

Lets look at a simple case

Deloitte Consulting, 2005

Visualizing the Raw Data

500 draws from n(5000,100)

Normal Q-Q Plot

Deloitte Consulting, 2005

Sampling With Replacement

Draw a data point at random from the data set.

Draw a second data point.

Then throw it back in

Keep going until weve got 500 data points.

Then throw it back in

You might call this a pseudo data set.

This is not merely re-sorting the data.

Some of the original data points will appear more than

Deloitte Consulting, 2005

Deloitte Consulting, 2005

Deloitte Consulting, 2005

From theory we know that

Normal Q-Q Plot

0.00 0.02 0.04 0.06 0.08

bootstrap X-bar data

Deloitte Consulting, 2005

Two Ways of Looking at a Confidence