Anda di halaman 1dari 6

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Page 1

Copyright 2017 by Howard C. Mahler.


A short study note that replaces section 16.5.3 Score Based Approaches from the Loss Models
textbook. This study note updates section 16.5.3 by including material on the Akaike Information
Criterion (AIC), along with updating examples to illustrate how to use the AIC. This note is effective
with the October 2016 exam administration. Download the note from the SOA webpage.1

AIC and BIC are each methods of comparing models fit via maximum likelihood.
In each case, a larger value is better.
AIC (Akaike Information Criterion):
The Akaike Information Criterion (AIC) is used to compare a bunch of models all fit via maximum
likelihood to the same data. The model with the largest AIC is preferred. For a particular model:
AIC = maximum loglikelihood - number of parameters = ln[L] - r.2
r is the number of parameters fit via maximum Iikelihood.
Assume for example, assume we have three Models fit to the same data:
Model #
1
2
3

Number of Parameters
4
5
6

Loglikelihood
-302.7
-301.2
-300.4

AIC
-302.7 - 4 = -306.7
-301.2 - 5 = -306.2
-300.4 - 6 = -306.4

We prefer Model #2, since it has the largest AIC.

https://www.soa.org/education/exam-req/edu-exam-c-detail.aspx
Then click on the Oct. 2016 syllabus. The syllabus has a link to the new study note.
2
This is the definition in Loss Models.
Most other textbooks define AIC = (-2) (maximum loglikelihood) + (number of parameters)(2).
In that case, instead the smaller AIC would be preferred. Which model is preferred would be the same, regardless of
which definition of AIC one used.

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Page 2

BIC (Bayesian Information Criterion):


In Loss Models, BIC is just another name for the Schwarz Bayesian Criterion (SBC).3
The Bayesian Information Criterion can also be used to compare a bunch of models all fit via
maximum likelihood to the same data. The model with the largest BIC is preferred.
For a particular model:
BIC = maximum loglikelihood - (number of parameters / 2) ln(number of data points) =
ln[L] - (r/2) ln[n].4
n is the number of data points, and r is the number of parameters fit via maximum Iikelihood.
Assume for example, assume we have three Models fit to the same 100 data points:
Model #
1
2
3

Number of Parameters
4
5
6

Loglikelihood
-533.6
-530.1
-527.3

BIC
-533.6 - (4/2)ln[100] = -542.8
-530.1 - (5/2)ln[100] = -541.6
-527.3 - (6/2)ln[100] = -541.1

We prefer Model #3, since it has the largest BIC.


The penalty for AIC is r, while the penalty for BIC is (r/2) ln[n]. For large data sets, the penalty for
BIC is larger.5 Thus, BIC prefers simpler models than does AIC.

The Schwarz Bayesian Criterion is covered in Mahlers Guide to Fitting Loss Distributions.
This is the definition in Loss Models.
Most other textbooks define BIC = (-2) (maximum loglikelihood) + (number of parameters) ln(number of data points).
In that case, instead the smaller BIC would be preferred. Which model is preferred would be the same, regardless of
which definition of BIC one used.
5
For n 8, the penalty for BIC is larger than for AIC.
4

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Problems:
Supp.1 (3 points) Five Models have been fit to the same set of 200 observations.
Model
Number of Fitted Parameters
LogLikelihood
A
3
-359.17
B
4
-357.84
C
5
-356.42
D
6
-354.63
E
7
-353.85
Which model has the best AIC (Akaike Information Criterion)?
Supp.2 (3 points)
Five different Models, have been fit to the same set of 400 observations.
Model
Number of Fitted Parameters
LogLikelihood
A
1
-730.18
B
2
-726.24
C
3
-723.56
D
4
-721.02
E
5
-717.50
Which model has the best BIC (Bayesian Information Criterion)?
Supp.3 (3 points)
Use the following information on two Models fit to the same 100 data points:
Number of Fitted Parameters
Loglikelihood
1
-321.06
2
-319.83
(a) Based on AIC (Akaike Information Criterion), which model is preferred?
(b) Based on BIC (Bayesian Information Criterion), which model is preferred?
Supp.4 (3 points) Five Models have been fit to the same set of 200 observations.
Model
Number of Fitted Parameters
LogLikelihood
A
1
-213.6
B
2
-212.3
C
3
-211.5
D
4
-210.7
E
5
-209.4
Which model has the best AIC (Akaike Information Criterion)?

Page 3

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Page 4

Supp.5 (3 points) Five Models have been fit to the same set of 60 observations.
Model
Number of Fitted Parameters
LogLikelihood
A
2
-220.18
B
3
-217.40
C
4
-214.92
D
5
-213.25
E
6
-211.03
Which model has the best BIC (Bayesian Information Criterion)?
Supp.6 (3 points) You are given:
(i) Sample size = 100
(ii) The negative loglikelihoods associated with five models are:
Model
Number Of Parameters
Negative Loglikelihood
Generalized Pareto
3
219.1
Burr
3
219.2
Pareto
2
221.2
Lognormal
2
221.4
Inverse Exponential
1
224.2
Which of the following is the best model, using the Akaike Information Criterion?
(A) Generalized Pareto (B) Burr (C) Pareto (D) Lognormal
(E) Inverse Exponential
Comment: Data taken from 4, 11/00, Q.10.

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Page 5

Solutions to Problems:
Supp.1. D. AIC = maximum loglikelihood - number of parameters.
For example, AIC = -359.17 - 3 = -362.17.
Model

Number of Parameters

Loglikelihood

AIC

A
B
C
D
E

3
4
5
6
7

-359.17
-357.84
-356.42
-354.63
-353.85

-362.17
-361.84
-361.42
-360.63
-360.85

Since AIC is largest for model D, model D is preferred.


Supp.2. B. BIC = maximum loglikelihood - (number of parameters / 2) ln[400].
For example, BIC = -730.18 - (1/2) ln[400] = -733.18.
Model

Number of Parameters

Loglikelihood

BIC

A
B
C
D
E

1
2
3
4
5

-730.18
-726.24
-723.56
-721.02
-717.50

-733.18
-732.23
-732.55
-733.00
-732.48

Since BIC is largest for model B, model B is preferred.


Supp.3. (a) AIC = maximum loglikelihood - (number of parameters.
For the first model, AIC = -321.06 - 1 = -322.06.
For the second model, AIC = -319.83 - 2 = -321.83.
Since AIC is larger for the second model, the second model is preferred.
(b) BIC = maximum loglikelihood - (number of parameters / 2) ln(number of data points).
For the first model, BIC = -321.06 - (1/2) ln[100] = -323.36.
For the second model, BIC = -319.83 - (2/2) ln[100] = -324.44.
Since BIC is larger for the first model, the first model is preferred.
Comment: An example where using AIC and BIC lead to different conclusions.
Supp.4. B. AIC = maximum loglikelihood - number of parameters.
For example, AIC = -212.3 - 2 = -214.3.
Model

Number of Parameters

Loglikelihood

AIC

A
B
C
D
E

1
2
3
4
5

-213.6
-212.3
-211.5
-210.7
-209.4

-214.6
-214.3
-214.5
-214.7
-214.4

Since AIC is largest for model B, model B is preferred.

AIC Supplement, Fitting Loss Distributions

HCM 9/27/16,

Supp.5. C. BIC = maximum loglikelihood + (number of parameters / 2) ln[60].


For example, BIC = -220.18 - (2/2) ln[60] = -224.27.
Model

Number of Parameters

Loglikelihood

BIC

A
B
C
D
E

2
3
4
5
6

-220.18
-217.40
-214.92
-213.25
-211.03

-224.27
-223.54
-223.11
-223.49
-223.31

Since BIC is largest for model C, model C is preferred.


Supp.6. A. AIC = maximum loglikelihood - number of parameters.
For example, AIC = -219.1 - 3 = -222.1.
Model

Number of Parameters

Loglikelihood

AIC

A
B
C
D
E

3
3
2
2
1

-219.1
-219.2
-221.2
-221.4
-224.2

-222.1
-222.2
-223.2
-223.4
-225.2

Since AIC is largest for model A, model A is preferred.

Page 6

Anda mungkin juga menyukai