Anda di halaman 1dari 21

INSTITUTE OF ACTUARIES OF INDIA

EXAMINATIONS
9th November 2010
Subject CT6 – Statistical Methods
Time allowed: Three Hours (10.00 – 13.00 Hrs.)

Total Marks: 100

INSTRUCTIONS TO THE CANDIDATES

1. Please read the instructions on the front page of answer booklet and instructions to
examinees sent along with hall ticket carefully and follow without exception

2. Mark allocations are shown in brackets.

3. Attempt all questions, beginning your answer to each question on a separate sheet.
However, answers to objective type questions could be written on the same sheet.

4. In addition to this paper you will be provided with graph paper, if required.

AT THE END OF THE EXAMINATION

Please return your answer book and this question paper to the supervisor separately.
IAI CT6 1110

Q. 1) A veteran actuary believes that the claims from a particular type of policy follow the Burr
distribution with parameters   2 ,   1000 and   0.75 . As per his recommendation,
the insurance company has set a deductible such that 25% of the losses result in no claim
to the insurer.

(i) Calculate the size of the deductible. (3)


(ii) An actuarial trainee suspects that the deductible set by the veteran actuary is based
on more of surmise than data. She has access to data on 1250 claims (net of
deductible). Continuing with the assumption of the Burr distribution for the original
claims, she wishes to estimate its parameters from the available data, by using the
method of maximum likelihood. Give an expression for the probability density
function of the observed data (net of deductible), and the likelihood function that
has to be maximized. (4)
(iii) Give an expression for the maximum likelihood estimate (MLE) of the true fraction
of the losses that result in no claim to the insurer, in terms of the MLE of the
parameters. (2)
[9]

Q. 2) The annual number of claims on a particular risk has the Binomial distribution with
maximum claim number 10 and average claim number  . The prior density of the
n n
1 n1  n2  1!    1   2
parameter  is     1   , where n1 and n2 are known positive
10 n1!n 2 !  10   10 
integers. The number of claims in the years 2007, 2008 and 2009 were X1, X2 and X3,
respectively.
(i) Determine the prior mean of  . (2)
(ii) Determine the maximum likelihood estimator of  . (2)
(iii) Determine the Bayes estimate of the number of claims in the year 2010, under the
squared error loss function. (4)
(iv) Show that the estimator of part (iii) has the form of a credibility estimate, and
identify the credibility factor. (2)
(v) Determine the credibility estimator of  under EBCT Model 1 and compare with
the result of part (iii). (6)

[16]

Q. 3) The aggregate claims process for a risk is a compound Poisson process with rate   50
per annum. Individual claim amounts are Rs. 2500 with probability 0.25, Rs. 5000 with
probability 0.5, or Rs. 7500 with probability 0.25. The premium loading is 10%. Let S
denote the aggregate annual claim amount.
(i) Calculate the mean and variance of S. (2)
(ii) Using a normal approximation to the distribution of S, calculate the initial surplus
required in order that the probability of ruin at the end of the first year is 0.05. (3)
(iii) A reinsurer offers to sell to the insurer proportional reinsurance for 25% of the
claims, for premium loading 15%. If this offer is accepted, calculate the modified
initial surplus required in order that the probability of ruin at the end of the first
year is 0.05. (4)

Page 2 of 5
IAI CT6 1110
[9]

Q. 4) The cumulative incurred claims (in thousands of rupees) on a portfolio of insurance


policies are as given in the following table.
Accident Development Year
Year 0 1 2 3
2006 2,463 2,749 3,529 3,980
2007 3,013 3,278 3,608
2008 3,321 3,716
2009 3,953

The earned premium for the year 2009 is Rs. 6,472,000, while the paid claims are
Rs. 1,731,000.

(i) Assuming that the Ultimate Loss Ratio is 88%, calculate the reserve needed for 2009
using the Bornhuetter-Ferguson (basic) method. (8)

(ii) State the assumptions underlying the use of the above method. (3)
[11]

Q. 5) Consider the autoregressive process given by


Yt  Yt 2  Z t ,

Z t being white noise with mean zero and variance  2 .


(i) What is the range of values of the real valued parameter  so that the process is
stationary? (2)

(ii) Obtain a representation of Yt as a Z
j 0
j t j , by specifying a0, a1,… explicitly. (3)
(iii) Using part (b) or otherwise, find an expression for the variance of Yt in terms of 
and  2 . (2)
(iv) Compare the result of part (iii) with the variance of an AR(1) process and explain any
similarity or dissimilarity. (2)
[9]

Q. 6) The sample ACF and PACF values at lags 1 to 10 of a time series of length 500, are as
given below.
Lag 1 2 3 4 5 6 7 8 9 10
SACF -0.7793 0.6180 -0.4824 0.386 -0.341 0.3172 -0.2989 0.2728 -0.2181 0.163
SPACF -0.7793 0.0275 0.0188 0.0232 -0.084 0.0538 -0.0289 0.0004 0.0616 -0.0301

(i) Determine through a statistical test whether the time series can be regarded as white
noise. (5)

(ii) Indicate, with reasons, if an AR(p) or an MA(q) model may be appropriate for this
time series, and if so, what could be the model order. (4)
[9]

Page 3 of 5
IAI CT6 1110

Q. 7) List six perils that are typically insured against under a household building policy. [3]

Q. 8) A claim analyst of a health insurance company examines data on a portfolio of health


insurance policies. He plans to use a generalized linear model for the claim amounts,
involving the following rating factors.
SA : Sum assured (x), a continuous variable.
AG : Age group, a factor with 10 levels.
OC : Occupation, a factor with 6 levels.
A preliminary analysis produces the following summary for the models considered by the
analyst.

Model Linear predictor No of parameters Scaled deviance


SA   x 2 238.4
SA + AG ? ? 206.7
SA + AG + SA * AG ? ? 178.3
SA * AG + OC ? ? 166.2
SA * AG * OC ? ? 58.9

(i) Complete the table by filling in the cells with question marks. (4)
(ii) On the basis of the scaled deviance, which model should the analyst choose? (3)
(iii) What further considerations should be given before the analyst makes a
recommendation about the choice of the model? (2)
[9]

Q. 9) An actuary uses the following algorithm to generate pseudorandom numbers X from the
Poisson distribution with mean  .
Step 1: Input lambda.
Step 2: Set X=0; Z=0.
Step 3: Set Y = Random sample from the uniform U(0,1) distribution.
Step 4: Increment Z by the amount –ln(Y)/lambda.
(ln is the log function).
Step 5: If Z<1, then increment X by 1; GO TO Step 3.
Step 6: Output X.
Step 7: GO TO Step 2 for generating the next value of X.
(i) By analysing the above algorithm, show that it generates the value X  0 with the
correct probability. (3)
(ii) If five successively random samples from the uniform distribution, generated in Step
3, happen to be 0.564, 0.505, 0.756, 0.610 and 0.046, and   2 , follow the above
algorithm to generate as many samples of X as this information permits. (4)
(iii) Using the uniformly distributed random samples and the value of  given in part (ii),
generate samples (as many as possible) from the Poisson distribution, using the
Page 4 of 5
IAI CT6 1110
inverse distribution transform method. (5)
[12]

Q. 10) (i) State the individual risk model, with a clear description of the assumptions. (5)

(ii) How is this model different from the collective risk model? (3)
[8]

Q. 11) The owner of a personal computer has to decide whether to sign an Annual Maintenance
Contract (AMC) or to pay for repair separately on each occasion of computer fault. The
AMC costs Rs. 1000, and provides for an unlimited count of repair services. In the
absence of the AMC, the servicing agency charges Rs. 300 for each repair. The owner
assumes the probability distribution of the annual number of faults as follows.
Number of faults 0 1 2 3 4 5 More than 5
Probability 0.1 0.1 0.2 0.3 0.2 0.1 0

(i) Form the loss matrix for the owner of the computer in respect of the above decision. (2)

(ii) What is the minimax decision? (1)

(iii) What is the Bayes decision? (2)


[5]
************************

Page 5 of 5
Institute of Actuaries of India

Subject CT6 – Statistical Methods

November 2010 Examinations

INDICATIVE SOLUTIONS

Introduction

The indicative solution has been written by the Examiners with the aim of helping candidates.
The solutions given are only indicative. It is realized that there could be other points as valid
answers and examiner have given credit for any alternative approach or interpretation which they
consider to be reasonable.
IAI CT6 1110
Question 1

(i) The deductible D satisfies the equation

α
⎛ λ ⎞
1− ⎜ γ ⎟
= 0.25 ,
⎝λ + D ⎠
2
⎛ 1000 ⎞
i.e., 1 − ⎜ 0.75 ⎟
= 0.25 .
⎝ 1000 + D ⎠
Thus,
2
⎛ 1000 ⎞
⎜ 0.75 ⎟
= 0.75 .
⎝ 1000 + D ⎠
1000
= 0.75 = 0.8660 .
1000 + D 0.75
D 0.75 1
1+ = .
1000 0.8660
D 0.75 1
= − 1 = 0.1547 .
1000 0.8660
D = 154.71 / 0.75 = 830.5 .

(ii) The density of the losses resulting in claims is the following truncated version of the Burr
density.
αγλα x γ −1
f (x )
=
λ + xγ ( =
)α +1
αγ λ + D γ x γ −1
, x > D.
( ) α

1 − F (D ) ⎛ λ ⎞ α λ + xγ
α +1
( )
⎜ γ ⎟
⎝λ + D ⎠
The probability density function of the claims (net of deductible) is the density of
Y = X − D , where X has the above density. Thus,
αγ (λ + D γ ) ( y + D )γ −1
α

fY ( y) = , y > 0.
{λ + ( y + D ) } γ α +1

If the claims data (net of deductible) are Y1 , Y2 , K , Y1250 , then the likelihood function to be
maximized with respect to the parameters α , γ and λ is
αγ (λ + D γ ) (Yi + D )γ −1
1250 1250 α

L(α , γ , λ ) = ∏ f Y (Yi ) = ∏ .
i =1 i =1 {λ + (Y + D ) }
i
γ α +1

(iii) The MLE of the true fraction of the losses that result in no claim to the insurer is
αˆ
⎛ λˆ ⎞
F (D ) α =αˆ ,γ =γˆ ,λ =λˆ = 1 − ⎜⎜ ⎟ ,

ˆ γˆ
⎝λ + D ⎠
where α̂ , γˆ and λ̂ are the respective MLE’s of the parameters α , γ and λ , obtained by
maximizing the likelihood described in part (ii), and D = 830.5 , as determined from part
(i).
[9]

Page 2 of 9
IAI CT6 1110
Question 2
θ
(i) Let p = . Then the prior mean is E (θ ) = 10 E ( p ) , where the density of p is
10
(n1 + n2 + 1)! p n1 (1 − p )n2 .
n1!n 2 !

1
E( p) = ∫ p ×
(n1 + n2 + 1)! p n (1 − p )n
1 2

dp
0 n1!n 2 !
1 n1 + 1 (n + n2 + 2)! p n1 +1 (1 − p ) 2 n1 + 1
n

=∫ × 1 dp = .
0 n +n +2
1 2 (n1 + 1)!n2 ! n1 + n2 + 2

The integral is evaluated by making use of the fact that


(n1 + n2 + 2)! p n +1 (1 − p )n
1 2

is a
(n1 + 1)!n2 !
probability density function (similar to the above density of p, but n1 is replaced by n1+1).
10(n1 + 1)
Thus, the prior mean is 10 E ( p ) = .
n1 + n2 + 2

X 10 − X i
⎛ 10 ⎞⎛ θ ⎞ i ⎛
3
θ ⎞
(ii) The likelihood is ∏ ⎜⎜ ⎟⎟⎜ ⎟ ⎜1 − ⎟ .
i =1 ⎝ X i ⎠⎝ 10 ⎠ ⎝ 10 ⎠
The log-likelihood is a constant plus
( X 1 + X 2 + X 3 )ln⎛⎜ θ ⎞⎟ + (30 − X 1 − X 2 − X 3 )ln⎛⎜1 − θ ⎞⎟ .
⎝ 10 ⎠ ⎝ 10 ⎠
X1 + X 2 + X 3
The MLE of θ , obtained by maximizing the log-likelihood, is =X.
3
(iii) The Bayes estimator of θ under the squared error loss function is the mean of the
posterior distribution.
n n
1 (n1 + n2 + 1)! ⎛ θ ⎞ 1 ⎛ θ ⎞2
The prior density of θ is × × ⎜ ⎟ ⎜1 − ⎟ .
10 n1!n 2 ! ⎝ 10 ⎠ ⎝ 10 ⎠
The likelihood is as given in part (ii).
Therefore, the posterior density of p is proportional to
n n X 10 − X
1 (n1 + n2 + 1)! ⎛ θ ⎞ 1 ⎛ θ ⎞ 2 3 ⎛ 10 ⎞⎛ θ ⎞ i ⎛ θ ⎞ i
× × ⎜ ⎟ ⎜1 − ⎟ × ∏ ⎜⎜ ⎟⎟⎜ ⎟ ⎜1 − ⎟ ,
10 n1!n2 ! ⎝ 10 ⎠ ⎝ 10 ⎠ i =1 ⎝ X i ⎠⎝ 10 ⎠ ⎝ 10 ⎠
n1 + X 1 + X 2 + X 3 n2 + 30 − X 1 − X 2 − X 3
⎛θ ⎞ ⎛ θ ⎞
i.e., proportional to ⎜ ⎟ ⎜1 − ⎟ .
⎝ 10 ⎠ ⎝ 10 ⎠
Comparison with the prior density reveals that the posterior density is
n1 + X 1 + X 2 + X 3 n2 + 30 − X 1 − X 2 − X 3

(n1 + n2 + 31)!⎛⎜ θ ⎞⎟ ⎛ θ ⎞
⎜1 − ⎟
1 ⎝ 10 ⎠ ⎝ 10 ⎠
× ,
10 (n1 + X 1 + X 2 + X 3 )!(n2 + 30 − X 1 − X 2 − X 3 )!

Page 3 of 9
IAI CT6 1110
which is similar to the prior density, but has different parameters. By comparing this
density with the prior and the prior mean computed in part (i), we conclude that the
10(n1 + X 1 + X 2 + X 3 + 1)
posterior mean is .
n1 + n2 + 32
(iv) The posterior mean can be written as
30 X + X 2 + X 3 n1 + n 2 + 2 10(n1 + 1)
× 1 + × ,
n1 + n 2 +32 3 n1 + n 2 +32 n1 + n 2 + 2
30
which is of the form Z × X + (1 − Z ) × E (θ ) , with credibility factor Z = .
n1 + n2 + 32

(v) m(θ ) = E (X θ ) = θ = 10 p . Note that

n1 + 1
E( p) = ,
n1 + n2 + 2

( ) 1 (n1 + n2 + 1)! p n (1 − p )n
E p2 = ∫ p2 × 1 2
dp
0 n1!n 2 !
1 (n1 + 2)(n1 + 1) (n + n2 + 3)! n + 2 (n1 + 2)(n1 + 1)
p (1 − p ) dp =
n
=∫ × 1 1 2
.
0 (n + n + 3)(n + n + 2 ) (n1 + 2)!n2 ! (n1 + n2 + 3)(n1 + n2 + 2)
1 2 1 2

Hence, V ( p ) = E p ( 2 (n1 + 2 )(n1 + 1)


) − [E ( p )] = (n + n + 3)(n + n + 2) −
2 (n1 + 1)
2

1 2 1 2 (n1 + n2 + 2)2
(n1 + 1) (n1 + 2)(n1 + n2 + 2) − (n1 + 1)(n1 + n2 + 3) (n1 + 1)(n2 + 1)
= × = .
(n1 + n2 + 2) (n1 + n2 + 2 )(n1 + n2 + 3 ) (n1 + n2 + 2)2 (n1 + n2 + 3)
It follows that
10(n1 + 1) 100(n1 + 1)(n 2 + 1)
E (m(θ )) = , V (m(θ )) = .
n1 + n2 + 2 (n1 + n2 + 2)2 (n1 + n2 + 3)
⎛ θ ⎞
Further, s 2 (θ ) = V ( X | θ ) = θ ⎜1 − ⎟ , and
⎝ 10 ⎠

( ) n1 + 1 (n1 + 2)(n1 + 1) (n1 + 1)(n2 + 1)


E s 2 (θ ) = 10 × − 10 × = 10 × .
n1 + n 2 + 2 (n1 + n2 + 3)(n1 + n2 + 2) (n1 + n2 + 2)(n1 + n2 + 3)
The EBCT Model 1 credibility factor is
n 3 3 30
Z= = = = ,
n+
(
E s 2 (θ ) ) 10(n1 + 1)(n 2 + 1)
3+
(n1 + n2 + 2) n1 + n2 + 32
V (m(θ )) (n + n2 + 2)(n1 + n2 + 3)
3+ 1
10
100(n1 + 1)(n 2 + 1)
(n1 + n2 + 2)2 (n1 + n2 + 3)
and the corresponding credibility estimate of θ is
30 X + X 2 + X 3 n1 + n2 + 2 10(n1 + 1)
× 1 + × , same as in part (iii).
n1 + n 2 + 32 3 n1 + n2 + 32 n1 + n2 + 2
[16]

Page 4 of 9
IAI CT6 1110
Question 3

(i) E (S ) = 50 × (0.25 × 2500 + 0.5 × 5000 + 0.25 × 7500 ) = 250,000 .


( )
V (S ) = 50 × 0.25 × 2500 2 + 0.5 × 5000 2 + 0.25 × 7500 2 = 37,500 2 .

(ii) We have to find U such that

P (U + (1 + 0.1)E (S ) < S ) = 0.05 ,

⎛ S − E (S ) U + (1 + 0.1)E (S ) − E (S ) ⎞
i.e., P⎜ > ⎟ = 0.05 ,
⎜ V (S ) V ( S ) ⎟
⎝ ⎠
U + (1 + 0.1)E (S ) − E (S )
i.e., = 1.645 ,
V (S )
U + 25,000
i.e., = 1.645 , i.e., U = 36,682 .
37,500

(iii) We have to find U R such that

P(U R + (1 + 0.1)E (S ) − (1 + 0.15) × 0.25 E (S ) < (1 − 0.25)S ) = 0.05 ,

⎛ 0.75S − 0.75 E (S ) U R + (1 + 0.1)E (S ) − (1 + 0.15) × 0.25 E (S ) − 0.75 E (S ) ⎞


i.e., P⎜ > ⎟ = 0.05 ,
⎜ 0.75 V (S ) 0 .75 V (S ) ⎟
⎝ ⎠

U R + (1 + 0.1)E (S ) − (1 + 0.15) × 0.25 E (S ) − 0.75 E (S )


i.e., = 1.645 ,
0.75 V (S )
U R + 15,625
i.e., = 1.645 , i.e., U = 30,637 .
28,125
[9]

Question 4

(i) The Development factors are


2749 + 3278 + 3716 3529 + 3608 3980
f 01 = = 1.1075 , f 12 = = 1.1842 , f 23 = = 1.1278 .
2463 + 3013 + 3321 2749 + 3278 3529

Calculation of reserve for the year 2009:


The cumulative development factor applicable to the year 2009 is
f = f 01 × f 12 × f 23 = 1.4791 .
1 − 1 / f = 0.3239 .
The earned premium is Rs. 6,472,000 (given).
The assumed ultimate loss ratio is 0.88.
The emerging liability is 0.3239 × 6,472,000 × 0.88 = 1,844,000 .
The reported liability is Rs. 3,953,000 (from table).
The Ultimate liability is Rs. 1,844,000 + Rs. 3,953,000 = Rs. 5798,000.
Paid claims are Rs. 1,731,000 (given).
Reserve needed = Rs. 5,798,000 – Rs. 1,731,000 = Rs. 4,066,000.

Page 5 of 9
IAI CT6 1110
(ii) The assumptions are:
• Payments from each accident year develop in the same way, i.e., for each accident
year, the amount of claims paid in each development year is a constant proportion
of the total claims paid from that accident year.
• Weighted average of the past inflation would be repeated in the future (this
assumption holds trivially if the rate of inflation is constant).
• The estimated loss ratio is appropriate.
[11]
Question 5

(i) ( )
The characteristic polynomial is 1 − αB 2 , and the magnitude of its roots is 1 / α . For
stationarity of the process, these roots should have magnitude greater than 1, which can
happen if and only if α < 1 .
(ii) (1 − αB )Y
2
t = Zt .

(
Hence, Yt = 1 − αB 2 )
−1
{ (
Z t = 1 + αB 2 + αB 2 ) + (αB )
2 2 3
} ∞
+ L Z t = ∑ α j Z t −2 j ,
j =0
j/2
⎧α , for j even,
i.e., a j = ⎨
⎩ 0, for j odd .
⎛ ∞ ⎞ ∞ ∞ ∞
σ2
(iii) V (Yt ) = V ⎜⎜ ∑ α j Z t − 2 j ⎟⎟ = ∑ V (α j Z t − 2 j ) = ∑ α 2 jV (Z t − 2 j ) = ∑ α 2 j σ 2 = .
⎝ j =0 ⎠ j =0 j =0 j =0 1−α 2
(iv) The above expression is identical with that of the variance of an AR(1) process with
parameter α .
This is more than a coincidence. The odd and even samples of the given time series
separate into two completely independent sequences, each being an AR(1) process with
parameter α .
[9]

Question 6

(i) The portmanteau test statistic computed from m sample ACF values, r1,…,rm, is
m
r2
n(n + 2 )∑ k , n being the sample size. If the time series indeed comes from white
k =1 n − k
noise, then this statistic has the chi-square distribution with m degrees of freedom. This
holds for any fixed m.
In this case, n = 500 , and we can choose m = 10 .
The portmanteau test statistic turns out to be 922.4.
This is a very large value in relation to the null distribution of χ 102 , indicating rejection of
the null hypothesis of white noise.
[A properly justified use of the portmanteau test for smaller values of m should also fetch
full credit. The correct value of the statistic for m = 9,…,1 are 908.8, 884.4, 846.5, 801.0,
749.9, 690.9, 615.5, 498.0 and 305.5, respectively. All the tests are highly significant.]

(ii) The ACF sequence becomes zero after lag q in the case of an MA(q) time series.
Likewise, the PACF sequence becomes zero after lag p in the case of an AR(p) process.
One can check whether sample ACF or PACF samples are significantly different from 0,
by comparing their absolute value with the threshold 2 n .
In the present case, the threshold is 0.0894.

Page 6 of 9
IAI CT6 1110
The sample ACF sequence has many values which are larger than this value in
magnitude. Therefore, an MA(q) model does not appear to be appropriate.
On the other hand, the sample PACF sequence has smaller absolute values after lag 1.
Therefore, an AR(1) model is indicated.
[9]
Question 7
Typical perils are:
• Fire
• Explosion
• Lightning
• Theft
• Storm
• Flood
• Earthquake [3]
[Any six from the above list will suffice. Credit should be given for other reasonable answers.]

Question 8

(i) The completed table is as follows (the index i indicates the age group and has values from
1 to 10; the index j indicates the occupation category and has values from 1 to 6).
Model Linear predictor No of parameters Scaled deviance
SA α + βx 2 238.4
SA + AG α + βx + γ i 11 206.7
SA + AG + SA * AG αi + βi x 20 178.3
SA * AG + OC αi + βi x + θ j 25 166.2
SA * AG * OC α ij + β ij x 120 58.9

(ii) The differences in deviance and degree of freedom in models of successively higher
complexity are as follows.
Model Degree of Scaled Change in degree Change in
freedom deviance of freedom scaled deviance
SA 2 238.4
SA + AG 11 206.7 9 31.7
SA + AG + SA * AG 20 178.3 9 28.4
SA * AG + OC 25 166.2 5 12.1
SA * AG * OC 120 58.9 95 107.3

Going by the thumb-rule reduction threshold of two per degree of freedom, each of the
successive levels of model sophistication, except for the last one, is justified. Therefore,
the chosen model is SA * AG + OC.

(iii) One needs to analyse the residuals in order to check for violation of model assumptions
before making recommendation about the appropriate choice of the model. A goodness of
fit test may also be conducted.
[9]

Question 9

(i) The output X is equal to zero if and only if the first generated value of Z happens to be
greater than or equal to 1. The probability of this even is

Page 7 of 9
IAI CT6 1110
P( X = 0) = P(Z ≥ 1) = P(− ln Y ≥ λ ) = P(Y ≤ e ) = e ,
−λ −λ

which is the correct value of probability of a sample from the Poisson distribution (with
mean λ ) being equal to 0.

(ii) The successively assigned values of different variables are as follows.

Y − ln Y Z X Output
λ
0.564 0.2864 0.2864 0 -
0.505 0.3416 0.6279 1 -
0.756 0.1399 0.7678 2 -
0.610 0.2471 1.0150 3 X =2
0.046 1.5395 1.5395 0 X =0

Thus, the generated values are 3 and 0.

(iii) The following table gives some values of the probability function and the cumulative
distribution function for the Poisson distribution with mean λ = 2 .

x P( X = x ) P( X ≤ x )
0 0.1353 0.1353
1 0.2707 0.4060
2 0.2707 0.6767
3 0.1804 0.8571
4 0.0902 0.9473
5 0.0361 0.9834
6 0.0120 0.9955

Accordingly, X would be assigned values according to the following rule.

Range of generated sample from U (0,1) Assigned value of X


(0, 0.1353) 0
(0.1353,0.4060) 1
(0.4060,0.6767) 2
(0.6767,0.8571) 3
(0.8571,0.9473) 4
(0.9473,0.9834) 5
(0.9834,0.9955) 6

Obviously the above tables are incomplete, but these are adequate for the purpose, since
the given random numbers are within the intervals listed here.
As per the above table, the five random numbers correspond to the following values of X:
2, 2, 3, 2, 0.
[12]
Question 10

(i) The individual risk model is a model for the risk arising from a portfolio of a fixed
number of individual risks. According to this model, the aggregate claims from the
portfolio is
S = Y1 + Y2 + L + Yn ,
where n is number of risks, and for i = 1,2,K, n , Yi is the claim amount under the ith risk.

Page 8 of 9
IAI CT6 1110

The assumptions underlying this model are:


• the risks are independent,
• the number of risks does not change over the period of insurance cover,
• the number of claims from each risk is either 0 or 1.
The claim numbers and claim sizes for different risks are not assumed to have identical
distribution.
(ii) This model differs from the collective risk model in three major ways.
• In the individual risk model, the number of risks is fixed, and stays the same over the
period of cover, whereas in the collective risk model, this number may be random,
and may also vary.
• Unlike the case of the individual risk model, there is no restriction on the number of
claims arising from individual risks under the collective risk model.
• In the individual risk model, the individual risks are assumed to be independent, while
in the collective risk model, the individual claim amounts are assumed to be
independent.
[8]

Question 11

(i) Nature’s choices in the present case are the number of faults, θ . The losses associated
with the decisions to sign the AMC (d1) or not to sign it (d2) are as given below.
θ
0 1 • 2 3 4 5
D1 1000 1000 1000 1000 1000 1000
D2 0 300 600 900 1200 1500

(ii) The maximum losses under d1 and d2 are Rs. 1000 and Rs. 1500, respectively. The
minimax decision is to sign the AMC (d1).
(iii) The average loss under d1 is Rs. 1000.
The average loss under d2 is
0.1 × 0 + 0.1 × 300 + 0.2 × 600 + 0.3 × 900 + 0.2 × 1200 + 0.1 × 1500 = Rs. 810.
Therefore, the Bayes’ decision is not to sign the AMC (d2).
[5]

[Total Marks 100]

******************

Page 9 of 9
INSTITUTE OF ACTUARIES OF INDIA

EXAMINATIONS
8th November 2010
Subject CT3 – Probability & Mathematical Statistics
Time allowed: Three Hours (15.00 – 18.00 Hrs)

Total Marks: 100

INSTRUCTIONS TO THE CANDIDATES

1. Please read the instructions on the front page of answer booklet and instructions to examinees sent
along with hall ticket carefully and follow without exception

2. Mark allocations are shown in brackets.

3. Attempt all questions, beginning your answer to each question on a separate sheet.
However, answers to objective type questions could be written on the same sheet.

4. In addition to this paper you will be provided with graph paper, if required.

AT THE END OF THE EXAMINATION

Please return your answer book and this question paper to the supervisor separately.
IAI CT3 1110
Q. 1) Puneet, the owner of Hard Rock Café is interested in how much people spend at the cafe. He
examines 20 randomly selected bill receipts and writes down the following data (in Rs.):
1500 2000 2500 2500 3500 2500 3500 3000 4000 4000 2000 2500 1500 1500 3500 2500
3500 4500 6000 6000
(a) Determine the mean and standard deviation of the sample.
(3)
A mathematician came up with bounds on how much of the data must lie close to the mean. In
particular, for any positive k, at least 1 – 1/k3 proportion of the data lies within the interval given
by k times the standard deviation around the mean.

(b) For k = 2, determine the upper and lower bounds for the data. (1)

(c) Using your answer to part (b), comment on whether the mathematician’s theorem is
correct in light of the given data. (2)
[6]

Q. 2) It is desired to simulate an observation of the random variable X with probability density


function
f(x) = 1/k; 0≤x≤k
0; otherwise
A random number r is generated from the uniform distribution over [0, 1]. The following values
are then calculated
x1 = rk ; x2 = k(1-r) ; x3 = r/k
State, with reason, which of the above values are valid simulated observations of X? [2]

Q. 3) Suppose the probability of an individual being born on any particular day of the year is given by
1/365.
(a) What is the probability that 2 people meeting at random have the same birthday? (2)

(b) Suppose now that a group has 3 individuals. What is the probability that at least two of
these individuals will share a birthday? What if the group has 4 individuals? (4)

(c) Show that a group must have 15 individuals such that the probability of finding at least 2
(3)
people with the same birthday is 25%.
[9]

Page 2 of 7
IAI CT3 1110
Q. 4) For the random variable X, you are given:
E*X+ = θ, θ>0
Var*X+ = θ2/25

kX
= ; k > 0 and where is the estimate of θ
k 1
2
MSE ( ) 2[bias ( )]

Find k. [4]

Q.5) Vivek’s company owns a factory. It buys insurance to protect itself against major repair costs.
Profit equals revenues, less the sum of insurance premiums, retained major repair costs, and all
other expenses. Company will pay a dividend equal to the profit, if it is positive.

You are given:

(i) Revenue from the factory is 1.70.

(ii) The distribution of major repair costs (k) for the factory is

K Probability
0 0.4
1 0.3
2 0.2
3 0.1

(iii) The insurance policy pays the major repair costs in excess of that factory’s deductible of 1
(i.e. claims will be payable after deducting 1 provided claims are greater than 1, else nil).
The insurance premium is 110% of the expected claims for the insurance company.

(iv) All other expenses are 20% of revenues.

Show that the expected dividend is equal to 0.368.


[4]

Q.6) Let mutually stochastically independent random variables, each


with common pdf . If Y is the minimum of these n variables, find
the CDF and the pdf of Y. [4]

Page 3 of 7
IAI CT3 1110
Q.7) The scores on the final exam in Varun’s risk management class have a normal distribution with
mean θ and standard deviation equal to 8. θ is a normally distributed random variable with
mean equal to 75 and variance equal to 36.

Each year, Varun chooses a student at random and pays the student equal to the student’s score.

However, if the student fails the exam (i.e. score ≤ 65), then there is no payment.

Calculate the conditional probability that the payment is less than 90, given that there is a
[5]
payment.

Q.8) A random variable X follows a “triangular” distribution specified by the density

4x : 0<x<
f(x) = 4(1 – x) : <x<1
0 : otherwise

(a) Show that the moment generating function of X is given by:

(5)

(b) Let X1 and X2 be uniform independent random variables on the interval (0,1), i.e. each with
density

f(x) = 1 : 0<x<1
0 : otherwise

and moment generating function

(i) Use the definition and properties of moment generating functions to derive the moment
(2)
generating function of Y = .

(ii) Hence, comment on the distribution of the mean of two random samples from the uniform
distribution on the interval (0, 1). (1)
[8]

Page 4 of 7
IAI CT3 1110
Q. 9) Suppose that the joint probability density function of the bivariate random variable (X,Y) is given
by:

(a) Work out E(XY ) (3)

(b) Work out the marginal density functions and and hence E(X) and E(Y). (3)

(c) Attempt to prove or disapprove the following statement:


(4)
"In this example, the variables X and Y are independent if and only if they are uncorrelated."
[10]

Q. 10) is a sequence of independent and identically distributed random variables, each with

mean 5 and variance 25. Sn represents aggregate claims from a risk in year n. The insurer

intends to calculate the annual risk premium, ∏, for this risk such that:

Pr [ Sn > ∏ ] = 0.01

(i) Assuming Sn has an exponential distribution, show that ∏ = 23.03. (2)

(ii) Calculate the value of ∏ assuming Sn has a lognormal distribution. (4)


(iii) Assuming that Sn has an exponential distribution, calculate the value of:
(4)
P[ (S1 ≤ 23.03) (S1 + S2 ≤ 46.06) ]
[10]

Q. 11) You are given the following random sample of 30 auto claims:
54 140 230 560 600 1,100 1,500 1,800 1,920 2,000
2,450 2,500 2,580 2,910 3,800 3,800 3,810 3,870 4,000 4,800
7,200 7,390 11,750 12,000 15,000 25,000 30,000 32,300 35,000 55,000
Test the hypothesis that auto claims follow a Continuous Distribution Function F(x) with the
following percentiles:
x 310 500 2,498 4,876 7,498 12,930
F(x) 0.16 0.27 0.44 0.65 0.83 0.95

Group the data using the largest number of groups such that the expected number of claims in
each group is at least 5.
Using the chi-square goodness-of-fit test, determine the results of the test at 5% level of
significance. [5]

Page 5 of 7
IAI CT3 1110
Q. 12) Let be a sample from the random variable X with pdf

, 0 < x < 1, and 0 < <

(4)
(a) Find the maximum likelihood estimator of .
(4)
(b) Show that the above estimate is an unbiased estimator of .
[8]

Q. 13) Sachin, owns a towing company which provides all towing services to members of the City
Automobile Club.
For a particular towing, the information is as below

Towing Distance Towing Cost Frequency


0 - 9.99 km 80 50%
10 - 29.99 km 100 40%
30+ km 160 10%

The automobile owner must pay 10% of the cost and the remainder is paid by the City
Automobile Club. The number of towings has a Poisson distribution with mean of 1000 per year.
The number of towings and the costs of individual towings are all mutually independent.

Using the normal approximation for the distribution of aggregate towing costs, show that the
probability that the City Automobile Club pays more than 90,000 in any given year is 10%. [5]

Q. 14) Rajeev compared protein intake among three groups of women:

- women eating a standard India diet (STD)

- women eating a lacto-vegetarian diet (LAC) and

- women eating a strict vegetarian diet (VEG)

The mean and standard deviation of protein intake as well as the group sizes are presented in
the table below.

Group Mean Standard Deviation Number of women in group


STD 75 9 10
LAC 57 13 10
VEG 47 17 6

Page 6 of 7
IAI CT3 1110

(a) Perform an overall F test to determine whether there is a significant difference in mean
(5)
protein intake between the three groups stating both the null and alternative hypotheses.

(b) Obtain 95% confidence intervals for each of the group means. Which groups appear (3)
different from one another?

(c) Repeat part (b) using pair-wise confidence intervals and hypothesis tests for whether there
(7)
are differences in mean protein intake.
[15]

Q.15) Anand obtains cash from an ATM (cash machine) for his girlfriend. He suspects that the rate at
which she spends cash is affected by the amount of cash he withdrew at his previous visit to an
ATM.

To investigate this, he deliberately varies the amounts he withdraws. For the next 10
withdrawals, he records, for each visit to an ATM, the amount x (in Rs.) withdrawn, and the
number of hours, y, until his next visit to an ATM.

Withdrawal 1 2 3 4 5 6 7 8 9 10

x 40 10 100 110 120 150 20 90 80 130

y 56 62 195 240 170 270 48 196 214 286

(a) Calculate the equation of the regression line of y on x (4)

(b) Interpret, in context of the question, the gradient of the regression line (1)
[5]

*****************************

Page 7 of 7