R语言代写|R语言代做|R语言代考
当前位置:以往案例 > >案例经济分析 project computer class Risk Theory Baile
2020-02-03

案例经济分析

project computer class Risk Theory 2017–2018; week 2


案例经济分析
Bailey-Simon tariff method—optimization in R



image.png– an open source, community developed statistical computing system – has acquired significant capability in optimization and nonlinear parameter estimation. Indeed, R is be- lieved to be the leading package for research with statistics, and its optimization capabilities reflect the interests of its creators and users. Thus maximum likelihood computations have particular tools, but there are also a number of function optimization tools that can be used

to construct customized computations. Having tools as well for mathematical programming tasks, R is a strong general-purpose tool for scientific computation. It is also quite adaptable and sometimes is used as a user interface to tools written in other languages.

In this project, we study the problem of constructing a tariff in non-life insurance. We have a portfolio of risks split up by two risk factors, gender i= 1,2 and region of residence j= 1,2,3. The available data are the numbers of policies nijin each class (i,j), as well as the average of the claims rijfiled.



We want to construct a tariff for insuring these policies, charging a premium of the form xiyj, for certain numbers x1,x2 and y1,y2,y3. It can be seen as a basic premium per region, multiplied by a surcharge for gender, or vice versa. We want to find numbers xiand yjsuch that the values xiyjare in some sense ‘close’ to the observed risk premiums rij.

We define ‘close’ as having a small value of a quantity equal to or resembling (o− e)2, with othe observed values rijin each cell and ethe premiums xiyjto be charged.





In insurance applications, the rule is that the deviation of o from its mean value will tend to be larger if this mean is large. So if we look at the unmodified sum-of-squares (o− e)2, the large values of o and e will have a large influence on the fitted values of xiand yj. Now assume that e is the mean value of o, and also a constant times its variance, such as when the observations are Poisson variables or multiples of those. Then if we look at (o − e)2/e instead, a scaled version of the sum-of-squares, all terms in the sum can be seen to have equal mean. See also MART, Ch. 9.3. Such quantities are known as chi-square statistics, and minimizing them to get estimates is called minimum chi-square estimation.

By asymptotic normality, the quantity Σ



i,j

(Oij− E^[Oij])2/E^[Oij] approximately has a χ2



distribution, with as the number of degrees of freedom k the number of cells minus the number of parameters estimated.



The purpose of this project is to learn about

• using matrices, functions, iteration in R;


1 Reading thedata


Our data are average claims and the numbers of policies in a portfolio split up by gender, row-wise, and by three regions of residence, column-wise:



Gender

Region 1

Policies Claims

Region 2

Policies Claims

Region 3

Policies Claims

1

800 550

2400 364

1200 455

2

3200 625

1600 455

800 518

First we read the data and tell R to store them in 2 × 3 matrices:

n <- rbind(c(800, 2400, 1200),

c(3200, 1600, 800)) ## nr. of policies r <- rbind(c(550, 364, 455),

c(625, 455, 518)) ## average claim amounts


2 Usingthegeneralpurposeroutineoptim()

Our first plan of action to find optimal premiums is simply to invoke R’s general purpose multivariate optimization routine optim(). The first argument of optim() is a vector of length equal to the number of parameters, and initially set to some starting values. The second is a function to be minimized. Further and optional arguments might instruct optim to use a specific optimization technique or to stop only after many iterations.



The first two elements of the parameter vector pars for the function dist1 defined below represent x1,x2, the last three are y1,y2,y3. Using the fact that the outer product ˙x⊗ ˙y is a matrix with (i, j) element xiyj, a straightforward function to compute the ‘distance’ d(˙x,˙y) = (o− e)2/eis as follows:

dist1 <- function (pars) { x <- pars[1:2] y <- pars[3:5] f <- x o y ## o gives outer product sum(n * (r - f) ^ 2 / f) ## i.e., sum{(o-e)^2/e} }



Note that oijhere is actually nijrij, and eijis nijfij, which explains the n appearing in the numerator for (o−e)2/ein the final line of the script. Since xiyjis to be used as a premium for cell (i,j), all parameters xi,yjmust be positive. If the parameters are not constrained to be positive, it is easy to see that the distance function has minimal value −∞, obtained when any f-value tends to −∞. Later on we will see some ways to find the solution we need, forcing optim to scan only the positive region.


We minimize this function dist1 by calling optim(), taking all starting values of the pa- rameters equal to one. By this choice, the initial fitted values are also all equal to 1, meaning we start the search for the minimum in a point very far removed from the minimum when all parameters are positive.




oo <- optim(rep(1, 5), dist1) dist1(rep(1, 5)) ## 2,594,325,600 oo$value ## 2554.5 oo$convergence ## 1 oo$counts ## 502 NA
Instead of the outer product, we might have used for-loops the generate the matrix of cross- products xiyj. But unless the result in the next step of a for-loop depends on the result  of the previous step(s), such as in recursive algorithms, it is better to use vector operations whenever possible. First of all, they are easier to read, and second, they are very much faster.The result of the optimization is a list of various fields, and we store it in object oo. To see what these fields contain, do ?optim in R or in RStudio to access the help-file for optim. From it, you learn that the par field of the object oo contains the optimal parameter values found, while oo$value gives the optimal function value. From the convergence field, you see that in spite of 502 function evaluations, convergence has not occurred. Possibly the fact that there are many equivalent solutions caused the problem with convergence. Indeed, any solution xi,yjis equivalent to αxi,yj/α, since they lead to the same fitted values xiyj, therefore to the same distance.

There are a number of ways to fix the problem of non-convergence:

• increase the number of function evaluations allowed (the default is about 500);

• find a better starting value than taking all f[i,j] = xiyjvalues to be equal to 1;

• try a method of optimization other than optim’s default;

• use a method specific to this distance function (see Section 6).


Increasing the number of iterations
The iteration above stops after (about) 500 function evaluations. In the help-file ?optim you can see what to do to get more iterations. We increase the limit to 2000, and inspect the resulting object oo:


oo <- optim(rep(1, 5), dist1, control = list(maxit = 2000)) oo$value ## 2160.3

oo$convergence ## 0 oo$counts ## 530 NA


You see a 0 resulting for the convergence field, meaning convergence has occurred.


Verifying if the solution found is in fact optimal
In view of the difficulty we experienced in finding the best solution, caused in part because we started far from the minimum and close to the border of the feasible region, we are still suspicious of the outcome found. It may very well be that we have not landed in one of


the global minima, but in a critical point that is in fact a saddle point. That means that though decrease is not possible by changing only one of the parameters, it might be that if we look in another direction, lower values of the objective function can be reached. To check if a critical point of a one-dimensional function is in fact a minimum, you look at the sign of the second derivative. In case of n-dimensional functions, to check if a solution to the normal equations with ∇f(˙x) = (∂f/∂x1,...,∂f/∂xn) = ˙0 is a minimum, you can look at the Hessian. The Hessian is the matrix ∇2f(˙x) of second partial derivatives. R gives us the possibility to calculate a numerical approximation to the Hessian after having run optim.

Consider the Taylor expansion (˙x,˙yare arbitrary n–vectors; (·,·) is the in-product):


f(˙y) ≈ f(˙x) + (∇f(˙x),˙y− ˙x) + 1(˙y− ˙x)T∇2f(˙x)(˙y− ˙x).

Suppose ˙xis a critical point, so the gradient ∇f(˙x) = ˙0. If ∇2f(˙x) is not positive definite, there exists a vector ˙uwith ˙uT∇2f(˙x)˙u<0. Then f(˙x) cannot be minimal, because for a small enough real number η, taking ˙y= ˙x+ η˙uwe will have



f(˙x) >f(˙x) +

η2




2 ˙u∇ f(˙x)˙u≈ f(˙x+ η˙u).


So for the critical point found to be a minimum, the Hessian must be positive definite. One way to check that is to find the eigenvalues of that matrix and verify that they are all positive. We can run optimHess to find the approximate Hessian at the optimum found by optim, or, like we do here, we can add an argument hessian=TRUE to make optim output this matrix in the field oo$hessian. To find its eigenvalues, we use eigen().

Consider the following script and its output reproduced in comments:


oo <- optim(rep(1, 5), dist1, hessian=TRUE, control = list(maxit=2000)) all(eigen(optimHess(oo$par, dist1))$values > 0) ## is FALSE

## Apparently the routine stopped with ‘convergence’ in a saddle point.  oo <- optim(oo$par, dist1, hessian=TRUE, control = list(maxit=2000)) all(eigen(optimHess(oo$par, dist1))$values > 0) ## is now TRUE oo$value ## 2132.833

oo$convergence ## 0 oo$counts ## 164 NA


After the first call of optim, the approximate Hessian has negative eigenvalues, indicating that we ended up in a saddle point. When rerunning optim starting from the saddle point, we escape from this saddle point, which is possible because in its initial phase, optim takes larger steps than in the end. Since all eigenvalues of the Hessian are now positive, we are no longer in a saddle point but in a minimum, although possibly only a local one.



2


QWhich elements of the Hessian of the function f(˙x,˙y) = nij(rij− xiyj)

1 xy

are zero?


i,jij


Check with the values in the approximate Hessian in oo$hessian.


Improving the starting value


在线提交订单