White Paper for Optimal Probability Density Function Models

Aditi707 · Aug 2, 2013

Description
In probability theory, a probability density function (pdf), or density of a continuous random variable, is a function that describes the relative likelihood for this random variable to take on a given value. The probability for the random variable to fall within a particular region is given by the integral of this variable’s density over the region.

Optimal Probability Density Function Models for Stochastic Model Outcomes

arametric Model Fitting on Tail Distributions

Y. ChuehH1L, W.D.Curtis Department of Mathematics, Central Washington University, Ellensburg, WA
{chueh, curtiswd}@cwu.edu
(1) corresponding author

1.

Abstract

This paper documents a Mathematica-related development activity in the category of applied mathematics for the field of Actuarial Science. This is an on-going research project sponsored by a

2

Proc. IMS2004

Central Washington University Seed Grant subsequent to the papers [Longley-Cook 2003 and Chueh 2002] on efficient stochastic modeling techniques. Efficient stochastic modeling for the insurance business can be achieved by effective sampling of stochastic economic (risk) scenarios. The run-time can be significantly reduced and the model outcome distribution can be preserved on the critical tails. However, there is a lack of a well-known software applications to help modelers analyze the model outcome distribution. This research will apply Mathematica as the main programming and computational resource to develop a user-friendly Mathematica application AMOOF (Actuarial Model Outcome Optimal Fit) that finds the best parametric probability density model for given stochastic model data. The AMOOF application will carry out a series of parametric probability model fitting processes and then create mathematical reports for actuaries. Unlike other commercial statistical data fitting packages, a comprehensive inventory of special and mixed distributions will be used to enhance fitting accuracy on the tails. The project will utilize Mathematica’s capabilities in calculus, graphing, and optimization and, when complete, will be valuable for actuarial modeling that is critical for a company's capital determination, reserve setting, risk analysis, and product pricing.

2.

Introduction

We are developing a Mathematica application called the Actuarial Model Outcome Optimal Fit (AMOOF). This software is designed for actuarial model outcome analysis. Actuarial models apply stochastic risk scenarios to project profitability and financial strength for the insurance business. The resulting stochastic modeling outcome distributions are critical to an insurance company’s decision making on planning, pricing, reserving,and valuation. AMOOF will analyze the outcome distributions and find the optimal parametric probability models for the model outcome data. The parametric probability density models allow further analysis, given changes in business or contract assumptions. There will be numerical and graphical reports on the findings to assist model applications. The currently available statistical software can fit data to a variety of standard probability distributions, but not those inverse, transformed, or mixed distributions commonly used/required in the actuarial field. There is a lack of a software packages to analyze actuarial model outcomes on distribution tails, even though the tail distributions are what really matter for the insurance business. AMOOF explores all (or most) of the probability models used in actuarial science and determines the optimal one that fits the data well, especially at the tails. This is a challenging task not only because the optimization process is complex, but also because the long tails and extreme data values can cause problems in computation. The computer program will provide the user with an actuarial report on how well the resulting model fits the input data, as well as statistics for risk management and capital requirements. The program, once complete, can be used in future actuarial research and by practicing actuaries in insurance companies. Actuaries can use

Proc. IMS2004

3

the program to conduct analysis on a variety of insurance model outcomes that might have been too time-prohibitive or onerous to do in the past.

3.
3.1

Overview of AMOOF
Program functionality

è A Mathematica program which finds the parametric distribution (i.e. a probability density function (PDF)) that best fits the user’s stochastic actuarial model outcomes. è Can import data from the user’s text files and export the results as graphic files to a report or presentation. è Tests all the continuous distribution families for more than 22 different distribution types. è Displays results as presentation-quality graphs and a full statistical report. è Automatic goodness-of-fit testing and CTE90 (CTE% conditional tail expectations/percentiles) calculation and comparisons. è On-line Help and tutorial.

4

Proc. IMS2004

3.2

Features of AMOOF

AMOOF will accept data in a variety of formats, including: 1. data points with equal probabilities; 2. data points with unequal probabilities. Analysis of large data sets will be supported. Input data for AMOOF can be read from a text file, entered by hand, or cut and pasted from another Windows application. Because the majority of insurance model outcomes are of the continuous type, and discrete outcomes such as claim frequency can be handled by a different approach, this project focuses on continuous probability distributions only. AMOOF tests more than twenty-two continuous distributions to determine which distribution best fits the data. AMOOF may skip some distributions that are obviously not good fits by checking the skewness. AMOOF determines the optimal parameters for selected distributions, performing three standard tests to determine the goodness-of-fit: chi-square, Anderson-Darling, and Kolmogorov-Smirnov. When AMOOF completes the process, the optimal distributions are listed in order of goodness-of-fit. The PDF's available in AMOOF include the following continuous distributions, selected from a well-known actuarial textbook [Klugman 1998]. The Appendix gives the PDF's for each of them. 1. Transformed Beta 2. Generalized Pareto 3. Burr 4. Inverse Burr 5. Pareto 6. Inverse Pareto 7. Loglogistic 8. Paralogistic 9. Inverse Paralogistic 10. Transformed Gamma 11. Inverse Transformed Gamma 12. Gamma 13. Inverse Gamma 14. Weibull 15. Inverse Weibull 16. Exponential 17. Inverse Exponential

Proc. IMS2004

5

18. Lognormal 19. Inverse Gaussian 20. Single-parameter Pareto 21. Generalized beta 22. Beta 23. Mixed (1-22)

4.

How AMOOF Works

AMOOF's goal is to find the distribution that best fits the input data. AMOOF does not produce an absolute answer: it identifies a density function that most likely produced the input data from a stochastic model. For a given distribution candidate for the input data set, AMOOF looks for the parameters of the density function that optimize the value of the likelihood function of the data set, a measurement of the probability that the input data was produced by the given distribution. AMOOF allows results to be evaluated both quantitatively and qualitatively, by examining both comparison graphs and statistics. Before maximizing the likelihood function of the input data, initial guesses of parameter values are produced using the method of moments. These guesses are equal to the parametric moments and sample moments. The goodness-of-fit measures how well a parametric PDF fits the data set and is calculated for each PDF. All fitted PDF’s are compared and the one with the lowest goodness-of-fit value is considered the best fit. AMOOF provides the information needed to decide which fit is the best, and whether that fit is good enough to use. All results, including graphs, statistics, and distribution functions, can easily be transferred to other programs for further analysis and presentation. The AMOOF Report has a number of features to display results and perform advanced analyses as follows. 1. Graphics: High resolution graphics are used to present results. Comparison, difference, PP,and QQ graphs all lead to a powerful visual presentation of results. 2. Detailed Statistics: AMOOF can generate a full statistical report. Detailed statistics (mean, variance, skewness, etc.) are reported for the input data and all results. Conditional tail expectations can be calculated and presented.

6

Proc. IMS2004

3. Confidence Levels: AMOOF generates Confidence Levels and Critical Values for each distribution (PDF) tested. These results are available in the statistics report. These statistics can be used to determine how good a fit AMOOF has found. If a particular distribution is rejected for all available confidence levels, this is indicated in the report. 4. The user can also have AMOOF compare the data set to a specific distribution by specifying the parameters.

5.

Programming

The main challenge of this project is to write Mathematica code to create an effective user interface, import data sets, choose appropriate probability models, optimize the likelihood functions, conduct statistical tests, perform calculus operations, and create numerical and graphical reports for statistical or actuarial use. The authors wrote the following code example to illustrate the basic data fitting process. Two probability density functions/models were used to fit a given data set. Both Gamma and mixed Gamma density functions were fitted to a data set of one hundred values. To fit the best model, the likelihood functions were defined and maximized. This was followed by Kolmogorov-Smirnov tests to check the goodness of fit. The CTE code (not in the code example) involves computing improper integrals that can be done only in Mathematica if best accuracy is required. Spreadsheet programs and numerical recipes can calculate CTE with less accuracy for some distribution models. Many of the tasks outlined in this paper can be carried out using a variety of statistical software and VBA programming in Excel. As part of the project the authors would like to compare Mathematica’s Optimizer with Excel’s add-in Solver tool, which has been extensively used in the financial industry.

6.

Code Example

Below is a sample data set of 100 values. 19.78, 20.762, 23.058, 24.365, 27.216, 28.595, 29.569, 31.7, 32.438, 32.466, 33.649, 37.072, 38.599, 38.818, 38.964, 42.403, 42.959, 44.655, 46.935, 47.015, 47.078, 47.156, 47.787, 48.02, 49.448, 49.597, 49.612, 51.256, 51.632, 53.547, 54.179, 55.874, 55.91, 55.984, 56.526, 59.556, 61.071, 61.941, 61.951, 65.231, 65.354, 66.349, 72.411, 75.999, 76.355, 79.237, 79.891, 84.059, 101.718, 104.379, 158.74, 160.442, 162.186, 176.292, 190.798, 221.739, 230.269, 230.711, 231.018, 231.021, 235.92, 241.679, 243.443, 243.989, 244.53,

Proc. IMS2004

7

244.694, 257.412, 257.475, 261.791, 271.918, 277.635, 279.834, 281.53, 282.391, 285.835, 290.84, 296.917, 301.806, 309.584, 320.709, 323.789, 328.277, 332.701, 345.071, 346.511, 349.923, 358.246, 359.094, 369.45, 371.69, 375.842, 377.236, 386.892, 391.95, 399.141, 407.217, 409.285, 458.163, 471.869, 486.527 We will fit a Gamma distribution to this data set, optimizing the Gamma distribution parameters a and q using the method of maximum likelihood. Then we will fit a mixed probability density function consisting of a convex combination of two Gamma densities. This combination is optimized using the method of maximum likelihood. We then apply the Kolmogorov-Smirnov test to see which distribution best fits the data. The Gamma PDF and the Gamma cumulative distribution function are shown below with the data. H L
?H L

In[5]:=

gammaPDF@?_, ?_, x_ D :=

x Gamma@?D

x ?

?

x ?

gammaCDF@?_, ?_, u_ D := IntegrateA

data = 819.78, 20.762, 23.058, 24.365, 27.216, 28.595, 29.569, 31.7, 32.438, 32.466, 33.649, 37.072, 38.599, 38.818, 38.964, 42.403, 42.959, 44.655, 46.935, 47.015, 47.078, 47.156, 47.787, 48.02, 49.448, 49.597, 49.612, 51.256, 51.632, 53.547, 54.179, 55.874, 55.91, 55.984, 56.526, 59.556, 61.071, 61.941, 61.951, 65.231, 65.354, 66.349, 72.411, 75.999, 76.355, 79.237, 79.891, 84.059, 101.718, 104.379, 158.74, 160.442, 162.186, 176.292, 190.798, 221.739, 230.269, 230.711, 231.018, 231.021, 235.92, 241.679, 243.443, 243.989, 244.53, 244.694, 257.412, 257.475, 261.791, 271.918, 277.635, 279.834, 281.53, 282.391, 285.835, 290.84, 296.917, 301.806, 309.584, 320.709, 323.789, 328.277, 332.701, 345.071, 346.511, 349.923, 358.246, 359.094, 369.45, 371.69, 375.842, 377.236, 386.892, 391.95, 399.141, 407.217, 409.285, 458.163, 471.869, 486.527<; dLen = Length@dataD; dMax = Max@dataD; dUpper = HQuotient@dMax, 10D + 1L 10;

x Gamma@?D

H

x ?

L

?

?H

x ?

L

, 8x, 0, u<E;

Given a data array 8x1 , x2 , ... , xN < and a probability density function f (x, q) depending on a N parameter (or vector of parameters) q, the likelihood function is LHqL = ¤k =1 f H xk , q L. The N log-likelihood function is lnH LHqLL = ?k=1 lnH f H xk LL. The method of maximum likelihood chooses the parameters q to maximize the log-likelihood function.

8

Proc. IMS2004

In[15]:=

logLikelihoodGamma@?_, ?_ D := Plus @@ HLog@gammaPDF@?, ?, #DD & ê@ dataL

Next we form a mixed distribution by taking a convex combination of two Gamma distributions (pœ (0,1)).
In[12]:=

mixedGammaPDF@?1_, ?1_, ?2_, ?2_, p_, x_ D := p gammaPDF@?1, ?1, xD + H1 ? pL gammaPDF@?2, ?2, xD

logLikelihoodMixedGamma@?1_, ?1_, ?2_, ?2_, p_ D := Plus @@ HLog@mixedGammaPDF@?1, ?1, ?2, ?2, p, #DD & ê@ dataL Then we find the optimal Maximum Likelihood parameter values for the Gamma distribution.
In[18]:=

mixedGammaCDF@?1_, ?1_, ?2_, ?2_, p_, x_ D := p gammaCDF@?1, ?1, xD + H1 ? pL gammaCDF@?2, ?2, xD

solGamma = NMaximize@8logLikelihoodGamma@?, ?D, ? > 0, ? > 0<, 8?, ?<, AccuracyGoal ?> 5, PrecisionGoal ?> 5, MaxIterations ?> 500D 8? 614.307, 8? ? 1.38601, ? ? 127.287<<

Out[18]=

Similarly we find the optimal Maximum Likelihood parameter values for the mixed Gamma distribution.
In[19]:=

solMixedGamma = NMaximize@8logLikelihoodMixedGamma@?1, ?1, ?2, ?2, pD, ?1 > 0, ?2 > 0, ?1 > 0, ?2 > 0, p > 0, p < 1<, 8?1, ?1, ?2, ?2, p<, AccuracyGoal ?> 5, PrecisionGoal ?> 5, MaxIterations ?> 500D 8? 574.517, 8p ? 0.499756, ?1 ? 7.32034, ?2 ? 13.7522, ?1 ? 7.02438, ?2 ? 21.8901<<

Out[19]=

Using the Kolmogorov-Smirnov test, we evaluate how well the data fit the distribution, first for the single Gamma distribution and then for the mixed distribution.

Proc. IMS2004

9

In[100]:=

cdflistGamma = HgammaCDF@?, ?, #D & ê@ data L ê. solGamma@@2DD; y= pValGamma = NestWhile@H2 H? 1L ^ Hj + 1L Exp@? 2 j ^ 2 y ^ 2D + #L &, j = 1; 0, HAbs@H? 1L ^ j ++ H#1 ? #2LD > 10 ^ H? 8LL &, 2D

è!!!!!!!!!!! dLen Max@Abs@cdflistGamma ? Range@1. ê dLen, 1, 1. ê dLenDDD;

Out[102]=

0.00458965 The very small P-value obtained is indicative of a poor fit, as is seen in the graph below (Figure 1).
In[103]:=

MultipleListPlot@Transpose@8data, cdflistGamma<D, Transpose@8data, Range@1. ê dLen, 1, 1. ê dLenD<D, PlotJoined ? True, SymbolStyle ? 8Red, Blue<, PlotLabel ?> StyleForm@ "Single ? CDF vs Sample CDF\n HK?S test P?value .0046L", FontSize ? 14, FontWeight ? BoldD, DefaultColor ? Blue, Background ? Yellow, Frame ? TrueD;

Single ? CDF vs Sample CDF HK?S test P?value .0046L
1

0.8

0.6

0.4

0.2

0 0 100 200 Figure 1: Poor fit using a single Gamma function 300 400

10

Proc. IMS2004

In[90]:=

cdflistMixedGamma = HmixedGammaCDF@?1, ?1, ?2, ?2, p, #D & ê@ data L ê. solMixedGamma@@2DD; y=

è!!!!!!!!!!! dLen Max@Abs@cdflistMixedGamma ? Range@1. ê dLen, 1, 1. ê dLenDDD;

pValMixedGamma = NestWhile@H2 H? 1L ^ Hj + 1L Exp@? 2 j ^ 2 y ^ 2D + #L &, j = 1; 0, HAbs@H? 1L ^ j ++ H#1 ? #2LD > 10 ^ H? 8LL &, 2D

Out[95]=

0.999421

The high P-value obtained shows that the mixed distribution fits the data very well.
In[104]:=

MultipleListPlot@Transpose@8data, cdflistMixedGamma<D, Transpose@8data, Range@1. ê dLen, 1, 1. ê dLenD<D, PlotJoined ? True, SymbolStyle ? 8Red, Blue<, PlotLabel ? StyleForm@"Mixed ? CDF vs Sample CDF\n HK?S test P?value .39L", FontSize ? 14, FontWeight ? BoldD, DefaultColor ? Blue, Background ? Yellow, Frame ? TrueD;

Mixed ? CDF vs Sample CDF HK?S test P?value .39L
1

0.8

0.6

0.4

0.2

0 0 100 200 Figure 2: Good fit using a pair of Gamma functions 300 400

Proc. IMS2004

11

Next, we create a histogram of the data, and superimpose upon it the PDF for both the optimal Gamma density and the optimal mixed Gamma density (Figure 3). Show@ Histogram@1 ê dUpper Count@data, x_ ê; # <= x && x < # + 5D & ê@ Range@0, dUpper, 5D, FrequencyData ? True, HistogramRange ? 80, dUpper<, HistogramCategories ? Automatic, DisplayFunction ?> IdentityD, Plot@gammaPDF@?, ?, xD ê. solGamma@@2DD, 8x, 0, dUpper<, DisplayFunction ?> Identity, PlotStyle ? [email protected]<D, PlotLabel ? StyleForm@"Single ? PDF vs Sample Histogram", FontSize ? 14, FontWeight ? BoldD, DisplayFunction ?> $DisplayFunction, DefaultColor ? Blue, Background ? Yellow, Frame ? TrueD; Show@Histogram@1 ê dUpper Count@data, x_ ê; # <= x && x < # + 5D & ê@ Range@0, dUpper, 5D, FrequencyData ? True, HistogramRange ? 80, dUpper<, HistogramCategories ? Automatic, DisplayFunction ?> IdentityD, Plot@mixedGammaPDF@?1, ?1, ?2, ?2, p, xD ê. solMixedGamma@@2DD, 8x, 0, dUpper<, DisplayFunction ?> Identity, PlotStyle ? [email protected]<D, DisplayFunction ?> $DisplayFunction, PlotLabel ? StyleForm@"Mixed ? PDF vs Sample Histogram", FontSize ? 14, FontWeight ? BoldD, DefaultColor ? Blue, Background ? Yellow, Frame ? TrueD;

In[131]:=

Single ? PDF vs Sample Histogram
0.0175 0.015 0.0125 0.01 0.0075 0.005 0.0025 0 100 200 300 400 500

12

Proc. IMS2004

Mixed ? PDF vs Sample Histogram
0.0175 0.015 0.0125 0.01 0.0075 0.005 0.0025 0 100 200 Figure 3: Superposition of the data on the optimal Gamma density and the optimal mixed Gamma density 300 400 500

As is seen in this example, a single Gamma distribution does not fit the data well. However, an optimally chosen mixed distribution which is a convex combination of two Gamma distributions fits the data very well. The use of such mixed distributions to obtain better fits is a major aspect of this project. Mathematica's powerful symbolic capabilities, optimization routines, data manipulation features, and flexible data visualization framework all combine to make it feasible to carry out calculations such as shown here.

7.

References

[Klugman 1998] Klugman, Stuart A., Panjer, H. H., and Willmot, G. E., Loss Models, Wiley-Interscience (1998). [Longley-Cook 2003] Longley-Cook, A. G., Efficient Stochastic Modeling Using Representative Scenarios: Application to Equity Risks. The Stochastic Modeling Symposium (2003). [Chueh 2002] Chueh, Y., Efficient Stochastic Modeling for Large and Consolidated Insurance Business: Interest Rate Sampling Algorithms. North American Actuarial Journal, Volume 6, no. 3 (2002).

Proc. IMS2004

13

Appendix

14

Proc. IMS2004

Continuous Probability Densities and Related Quantities
PDF : probability density function f H xL Moments : the k ? th moment is the expectation EHX k L = ‡ xk f H xL x
a ? Conditional Tail Ÿ0 x f H xL x Ÿb x f H xL x Expectations : EHX » X < aL = and EHX » X > bL = ? a Ÿb f H x L x Ÿ0 f H xL x

1. Transformed Beta

f HxL =

E HX k L =

?? ?H? + ?L H? H x ? L L ?+? ? H?H?L ?H?LL Hx HH x L ? L + 1L

?k ? H? + k ê ?L ? H? ? k ê ?L , ??? < k < ?? ? H?L ? H?L

2. Generalized Pareto

f HxL =

?@? + ?D ?? x??1 ?@?D ?@?D Hx + ?L?+?

E HX k L = 3. Burr f H xL = E HX k L = 4. Inverse Burr f H xL = E HX k L = 5. Pareto f HxL = E HX k L =

?k ? H? + kL ? H? ? kL , ?? < k < ? ? H?L ? H?L ??H
x ?

x HH

?k ? H1 + k ê ?L ? H? ? k ê ?L , ?? < k < ?? ? H?L ??H
x ?

L? + 1L
x ?

L?

?+1

x HH

?k ? H? + k ê ?L ? H1 ? k ê ?L , ??? < k < ? ? H?L

L + 1L
x ? ?

L? ?

?+1

? ?? Hx + ?L?+1

?k ? H1 + kL ? H? ? kL , ?1 < k < ? ? H?L

Proc. IMS2004

15

6. Inverse Pareto

f H xL = E HX k L =

? ? x??1 Hx + ?L?+1

?k ? H? + kL ? H1 ? kL , ?? < k < 1 ? H?L ?H
x ? ? L x L? + 1L2 ?

7. Loglogistic

f HxL =

E HX k L = ?k ? H1 + k ê ?L ? H1 ? k ê ?L, ?? < k < ? 8. Paralogistic f HxL = E HX k L = ?2 H
x ?

x HH

x HH

L + 1L
x ? ?

L?

?+1

?k ? H1 + k ê ?L ? H? ? k ê ?L , ?? < k < ?2 ? H?L ?2 H
x ?

9. Inverse Paralogistic

f H xL =

x HH

L + 1L
x ? ?

L?

2

?+1

E HX k L =

?k ? H? + k ê ?L ? H1 ? k ê ?L , ??2 < k < ? ? H?L ?H L? ? ?H ? L x ?H?L
x

10. Transformed Gamma

f HxL = E HX k L =

x ?

?

?k ? H? + k ê ?L , ??? < k ? H?L ?H
?H x L L x ?H?L ??
?

11. Inverse Transformed Gamma f HxL =
? x

?

E HX k L =

?k ? H? ? k ê ?L , ?? > k ? H?L

16

Proc. IMS2004

12. Gamma

f HxL =

H

E HX k L = H

L? ? ? x ?H?L
x ?
x

?k ? H? + kL , ? > ?k ? H?L

13. Inverse Gamma

f H xL =

E HX k L =

L ?x x ?H?L
? x ?
?

?k ? H? ? kL ,?>k ? H?L L? x
?H L?

14. Weibull

f HxL =

?H

x ?

x ?

E HX k L = ?k ? H1 + k ê ?L, k > ?? 15. Inverse Weibull f HxL = ?H L x
? ?H L

? x

? x

?

E HX k L = ?k ? H1 ? k ê ?L, k < ?

16. Exponential

f H xL =

?

x ?

?

E HX k L = ?k ? H1 + kL, k > ? 1

17. Inverse Exponential

f HxL =

?

?

? x

x2

E HX k L = ?k ? H1 ? kL, k < 1

18. Lognormal

Proc. IMS2004

17

f HxL =

?

HlogHxL?µL2 2 ?2

E HX k L = exp Hkµ + k2 ?2 ê 2L

x?

è!!!!!!! 2?

19. Inverse Gaussian

? f HxL = $%%%%%%%%%%%%%%% 2 ? x3

Hx?µL ? ?2 x µ2

2

E HX L = µ, Var HXL = 20. Single ? parameter Pareto f HxL = ? ?? x?+1 ??k ,k<? ??k

µ3 ?

E HX k L = 21. Generalized beta

E HX k L =

?Ha + bL H x L? a H1 ? H x L? L ? ? f HxL = H?HaL ?HbLL x

b?1

?

?k ? Ha + bL ? Ha + k ê ?L , k > ? a? ? HaL ? Ha + b + k ê ?L

22. Beta

23. Mixed H1 ? 22L

E HXk L =

a x b?1 ?Ha + bL H x ? L H1 ? ? L f HxL = H?HaL ?HbLL x

?k ? Ha + bL ? Ha + kL , k > ?a ? HaL ? Ha + b + kL

f HxL = p f1 HxL + H1 ? pL f2 HxL, 0 < p < 1

Null

doc_657159757.pdf

White Paper for Optimal Probability Density Function Models

Attachments