# Associations and Social Media: Managing the Risks and Liabilities

Industrial Fire Insurance Data In order to highlight the methodology briefly discussed in the previous sections, we first apply it to 8043 industrial fire insurance claims. We show how a tail-fit and the resulting quantile estimates can be obtained. Clearly, a full analysis (as found, for instance, in Rootze´n and Tajvidi 1997 for windstorm data) would require much more work. Figure 4 contains the log histogram of the data. The right-skewness stresses the long-tailed behavior of the underlying data. A useful plot for specifying the long-tailed nature of data is the mean-excess plot given in Figure 5. In it, the mean-excess function e(u) 5 E(X 2 u u X . u) is estimated by its empirical counterpart n 1 1 e (u) 5 O (X 2 u) . n i #{1 # i # n : Xi . u} i51 The Pareto df can be characterized by linearity (positive slope) of e(u). In general, long-tailed df’s exhibit an upward sloping behavior, exponential-type df’s have roughly a constant mean-excess plot, whereas shorttailed data yield a plot decreasing to 0. In our case, the upward trend clearly stresses the long-tailed behavior. The increase in variability toward the upper end of the plot is characteristic of the technique, since toward the largest observation X1,n, only a few data points go into the calculation of e n(u). The main aim of our EVT analysis is to find a fit of the underlying df F(x) (or of its tail ( F x)) by a generalized Name /8042/03 04/21/99 09:19AM Plate # 0 pg 36 # 7 36 NORTH AMERICAN ACTUARIAL JOURNAL, VOLUME 3, NUMBER 2 NAAJ (SOA) Figure 7 Maximum Likelihood Estimate of j as a Function of the Threshold u (top), Alternatively as a Function of the Number of Exceedances Figure 6 Empirical Estimator of on Doubly F Logarithmic Scale Pareto df, especially for the larger values of x. The empirical df n F is given in Figure 6 on a doubly logarithmic scale. This scale is used to highlight the tail region. Here an exact Pareto df corresponds to a linear plot. Using the theory presented in Theorems 2 and 4, a maximum-likelihood-based approach yields estimates for the parameters of the extreme value df Hj ;m,s and the generalized Pareto df G . In order to start this j ;b procedure, a threshold value u has to be chosen, as estimates depend on the excesses over this threshold. The estimates of the key shape parameter j as a function of u (alternatively as a function of the number of order statistics used) is given in Figure 7. Approximate 95% confidence intervals are given. The picture shows a rather stable behavior for values of u below 300. An estimate in the range (0.7, 0.9) results, which corresponds to an a-value in the range (1.1, 1.4). It should be remarked that the ‘‘optimal’’ value of the threshold u to be used is difficult (if not impossible) to obtain. See Embrechts, Klu¨ppelberg, and Mikosch (1997, p. 351) and Beirlant, Teugels, and Vynckier (1996) for some discussion. We also would like to stress that in order to produce Figure 7, a multitude of models (one for each u chosen) has to be estimated. For each given u, a tail fit for F F u and (as in (10)) can be obtained. For the former, in the case of u 5 100 an estimate 5 0.747 results. A graphical rep- ˆ j resentation of 100 F is given in Figure 8. Using the parameter estimates corresponding to u 5 100 in (10), the tail fit of on a doubly logarithmic scale is given