Research article

Longevity risk analysis: applications to the Italian regional data

  • Received: 03 March 2022 Revised: 27 March 2022 Accepted: 28 March 2022 Published: 30 March 2022
  • JEL Codes: C02, C15, C22

  • Longevity risk is the risk that members of a given population will live longer than expected. When it occurs, pension providers may have to pay pensions for longer than expected, significantly increasing their costs. While this risk is being adequately studied using the national mortality data provided by the Human Mortality Database, relatively few studies exist that analyse sub-national data. This manuscript proposes a comparative study of some stochastic mortality models to measure the longevity risk on Italian mortality data at the regional level. In particular, the use of the Lee-Carter and Li-Lee models is explored. The models are compared in fitting quality, forecasting accuracy and complexity. Numerical experiments and applications to immediate life annuity evaluation are presented.

    Citation: Salvatore Scognamiglio. Longevity risk analysis: applications to the Italian regional data[J]. Quantitative Finance and Economics, 2022, 6(1): 138-157. doi: 10.3934/QFE.2022006

    Related Papers:

    [1] Cemile Özgür, Vedat Sarıkovanlık . An application of Regular Vine copula in portfolio risk forecasting: evidence from Istanbul stock exchange. Quantitative Finance and Economics, 2021, 5(3): 452-470. doi: 10.3934/QFE.2021020
    [2] Akash Deep . Advanced financial market forecasting: integrating Monte Carlo simulations with ensemble Machine Learning models. Quantitative Finance and Economics, 2024, 8(2): 286-314. doi: 10.3934/QFE.2024011
    [3] Andrea Ferrario, Massimo Guidolin, Manuela Pedio . Comparing in- and out-of-sample approaches to variance decomposition-based estimates of network connectedness an application to the Italian banking system. Quantitative Finance and Economics, 2018, 2(3): 661-701. doi: 10.3934/QFE.2018.3.661
    [4] Monique Timmermans, Ronald Heijmans, Hennie Daniels . Cyclical patterns in risk indicators based on financial market infrastructuretransaction data. Quantitative Finance and Economics, 2018, 2(3): 615-636. doi: 10.3934/QFE.2018.3.615
    [5] Zheng Nan, Taisei Kaizoji . Bitcoin-based triangular arbitrage with the Euro/U.S. dollar as a foreign futures hedge: modeling with a bivariate GARCH model. Quantitative Finance and Economics, 2019, 3(2): 347-365. doi: 10.3934/QFE.2019.2.347
    [6] Andres Fernandez, Norman R. Swanson . Further Evidence on the Usefulness of Real-Time Datasets for Economic Forecasting. Quantitative Finance and Economics, 2017, 1(1): 2-25. doi: 10.3934/QFE.2017.1.2
    [7] Sahar Charfi, Farouk Mselmi . Modeling exchange rate volatility: application of GARCH models with a Normal Tempered Stable distribution. Quantitative Finance and Economics, 2022, 6(2): 206-222. doi: 10.3934/QFE.2022009
    [8] Elyas Elyasiani, Luca Gambarelli, Silvia Muzzioli . The Information Content of Corridor Volatility Measures During Calm and Turmoil Periods. Quantitative Finance and Economics, 2017, 1(4): 454-473. doi: 10.3934/QFE.2017.4.454
    [9] Kuashuai Peng, Guofeng Yan . A survey on deep learning for financial risk prediction. Quantitative Finance and Economics, 2021, 5(4): 716-737. doi: 10.3934/QFE.2021032
    [10] Keyue Yan, Ying Li . Machine learning-based analysis of volatility quantitative investment strategies for American financial stocks. Quantitative Finance and Economics, 2024, 8(2): 364-386. doi: 10.3934/QFE.2024014
  • Longevity risk is the risk that members of a given population will live longer than expected. When it occurs, pension providers may have to pay pensions for longer than expected, significantly increasing their costs. While this risk is being adequately studied using the national mortality data provided by the Human Mortality Database, relatively few studies exist that analyse sub-national data. This manuscript proposes a comparative study of some stochastic mortality models to measure the longevity risk on Italian mortality data at the regional level. In particular, the use of the Lee-Carter and Li-Lee models is explored. The models are compared in fitting quality, forecasting accuracy and complexity. Numerical experiments and applications to immediate life annuity evaluation are presented.



    In recent decades, life expectancy has been increasing in the most developed countries, mainly thanks to the improvements in nutrition, hygiene, medical technology, health care, lifestyle. EUROSTAT statistics show that the life expectancy for an individual aged 65 over the last 30 years is increased in almost all European countries (27% in Italy, 28% in Spain and 25% in Greece). These improvements, which are generally perceived as positive by individuals, have effects on retirement costs pose significant challenges for governments as well as for individual pension funds and life insurers as described in De Waegenaere et al. (2010). For example, we consider the savings needed to finance a stream of pensions consumption that pays 1 per year. The expected present value of such annuity for an Italian individual aged 65 (with an interest rate equal to 0) increased from 17.2 in 1990 to 21.8 in 2020. Pension providers and actuaries should consider these longevity improvements in life insurance pricing and reserving to avoid underestimating their future liabilities.

    There is extensive literature on mortality forecasting, especially in the category of extrapolation methods, see for example Renshaw & Haberman (2003); Currie et al. (2004); Cairns et al.(2006, 2009). The model proposed by Lee & Carter (1992) (LC) is the best-known approach to stochastically model and forecast the mortality rates of a given population. Their model decomposes the age-time matrix of mortality rates into a bi-linear combination of age and period parameters using the Principal Component Analysis (PCA). Forecasting is performed by projecting the time-index component into the future with time-series models. A formal description of the LC model is presented in the next section. The literature is rich in contributions that extended it in different directions or developed different models. Brouhns et al. (2002) proposed an alternative to the Ordinary Least Squared estimation approach of the classical LC method by assuming the Poisson distribution for the number of deaths and employing maximum likelihood for parameter estimation. Renshaw & Haberman (2003) explored multi-factor extensions of the LC model, while Renshaw & Haberman (2006) suggested the incorporation of a cohort effect. Other extensions of the LC model can be found in Currie (2013); Nigri et al. (2019); Gao & Shi (2021). Another very popular mortality model is the Cairns Blake Dowd (CBD) model proposed in Cairns et al. (2006). It proceeds by fitting a parametric mortality model to each calendar year of mortality experience separately and then extrapolating the coefficients to future years with time series models. Many extensions of the CBD model have been proposed and investigated. Cairns et al. (2009) presented some of them augmenting the classical CBD with a quadratic component in the parametric model or a cohort effect. Hyndman & Ullah (2007) introduced a functional data approach in which the mortality data of each year are smoothed via constrained regression splines before fitting a model using principal components decomposition, and Hainaut & Denuit (2020) further extended the idea suggesting a wavelet-based decomposition. Despite the numerous contributions, the LC model is extensively used by practitioners and academics thanks to its simplicity and discrete forecasting accuracy.

    Enchev et al. (2017) remarks that the drivers of the longevity improvements mentioned above often spread quickly, and the mortality of different populations appears, in some way, correlated. For instance, adverse events such as pandemics or wars can have a transversal impact on the mortality rates of many countries. For this reason, the study of models able to describe the mortality dynamics of multiple populations has aroused interest in recent years. Populations can differ for various features such as gender, country or geographical area. One of the most straightforward approaches for multi-population mortality modelling consists of using a set of independent models. Single-population mortality models, e.g. LC models, are applied to the considered populations individually, and an own model describes the mortality of each population. However, this approach completely ignores the dependency among mortality of the different populations. Some authors address this issue by introducing common terms in the single-population models. A very popular model is the (Augmented) Common Factor model developed in Li & Lee (2005) that proposes a double log-bilinear mortality model augmenting common age and period effects with sub–population-specific age and period effects. An attractive property of this model is producing "coherent mortality forecasts" in that it ensures that long-term forecasts do not diverge among the populations. A Poisson version of the Augmented Common Factor model is proposed in Li (2013). In contrast, Kleinow (2015) relaxes the coherence assumption by imposing that only the age-specific LC parameters modulate the period effect are common to all populations. At the same time, different time indices fit each population. Other examples of multi-population mortality models can be found in Hyndman et al. (2013); Schnürch et al. (2021); Chen et al. (2015). While multi-population mortality models for populations belonging to different countries or genders have been extensively investigated at the country level, few studies exist that analyse sub-national and regional mortality data. A recent application of multi-population models to United Kingdom data can be found in Chen & Millossovich (2018), while Shang & Yang (2021) analyses the Australian sub-national data. One of the few studies that analyse sub-national Italian data is presented in Danesi et al. (2004), where a comparison of some single-population models is discussed. This manuscript proposes a comparative study between single-population and multi-population mortality models on Italian mortality data at the regional level. Italy represents an interesting case study for some reasons. On the one hand, the differences among the Italian regions (in terms of socio-economic development, living conditions, and historical differences), particularly between those located in the north and south of the country, represent a longstanding issue and is currently highly debated in the literature, see for example Franzini L & Giannoni M (2010). It is reasonable to think that these differences could induce differences in mortality among populations living in the different Italian areas. On the other hand, all the Italian regions share the same regulatory, political, and health systems, which induces a somewhat dependency structure among the regional mortality rates. The application of single-population LC models and the Li and Lee Model is analysed. The comparison appears interesting since both models adopt a linear structure for the time component. However, the approach based on single-population LC models assumes total independence among the mortality of the different regions, while the Li and Lee model assumes that a single factor drives the mortality of the different regions and only short-term divergences are allowed.

    The remainder of the paper is organised as follows. Section 2 provides a formal description of the mortality models considered; Section 3 describes the numerical experiments and the results, Section 4 shows an application to the life annuities pricing, and Section 5 concludes.

    This section introduces the stochastic mortality models used in this research. We denote by X={x0,x1,,xω} the set of the age categories, T={t0,t1,,tn} the set of calendar years considered and I={pop1,pop2,,popm} the set of sub-population considered.

    The LC model is the most popular approach to model the mortality of a single population. It specifies a log-bilinear form for the logarithm of the central death rate logmx,tR at age xX in the year tT in a given population:

    log(mx,t)=αx+βxκt+ϵx,t,with i.i.d ϵx,tN(0,σ2ϵ) (1)

    where αxR describes the average pattern of mortality for the age group; βxR represents the age-specific patterns of mortality change, indicating the sensitivity of the logarithm of the force of mortality at age x to variations in the time index κt; κtR explains the time trend of the general mortality level; and ϵx,t represents the deviation of the model from the observed log-central death rate. To avoid identifiability problems, the authors suggest imposing the following constraints

    xXβx=1tTκtT=0 (2)

    The estimates of the LC parameters are obtained by solving the optimisation problem

    argmin(αx)x,(βx)x,(κt)txXtT(log(mx,t)αxβxκt)2 (3)

    (αx)x* are estimated as the logarithm of the geometric mean of the crude mortality rates, averaged over all t, for each xX

    * The notation (α(i)x)x indicates the curve of the αx for the different ages x parameters of a given population i. The same notation is used in the following also for the other parameters.

    ˆαx=log(tT(mx,t)1/T)

    while (κt)t and (βx)x are estimated using a first-order Singular Value Decompotion (SVD) to the center log-mortality matrix H={hx,t}xX,tTR|X|×|T| where

    hx,t=(log(mx,t)ˆαx)

    To forecast future mortality rates, the model assumes that the αx and βx parameters remain constant over time and forecast future values of kt using a standard univariate time series model. Despite several ARIMA (p,d,q) models could be considered, in practice the random walk with drift model is used almost exclusively:

    κt=κt1+γ+ξtwith i.i.d ξtN(0,σ2ξ) (4)

    where γR is the drift. When the aim is to model and forecast the mortality of many different sub-populations iI, one could apply an LC model for each subgroup considered. In this setting, the mortality of each sub-group is described independently from the others. The model reads:

    log(m(i)x,t)=α(i)x+β(i)xκ(i)t+ϵ(i)x,tiI (5)

    The model fitting is performed individually iI and the population-specific time indices κ(i)t are projected with independent ARIMA (0, 1, 0) models.

    Applying independent LC models to multiple populations can produce divergent long-term predictions. However, if two or more populations share similar socioeconomic conditions, it might be reasonable to assume that the differences in mortality among them should not diverge over time.

    To avoid long-run divergence, Li & Lee (2005) proposed a model where all the populations share the parameters of the bilinear term (β(i)x=BxR and κ(i)t=KtR,iI). They define the Common Factor (CF) model as:

    logm(i)x,t=α(i)x+BxKt+ν(i)x,t,with i.i.d ν(i)tN(0,(σ(i)ν)2) (6)

    where Kt is a common risk factor shaping all populations' mortality evolution, which is modulated by the age-specific parameter Bx and ν is the normally distributed error term. While the α(i)x is estimated separately for each individual sub-population, the estimates of Bx and Kt are obtained by applying the ordinary LC method to the whole group. The time-specific common factor Kt is a non-stationary process, and a random walk with drift model is used to obtain forecasts:

    Kt=Kt1+δ+ηtwith i.i.d ηtN(0,(σ(i)η)2) (7)

    To improve the fitting and forecasting, the authors suggest to include in the CF model an additional bilinear term with population-specific parameters. In that case, we obtain the Augmented Common Factor (ACF) Model:

    logm(i)x,t=α(i)x+BxKt+b(i)xk(i)t+ζ(i)x,t,with i.i.d ζ(i)tN(0,(σ(i)ζ)2) (8)

    where the estimates of b(i)x and k(i)t are obtained by applying the first-order SVD to the residuals matrix of the CF model. The sub-population specific time-components k(i)t is assumed stationary and it described with an order autoregressive AR(1) model:

    k(i)t=ϕ(i)0+ϕ(i)1k(i)t1+o(i)twith i.i.d o(i)tN(0,(σ(i)o)2) (9)

    where ϕ(i)0,ϕ(i)1R,iI.

    We perform the tests using the ISTAT data. It provides the mortality data of the Italian population for different ages, years and geographic regions. We consider the single-age mortality rates of the total population (male and female together) of full time span available on the ISTAT website. In accordance with the previous notation, we set X={xN0:0x99}, T={tN:1974x<2020}. In addition, we focus our attention on the 20 Italian region and set I={Lombardia,Lazio,,Valle dAosta}. A graphical presentation of our dataset is illustrated in Figure 1. It includes some subplots, one for each Italian region, where mortality rates are plotted in log-scale. The order of the subplots reflects the population size of the different regions: the first one refers to Lombardia, that is, the Italian region with the largest population, while the last one refers to Valle d'Aosta, which is the region with the smallest population. Each curve refers to a different calendar year: the curves in the dark blue refer to less recent years while the lighter ones refer to more recent calendar years. It is immediate to note that the most recent curves lie below the dark ones highlighting a progressive decline in mortality for all the 20 Italian regions. Furthermore, we also observe that when one looks at less populated regions, the log-mortality curves exhibit some random fluctuations along the age dimension. This evidence is probably due to the law of large numbers: the estimate of mortality rates is more accurate when measured on large populations such as Lombardia and less precise when a small population such as Valle d'Aosta is considered.

    https://www.istat.it

    Figure 1.  Log-mortality rates in the Italian regions.

    This section presents the results of some numerical experiments performed on the ISTAT mortality data. The aim is to analyse and compare the LC, the CF, and the ACF models from different perspectives. The comparison is performed in terms of ⅰ) the fitting quality, ⅱ) forecasting accuracy, and ⅲ) the number of parameters to optimise. We split the mortality data into two parts. The first set of data consists of the mortality rates for calendar years in T1={tT:t1999}. It is used to fit the models. The second one includes the mortality rates for the calendar years in T2={tT:t>1999}, and it is used to measure the forecasting accuracy of the models.

    First, we discuss the fitting of the LC, CF and ACF models and the resulting estimates. The LC model is estimated individually on the regional data following the SVD-based procedure described in the previous section. Figures 2, 3 and 4 report the estimates of the parameters (α(i)x)x, (β(i)x)x and (κ(i)t)t for the different Italian regions. Also in this case, the order of the subplots reflects the population size of the regions. Analysing Figure 2, we note that the (α(i)x)x estimates exhibit the classic life table shape for all the Italian regions, and they are pretty similar among them. In addition, they seem relatively smooth over the age dimension. This result appears reasonable since this curve is estimated as the average of the observed log-mortality rates.

    Figure 2.  Estimates of (α(i)x)x of the LC model for the different Italian region.
    Figure 3.  Estimates of (β(i)x)x of the LC model for the different Italian region.
    Figure 4.  Estimates of (κ(i)t)t of the LC model for the different Italian region.

    On the contrary, Figure 3 shows that the (β(i)x)x curves present some irregular fluctuations especially for Basilicata and Valle d'Aosta. Unfortunately, this problem is not new to the mortality modelling literature. Delward et al. (2007) argues that an irregular pattern exhibit (β(i)x)x can be observed sometimes and that this issue is undesirable from an actuarial point of view since it could induce erratic variations across ages in the resulting projected life tables. Interestingly, Figure 3 highlights a relationship between the population size and the fluctuation in the (β(i)x)x estimate exists. Indeed, the oscillations appear visible for the regions in the bottom of Figure 3 where the low-population regions are located. One might argue that the use of the LC model may not be adequate for modelling the mortality of regions or subpopulations where the population size is too small. The motivation is due to the law of large numbers. The estimates of the mortality rates are less precise when the sample size decreases. This induces fluctuations in the observed mortality curves and affects the estimates (β(i)x)x estimates, which appear sensitive to this phenomenon.

    We use this term to refer to the regions with a small population without considering the geographical extension as it is outside the scope of this research.

    The (κ(i)t)t estimates for the different regions are presented in Figure 4. A dashed vertical line corresponding to 1999 is drawn. The (κ(i)t)t values on the left of that line are estimated on the mortality data, while the values on the right represent the projections obtained using the ARIMA (0, 1, 0) models. Figure 4 shows a decreasing trend of the mortality over time, and this evidence confirms that mortality is progressively declining in all Italian regions although the drift terms γ(i),iI appear different. The CF and ACF models are fitted following the procedure described in the original paper and using the same data considered for the LC model. First, the (B(i)x)x and (K(i)t)t parameters are estimated by applying the ordinary LC method to the aggregate mortality data. Figure 5 presents the resulting estimates. The (B(i)x)x curve appears relatively smooth since it is estimated considering the mortality data of all regions. In addition, we also observe that the common risk factor K(t) is downward sloping, implying a long-term trend of mortality improvement in all the Italian regions. Also in this case, the values to the right of the dashed line are the projections obtained via ARIMA (0, 1, 0) model.

    Figure 5.  Estimates of (Bx)x (left) and (Kt)t (right) of the CF and ACF models.

    In this section, some comparisons among the LC, the CF and the ACF models are carried out in terms of fitting quality, forecasting accuracy, and the number of parameters to optimise. The fitting quality and the forecasting accuracy are measured in terms of Mean Squared Error (MSE) and Mean Absolute Error (MAE) of the predicted mortality rates values from the actual ones:

    MSE=xit(ˆm(i)x,tm(i)x,t)2N
    MAE=xit|ˆm(i)x,tm(i)x,t)|N

    where N is the size of the sample considered. Although the first measure penalises large deviations more than the second one, these two values should be as low as possible. Moreover, we also desire that the number of parameters fit as low as possible. For this reason, we also analyse the number of parameters and the Bayesian Information (BIC) or Schwarz Criterion. The last one is a measure that considers the fitting quality and the number of parameters required. Models with lower BIC are generally preferred. This criterion is often used in the mortality modelling literature, see Booth et al. (2006); Apicella et al. (2019); Enchev et al. (2017).

    Table 1 reports the results LC, CF and ACF models. The MSE and MAE values are in 104. We report in bold the best performance for each used criterion.

    Table 1.  Number of parameters, MSE and MSE for fitting and forecasting of the LC, CF and ACF models; the values are in 104.
    Fitting Forecasting
    Model # Parameters BIC MSE MAE MSE MAE
    LC 4560 99687.64 0.2169 16.2408 2.1162 39.8842
    CF 228 99480.93 0.3956 20.1524 4.099 95.0501
    ACF 4808 99718.77 0.1900 15.6569 0.7964 28.8717

     | Show Table
    DownLoad: CSV

    Intuitively, the CF is the most parsimonious model since it has the lowest number of parameters. The LC model is the second one, while the ACF model is the model that requires the most expensive model in terms of the number of parameters. This ranking changes if we look at the fitting MSE. In particular, we observe that the ACF model obtains the best fitting. The second is the LC model, while the CF model is the least accurate. This evidence continues to hold even if we consider the MAE. This result appears reasonable since more parameters make the model more flexible and, in that case, it should better fit the data points used to optimise the parameters. From a BIC perspective, we observe that the CF model is the best, the LC model the second, and the ACF model the third. This result is probably due to the larger number of parameters required by the ACF model. However, these additional parameters produce a significant gain in forecasting accuracy. The ACF model is the most accurate, the LC model is the second, and the CF model is the least valid. This result is the same for both errors measures.

    A more detailed comparison of the fitting and forecasting accuracies is shown in Table 2, which reports the MSE and MAE of the three models in the different Italian regions in the fitting and forecasting. Also in this case, we observe that the CF model is often the least accurate in fitting and forecasting from both MSE and MAE perspectives. The fitting performance of the LC and ACF models are pretty similar: both models achieve the best performance in 50% of the regions considered. This evidence works for both error measures. The ACF model overperforms the LC model from a forecasting point of view. In particular, the ACF model is the best in 75% of cases (15/20) from the MSE point of view, while it has the best performance in 90% of cases (18/20) when the MAE is considered. Furthermore, we observe that the gain in forecasting performance in cases where the LC model overperforms the ACF model (Lazio, Sardegna, Abruzzo) is relatively modest. On the contrary, the gain appears significant in some regions where the ACF model beats the LC model (Lombardia, Calabria, Basilicata). The only case where the LC model overperforms the ACF model from both MAE and MSE perspectives is Toscana. We conclude that the CF model performs poorly on regional Italian data. It is necessary to include the sub-population specific bilinear terms to obtain satisfactory fitting and forecasting performance. We drop the CF model and focus on the LC and ACF models to make further comparisons.

    Table 2.  MSE and MSE for fitting and forecasting of the LC, CF and ACF models in the different Italian regions; the values are in 104.
    Country LC CF ACF LC CF ACF
    MSE Lazio 0.1570 0.2493 0.2129 0.3300 4.3693 0.3676
    Campania 0.2504 0.3694 0.3178 0.9781 4.6704 0.4238
    Sicilia 0.2399 0.4170 0.1117 1.0505 4.8197 0.6862
    Veneto 0.1192 0.1884 0.1453 1.4601 3.8790 0.7292
    Emilia-Romagna 0.0853 0.1046 0.0616 0.9207 3.8268 0.5885
    Piemonte 0.0888 0.1249 0.0906 0.9305 4.1159 0.5436
    Puglia 0.1919 0.2047 0.1199 0.4306 4.0716 0.3730
    Toscana 0.1161 0.1108 0.0611 0.5179 4.0900 0.6900
    Calabria 0.2271 0.5606 0.3370 3.6670 4.0192 1.3038
    Sardegna 0.1662 0.1865 0.1696 0.3854 3.8333 0.3954
    Liguria 0.0578 0.0932 0.0867 0.3395 3.8239 0.4457
    Marche 0.0940 0.1347 0.1334 0.5868 3.9673 0.5805
    Abruzzo 0.0908 0.0845 0.0767 0.3860 4.2181 0.4180
    Friuli-Venezia Giulia 0.1282 0.2029 0.1547 0.6482 3.8721 0.5260
    Trentino Alto Adige 0.1220 0.3600 0.1314 1.8999 3.7722 0.8129
    Umbria 0.1495 0.1947 0.1702 0.3939 3.9955 0.3943
    Basilicata 1.3531 3.3721 0.8315 23.6278 4.2491 4.2366
    Molise 0.0908 0.0845 0.0767 0.7047 3.9790 0.6492
    Valle d'Aosta 0.5261 0.6274 0.4214 2.0250 4.6717 1.4133
    MAE Lombardia 12.1486 19.9305 11.7124 37.2103 92.1732 26.7736
    Lazio 14.5028 19.4734 18.2472 24.6515 99.9781 22.3118
    Campania 18.7160 22.9006 20.6232 35.5810 105.6370 22.8750
    Sicilia 18.8805 25.5242 14.3848 35.0671 105.0941 28.2108
    Veneto 13.3441 17.9538 14.7495 41.2196 91.4148 31.1028
    Emilia-Romagna 12.0344 13.4764 10.7691 34.2336 91.4306 27.3125
    Piemonte 11.4615 13.5049 12.0035 33.6106 96.2467 27.1122
    Puglia 16.6614 17.0389 13.1328 24.0734 95.4091 21.9900
    Toscana 13.9102 14.1957 11.1923 25.4458 94.1629 28.8530
    Calabria 17.4076 27.7452 21.5707 63.3229 95.8088 33.8510
    Sardegna 14.9562 16.1873 16.2295 26.1191 92.6877 20.7598
    Liguria 11.0432 13.0582 12.5367 26.6906 93.8239 24.2515
    Marche 12.5060 13.3978 13.1364 26.7175 91.2184 26.5269
    Abruzzo 12.6381 12.3112 11.4253 22.5176 96.6803 23.4488
    Friuli-Venezia Giulia 16.4157 17.1397 16.9118 31.6200 92.1787 28.2669
    Trentino Alto Adige 15.4234 25.4248 16.1150 45.9268 88.2525 32.3867
    Umbria 16.0213 18.4215 17.6157 22.9201 93.4483 22.6744
    Basilicata 38.5176 58.6050 31.4198 164.4508 96.2181 61.1973
    Molise 12.6381 12.3112 11.4253 28.3012 93.0878 27.3108
    Valle d'Aosta 25.5891 24.4487 17.9363 48.0054 96.0502 40.2191

     | Show Table
    DownLoad: CSV

    Figure 6 further compares the three methods. It shows the MSE and the MAE in the different forecast years from 2000 to 2019. Interestingly, the curves related to the ACF model are the lowest, those relating to the LC model are the second, while the CF model is the least accurate in all forecast years. In the following, we will focus our attention on the LC and ACF model since they are more performing. Figure 7 plots the forecasting error of the LC and ACF models for all ages and calendar years, distinguished by region. The residuals are calculated as predictions minus the observed values, scaled by the estimated standard deviation of the actual values calculated at each age. The red areas in Figure 7 indicate an overestimation of mortality rates, while blue areas indicate an underestimation. Some interesting comments can be made. First, we note that some pronounced red areas are present in the heatmaps related to the LC model for ages 20–45. This result appears especially evident for Liguria, Emilia Romagna, Lombardia and Lazio, highlighting a systematic overestimation of the mortality rates for the LC model in that age range. This effect is less pronounced in the ACF model heatmaps suggesting that it better anticipates longevity improvements and reduces the systematic overestimation of mortality rates at those ages. The overestimation of mortality rates for the LC model also relates to very old ages for some regions; see Basilicata and Calabria. This effect also appears to be less noticeable for the forecasts produced by the ACF model.

    Figure 6.  Forecasting MAE and MSE of the LC, CF and ACF models distinguishing by year.
    Figure 7.  Residuals produced by the LC and ACF models, for each region, year and age.

    Finally, one might also notice that some oblique lines are visible in the heatmaps of both models. This effect, known as the cohort effect, refers to the mortality rates of individuals born in the same year. It is probably because both models don't include cohort terms in the model specification. Figure 8 depicts the confidence intervals at 95% level produced by the LC and ACF models for mortality rates at age x=65. One could observe that the mortality rates of all regions show a decreasing trend suggesting that mortality is improving across all the Italian areas. The ACF model generally presents confidence intervals of similar width for all regions. In contrast, the LC model generates narrowed confidence bands for high-population regions such as Lombardia and Lazio and wide confidence bands for low-population regions such as Valle d'Aosta and Basilicata. Furthermore, we observe that in some regions, such as Abruzzo, Molise, Puglia, Piemonte, and Veneto, the confidence intervals produced by the two models are quite similar, while they appear different for Valle d'Aosta and Basilicata.

    Figure 8.  Confidence interval (at 95% level) of the projected log-mortality rates at age x=65 produced by the LC and the ACF models for the different Italian region.

    Figure 9 presents the projected log mortality curves obtained via LC and ACF models for t = 2000, 2010, 2020, 2030, and different Italian regions. The ACF model produces coherent forecasts since the projected curves do not diverge in the long run. In contrast, the projections obtained via single-population LC models diverge when t increases.

    Figure 9.  Projected log-mortality curves of the LC ACF models for t=2000,2010,2020,2030.

    In this section, we measure the impact of applying the regional mortality data and the ACF model to evaluate life annuity. The price of an immediate life annuity sold to an individual aged x in year t is given by:

    ax,t=k0{kj=0px+j,t+j}(1+r)(k+1)

    where px,t is the one-year survival probability derived from the mortality model, and r is the interest rate. The annuity value is a random variable, and simulation-based approaches are often used to compute its distribution. In this case, we consider the interest rate as deterministic, while the mortality risk is stochastic. We derive the distribution of value of an immediate life annuity with x=65 and t=1999 for each Italian region, simulating possible trajectories of the future mortality evolution according to the ACF model. In this case, we consider only the random error of projecting the mortality indices Kt and κ(i)t as risk sources. A sample of 10000 paths is generated, and the resulting annuity values are computed. The interest rate is assumed to be equal to zero; therefore, ax,t corresponds to the life expectancy at age x and time t. Figure 10 shows the simulated distributions of the annuity price for all the Italian regions. We also report the value of an immediate life annuity computed according to the LC model using the total national data. This value represents an interesting benchmark since annuity pricing is generally performed using national data as also discussed in Bozzo et al. (2021). It is denoted with a vertical red line in Figure 10. For this national benchmark, we use Human Mortality Database (HMD) data Wilmoth & Shkolnikov (2021). § We observe some heterogeneity in the distributions of the different regions. In some cases, the (average) annuity value difference is almost equal to 2 years (see Marche and Sicilia). This evidence had already been observed, in terms of average values, in Bozzo et al. (2021). In particular, Figure 10 shows that the distributions of the annuity value for some regions are essentially below the national average (See Sicilia and Campania) and vice versa, there are some regions where the annuity value is above the reference value (Marche). Some differences are also visible in the variability: Molise and Abruzzo have more concentrated distributions around the average, while Sicily and Trentino have more dispersed distributions. In general, Figure 10 highlights a certain degree of heterogeneity across the different Italian regions. This result points out the importance of considering differences in mortality among the Italian regions in the actuarial calculations.

    § It is the most popular data source for the study of mortality and provides data at the national level (rates, deaths, exposure to risk) for a large set of countries and calendar years. We calibrate the LC model employing the same period and age range described above.

    Figure 10.  Random present value of an immediate life annuity with x=65 and t=1999 for each Italian region according to the ACF model.

    This paper presents a comparison of different approaches to modelling Italian sub-national mortality data. We consider the independent modelling approach based on rdsingle-population LC models and the coherent multi-population model proposed by Li & Lee (2005). The analysis was performed on the Italian mortality data provided at the regional level by ISTAT. The tests have shown that, although the two models produce somewhat similar fitting performance, the ACF model produces significantly better forecasting performances. It beats the LC model in 90% of the Italian regions from an MAE perspective. In addition, the ACF model appears to capture and predict longevity improvements better. The ACF model was also employed to simulate the annuity prices of an individual aged 65 in 1999 in the different Italian regions. We observed some heterogeneity among the regions. Furthermore, in some cases, the value of the annuity differed significantly from the national benchmark. The analysis of the regional data could provide additional information about the heterogeneity in longevity in the national population. In particular, understanding the regions' mortality differences could be helpful from a longevity risk management perspective. Indeed, suppose an annuity portfolio that is not adequately balanced between annuitants living in areas characterised by higher life expectancies and annuitants residing in regions with lower life expectancy. In that case, the use of aggregate national data could lead to misestimating future liabilities and inducing financial trouble. Future research will proceed in different ways. First, we plan to investigate sub-national data of other countries like the United States, Hungary and France. Second, we would like to explore the application of more sophisticated single-population and multi-population mortality models Kleinow (2015); Hyndman et al. (2013). The use of non-linear mortality models could highlight further interesting information on the difference in subnational mortality data. Finally, we aim to explore the use of machine learning and deep learning techniques in sub-national mortality data, which have shown enormous potential in multi-country and large-scale mortality modelling; see for example Perla et al. (2021); Richman & Wüthrich (2021).

    The author thank the three anonymous referees for helpful comments that greatly improved the article.

    The author declares no conflict of interest in this paper.



    [1] Apicella G, Dacorogna M, Di Lorenzo E, et al. (2019) Improving the forecast of longevity by combining models. N Am Actuar J 23: 298–319. https://doi.org/10.1080/10920277.2018.1556701 doi: 10.1080/10920277.2018.1556701
    [2] Booth H, Hyndman RJ, Tickle L, et al. (2006) Lee-Carter mortality forecasting: a multi-country comparison of variants and extensions. Demogr Res 15: 298–319. https://doi.org/10.4054/DemRes.2006.15.9 doi: 10.4054/DemRes.2006.15.9
    [3] Bozzo G, Levantesi S, Menzietti M (2021) Longevity risk and economic growth in sub-populations: evidence from Italy. Decis Econ Financ 44: 101–115. https://doi.org/10.1007/s10203-020-00275-x doi: 10.1007/s10203-020-00275-x
    [4] Brouhns N, Denuit M, Vermunt JK (2002) A Poisson log-bilinear regression approach to the construction of projected lifetables. Insur Math Econ 31: 373–393. https://doi.org/10.1016/S0167-6687(02)00185-3 doi: 10.1016/S0167-6687(02)00185-3
    [5] Cairns A, Blake D, Dowd K (2006) A two-factor model for stochastic mortality with parameter uncertainty: theory and calibration. J Risk Insur 73: 687–718. https://doi.org/10.1016/S0167-6687(02)00185-3 doi: 10.1016/S0167-6687(02)00185-3
    [6] Cairns A, Blake D, Dowd K, et al. (2009) A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. N Am Actuar J 13: 1–35. https://doi.org/10.1080/10920277.2009.10597538 doi: 10.1080/10920277.2009.10597538
    [7] Chen H, MacMinn R, Sun T (2015) Multi-population mortality models: A factor copula approach. Insur Math Econ 63: 135–146. https://doi.org/10.1016/j.insmatheco.2015.03.022 doi: 10.1016/j.insmatheco.2015.03.022
    [8] Chen RY, Millossovich P (2018) Sex-specific mortality forecasting for UK countries: a coherent approach. Eur Actuar J 8: 69–95. https://doi.org/10.1007/s13385-017-0164-0 doi: 10.1007/s13385-017-0164-0
    [9] Currie ID (2013) Smoothing constrained generalized linear models with an application to the Lee-Carter model. Stat Model 13: 69–93. https://doi.org/10.1177/1471082X12471373 doi: 10.1177/1471082X12471373
    [10] Currie ID, Durban M, Eilers PHC (2018) Smoothing and forecasting mortality rates. Stat Model 4: 279–298. https://doi.org/10.1191/1471082X04st080oa doi: 10.1191/1471082X04st080oa
    [11] Danesi IL, Haberman S, Millossovich P (2018) Forecasting mortality in subpopulations using Lee–Carter type models: A comparison. Insur Math Econ 62: 151–161. https://doi.org/10.1016/j.insmatheco.2015.03.010 doi: 10.1016/j.insmatheco.2015.03.010
    [12] Delwarde A, Denuit M, Eilers P (2007) Smoothing the Lee–Carter and Poisson log-bilinear models for mortality forecasting: a penalized log-likelihood approach. J Popul Res 7: 29–48. https://doi.org/10.1177/1471082X0600700103 doi: 10.1177/1471082X0600700103
    [13] De Waegenaere A, Melenberg B, Stevens R (2010) Longevity risk. De Econ 158: 151–192. https://doi.org/10.1007/s10645-010-9143-4 doi: 10.1007/s10645-010-9143-4
    [14] Enchev V, Kleinow T, Cairns A (2017) Multi-population mortality models: fitting, forecasting and comparisons. Scand Actuar J 4: 319–342. https://doi.org/10.1080/03461238.2015.1133450 doi: 10.1080/03461238.2015.1133450
    [15] Franzini L, Giannoni M (2010) Determinants of health disparities between Italian regions. BMC Public Health 10: 1–10. https://doi.org/10.1186/1471-2458-10-296 doi: 10.1186/1471-2458-10-296
    [16] Gao G, Shi Y (2021). Age-coherent extensions of the Lee–Carter model. Scand Actuar J 10: 998–1016. https://doi.org/10.1080/03461238.2021.1918578 doi: 10.1080/03461238.2021.1918578
    [17] Hainaut D, Denuit M (2020) Wavelet-based feature extraction for mortality projection. ASTIN B J IAA 50: 675–707. https://doi.org/10.1017/asb.2020.18 doi: 10.1017/asb.2020.18
    [18] Hyndman RJ, Ullah MS (2007) Robust forecasting of mortality and fertility rates: A functional data approach. Comput Stat & Data Anal 51: 4942–4956. https://doi.org/10.1016/j.csda.2006.07.028 doi: 10.1016/j.csda.2006.07.028
    [19] Hyndman R, Booth H, Yasmeen F (2017) Coherent mortality forecasting: the product-ratio method with functional time series models. Demography 50: 261–283. https://doi.org/10.1007/s13524-012-0145-5 doi: 10.1007/s13524-012-0145-5
    [20] Kleinow T (2015) A common age effect model for the mortality of multiple populations. Insur Math Econ 63: 147–152. https://doi.org/10.1007/s13524-012-0145-5 doi: 10.1007/s13524-012-0145-5
    [21] Lee RD, Carter LR (1992) Modeling and forecasting US mortality. J Am Stat Assoc 87: 659–671. https://doi.org/10.1080/01621459.1992.10475265 doi: 10.1080/01621459.1992.10475265
    [22] Li N, Lee R (2005) Coherent mortality forecasts for a group of populations: An extension of the Lee-Carter method. Demography 42: 575–594. https://doi.org/10.1353/dem.2005.0021 doi: 10.1353/dem.2005.0021
    [23] Li J (2013) A Poisson common factor model for projecting mortality and life expectancy jointly for females and males. Popul Stud 67: 111–126. https://doi.org/10.1080/00324728.2012.689316 doi: 10.1080/00324728.2012.689316
    [24] Nigri A, Levantesi S, Marino M, et al. (2019) A deep learning integrated Lee–Carter model. Risks 7: 33. https://doi.org/10.3390/risks7010033 doi: 10.3390/risks7010033
    [25] Perla F, Richman R, Scognamiglio S, et al. (2021) Time-series forecasting of mortality rates using deep learning. Scand Actuar J 2021: 1–27. https://doi.org/10.1080/03461238.2020.1867232 doi: 10.1080/03461238.2020.1867232
    [26] Renshaw A, Haberman S (2003) Lee–Carter mortality forecasting with age-specific enhancement. Insur Math Econ 33: 255–272. https://doi.org/10.1016/S0167-6687(03)00138-0 doi: 10.1016/S0167-6687(03)00138-0
    [27] Renshaw A, Haberman S (2006) A cohort-based extension to the Lee–Carter model for mortality reduction factors. Insur Math Econ 38: 556–570. https://doi.org/10.1016/j.insmatheco.2005.12.001 doi: 10.1016/j.insmatheco.2005.12.001
    [28] Richman R, Wüthrich MV (2021) A neural network extension of the Lee–Carter model to multiple populations. Ann Actuar Sci 15: 346–366. https://doi.org/10.1017/S1748499519000071 doi: 10.1017/S1748499519000071
    [29] Schnürch S, Kleinow T, Korn R (2021) Clustering-Based Extensions of the Common Age Effect Multi-Population Mortality Model. Risks 9: 45. https://doi.org/10.3390/risks9030045 doi: 10.3390/risks9030045
    [30] Shang HL, Yang Y (2021) Forecasting Australian subnational age-specific mortality rates. J Popul Res 38: 1–24. https://doi.org/10.1007/s12546-020-09250-0 doi: 10.1007/s12546-020-09250-0
    [31] Wilmoth JR and Shkolnikov V (2021) University of California, Berkeley (US), and Max Planck Institute for Demographic Research (Germany).
  • This article has been cited by:

    1. Yezhou Sha, Kung-Cheng Ho, Cheng Yan, Prevention of Financial Risk, the International Conference on Preventing Major Finance Risk and Fostering High-Quality Growth Special Issue, 2022, 58, 1540-496X, 4191, 10.1080/1540496X.2022.2151898
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1729) PDF downloads(146) Cited by(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog