Citation: Bill Huajian Yang, Jenny Yang, Haoji Yang. Modeling portfolio loss by interval distributions[J]. Big Data and Information Analytics, 2020, 5(1): 1-13. doi: 10.3934/bdia.2020001
[1] | Bill Huajian Yang . Resolutions to flip-over credit risk and beyond-least squares estimates and maximum likelihood estimates with monotonic constraints. Big Data and Information Analytics, 2018, 3(2): 54-67. doi: 10.3934/bdia.2018007 |
[2] | S. Chen, Z. Wang, M. Kelly . Aggregate loss model with Poisson-Tweedie frequency. Big Data and Information Analytics, 2021, 6(0): 56-73. doi: 10.3934/bdia.2021005 |
[3] | Tieliang Gong, Qian Zhao, Deyu Meng, Zongben Xu . Why Curriculum Learning & Self-paced Learning Work in Big/Noisy Data: A Theoretical Perspective. Big Data and Information Analytics, 2016, 1(1): 111-127. doi: 10.3934/bdia.2016.1.111 |
[4] | Xiaoying Chen, Chong Zhang, Zonglin Shi, Weidong Xiao . Spatio-temporal Keywords Queries in HBase. Big Data and Information Analytics, 2016, 1(1): 81-91. doi: 10.3934/bdia.2016.1.81 |
[5] | Ricky Fok, Agnieszka Lasek, Jiye Li, Aijun An . Modeling daily guest count prediction. Big Data and Information Analytics, 2016, 1(4): 299-308. doi: 10.3934/bdia.2016012 |
[6] | Bill Huajian Yang . Modeling path-dependent state transitions by a recurrent neural network. Big Data and Information Analytics, 2022, 7(0): 1-12. doi: 10.3934/bdia.2022001 |
[7] | Amanda Working, Mohammed Alqawba, Norou Diawara, Ling Li . TIME DEPENDENT ATTRIBUTE-LEVEL BEST WORST DISCRETE CHOICE MODELLING. Big Data and Information Analytics, 2018, 3(1): 55-72. doi: 10.3934/bdia.2018010 |
[8] | David E. Bernholdt, Mark R. Cianciosa, David L. Green, Kody J.H. Law, Alexander Litvinenko, Jin M. Park . Comparing theory based and higher-order reduced models for fusion simulation data. Big Data and Information Analytics, 2018, 3(2): 41-53. doi: 10.3934/bdia.2018006 |
[9] | Antonio N. Bojanic, Alejandro Jordán . Modeling the COVID-19 epidemic in Bolivia. Big Data and Information Analytics, 2020, 5(1): 47-57. doi: 10.3934/bdia.2020004 |
[10] | Nickson Golooba, Woldegebriel Assefa Woldegerima, Huaiping Zhu . Deep neural networks with application in predicting the spread of avian influenza through disease-informed neural networks. Big Data and Information Analytics, 2025, 9(0): 1-28. doi: 10.3934/bdia.2025001 |
For a continuous risk outcome
Given fixed effects
In this paper, we assume that the risk outcome
y=Φ(a0+a1x1+⋯+akxk+bs), | (1.1) |
where
Given random effect model (1.1), the expected value
We introduce a family of interval distributions based on variable transformations. Probability densities for these distributions are provided (Proposition 2.1). Parameters of model (1.1) can then be estimated by maximum likelihood approaches assuming an interval distribution. In some cases, these parameters get an analytical solution without the needs for a model fitting (Proposition 4.1). We call a model with a random effect, where parameters are estimated by maximum likelihood assuming an interval distribution, an interval distribution model.
In its simplest form, the interval distribution model
The paper is organized as follows: in section 2, we introduce a family of interval distributions. A measure for tail fatness is defined. In section 3, we show examples of interval distributions and investigate their tail behaviours. We propose in section 4 an algorithm for estimating the parameters in model (1.1).
Interval distributions introduced in this section are defined for a risk outcome over a finite open interval
Let
Let
Φ:D→(c0,c1) | (2.1) |
be a transformation with continuous and positive derivatives
Given a continuous random variable
y=Φ(a+bs), | (2.2) |
where we assume that the range of variable
Proposition 2.1. Given
g(y,a,b)=U1/(bU2) | (2.3) |
G(y,a,b)=F[Φ−1(y)−ab]. | (2.4) |
where
U1=f{[Φ−1(y)−a]/b},U2=ϕ[Φ−1(y)] | (2.5) |
Proof. A proof for the case when
G(y,a,b)=P[Φ(a+bs)≤y] |
=P{s≤[Φ−1(y)−a]/b} |
=F{[Φ−1(y)−a]/b}. |
By chain rule and the relationship
∂Φ−1(y)∂y=1ϕ[Φ−1(y)]. | (2.6) |
Taking the derivative of
∂G(y,a,b)∂y=f{[Φ−1(y)−a]/b}bϕ[Φ−1(y)]=U1bU2. |
One can explore into these interval distributions for their shapes, including skewness and modality. For stress testing purposes, we are more interested in tail risk behaviours for these distributions.
Recall that, for a variable X over (−
For a risk outcome over a finite interval
We say that an interval distribution has a fat right tail if the limit
Given
Recall that, for a Beta distribution with parameters
Next, because the derivative of
z=Φ−1(y) | (2.7) |
Then
Lemma 2.2. Given
(ⅰ)
(ⅱ) If
(ⅲ) If
Proof. The first statement follows from the relationship
[g(y,a,b)(y1−y)β]−1/β=[g(y,a,b)]−1/βy1−y=[g(Φ(z),a,b)]−1/βy1−Φ(z). | (2.8) |
By L’Hospital’s rule and taking the derivatives of the numerator and the denominator of (2.8) with respect to
For tail convexity, we say that the right tail of an interval distribution is convex if
Again, write
h(z,a,b)=log[g(Φ(z),a,b)], | (2.9) |
where
g(y,a,b)=exp[h(z,a,b)]. | (2.10) |
By (2.9), (2.10), using (2.6) and the relationship
g′y=[h′z(z)/ϕ(z)]exp[h(Φ−1(y),a,b)],g″yy=[h″zz(z)ϕ2(z)−h′z(z)ϕ′z(z)ϕ3(z)+h′z(z)h′z(z)ϕ2(z)]exp[h(Φ−1(y),a,b)]. | (2.11) |
The following lemma is useful for checking tail convexity, it follows from (2.11).
Lemma 2.3. Suppose
In this section, we focus on the case where
One can explore into a wide list of densities with different choices for
A.
B.
C.
D.D.
Densities for cases A, B, C, and D are given respectively in (3.3) (section 3.1), (A.1), (A.3), and (A5) (Appendix A). Tail behaviour study is summarized in Propositions 3.3, 3.5, and Remark 3.6. Sketches of density plots are provided in Appendix B for distributions A, B, and C.
Using the notations of section 2, we have
By (2.5), we have
log(U1U2)=−z2+2az−a2+b2z22b2 | (3.1) |
=−(1−b2)(z−a1−b2)2+b21−b2a22b2. | (3.2) |
Therefore, we have
g(y,a,b)=1bexp{−(1−b2)(z−a1−b2)2+b21−b2a22b2}. | (3.3) |
Again, using the notations of section 2, we have
g(y,p,ρ)=√1−ρρexp{−12ρ[√1−ρΦ−1(y)−Φ−1(p)]2+12[Φ−1(y)]2}, | (3.4) |
where
Proposition 3.1. Density (3.3) is equivalent to (3.4) under the relationships:
a=Φ−1(p)√1−ρ and b=√ρ1−ρ. | (3.5) |
Proof. A similar proof can be found in [19]. By (3.4), we have
g(y,p,ρ)=√1−ρρexp{−1−ρ2ρ[Φ−1(y)−Φ−1(p)/√1−ρ]2+12[Φ−1(y)]2} |
=1bexp{−12[Φ−1(y)−ab]2}exp{12[Φ−1(y)]2} |
=U1/(bU2)=g(y,a,b). |
The following relationships are implied by (3.5):
ρ=b21+b2, | (3.6) |
a=Φ−1(p)√1+b2. | (3.7) |
Remark 3.2. The mode of
√1−ρ1−2ρΦ−1(p)=√1+b21−b2Φ−1(p)=a1−b2. |
This means
Proposition 3.3. The following statements hold for
(ⅰ)
(ⅱ)
(ⅲ) If
Proof. For statement (ⅰ), we have
Consider statement (ⅱ). First by (3.3), if
[g(Φ(z),a,b)]−1/β=b1/βexp(−(b2−1)z2+2az−a22βb2) | (3.8) |
By taking the derivative of (3.8) with respect to
−{∂[g(Φ(z),a,b)]−1β/∂z}/ϕ(z)=√2πb1β(b2−1)z+aβb2exp(−(b2−1)z2+2az−a22βb2+z22). | (3.9) |
Thus
{∂[g(Φ(z),a,b)]−1β/∂z}/ϕ(z)=−√2πb1β(b2−1)z+aβb2exp(−(b2−1)z2+2az−a22βb2+z22). | (3.10) |
Thus
For statement (ⅲ), we use Lemma 2.3. By (2.9) and using (3.2), we have
h(z,a,b)=log(U1bU2)=−(1−b2)(z−a1−b2)2+b21−b2a22b2−log(b). |
When
Remark 3.4. Assume
limz⤍+∞−{∂[g(Φ(z),a,b)]−1β/∂z}/ϕ(z) |
is
For these distributions, we again focus on their tail behaviours. A proof for the next proposition can be found in Appendix A.
Proposition 3.5. The following statements hold:
(a) Density
(b) The tailed index of
Remark 3.6. Among distributions A, B, C, and Beta distribution, distribution B gets the highest tailed index of 1, independent of the choices of
In this section, we assume that
First, we consider a simple case, where risk outcome
y=Φ(v+bs), | (4.1) |
where
Given a sample
LL=∑ni=1{logf(zi−vib)−logϕ(zi)−logb}, | (4.2) |
where
Recall that the least squares estimators of
SS=∑ni=1(zi−vi)2 | (4.3) |
has a closed form solution given by the transpose of
X=⌈1x11…xk11x12…xk2…1x1n…xkn⌉,Z=⌈z1z2…zn⌉. |
The next proposition shows there exists an analytical solution for the parameters of model (4.1).
Proposition 4.1. Given a sample
Proof. Dropping off the constant term from (4.2) and noting
LL=−12b2∑ni=1(zi−vi)2−nlogb, | (4.4) |
Hence the maximum likelihood estimates
Next, we consider the general case of model (1.1), where the risk outcome
y=Φ[v+ws], | (4.5) |
where parameter
(a)
(b)
Given a sample
LL=∑ni=1−12[(zi−vi)2/w2i−ui], | (4.6) |
LL=∑ni=1{−(zi−vi)/wi−2log[1+exp[−(zi−vi)/wi]−ui}, | (4.7) |
Recall that a function is log-concave if its logarithm is concave. If a function is concave, a local maximum is a global maximum, and the function is unimodal. This property is useful for searching maximum likelihood estimates.
Proposition 4.2. The functions (4.6) and (4.7) are concave as a function of
Proof. It is well-known that, if
For (4.7), the linear part
In general, parameters
Algorithm 4.3. Follow the steps below to estimate parameters of model (4.5):
(a) Given
(b) Given
(c) Iterate (a) and (b) until a convergence is reached.
With the interval distributions introduced in this paper, models with a random effect can be fitted for a continuous risk outcome by maximum likelihood approaches assuming an interval distribution. These models provide an alternative regression tool to the Beta regression model and fraction response model, and a tool for tail risk assessment as well.
Authors are very grateful to the third reviewer for many constructive comments. The first author is grateful to Biao Wu for many valuable conversations. Thanks also go to Clovis Sukam for his critical reading for the manuscript.
We would like to thank you for following the instructions above very closely in advance. It will definitely save us lot of time and expedite the process of your paper's publication.
The views expressed in this article are not necessarily those of Royal Bank of Canada and Scotiabank or any of their affiliates. Please direct any comments to Bill Huajian Yang at h_y02@yahoo.ca.
[1] | Basel Committee on Banking Supervision, An Explanatory Note on the Basel II IRB Risk Weight Functions, 2005. Available from: https://www.bis.org/bcbs/irbriskweight.htm. |
[2] | Basel Committee on Banking Supervision, Guidance on credit risk and accounting for expected credit loss, 2015. Available from: https://www.bis.org/bcbs/publ/d350.htm. |
[3] | Basel Committee on Banking Supervision, Minimum capital requirements for market risk, 2019. Available from: https://www.bis.org/bcbs/publ/d457.htm. |
[4] | Cribari-Neto F and Zeileis A, (2010) Beta Regression in R. J Stat Software 34: 1-24. |
[5] | Friedman J, Hastie T and Tibshirani R, (2001) The Elements of Statistical Learning, 2 Eds., New York: Springer series in statistics. |
[6] |
Gourieroux C, Monfort A and Trognon A, (1984) Pseudo maximum likelihood methods: Theory. Econometrica 52: 681-700. doi: 10.2307/1913471
![]() |
[7] | Huang X, Oosterlee CW and Mesters M, (2007) Computation of VaR and VaR contribution in the Vasicek portfolio credit loss model: a comparative study. J Credit Risk 3: 75-96. |
[8] | Mullahy J, (1990) Regression models and transformations for Beta-distributed outcomes. |
[9] | Murphy K, (2012) Machine learning: a probabilistic perspective, MIT press. |
[10] |
Papke LE and Wooldrige JM, (1996) Econometric methods for fractional response variables with an application to 401 (k) plan participation rates. J Appl Econometrics 11: 619-632. doi: 10.1002/(SICI)1099-1255(199611)11:6<619::AID-JAE418>3.0.CO;2-1
![]() |
[11] |
Ramponi FA and Campi MC, (2018) Expected shortfall: Heuristics and certificates. Eur J Operational Res 267: 1003-1013. doi: 10.1016/j.ejor.2017.11.022
![]() |
[12] |
Rosen D and Saunders D, (2009) Analytical methods for hedging systematic credit risk with linear factor portfolios. J Econ Dyn Control 33: 37-52. doi: 10.1016/j.jedc.2008.03.010
![]() |
[13] | Vasicek O, (1991) Limiting loan loss probability distribution. KMV Corporation. |
[14] | Vasicek O, (1991) The distribution of loan portfolio value. Risk 15: 160-162. |
[15] | Wolfinger RD, Fitting nonlinear mixed models with the new NLMIXED procedure. Proceedings of the 24th Annual SAS Users Group International Conference (SUGI 24), 1999. Available from: https://pdfs.semanticscholar.org/. |
[16] | Wu B, (2019) The Probability of Default Distribution of Heterogeneous Loan Portfolio. Curr Anal Econ Finance 1: 88-95. |
[17] | Yamai Y and Yoshiba T, (2002) Comparative analyses of expected shortfall and value-at-risk: their estimation error, decomposition, and optimization. Monetary Econ Stud 20: 87-121. |
[18] | Yang BH, (2013) Estimating Long-Run PD, Asset Correlation, and Portfolio Level PD by Vasicek Models. J Risk Model Validation 7: 3-19. |
[19] | Yang BH, Wu B, Cui K et al. (2020) IFRS9 Expected Credit Loss Estimation: Advanced Models for Estimating Portfolio Loss and Weighting Scenario Losses. J Risk Model Validation 14: 19-34. |