Loading [MathJax]/jax/element/mml/optable/Latin1Supplement.js
Research article

Breast cancer diagnosis using feature extraction and boosted C5.0 decision tree algorithm with penalty factor


  • Received: 16 September 2021 Revised: 28 November 2021 Accepted: 14 December 2021 Published: 04 January 2022
  • To overcome the two class imbalance problem among breast cancer diagnosis, a hybrid method by combining principal component analysis (PCA) and boosted C5.0 decision tree algorithm with penalty factor is proposed to address this issue. PCA is used to reduce the dimension of feature subset. The boosted C5.0 decision tree algorithm is utilized as an ensemble classifier for classification. Penalty factor is used to optimize the classification result. To demonstrate the efficiency of the proposed method, it is implemented on biased-representative breast cancer datasets from the University of California Irvine(UCI) machine learning repository. Given the experimental results and further analysis, our proposal is a promising method for breast cancer and can be used as an alternative method in class imbalance learning. Indeed, we observe that the feature extraction process has helped us improve diagnostic accuracy. We also demonstrate that the extracted features considering breast cancer issues are essential to high diagnostic accuracy.

    Citation: Jian-xue Tian, Jue Zhang. Breast cancer diagnosis using feature extraction and boosted C5.0 decision tree algorithm with penalty factor[J]. Mathematical Biosciences and Engineering, 2022, 19(3): 2193-2205. doi: 10.3934/mbe.2022102

    Related Papers:

    [1] Baishuai Zuo, Chuancun Yin . Stein’s lemma for truncated generalized skew-elliptical random vectors. AIMS Mathematics, 2020, 5(4): 3423-3433. doi: 10.3934/math.2020221
    [2] Guangshuai Zhou, Chuancun Yin . Family of extended mean mixtures of multivariate normal distributions: Properties, inference and applications. AIMS Mathematics, 2022, 7(7): 12390-12414. doi: 10.3934/math.2022688
    [3] Remigijus Leipus, Jonas Šiaulys, Dimitrios Konstantinides . Minimum of heavy-tailed random variables is not heavy tailed. AIMS Mathematics, 2023, 8(6): 13066-13072. doi: 10.3934/math.2023658
    [4] Naif Alotaibi, A. S. Al-Moisheer, Ibrahim Elbatal, Salem A. Alyami, Ahmed M. Gemeay, Ehab M. Almetwally . Bivariate step-stress accelerated life test for a new three-parameter model under progressive censored schemes with application in medical. AIMS Mathematics, 2024, 9(2): 3521-3558. doi: 10.3934/math.2024173
    [5] Huifang Yuan, Tao Jiang, Min Xiao . The ruin probability of a discrete risk model with unilateral linear dependent claims. AIMS Mathematics, 2024, 9(4): 9785-9807. doi: 10.3934/math.2024479
    [6] Weiwei Ni, Chenghao Xu, Kaiyong Wang . Estimations for aggregate amount of claims in a risk model with arbitrary dependence between claim sizes and inter-arrival times. AIMS Mathematics, 2022, 7(10): 17737-17746. doi: 10.3934/math.2022976
    [7] Khaled M. Alqahtani, Mahmoud El-Morshedy, Hend S. Shahen, Mohamed S. Eliwa . A discrete extension of the Burr-Hatke distribution: Generalized hypergeometric functions, different inference techniques, simulation ranking with modeling and analysis of sustainable count data. AIMS Mathematics, 2024, 9(4): 9394-9418. doi: 10.3934/math.2024458
    [8] Gunduz Caginalp . Fat tails arise endogenously from supply/demand, with or without jump processes. AIMS Mathematics, 2021, 6(5): 4811-4846. doi: 10.3934/math.2021283
    [9] Hamid Reza Safaeyan, Karim Zare, Mohamadreza Mahmoudi, Mohsen Maleki, Amir Mosavi . A Bayesian approach on asymmetric heavy tailed mixture of factor analyzer. AIMS Mathematics, 2024, 9(6): 15837-15856. doi: 10.3934/math.2024765
    [10] Yanfang Zhang, Fuchang Wang, Yibin Zhao . Statistical characteristics of earthquake magnitude based on the composite model. AIMS Mathematics, 2024, 9(1): 607-624. doi: 10.3934/math.2024032
  • To overcome the two class imbalance problem among breast cancer diagnosis, a hybrid method by combining principal component analysis (PCA) and boosted C5.0 decision tree algorithm with penalty factor is proposed to address this issue. PCA is used to reduce the dimension of feature subset. The boosted C5.0 decision tree algorithm is utilized as an ensemble classifier for classification. Penalty factor is used to optimize the classification result. To demonstrate the efficiency of the proposed method, it is implemented on biased-representative breast cancer datasets from the University of California Irvine(UCI) machine learning repository. Given the experimental results and further analysis, our proposal is a promising method for breast cancer and can be used as an alternative method in class imbalance learning. Indeed, we observe that the feature extraction process has helped us improve diagnostic accuracy. We also demonstrate that the extracted features considering breast cancer issues are essential to high diagnostic accuracy.



    One of the main challenges faced by financial companies is to evaluate market risks in a set of changes of the basic variables such as stock prices, interest rates or exchange rates. In this regard the Value-at-Risk (VaR) introduced by J. P. Morgan in the mid 1990s has become a standard risk measure of financial market risk. Despite its extensive use, the VaR is not a coherent risk measure because it fails to satisfy subadditivity property (see [1]). The VaR can not determine the expected loss of portfolio in q worst case, but it defines the minimum loss. Furthermore, the computation of the VaR is based on the assumption that financial data returns follow the normal distribution. However, as shown in the literatures, the underlying distributions of many financial data exhibit skewness, non-symmetric, heavy tails and excess kurtosis (see [10]). They suggest in particular that large losses occur with much higher probability than the normal distribution would suggest.

    The tail conditional expectation (TCE) risk measure shares properties that are considered desirable in all cases. For instance, due to the additivity of expectations, TCE allows venture capital to decompose naturally among its various components.

    Consider X to be a loss random variable whose cumulative distribution function (cdf) is denoted by FX(x). The TCE is defined as

    TCEp(X)=E(X|X>xp),p(0,1),

    where xp=inf{xR:FX(x)p}=VaRp(X). The TCE has been discussed in many literatures (see e.g., [11,12,15,17]).

    The tail conditional expectation risk measure shares properties that are considered desirable in a variety of situations. For instance, due to the additivity of expectations, TCE allows for a natural decomposition of risk capital among its various constituents. The conception of capital allocation principle has long been introduced, in which the capital allocated to each risk unit can be expressed as its contribution to the tail conditional expectation of total risk. Risk allocation can not only help to evaluate and compare the performance of individual risk units, but also help to understand the risk contribution of each unit towards the total risk of the portfolio. Landsman and Valdez [17] derived the portfolio risk decomposition with TCE for the multivariate elliptical distribution. In [18], authors derived the portfolio risk decomposition with TCE for the exponential dispersion model, and Kim [14] for the exponential family class. The allocation for the class of exponential marginal was developed in []. The portfolio risk decomposition with TCE was further considered in [19] for the skew-normal distribution. Furman and Landsman [16] for the multivariate Gamma distribution. Cai and Li [2] for the phase-type distribution. Goovaerts et al. [9] and Chiragiev and Landsman [4] have provided the TCE-capital allocation for the multivariate Pareto distribution while Cossette et al. [5] have considered multivariate compound distribution. Ignatieva and Landsman [12] for generalized hyperbolic distribution. Recently, Kim and Kim [15] and Ignatieva and Landsman [12] investigated the TCE allocation for the family of multivariate normal mean-variance mixture distributions and skewed generalized hyperbolic, respectively. The univariate TCE and risk allocation formula for the generalized hyper-elliptical class were available in [13].

    Furman and Landsman [8] observed that in many cases the TCE does not provide adequate information about the risks on the right tail. This point can be confirmed by the fact that the TCE does not include the information that the risk deviates from the upper tail expectation. Furman and Landsman [8] introduced the tail variance measure. The tail variance is defined as

    TVp(X)=Var(X|X>xp)=E((XTCEp(X))2|X>xp),

    and it has been discussed in many literatures (see e.g., [8,15]).

    In this paper we consider a class of multivariate location-scale mixtures of elliptical (LSME) distributions which is known to be extremely flexible and contains many special cases as its members. Examples include the generalized hyper-elliptical distribution and generalized hyperbolic distribution.

    The rest of the paper is organized as follows. Section 2 reviews the definition and properties of the multivariate LSME class, and introduces the generalized hyper-elliptical distribution as a representative subclass. Section 3 presents a theorem and proves the proposed TCE formula for the LSME and in Section 4, the development is extended to the portfolio risk decomposition with TCE for the multivariate LSME. In Section 5, we develop TV formula for univariate LSME. Section 6 deals with the special case of generalized hyperbolic distribution. Numercial illustration is presented in Section 7. Finally, concluding remarks are presented in Section 8.

    In this section, we introduce the class of location-scale mixtures of elliptical distributions and some of its properties.

    Let Ψn be a class of functions ψ(t):[0,)R such that function ψ(ni=1t2i) is an n-dimensional characteristic function.

    A random vector Y is said to have a multivariate elliptical distribution, denoted by YEn(μ,Σ,ψ), if its characteristic function can be expressed as

    φY(t)=exp(itTμ)ψ(12tTΣt), (2.1)

    for column-vector vector μ, n×n positive definite scale matrix Σ, and for function ψ(t)Ψn, which is called the characteristic generator.

    In general, a multivariate elliptical distribution may not have a probability density function (pdf), but if its pdf exists then the form will be

    fY(y)=cn|Σ|gn[12(yμ)TΣ1(yμ)], (2.2)

    for function gn(), which is called the density generator. The condition

    0un21gn(u)du< (2.3)

    guarantees gn(u) to be the density generator ([7]). In addition, the normalizing constant cn is

    cn=Γ(n2)(2π)n2(0un21gn(u)du)1.

    Similarly, the elliptical distribution can also be introduced by the density generator and then written YEn(μ,Σ,gn).

    From (2.1), it follows that, if YEn(μ,Σ,gn) and A is m×n matrix of rank mn and b is m-dimensional column-vector, then

    AY+bEm(Aμ+b,AΣAT,gm).

    The following condition:

    0g1(u)du<

    guarantees the existence of the mean. If, in addition, |ψ(0)|<, the covariance matrix exists and is equal to

    Cov(Y)=ψ(0)Σ,

    (see [3]).

    From (2.2) and (2.3), g1(x) can be a density generator of univariate elliptical distribution of the random variable YE1(μ,σ2,g1) whose pdf can be expressed as

    fY(y)=cσg1(12(yμσ)2),

    where c is the normalizing constant. In this paper, we assume

    Var(Z)=σ2Z<, (2.4)

    where Z=Yμσ is the spherical random variable. The cdf of the random variable Z can be written as the following integration form:

    FZ(z)=czg1(12u2)du.

    We can obtain the mean and variance of Z:

    μZ=0

    and

    σ2Z=2c0u2g1(12u2)du=ψ(0).

    Landsman and Valdez [17] showed that

    fZ(z)=1σ2Z¯G(12z2) (2.5)

    is the density of another spherical random variable Z associated with Z, where

    ¯G(z)=zg1(u)du.

    The random vector X LSME n(μ,Σ,γ,gn;Π) has an n-dimensional LSME distribution with location parameter μ, positive definite scale matrix Σ, if

    X=m(Θ)+Θ12Σ12Y, (2.6)

    in distribution, where

    (1) YEn(0,In,gn), the n-dimensional multivariate elliptical variable;

    (2) Non-negative scalar random variable Θ is independent of Y, whose pdf and cdf are π(θ) and Π(θ) respectively;

    (3) m(Θ)=μ+Θγ, where μ=(μ1,,μn)Tandγ=(γ1,,γn)T are constant vectors in Rn.

    The pdf of the LSME can be written as the following integration form:

    fX(x)=cn|Σ|01θgn(12θ(xμθγ)TΣ1(xμθγ))π(θ)dθ,xRn.

    We find that the conditional distribution of X|θ is elliptical, that is

    X|Θ=θEn(m(θ),θΣ,gn). (2.7)

    We can obtain the mean and covariance of X:

    E(X)=E[E(X|Θ)]=E(m(Θ))=μ+E(Θ)γ

    and

    Cov(X)=E[Var(X|Θ)]+Var[E(X|Θ)]=E(ΘΣ)+Var(m(Θ))=E(Θ)Σ+Var(Θ)γγT.

    The characteristic function of X|Θ=θ exists and equals to

    φX(t|Θ=θ)=exp(itTμ)exp(iθtTγ)ψ(12θtTΣt).

    Then the characteristic function of the LSME-distributed random vector X can be written as

    φX(t)=exp(itTμ)E[exp(iθtTγ)ψ(12θtTΣt)]=exp(itTμ)0exp(iθtTγ)ψ(12θtTΣt)π(θ)dθ. (2.8)

    Under the condition (2.2) and from (2.5), we can conclude ¯G(z) is the density generator of the associated elliptical variable Z, then

    X=m(Θ)+ΘσZ (2.9)

    is said to have a univariate LSME distributions, denoted by XLSME1(μ,σ2,γ,Θ,¯G;Π).

    Proposition 1. If XLSMEn(μ,Σ,γ,gn;Π) and Y=BX+b where B is m×n (mn) matrix and b is m-dimensional column-vector, then it holds that YLSMEm(Bμ+b,BΣBT,Bγ,gm;Π).

    Proof. Using the characteristic function (2.8), we write

    φY(t)=E(eitT(BX+b))=exp(itTb)φX(BTt)=exp(itTb)exp(i(BTt)Tμ)0exp(iθ(BTt)Tγ)ψ(12θ(BTt)TΣ(BTt))π(θ)dθ=exp(itT(Bμ+b))0exp(iθtTBγ)ψ(12θtTBΣBTt)π(θ)dθ, (2.10)

    i.e., YLSMEm(Bμ+b,BΣBT,Bγ,gm;Π).

    Example2.1 (Generalized hyper-elliptical distribution). The GHE distribution is constructed by mixing a generalized inverse Gaussian distribution with elliptical distribution. A positive random variable Θ is said to have a generalized inverse Gaussian distribution, denoted by ΘGIG(λ,a,b), if its pdf is given by

    π(θ;λ,a,b)=aλ(ab)λ2Kλ(ab)θλ1exp(12(aθ1+bθ)),θ>0, (2.11)

    where parameters follow

    {a0andb>0,ifλ>0,a>0andb0,ifλ<0,a>0andb>0,ifλ=0

    and Kλ() denotes the modified Bessel function of the second kind with index λR. A random vector XGHEn(μ,Σ,γ,gn,λ,a,b) has an n-dimensional GHE distribution, if there exists a random vector Y follows (2.6) such that

    X=m(Θ)+Θ12Σ12Y, (2.12)

    where ΘGIG(λ,a,b).

    The univariate LSME variable is given by n=1 in the multivariate definition. That is, the univariate LSME variable XLSME1(μ,σ2,γ,g1;Π) satisfies

    X=m(Θ)+ΘσZ,

    where ZE1(0,1,g1) is the standard elliptical variable, and non-negative scalar random variable Θ with pdf π(θ) is independent of Z. From (2.7), we have

    X|θE1(m(θ),θσ2,g1). (3.1)

    Assuming that both the conditional distribution and the mixed distribution are continuous, the pdf of X produced by the mixed distribution can be written as

    fX(x)=Ωθf(x|θ)π(θ)dθ, (3.2)

    where f(x|θ) is the pdf of X|θ and Ωθ is the support of π(θ). Now let xp be the quantile of the LSME variable X. Then the TCE of X can be expressed as

    E(X|X>xp)=11pxpxfX(x)dx=11pΩθxpxf(x|θ)dxπ(θ)dθ=11pΩθEX|θ(X|X>xp)¯FX|θ(xp)π(θ)dθ. (3.3)

    The TCE formula for a univariate elliptical distribution is introduced by [17], and equals to

    TCEp(X|θ)=EX|θ(X|X>xp)=m(θ)+1θσfZ(κ(xp;θ))¯FZ(κ(xp;θ))σ2Zθσ2=m(θ)+1θσfZ(κ(xp;θ))¯FZ(κ(xp;θ))Var(X|θ), (3.4)

    where

    κ(x;θ)=xm(θ)θσ.

    We now give a general TCE formula for the univariate LSME distributions.

    Theorem 1. Let XLSME1(μ,σ2,γ,g1;Π) and π(θ)=(c)1θπ(θ) be a mixing pdf with c=E(Θ)<. Then the TCE of X can be computed by:

    E(X|X>xp)=μ+γc1p¯FLSME,1(xp;μ,σ2,γ,g1;Π)+cσ2σ2Z1pfLSME,1(xp;μ,σ2,γ,¯G;Π), (3.5)

    where Π is the cdf corresponding to the pdf π.

    Proof. From (3.3) and (3.4), we have

    E(X|X>xp)=11p0[m(θ)+1θσfZ(κ(xp;θ))¯FZ(κ(xp;θ))Var(X|θ)]¯FX|θ(xp)π(θ)dθ=11pxp0m(θ)θσfZ(κ(x;θ))π(θ)dθdx+σ2σ2Z1p0fZ(κ(xp;θ))θσθπ(θ)dθ=11pxpμf(x)dx+γ1pxp0f(x|θ)θπ(θ)dθdx+σ2σ2Z1p0fZ(κ(xp;θ))θσθπ(θ)dθ. (3.6)

    From the definition of LSME distributions and (3.2), we have

    0f(x|θ)θπ(θ)dθ=c0f(x|θ)π(θ)dθ=cfLSME,1(x;μ,σ2,γ,g1;Π).

    As a result, (3.6) can be further simplified

    E(X|X>xp)=μ+γc1pxp0f(x|θ)π(θ)dθdx+cσ2σ2Z1p0fZ(κ(xp;θ))θσπ(θ)dθ=μ+γc1pxpfLSME,1(x;μ,σ2,γ,g1;Π)dx+cσ2σ2Z1p0fZ(κ(xp;θ))θσπ(θ)dθ=μ+γc1p¯FLSME,1(xp;μ,σ2,γ,g1;Π)+cσ2σ2Z1pfLSME,1(xp;μ,σ2,γ,¯G;Π).

    Corollary 1. Let XGHE1(μ,σ2,γ,g1,λ,a,b). Assume the conditions in Theorem 1 are satisfied, then the TCE of GHE can be computed by:

    TCEp(X)=μ+γ1pabKλ+1(ab)Kλ(ab)¯FGHE,1(xp;μ,σ2,γ,g1,λ+1,a,b)+σ2σ2Z1pabKλ+1(ab)Kλ(ab)fGHE,1(xp;μ,σ2,γ,¯G,λ+1,a,b). (3.7)

    Proof. From the GIG density in (2.11), we conclude

    θπ(θ;λ,a,b)=abKλ+1(ab)Kλ(ab)π(θ;λ+1,a,b),

    by setting

    c=abKλ+1(ab)Kλ(ab),

    then

    π(θ)=(c)1θπ(θ)=π(θ;λ+1,a,b),

    which also is the GIG density. Using (3.5) we can directly obtain (3.7).

    Consider a risk vector Y=(Y1,,Yn)T and S=Y1++Yn. We denote sp as the p-quantile of S, then

    E(S|S>sp)=ni=1E(Yi|S>sp),

    where E(Yi|S>sp) is the contribution of the i-th risk to the aggregated risks.

    Let Y=(Y1,,Yn)En(μ,Σ,gn) and S=Y1++Yn, then ([6])

    E(Yi|S=s)=yif(yi|s)dyi=E(Yi)+Cov(Yi,S)Var(S)(sE(S)).

    The contribution of risk Yi,1in, to the total TCE can be expressed as

    E(Yi|S>sp)=spE(Yi|S=s)dFS(s|S>sp)=spE(Yi|S=s)fS(s)1FS(sp)ds=11pspE(Yi|S=s)fS(s)ds.

    We now exploit this formulation to the multivariate LSME to obtain its portfolio risk decomposition with TCE.

    Let us assume X=(X1,,Xn)TLSMEn(μ,Σ,γ,gn;Π). Denote the (i,j) element of Σ by σij, define

    S=X1++Xn.

    Then, E(Xi|S=s) can be further expanded by conditioning on θ as follows:

    E(Xi|S=s)=xif(xi|s)dxi=xif(xis)dxifS(s)=1fS(s)xi0f(xi,s|θ)dπ(θ)dxi=1fS(s)xi0f(xi|s,θ)f(s|θ)π(θ)dθdxi=1fS(s)0[xif(xi|s,θ)dxi]f(s|θ)π(θ)dθ. (4.1)

    To deal with the inner integral, we define a matrix Bi of size 2×n:

    Bi=[000100111]. (4.2)

    The first row vector has 1 in the ith position. If we keep the general form

    m(θ)=(m1(θ),,mn(θ))T,

    we have

    BiX|θ=(Xi,S|θ)T=Bim(θ)+θ12BiΣ12Y,

    here (Xi,S|θ)T stands for a random column vector of size 2×1, with each element being Xi|θ and S|θ, respectively. Thus, the joint distribution of (Xi,S) under the condition of Θ=θ is a bivariate elliptical distribution

    (Xi,S|θ)TE2(Bim(θ),θBiΣBTi,g2),

    where the mean vector and convariance matrix of (Xi,S|θ) are given by

    E(BiX|θ)=Bim(θ)=(E(Xi|θ),E(S|θ))T=[mi(θ)nj=1mj(θ)],
    Cov(BiX|θ)=ψ(0)θBiΣBTi=ψ(0)[θσiiθnj=1σijθnj=1σijθσ2S],

    where

    σ2S=1TΣ1=ni=1nj=1σij.

    Therefore, if we impose another condition on S, we see that f(xi|s,θ) is an elliptical density. In particular

    xif(xi|s,θ)dxi=E(Xi|S=s,Θ=θ)=E(Xi|θ)+Cov(Xi,S|θ)Var(S|θ)(sE(S|θ))=mi(θ)+ψ(0)nj=1σijψ(0)σ2S(snj=1mj(θ))=mi(θ)+dj=1σijσ2S(snj=1mj(θ)).

    Consequently

    E(Xi|S=s)=1fS(s)0[xif(xi|s,θ)dxi]f(s|θ)π(θ)dθ=1fS(s)0[mi(θ)+nj=1σijσ2S(snj=1mj(θ))]×1θσSfZ(snj=1mj(θ)θσS)π(θ)dθ. (4.3)

    Eventually

    E(Xi|S>sp)=11pspE(Xi|S=s)fS(s)ds=11psp0[mi(θ)+nj=1σijσ2S(snj=1mj(θ))]×1θσSfZ(snj=1mj(θ)θσS)π(θ)dθds. (4.4)

    This expression, though complex, can produce a closed-form quantity to properly select π(θ) and mj(θ).

    The portfolio risk decomposition with TCE is additive, that is, the sum of all portfolio risk decomposition must amount to the TCE for S. We can verify this

    ni=1E(Xi|S>sp)=11pni=1sp0[mi(θ)+nj=1σijσ2S(snj=1mj(θ))]1θσSfZ(snj=1mj(θ)θσS)π(θ)dθds=11psp0[ni=1mi(θ)+ni=1nj=1σijσ2S(snj=1mj(θ))]1θσSfZ(snj=1mj(θ)θσS)π(θ)dθds=11psp0[ni=1mi(θ)+(snj=1mj(θ))]1θσSfZ(snj=1mj(θ)θσS)π(θ)dθds=11psp0s1θσSfZ(snj=1mj(θ)θσS)π(θ)dθds=E(S|S>sp),

    as required. Now the general portfolio risk decomposition with TCE formula for the multivariate LSME distributions class in presented is a more concrete and compact manner when m(θ) is linear in θ.

    Theorem 2. Let X=(X1,X2,,Xn)TLSMEn(μ,Σ,γ,gn;Π) and denote the pdf of S=1TX by fS(s). Let π(θ)=(c)1θπ(θ) be a mixing pdf with c=E(Θ)<.

    Then the portfolio risk decomposition with TCE for the i-th marginal variable is given by

    E(Xi|S>sp)=b0,i+b1,iE(S|S>sp)+b2,i1pc¯FLSME,1(sp;1Tμ,1TΣ1,1Tγ,g1;Π), (4.5)

    where Π is the cdf corresponding to the pdf π, the coefficients b0,i,b1,i, and b2,i are defined as

    b0,i=μib1,inj=1μj;b1,i=nj=1σijσ2S;b2,i=γib1,inj=1γj,

    and sp is the p-quantile of S.

    Proof. Let mi(θ)=μi+θγi, and from (4.3) we have

    E(Xi|S=s)=1fS(s)0[μi+nj=1σijσ2S(snj=1μj)+(γinj=1γjnj=1σijσ2S)θ]×1θσSfZ(snj=1μjθnj=1γjθσS)π(θ)dθ=1fS(s)0[b0,i+b1,is+b2,iθ]1θσSfZ(snj=1μjθnj=1γjθσS)π(θ)dθ=1fS(s)[b0,ifS(s)+b1,isfS(s)+b2,icfLSME,1(sp;1Tμ,1TΣ1,1Tγ,g1;Π)]=b0,i+b1,is+b2,icfLSME,1(sp;1Tμ,1TΣ1,1Tγ,g1;Π)fS(s).

    By inserting this into the portfolio risk decomposition with TCE formulation (4.4), we complete the proof as

    E(Xi|S>sp)=spE(Xi|S=s)f(s|S>sp)ds=spE(Xi|S=s)fS(s)1pds=11psp(b0,i+b1,is)fS(s)ds+11pb2,icspfLSME,1(s;1Tμ,1TΣ1,1Tγ,g1;Π)ds=b0,i+b1,iE(S|S>sp)+b2,i1pc¯FLSME,1(sp;1Tμ,1TΣ1,1Tγ,g1;Π). (4.6)

    Notice that ni=1b0,i=0,ni=1b1,i=1, and ni=1b2,i=0, which can be used to verify that the sum of these portfolio risk decomposition amounts to E(S|S>sp).

    Corollary 2. Let X=(X1,X2,,Xn)TGHEn(μ,Σ,γ,gn,λ,a,b). The portfolio risk decomposition with TCE for the i-th marginal variable is given by

    E(Xi|S>sp)=μi+σ2Znj=1σij1pabKλ+1(ab)Kλ(ab)fGHE,1(sp;1Tμ,1TΣ1,1Tγ,¯G;λ+1,a,b)+γi1pabKλ+1(ab)Kλ(ab)¯FGHE,1(sp;1Tμ,1TΣ1,1Tγ,g1,λ+1,a,b). (4.7)

    Proof. We can know SGHE1(sp;1Tμ,1TΣ1,1Tγ,g1,λ+1,a,b) by using Proposition 1. Using (4.5), we see that TCE of S is given by

    E(S|S>sp)=μi+σ2Znj=1σij1pabKλ+1(ab)Kλ(ab)fGHE,1(sp;1Tμ,1TΣ1,1Tγ,¯G,λ+1,a,b)+γi1pabKλ+1(ab)Kλ(ab)¯FGHE,1(sp;1Tμ,1TΣ1,1Tγ,g1,λ+1,a,b).

    Therefore

    E(Xi|S>sp)=μi+b1,iσ2Sσ2Z1pabKλ+1(ab)Kλ(ab)fGHE,1(sp;1Tμ,1TΣ1,1Tγ,¯G,λ+1,a,b)+γi1pabKλ+1(ab)Kλ(ab)¯FGHE,1(sp;1Tμ,1TΣ1,1Tγ,g1,λ+1,a,b)=μi+σ2Znj=1σij1pabKλ+1(ab)Kλ(ab)fGHE,1(sp;1Tμ,1TΣ1,1Tγ,¯G,λ+1,a,b)+γi1pabKλ+1(ab)Kλ(ab)¯FGHE,1(sp;1Tμ,1TΣ1,1Tγ,g1,λ+1,a,b).

    The TV of the univariate elliptical distribution is introduced by [8]. From (3.1), We can write the TV for X|θ as

    TVp(X|θ)=Var(X|θ)[r(κ(xp;θ))+hZ,Z(κ(xp;θ))(κ(xp;θ)hZ,Z(κ(xp;θ))σ2Z)],

    where κ(x;θ) is the same as in (3.4),

    r(z)=¯FZ(z)¯FZ(z)

    is the distorted ratio function, and

    hZ,Z(z)=fZ(z)¯FZ(z)

    is the distorted hazard function.

    TV can be rewritten as:

    TVp(X)=Var(X|X>xp)=E[(XTCEp(X))2|X>xp]=E(X2|X>xp)[TCEp(X)]2. (5.1)

    Consequently, we need to derive the second order conditional tail moment E(X2|X>xp). We now provide its analytic expression in the following result.

    Proposition 2. Assume a random variable XLSME1(μ,σ2,γ,g1;Π). Let π(θ)=(c)1θπ(θ) and π(θ)=(c)1θ2π(θ) be two different mixing pdfs with c=E(Θ)< and c=E(Θ2)< respectively. Then

    E(X2|X>xp)=σ2σ2Z1p[(xp+μ)cfLSME,1(xp;μ,σ2,γ,¯G;Π)+γcfLSME,1(xp;μ,σ2,γ,¯G;Π)+c¯FLSME,1(xp;μ,σ2,γ,¯G;Π)]+γ1p[2μc¯FLSME,1(xp;μ,σ2,γ,g1;Π)+γc¯FLSME,1(xp;μ,σ2,γ,g1;Π)]+μ2, (5.2)

    where Π and Π are two cdfs corresponding to the two different pdfs π and π, respectively.

    Proof. From observing

    E(X2|X>xp)=11pxpx2fX(x)dx=11p0xpx2fX|θ(x)π(θ)dxdθ=11p0EX|θ(X2|X>xp)¯FX|θ(xp)π(θ)dθ.

    To deal with the second order conditional tail moment in the integration, we write it as

    EX|θ(X2|X>xp)=TVp(X|θ)+[TCEp(X|θ)]2. (5.3)

    From [17], we know

    TCEp(X|θ)=m(θ)+hZ,Z(κ(xp;θ))Var(X|θ)θσ,

    taking Var(X|θ)=θσ2σ2Z into consideration, then (5.3) becomes

    EX|θ(X2|X>xp)=Var(X|θ)[r(κ(xp;θ))+hZ,Z(κ(xp;θ))(κ(xp;θ)hZ,Z(κ(xp;θ))σ2Z)]+(m(θ)+hZ,Z(κ(xp;θ))Var(X|θ)θσ)2=Var(X|θ)r(κ(xp;θ))+Var(X|θ)hZ,Z(κ(xp;θ))xpm(θ)θσVar(X|θ)(hZ,Z(κ(xp;θ)))2σ2Z+m2(θ)+2m(θ)hZ,Z(κ(xp;θ))Var(X|θ)θσ+(hZ,Z(κ(xp;θ)))2θσ2σ2ZVar(X|θ)θσ2=m2(θ)+Var(X|θ)(r(κ(xp;θ))+hZ,Z(κ(xp;θ))xp+m(θ)θσ).

    As a result

    E(X2|X>xp)=11p0EX|θ(X2|X>xp)¯FX|θ(xp)π(θ)dθ=11p0Var(X|θ)[xp+m(θ)θσhZ,Z(κ(xp;θ))+r(κ(xp;θ))]ׯFX|θ(xp)π(θ)dθ+11p0m2(θ)¯FX|θ(xp)π(θ)dθ=11p0Var(X|θ)[xp+m(θ)θσfZ(κ(xp;θ))+¯FZ(κ(xp;θ))]×π(θ)dθ+11p0¯FX|θ(xp)π(θ)(μ2+2μθγ+θ2γ2)dθ=σ2σ2Z1p[(xp+μ)cfLSME,1(xp;μ,σ2,γ,¯G;Π)+γcfLSME,1(xp;μ,σ2,γ,¯G;Π)+c¯FLSME,1(xp;μ,σ2,γ,¯G;Π)]+γ1p[2μc¯FLSME,1(xp;μ,σ2,γ,g1;Π)+γc¯FLSME,1(xp;μ,σ2,γ,g1;Π)]+μ2.

    Theorem 3. Under assumptions of Proposition 2, the TV of X is given by

    TVp(X)=σ2σ2Z1p[(xpμ)cfLSME,1(xp;μ,σ2,γ,¯G;Π)+γcfLSME,1(xp;μ,σ2,γ,¯G;Π)+c¯FLSME,1(xp;μ,σ2,γ,¯G;Π)]+γ21pc¯FLSME,1(xp;μ,σ2,γ,g1;Π)(c1p)2[γ¯FLSME,1(xp;μ,σ2,γ,g1;Π)+σ2σ2ZfLSME,1(xp;μ,σ2,γ,¯G;Π)]2, (5.4)

    where Π and Π are two cdfs corresponding to the two different pdfs π and π, respectively.

    Proof. From the Theorem 1, the TCE formula is

    E(X|X>xp)=μ+γc1p¯FLSME,1(xp;μ,σ2,γ,g1;Π)+cσ2σ2Z1pfLSME,1(xp;μ,σ2,γ,¯G;Π).

    Hence, the result can be derived by using Proposition 2 and (5.1).

    Corollary 3. Let XGHE1(μ,σ2,γ,g1,λ,a,b). Assume the conditions in Theorem 3 are satisfied, then the TV for GHE can be computed by:

    TVp(X)=σ2σ2Z1p[(xpμ)abKλ+1(ab)Kλ(ab)fGHE,1(xp;μ,σ2,γ,¯G;λ+1,a,b)+γabKλ+2(ab)Kλ(ab)fGHE,1(xp;μ,σ2,γ,¯G;λ+2,a,b)+abKλ+1(ab)Kλ(ab)¯FGHE,1(xp;μ,σ2,γ,¯G;λ+1,a,b)]+γ21pabKλ+2(ab)Kλ(ab)¯FGHE,1(xp;μ,σ2,γ,g1,λ+2,a,b)(11pabKλ+1(ab)Kλ(ab))2[γ¯FGHE,1(xp;μ,σ2,γ,g1,λ+1,a,b)+σ2σ2ZfGHE,1(xp;μ,σ2,γ,¯G;λ+1,a,b)]2. (5.5)

    Proof. From the GIG density in (2.11), we conclude

    θπ(θ;λ,a,b)=abKλ+1(ab)Kλ(ab)π(θ;λ+1,a,b)

    and

    θ2π(θ;λ,a,b)=abKλ+2(ab)Kλ(ab)π(θ;λ+2,a,b).

    By setting

    c=abKλ+1(ab)Kλ(ab),c=abKλ+2(ab)Kλ(ab),

    the two pdfs can be presented as

    π(θ)=(c)1θπ(θ)=π(θ;λ+1,a,b)

    and

    π(θ)=(c)1θ2π(θ)=π(θ;λ+2,a,b),

    which also are two GIG pdfs. Using (5.4) we can directly obtain (5.5).

    Example6.1 (Generalized hyperbolic distribution). If μ=0, Σ=In and density generator g(u)=eu in (2.2), then the elliptical vector Y is said to have a multivariate normal distribution, denoted by YNn(0,In). Letting YNn(0,In) in (2.6), then the random vector XGHn(μ,Σ,γ,λ,a,b) is an n-dimensional generalized hyperbolic (GH) distribution. Therefore, the pdf of the GH distribution is (see [15])

    fGHn(x,μ,Σ,γ,λ,a,b)=cKλ(n/2)((a+(xμ)TΣ1(xμ))(b+γTΣ1γ))e(xμ)TΣ1γ(a+(xμ)TΣ1(xμ))(b+γTΣ1γ))n4λ2,

    where the normalizing constant is

    c=(ab)λ2bλ(b+γTΣ1γ)(n/2)λ(2π)n/2|Σ|1/2Kλab.

    From Corollary 1, TCE of GH distribution is given by

    TCEp(X)=μ+γ1pabKλ+1(ab)Kλ(ab)¯FGH,1(xp;μ,σ2,γ,λ+1,a,b)+σ21pabKλ+1(ab)Kλ(ab)fGH,1(xp;μ,σ2,γ,λ+1,a,b).

    From Corollary 2, portfolio risk decomposition with TCE for the i-th marginal of GH distribution is given by

    E(Xi|S>sp)=μi+nj=1σij1pabKλ+1(ab)Kλ(ab)fGH,1(sp;1Tμ,1TΣ1,1Tγ,λ+1,a,b)+γi1pabKλ+1(ab)Kλ(ab)¯FGH,1(sp;1Tμ,1TΣ1,1Tγ,λ+1,a,b).

    From Corollary 3, TV of GH distribution is given by

    TVp(X)=σ21p[(xpμ)abKλ+1(ab)Kλ(ab)fGH,1(xp;μ,σ2,γ,λ+1,a,b)+γabKλ+2(ab)Kλ(ab)fGH,1(xp;μ,σ2,γ,λ+2,a,b)+abKλ+1(ab)Kλ(ab)¯FGH,1(xp;μ,σ2,γ,λ+1,a,b)]+γ21pabKλ+2(ab)Kλ(ab)¯FGH,1(xp;μ,σ2,γ,λ+2,a,b)(11pabKλ+1(ab)Kλ(ab))2[γ¯FGH,1(xp;μ,σ2,γ,λ+1,a,b)+σ2fGH,1(xp;μ,σ2,γ,λ+1,a,b)]2.

    In this section we discuss the TV of five stocks (Amazon, Goldman Sachs, IBM, Google, and Apple) and aggregate portfolio covering a time frame from the 1st of January 2015 to the 1st of January 2017. Ignatieva and Landsman [12] fitted a GH model to five stocks and aggregate portfolio, and obtained the following parameter set based on the maximum likelihood technique.

    λ=1.18336,a=1.272016,ψ=0.348483,μ=(0.099770.045550.093550.036690.10367),γ=(0.086260.008030.079280.052300.08534),
    Σ=(3.3871.4071.1031.8281.3541.4073.0141.2881.2091.4341.1031.2881.8701.0611.1551.8281.2091.0612.1711.2201.3541.4341.1551.2202.891).

    For the risk analysis, we denote five stocks as X1,,X5. We also consider aggregate portfolio S where each stock has equal weight for simplicity, so that the aggregate portfolio is defined as S=X1++X5. Figure 1 shows the densities for five stocks Xi,i=1,,5 and aggregate portfolio S. The pdf of S has the largest variance, and Amazon has the largest dispersion among five stocks. IBM has the smallest dispersion. Figure 2 presents the TV for five stocks Xi,i=1,,5 and aggregate portfolio S. All the risk measures increase over the quantile with the TV. Also Figure 2 shows the differences in the TV measure along with five stocks and aggregate portfolio. For the same quantile, the TV of Apple is the largest one and the TV of IBM is the smallest one among the five stocks.

    Figure 1.  Densities for Xi, i=1,,5 and S for GH.
    Figure 2.  TV for Xi, i=1,,5 and S for GH.

    In this paper we generalize the tail risk measure and portfolio risk decomposition with TCE formula derived by [15] for the class of multivariate normal mean-variance mixture distributions to the larger class of multivariate elliptical location-scale mixtures distributions. A prominent member in the normal mean-variance mixture class is the generalized hyperbolic (GH) distribution, which itself can construct a Leˊvy process. The \text{GH} is a special case of normal mean-variance mixture random variable with \boldsymbol{X}\sim \text{N}_n{(\boldsymbol{0}, \boldsymbol{I_n})} and the distribution of \Theta given by a generalized inverse Gaussian (GIG) distribution with three parameters (see [12,15] for details). Prominent member in the elliptical location-scale mixtures class is the generalized hyper-elliptical ( \text{GHE} ) distribution. The \text{GHE} distribution provides excellent fit to univariate and multivariate data, allowing to capture a long right tail in the distribution of losses even more effectively than the \text{GH} distribution considered in [12]. And \text{GHE} is a special case of elliptical location-scale mixtures random variable with \boldsymbol{X}\sim \text{N}_n{(\boldsymbol{0}, \boldsymbol{I_n})} and the distribution of \Theta given by a generalized inverse Gaussian (GIG) distribution with three parameters. Although the univariate \text{TCE} and portfolio risk decomposition with \text{TCE} formula for the \text{GHE} class was available in [13], it can be derived more efficiently and seen as a special case of \text{TCE} for the unified location-scale mixtures of elliptical distributions and risk allocation formula in Theorems 1 and 2, respectively. And the univariate \text{TV} formula for the \text{GHE} class can be derived efficiently and seen as a special case of \text{TV} for the unified location-scale mixtures of elliptical distributions in Theorem 3.

    The research was supported by the National Natural Science Foundation of China (No. 12071251).

    The authors declare no conflict of interest.



    [1] L. A. Torre, F. Bray, R. L. Siegel, J. Ferlay, J. Lortet-Tieulent, A. Jemal, Global cancer statistics, 2012, CA Cancer J. Clin., 65 (2015), 87–108. https://doi.org/10.3322/caac.21262 doi: 10.3322/caac.21262
    [2] M. F. Akay, Support vector machines combined with feature selection for breast cancer diagnosis, Expert Syst. Appl., 36 (2009), 3240–3247. https://doi.org/10.1016/j.eswa.2008.01.009 doi: 10.1016/j.eswa.2008.01.009
    [3] R. L. Siegel, K. D. Miller, A. Jemal, Cancer statistics, 2018, CA Cancer J. Clin., 68 (2018), 7–30. https://doi.org/10.3322/caac.21442 doi: 10.3322/caac.21442
    [4] L. Peng, W. Chen, W. Zhou, F. Li, J. Yang, J. Zhang, An immune-inspired semi-supervised algorithm for breast cancer diagnosis, Comput. Methods Programs Biomed., 134 (2016), 259–265. https://doi.org/10.1016/j.cmpb.2016.07.020 doi: 10.1016/j.cmpb.2016.07.020
    [5] H. L. Chen, B. Yang, J. Liu, D. Y. Liu, A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis, Expert Syst. Appl., 38 (2011), 9014–9022. https://doi.org/10.1016/j.eswa.2011.01.120 doi: 10.1016/j.eswa.2011.01.120
    [6] J. B. Li, Y. Peng, D. Liu, Quasiconformal kernel common locality discriminant analysis with application to breast cancer diagnosis, Inf. Sci., 223 (2013), 256–269. https://doi.org/10.1016/j.ins.2012.10.016 doi: 10.1016/j.ins.2012.10.016
    [7] B. Zheng, S. W. Yoon, S. S. Lam, Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms, Expert Syst. Appl., 4 (2014), 1476–1482. https://doi.org/10.1016/j.eswa.2013.08.044 doi: 10.1016/j.eswa.2013.08.044
    [8] F. Gorunescu, S. Belciug, Evolutionary strategy to develop learning-based decision systems. Application to breast cancer and liver fibrosis stadialization, J. Biomed. Inform., 49 (2014), 112–118. https://doi.org/10.1016/j.jbi.2014.02.001 doi: 10.1016/j.jbi.2014.02.001
    [9] M. Karabatak, A new classifier for breast cancer detection based on Naive Bayesian, Meas., 72 (2015), 32–36. https://doi.org/10.1016/j.measurement.2015.04.028 doi: 10.1016/j.measurement.2015.04.028
    [10] R. Sheikhpour, M. A. Sarram, R. Sheikhpour, Particle swarm optimization for bandwidth determination and feature selection of kernel density estimation based classifiers in diagnosis of breast cancer, Appl. Soft Comput., 40 (2016), 113–131. https://doi.org/10.1016/j.asoc.2015.10.005 doi: 10.1016/j.asoc.2015.10.005
    [11] M. F. Ijaz, M. Attique, Y. Son, Data-driven cervical cancer prediction model with outlier detection and over-sampling methods, Sensors, 20 (2020), 2809. https://doi.org/10.3390/s20102809 doi: 10.3390/s20102809
    [12] M. Mandal, P. K. Singh, M. F. Ijaz, J. Shafi, R. Sarkar, A Tri-Stage Wrapper-Filter Feature Selection Framework for Disease Classification, Sensors, 21 (2021), 5571. https://doi.org/10.3390/s21165571 doi: 10.3390/s21165571
    [13] H. Patel, G. S. Thakur, Classification of imbalanced data using a modified fuzzy-neighbor weighted approach, Int. J. Intell. Eng. Syst., 10 (2017), 56–64. https://doi.org/10.22266/ijies2017.0228.07 doi: 10.22266/ijies2017.0228.07
    [14] W. C. Lin, C. F. Tsai, Y. H. Hu, J. S. Jhang, Clustering-based undersampling in class-imbalanced data, Inf. Sci., 409 (2017), 17–26. https://doi.org/10.1016/j.ins.2017.05.008 doi: 10.1016/j.ins.2017.05.008
    [15] P. D. Turney, Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm, J. Artif. Intell. Res., 2 (1994), 369–409. https://doi.org/10.1613/jair.120 doi: 10.1613/jair.120
    [16] H. E. Kiziloz, Classifier ensemble methods in feature selection, Neurocomputing, 419 (2021), 97–107. https://doi.org/10.1016/j.neucom.2020.07.113 doi: 10.1016/j.neucom.2020.07.113
    [17] M. Galar, A. Fernández, E. Barrenechea, H. Bustince, F. Herrera, Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets, Inf. Sci., 354 (2016), 178–196. https://doi.org/10.1016/j.ins.2016.02.056 doi: 10.1016/j.ins.2016.02.056
    [18] J. Zhang, L. Chen, J. Tian, F. Abid, W. Yang, X. Tang, Breast cancer diagnosis using cluster-based undersampling and boosted C5. 0 algorithm, Int. J. Control Autom. Syst., 19 (2021), 1998–2008. https://doi.org/10.1007/s12555-019-1061-x doi: 10.1007/s12555-019-1061-x
    [19] Z. Zheng, X. Wu, R. Srihari, Feature selection for text categorization on imbalanced data, ACM Sigkdd Explor. Newsl., 6 (2004), 80–89. https://doi.org/10.1145/1007730.1007741 doi: 10.1145/1007730.1007741
    [20] S. Punitha, F. Al-Turjman, T. Stephan, An automated breast cancer diagnosis using feature selection and parameter optimization in ANN, Comput. Electr. Eng., 90 (2021), 106958. https://doi.org/10.1016/j.compeleceng.2020.106958 doi: 10.1016/j.compeleceng.2020.106958
    [21] P. N. Srinivasu, J. G. SivaSai, M. F. Ijaz, A. K. Bhoi, W. Kim, J. J. Kang, Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM, Sensors, 21 (2021), 2852. https://doi.org/10.3390/s21082852 doi: 10.3390/s21082852
    [22] H. Naeem, A. A. Bin-Salem, A CNN-LSTM network with multi-level feature extraction-based approach for automated detection of coronavirus from CT scan and X-ray images, Appl. Soft Comput., 113 (2021), 107918. https://doi.org/10.1016/j.asoc.2021.107918 doi: 10.1016/j.asoc.2021.107918
    [23] P. Huang, Q. Ye, F. Zhang, G. Yang, W. Zhu, Z. Yang, Double L2, p-norm based PCA for feature extraction, Inf. Sci., 573 (2021), 345–359. https://doi.org/10.1016/j.ins.2021.05.079 doi: 10.1016/j.ins.2021.05.079
    [24] H. D. Cheng, X. J. Shi, R. Min, L. M. Hu, X. P. Cai, H. N. Du, Approaches for automated detection and classification of masses in mammograms, Pattern Recognit., 4 (2006), 646–668. https://doi.org/10.1016/j.patcog.2005.07.006 doi: 10.1016/j.patcog.2005.07.006
    [25] T. Raeder, G. Forman, N. V. Chawla, Learning from imbalanced data: Evaluation matters, in Data mining: Foundations and intelligent paradigms, Springer, (2012), 315–331. https://doi.org/10.1007/978-3-641-23166-7_12 doi: 10.1007/978-3-641-23166-7_12
    [26] S. Piri, D. Delen, T. Liu, A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets, Decis. Support Syst., 106 (2018), 15–29. https://doi.org/10.1016/j.dss.2017.11.006 doi: 10.1016/j.dss.2017.11.006
    [27] C. Seiffert, T. M. Khoshgoftaar, J. Van. Hulse, A. Napolitano, RUSBoost: A hybrid approach to alleviating class imbalance, IEEE Trans. Syst. Man Cybern. Part A: Syst. Hum., 40 (2009), 185–197. https://doi.org/10.1109/tsmca.2009.2029559 doi: 10.1109/tsmca.2009.2029559
    [28] N. Liu, E. S. Qi, M. Xu, B. Gao, G. Q. Liu, A novel intelligent classification model for breast cancer diagnosis, Inf. Process. Manage., 56 (2019), 609–623. https://doi.org/10.1016/j.ipm.2018.10.014 doi: 10.1016/j.ipm.2018.10.014
    [29] S. Wang, Y. Wang, D. Wang, Y. Yin, Y. Wang, Y. Jin, An improved random forest-based rule extraction method for breast cancer diagnosis, Appl. Soft Comput., 86 (2020), 105941. https://doi.org/10.1016/j.asoc.2019.105941 doi: 10.1016/j.asoc.2019.105941
    [30] H. Wang, B. Zheng, S. W. Yoon, H. S. Ko, A support vector machine-based ensemble algorithm for breast cancer diagnosis, Eur. J. Oper. Res., 267 (Year), 687–699. https://doi.org/10.1016/j.ejor.2017.12.001 doi: 10.1016/j.ejor.2017.12.001
    [31] L. Breiman, Bagging predictors, Mach. Learn., 24 (1996), 123–140. https://doi.org/10.1007/BF00058655 doi: 10.1007/BF00058655
    [32] A. Taherkhani, G. Cosma, T. M. McGinnity, AdaBoost-CNN: An adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning, Neurocomputing, 404 (2020), 351–366. https://doi.org/10.1016/j.neucom.2020.03.064 doi: 10.1016/j.neucom.2020.03.064
  • This article has been cited by:

    1. Mengxin He, Zhong Li, Dynamic behaviors of a Leslie-Gower predator-prey model with Smith growth and constant-yield harvesting, 2024, 32, 2688-1594, 6424, 10.3934/era.2024299
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3448) PDF downloads(200) Cited by(13)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog