Processing math: 39%
Research article Special Issues

A discrete extension of the Burr-Hatke distribution: Generalized hypergeometric functions, different inference techniques, simulation ranking with modeling and analysis of sustainable count data

  • The intertwining relationship between sustainability and discrete probability distributions found its significance in decision-making processes and risk assessment frameworks. Count data modeling and its practical applications have gained attention in numerous research studies. This investigation focused on a particular discrete distribution characterized by a single parameter obtained through the survival discretization method. Statistical attributes of this distribution were accurately explicated using generalized hypergeometric functions. The unveiled characteristics highlighted its suitability for analyzing data displaying "right-skewed" asymmetry and possessing extended "heavy" tails. Its failure rate function effectively addressed scenarios marked by a consistent decrease in rates. Furthermore, it proved to be a valuable tool for probabilistic modeling of over-dispersed data. The study introduced various estimation methods such as maximum product of spacings, Anderson-Darling, right-tail Anderson-Darling, maximum likelihood, least-squares, weighted least-squares, percentile, and Cramer-Von-Mises, offering comprehensive explanations. A ranking simulation study was conducted to evaluate the performance of these estimators, employing ranking techniques to identify the most effective estimator across different sample sizes. Finally, real-world sustainability engineering and medical datasets were analyzed to demonstrate the significance and application of the newly introduced model.

    Citation: Khaled M. Alqahtani, Mahmoud El-Morshedy, Hend S. Shahen, Mohamed S. Eliwa. A discrete extension of the Burr-Hatke distribution: Generalized hypergeometric functions, different inference techniques, simulation ranking with modeling and analysis of sustainable count data[J]. AIMS Mathematics, 2024, 9(4): 9394-9418. doi: 10.3934/math.2024458

    Related Papers:

    [1] Abdulaziz S. Alghamdi, Muhammad Ahsan-ul-Haq, Ayesha Babar, Hassan M. Aljohani, Ahmed Z. Afify . The discrete power-Ailamujia distribution: properties, inference, and applications. AIMS Mathematics, 2022, 7(5): 8344-8360. doi: 10.3934/math.2022465
    [2] Ahmed Sedky Eldeeb, Muhammad Ahsan-ul-Haq, Mohamed S. Eliwa . A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: Properties and various estimation techniques with inference. AIMS Mathematics, 2022, 7(2): 1726-1741. doi: 10.3934/math.2022099
    [3] Mohamed S. Algolam, Mohamed S. Eliwa, Mohamed El-Dawoody, Mahmoud El-Morshedy . A discrete extension of the Xgamma random variable: mathematical framework, estimation methods, simulation ranking, and applications to radiation biology and industrial engineering data. AIMS Mathematics, 2025, 10(3): 6069-6101. doi: 10.3934/math.2025277
    [4] Mohamed Ahmed Mosilhy . Discrete Erlang-2 distribution and its application to leukemia and COVID-19. AIMS Mathematics, 2023, 8(5): 10266-10282. doi: 10.3934/math.2023520
    [5] Rasha Abd El-Wahab Attwa, Shimaa Wasfy Sadk, Hassan M. Aljohani . Investigation the generalized extreme value under liner distribution parameters for progressive type-Ⅱ censoring by using optimization algorithms. AIMS Mathematics, 2024, 9(6): 15276-15302. doi: 10.3934/math.2024742
    [6] Hatim Solayman Migdadi, Nesreen M. Al-Olaimat, Maryam Mohiuddin, Omar Meqdadi . Statistical inference for the Power Rayleigh distribution based on adaptive progressive Type-II censored data. AIMS Mathematics, 2023, 8(10): 22553-22576. doi: 10.3934/math.20231149
    [7] Alanazi Talal Abdulrahman, Khudhayr A. Rashedi, Tariq S. Alshammari, Eslam Hussam, Amirah Saeed Alharthi, Ramlah H Albayyat . A new extension of the Rayleigh distribution: Methodology, classical, and Bayes estimation, with application to industrial data. AIMS Mathematics, 2025, 10(2): 3710-3733. doi: 10.3934/math.2025172
    [8] Monthira Duangsaphon, Sukit Sokampang, Kannat Na Bangchang . Bayesian estimation for median discrete Weibull regression model. AIMS Mathematics, 2024, 9(1): 270-288. doi: 10.3934/math.2024016
    [9] Emrah Altun, Hana Alqifari, Mohamed S. Eliwa . A novel approach for zero-inflated count regression model: Zero-inflated Poisson generalized-Lindley linear model with applications. AIMS Mathematics, 2023, 8(10): 23272-23290. doi: 10.3934/math.20231183
    [10] Nora Nader, Dina A. Ramadan, Hanan Haj Ahmad, M. A. El-Damcese, B. S. El-Desouky . Optimizing analgesic pain relief time analysis through Bayesian and non-Bayesian approaches to new right truncated Fréchet-inverted Weibull distribution. AIMS Mathematics, 2023, 8(12): 31217-31245. doi: 10.3934/math.20231598
  • The intertwining relationship between sustainability and discrete probability distributions found its significance in decision-making processes and risk assessment frameworks. Count data modeling and its practical applications have gained attention in numerous research studies. This investigation focused on a particular discrete distribution characterized by a single parameter obtained through the survival discretization method. Statistical attributes of this distribution were accurately explicated using generalized hypergeometric functions. The unveiled characteristics highlighted its suitability for analyzing data displaying "right-skewed" asymmetry and possessing extended "heavy" tails. Its failure rate function effectively addressed scenarios marked by a consistent decrease in rates. Furthermore, it proved to be a valuable tool for probabilistic modeling of over-dispersed data. The study introduced various estimation methods such as maximum product of spacings, Anderson-Darling, right-tail Anderson-Darling, maximum likelihood, least-squares, weighted least-squares, percentile, and Cramer-Von-Mises, offering comprehensive explanations. A ranking simulation study was conducted to evaluate the performance of these estimators, employing ranking techniques to identify the most effective estimator across different sample sizes. Finally, real-world sustainability engineering and medical datasets were analyzed to demonstrate the significance and application of the newly introduced model.



    In the analysis of real-world sustainability data, it is common to utilize continuous random distributions like the Burr-Hatke exponential (BHE) distribution. However, there are instances where the measurement of lifetimes is discrete, such as recording survival time in months or weeks. In such cases, employing a discrete random variable is more suitable. Additionally, practical problems in engineering and applied sciences often involve count phenomena, like the number of earthquakes in a year, accidents at a location, doctor visits, or insurance claims. Despite the availability of various established discrete models, there is a continued need for more flexible distributions that can effectively capture the diverse characteristics of sustainability datasets. This includes factors like asymmetry, under or over-dispersion, and variations in the failure rate function. Recognizing the significance of discrete probability models in our previous survey, we have developed and extensively explored a discrete probability distribution. This new model serves as the discrete counterpart to the BHE distribution. The BHE distribution has gained widespread utility in reliability analysis, survival modeling, and risk assessment due to its versatility in capturing diverse data patterns. Known for its flexibility in modeling right-skewed and heavy-tailed data, the BHE model is well-suited for characterizing a broad spectrum of real-world phenomena. Its adaptability extends to applications in survival analysis, providing a valuable tool for researchers to effectively model complex datasets and gain a deeper understanding of the underlying mechanisms governing observed events. For additional information and in-depth details about the BHE distribution, please refer to the citation labeled as [1]. If the expression for the survival function (SF) and probability density function (PDF) of a random variable X conforms to the following, it is recognized as adhering to the BHE distribution

    S(x;λ)=eλx1+λx; λ>0, x>0, (1.1)

    and

    g(x;λ)=λeλx2+λx(1+λx)2; λ>0, x>0, (1.2)

    respectively, where λ>0 is a scale parameter. In accordance with survival discretization techniques, one can derive a discrete BHE (DBHE) distribution. Survival discretization techniques are a set of statistical methods used to transform continuous probability distributions, such as the BHE distribution, into discrete versions suitable for practical applications. These techniques are particularly valuable when dealing with real-world data, which is often recorded in discrete units or intervals. By means of this process, the probability mass function can be obtained as

    Pr(X=x;.)=S(x;.)S(x+1;.); x=0,1,2,3,.... (1.3)

    Several discrete distributions have been suggested and examined, utilizing the discrete survival function and other techniques as a foundation, including: Discrete Burr-Hatke [2], discrete linear exponential [3], discrete Pareto [4], discrete inverse Rayleigh [5], discrete inverse Weibull [6], discrete Lindley [7], new discrete extended Weibull [8], discrete generalized geometric [9], discrete Gompertz [10], discrete generalized exponential type II [11], an overview of discrete models for fitting COVID-19 datasets [12], discrete Ramos-Louzada [13], discrete generalized Rayleigh [14], and discrete Marshall-Olkinin [15], as well as the references cited within.

    The structure of this article is as follows: In Section 2, we introduce the DBHE distribution, developed through the survival discretization approach. Section 3 explores a range of statistical properties. Section 4 delves into the estimation of distribution parameters using various methods. In Section 5, we present a comprehensive simulation study based on ranking techniques. Section 6 demonstrates the versatility of the DBHE distribution by analyzing different datasets. Finally, Section 7 offers concluding remarks summarizing the findings presented in this paper.

    Using Eqs (1.1) and (1.3), the SF for the DBHE distribution is expressed as

    S(x;β)=βx+11(x+1)lnβ; xN0, (2.1)

    where 0<β=eλ<1 and N0=0,1,2,3,.... The behavior of the SF is described by

    S(x;β)={β1lnβ; x=01;     β1. (2.2)

    The associated cumulative distribution function (CDF) and probability mass function (PMF) for (2.1) can be formulated as follows:

    F(x;β)=1βx+11(x+1)lnβ; xN0, (2.3)

    and

    Pr(X=x;β)=βx[11xlnββ1(x+1)lnβ]; xN0, (2.4)

    respectively, where β controls the shape of the distribution. The behavior of the PMF is given by

    Pr(X=x;β)={1β1lnβ; x=00;            β1. (2.5)

    Figure 1 displays the PMF plots for different values of the parameter β.

    Figure 1.  The PMF plots of the DBHE model.

    It is worth noting that the PMF is highly effective for modeling unimodal-shaped data. Furthermore, it can also be applied to analyze asymmetric "positively-skewed" data, showcasing its versatility in capturing various data patterns. The hazard rate function (HRF) can be formulated as

    h(x;β)=1β(1xlnβ)1(x+1)lnβ; xN0. (2.6)

    The reversed hazard rate function (RHRF) is expressed as follows:

    r(x;β)=βx[11xlnββ1(x+1)lnβ]1βx+11(x+1)lnβ; xN0. (2.7)

    The hazard rate is a measure of an item's death rate at a specific age x and is a component of the broader hazard function equation. This equation evaluates the probability that an item, having survived up to a certain time t, will continue to endure beyond that point. In essence, it quantifies the likelihood that an item surviving one moment will persist to the next one. The hazard rate is particularly relevant to non-repairable items and is sometimes referred to as the failure rate. Its significance extends to the design of secure systems in various domains such as commerce, engineering, finance, insurance, and regulatory industries. It can be expressed as a ratio of probability density to its corresponding survival function. Conversely, the reversed hazard rate of a random life is defined as the ratio between the life probability density and its distribution function. This concept holds significance in the analysis of censored data and finds applications in fields such as forensic sciences. Figure 2 illustrates the HRF and RHRF plots for varying values of the parameter β.

    Figure 2.  The HRF and RHRF of the DBHE distribution.

    The observation of decreasing HRF and RHRF carries significant implications across multiple disciplines. This includes reliability engineering, where it signifies a decrease in system failure rates over time, healthcare, where it indicates improving survival probabilities, finance, where it suggests decreasing default probabilities, environmental sciences, where it hints at slowed environmental degradation, manufacturing, where it implies improved product quality, and public policy, where it informs safety measures and disaster preparedness, highlighting the importance of statistical analysis and hazard rate modeling for informed decision-making and process optimization in risk assessment and reliability domains.

    The moment generating function (MGF) and cumulant generating function (CGF) are essential tools in probability theory and statistics, offering valuable insights and advantages in various aspects of statistical analysis and probability modeling. Consider X as a random variable conforming to the DBHE distribution. The MGF, denoted as ΠX(t), and the CGF, denoted as KX(t), can be represented in terms of generalized hypergeometric functions as follows:

    ΠX(t;β)=x=0etxPr(X=x;β)=(1β1lnβ) hypergeom([1,λ1,λ2],[λ3,λ4],etβ), (3.1)

    and

    KX(t;β)=ln(ΠX(t;β))=ln(1β1lnβ)+ln(hypergeom([1,λ1,λ2],[λ3,λ4],etβ)), (3.2)

    where λ1=1lnβ, λ2=(β2)lnβ+1β(1+β)lnβ,λ3=1+2lnβlnβ, and λ4=1lnββ(1+β)lnβ. The equation represented by (3.1) can be derived using the Maple software, utilizing the hypergeom(.) function, which is a generalized hypergeometric function. This mathematical function finds applications across diverse fields such as complex analysis, differential equations, and statistical mechanics. Renowned for its role as a solution to the hypergeometric differential equation, it is extensively employed in expressing solutions to problems characterized by symmetry, particularly those featuring spherical or cylindrical symmetry. The initial four moments of the DBHE distribution can be formulated as follows:

    E(X)=A hypergeom([2,B,C],[D,E],β), (3.3)
    E(X2)=A hypergeom([2,2,B,C],[1,D,E],β), (3.4)
    E(X3)=A hypergeom([2,2,2,B,C],[1,1,D,E],β), (3.5)

    and

    E(X4)=A hypergeom([2,2,2,2,B,C],[1,1,1,D,E],β), (3.6)

    where A=β[(2+β)lnβ+1β]1+2(lnβ)23lnβ, B=1+lnβlnβ, C=(3+2β)lnβ+1β(1+β)lnβ, D=1+3lnβlnβ, E=(2+β)lnβ+1β(1+β)lnβ. Let n=[n1,n2,...], p=nops(n), d=[d1,d2,...], and q=nops(d). The hypergeom(n,d,z) calling sequence is the generalized hypergeometric function F(n,d,z). This function is frequently denoted by pFq(n,d,z). For the variable z, the pFq(n,d,z) can be formulated as

    pFq(n,d,z)=k=0zn.a(ni,k)k!.b(dj,k),

    where

    a(ni,k)=pi=1pochhammer(ni,k)  and  b(dj,k)=qj=1pochhammer(dj,k).

    The Pochhammer symbol can be listed as

    pochhammer(z,n)=z(z+1)...(z+n1).

    For additional information, please refer to the Maple software's library. Using Eqs 3.3–3.6, the variance, skewness and kurtosis can be derived as

    var(X)=E(X2)[E(X)]2, (3.7)
    skewness(X)=E(X3)3E(X2)E(X)+2[E(X)]3[Var(X)]3/2, (3.8)

    and

    kurtosis(X)=E(X4)4E(X)E(X3)+6E(X2)[E(X)]23[E(X)]4[Var(X)]2. (3.9)

    Table 1 provides a compilation of numerical descriptive measures that serve as valuable tools for gaining insights into the attributes of the DBHE distribution. These measures aid researchers and analysts in comprehending aspects like central tendency, variability, shape, and other critical properties. The choice of which measures to emphasize may vary depending on the specific analysis and application.

    Table 1.  Numerical descriptors for characterizing the DBHE distribution.
    Measure β 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
    Mean 0.0322 0.0876 0.1702 0.2906 0.4697 0.7499 1.2305 2.2094 5.1776
    Var 0.0350 0.1062 0.2299 0.4458 0.8399 1.6249 3.4397 8.9575 40.5734
    Skewness 6.3536 4.3745 3.5841 3.1582 2.9003 2.7370 2.6342 2.5731 2.5420
    Kurtosis 49.4831 27.1804 20.3660 17.1898 15.4389 14.4014 13.7775 13.4179 13.2383

     | Show Table
    DownLoad: CSV

    Based on the information in Table 1, it's evident that as β approaches 1, the mean and variance of the DBHE distribution exhibit an increase, whereas the skewness and kurtosis experience a decrease. Moreover, the presented model demonstrates its capability to effectively model distributions that are positively skewed and leptokurtic in nature. Leptokurtic is a statistical term used to describe a distribution that has heavier tails and a sharper peak (higher kurtosis) compared to a normal distribution. This indicates that the distribution has more extreme values or outliers than a normal distribution, leading to a higher concentration of data points in the center and in the tails. In simple terms, a leptokurtic distribution has a more peaked and less spread-out shape than a normal distribution.

    The index of dispersion (IOD) quantifies the absolute spread of data, while the coefficient of variation (COV) gauges the relative spread. Both metrics are valuable across diverse fields like epidemiology, finance, and quality control, where understanding data variability is crucial for decision-making. An IOD below 1 suggests underdispersion, indicating data points cluster closely around the mean. In contrast, values exceeding 1 signal overdispersion, revealing greater variability than expected by the assumed model. An IOD of 1 suggests a random distribution where spread is proportional to the mean. When interpreting the COV, a low COV indicates minor relative variability compared to the mean, while a high COV suggests significant relative variability. These measures offer essential insights for effective analysis and decision-making in various domains. Consider X as a random variable conforming to the DBHE distribution, then the IOD and the COV can be formulated as

    IOD(X;β)=hypergeom([2,2,B,C],[1,D,E],β)hypergeom([2,B,C],[D,E],β)A hypergeom([2,B,C],[D,E],β), (3.10)

    and

    COV(X;β)=hypergeom([2,2,B,C],[1,D,E],β)A (hypergeom([2,B,C],[D,E],β))21. (3.11)

    The statistics for the DBHE distribution, including the IOD and COV can be reported in Table 2.

    Table 2.  The IOD and COV of the DBHE distribution.
    Measure β 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
    IOD 1.0964 1.2097 1.3505 1.5342 1.7881 2.1664 2.7955 4.0542 7.8363
    COV 5.8351 3.7128 2.8165 2.2978 1.9512 1.6997 1.5073 1.3546 1.2302

     | Show Table
    DownLoad: CSV

    Based on the information in Table 2, it's evident that as β approaches 1, the IOD increases while the COV decreases. Additionally, the proposed model is best suited for modeling data with overdispersion characteristics.

    Consider a scenario where we have a set of n random variables, denoted as X1,X2,...,Xn, which are arranged in nondecreasing order and expressed as X1:nX2:n...Xn:n. In the context of order statistics, it's important to note that there are no constraints placed on whether these X,is are independent or identically distributed. However, many well-established results pertaining to order statistics are derived under the classical assumption that the X,is are independent and identically distributed (iid). The CDF of the ith order statistic is expressed as follows:

    Fi:n(x;β)=nk=i(nk)[Fi(x;β)]k[1Fi(x;β)]nk=nk=inkj=0Φ(n,k)m[Fi(x;β)]k+j, (3.12)

    where Φ(n,k)m=(1)j(nk)(nkj). Moreover, the associated PMF of the ith order statistic is given by

    fi:n(x;β)=Fi:n(x;β)Fi:n(x1;β)=nk=inkj=0Φ(n,k)m[fi(x;β)]k+j. (3.13)

    Thus, the rth moments of Xi:n can be expressed as

    E(Xri:n)=x=0nk=inkj=0Ψ(n,k)mxr[fi(x;β)]k+j. (3.14)

    L-moments are statistical summary measures for probability distributions, introduced by [11]. They share similarities with ordinary moments but are calculated using linear functions applied to the ordered data values. The L-moment of a random variable X is expressed as follows:

    λδ=1δδ1i0(1)i(δ1i)E(Xδi:δ). (3.15)

    Using (3.15), several statistical measures based on L-moment statistics can be computed, including: mean = λ1, coefficient of skewness = λ3λ2, and coefficient of kurtosis. = λ4λ2. In summary, order statistics help organize and analyze data by arranging it in a specific order, while L-moment statistics provide robust and efficient tools for estimating distribution parameters and understanding the shape and characteristics of a distribution. Higher-order L-moments provide information about the shape and tail characteristics of the distribution. Both concepts play important roles in various statistical applications, particularly when dealing with nonparametric or nonstandard distributions.

    In this section, we delve into the estimation of DBHE parameter through the MPSE method, utilizing a complete sample. Consider a random sample X1, X2,,Xn drawn from the DBHE distribution. For j=1,2,,m+1, let

    Wj(β)=F(x(j)|β)F(x(j1)|β),

    be the uniform spacings of a random sample from the DBHE model, where F(x(0)|β)=0, F(x(m+1)|β)=1 and m+1j=1Wj(β)=1. The MPSE of β, say ˆβMPS, can be derived by maximizing the geometric mean of the spacings

    V(β)=[m+1j=1Wj(β)]1m+1, (4.1)

    with respect to the parameter β.

    Assume a random sample X1, X2,,Xn drawn from the DBHE model. The Anderson-Darling estimator (ADE) is another type of minimum distance estimator. The ADE of the DBHE parameter, say ˆβAD, is derived by minimizing

    AD(β)=m1mmj=1(2j1)[logF(x(j)|β)+log(1F(x(j)|β))]. (4.2)

    Concerning the parameter β, the model is subject to optimization, while the right-tail Anderson-Darling estimator (RADE) of the model parameter is achieved through minimization

    RAD(β)=m22mj=1F(x(j:m)|β)1mmj=1(2j1)[log(1F(x(m+1j:m)|β))], (4.3)

    with respect to the parameter β.

    Consider a random sample X1, X2,,Xn drawn from the DBHE model. The log-likelihood function (L) for the DBHE distribution can be represented as follows:

    L(x_|β)=lnβni=1xi+ni=1ln[(11xilnββ1(xi+1)lnβ)]. (4.4)

    Taking the derivative of the log-likelihood with respect to β and equating it to zero, we obtain

    L(x_|β)β=1βni=1xi+ni=1xiβ(1xilnβ)2(xi+1)(1(xi+1)lnβ)2(1(xi+1)lnβ)1(1xilnβ)1β(1(xi+1)lnβ)1. (4.5)

    Finding an analytical solution for this equation is not possible. Therefore, it requires the application of a numerical iterative method, like the Newton-Raphson method, within the R software, or other optimization techniques.

    Consider a random sample from the DBHE model, with order statistics X(1),X(2),,X(m). The least-squares estimator (LSE) of the DBHE parameter, denoted as ˆβLS, can be obtained by solving the nonlinear equation defined as follows:

    mj=1[F(x(j)|β)jm+1]Δβ(x(j)|β)=0, (4.6)

    with respect to the parameter β, where

    Δβ(x(i)|β)=βF(x(j)|β). (4.7)

    Note that the solution of Δβ(x(j)|β) can be obtained numerically. The weighted LSE (WLSE), say ˆβWLS, can be derived by solving the nonlinear equation defined by

    mj=1(m+1)2(m+2)j(mj+1)[F(x(j)|β)jm+1]Δβ(x(j)|β)=0, (4.8)

    with respect to the parameter β.

    The CVME arises as the disparity between the estimated CDF and the empirical CDF. Estimating the CVME of the DBHE parameter involves solving the non-linear equation defined as follows:

    mj=1[F(x(j)|β)2j12m]Δβ(x(j)|α,β)=0, (4.9)

    with respect to the parameter β, where Δβ(x(j)|α,β) is defined in Eq (4.7).

    Consider zj=j/(m+1) to be an unbiased estimator of F(x(j)|β). Hence, the PCE of the parameter β, denoted by ˆβPC, can be reported by minimizing

    P(β)=mj=1(x(j)D(zj))2,

    with respect to the parameter β where D(zj)=F1(x(j)|β) is the quantile function of the DBHE model.

    In this segment, we assess the effectiveness of MPSE, ADE, MLE, LSE, RADE, PCE, CVME, and WLSE concerning the sample size 'n', and utilizing the R software with DEHB parameters. The process of generating a random variable X from the DEHB distribution begins by generating the value Y from the continuous distribution. Subsequently, the obtained Y value undergoes discretization to produce X, where X is defined as the greatest integer less than or equal to Y. To replicate this, we perform Markov Chain Monte Carlo (MCMC) simulations using various schemes. The assessment is carried out through a simulation study:

    (1) Generate N=10000 samples of various sizes "ni;i=1,2,3,4,5" from the DBHE model as follows:

    ● Scheme I: β=0.2|n1=50, n2=150, n3=300, n4=700, n5=1000.

    ● Scheme II: β=0.4|n1=50, n2=150, n3=300, n4=700, n5=1000.

    ● Scheme III: β=0.7|n1=50, n2=150, n3=300, n4=700, n5=1000.

    ● Scheme III: β=0.9|n1=50, n2=150, n3=300, n4=700, n5=1000.

    (2) Compute the MPSE, ADE, MLE, LSE, RADE, PCE, CVME, and WLSE for the 10000 samples, say ˆβk for k=1,2,...,10000.

    (3) Caculate the bias, mean squared errors (MSE), and mean relative errors (MRE) for N=10000 samples as

    |Bias(β)|=1NNk=1|^βkβk|,  MSE(β)=1NNk=1(^βkβk)2, MRE(β)=1NNk=1|^βkβk|βk .

    The MSE measures the average squared difference between predicted and actual values, with a lower MSE indicating closer predictions to actual values. On the other hand, MRE expresses the average relative difference as a percentage, offering insights into accuracy and normalization across varying data magnitudes. MSE emphasizes precision by squaring errors, while MRE considers the relative magnitude of errors. MSE can be sensitive to outliers, while MRE, in percentage terms, may be less influenced. Despite MSE being less interpretable due to squared units, MRE, as a percentage, provides a standardized measure of error. The choice between MSE and MRE depends on data characteristics and the desired focus on precision or accuracy in predictions.

    (4) The empirical results of simulation are reported in the Tables 37.

    Table 3.  Simulation outcomes for Scheme I.
    n Est. MPSE ADE MLE LSE RADE PCE CVME WLSE
    50 |Bias| 0.349733 0.296181 0.465246 0.501757 0.378084 0.527738 0.342602 0.426185
    MSE 0.455382 0.442731 0.527776 0.541797 0.475314 0.584608 0.460153 0.503785
    MRE 0.151792 0.147561 0.175926 0.180607 0.158444 0.194878 0.153383 0.167935
    Sum of Ranks 72 31 186 217 124 248 83 155
    150 |Bias| 0.103242 0.100211 0.140977 0.140166 0.113513 0.190098 0.114744 0.132465
    MSE 0.254011 0.255962 0.299677 0.297346 0.268944 0.351378 0.268083 0.285865
    MRE 0.084671 0.085322 0.099897 0.099116 0.089654 0.117128 0.089363 0.095295
    Sum of Ranks 41 52 217 186 114 248 103 155
    300 |Bias| 0.049401 0.050652 0.072097 0.068486 0.055314 0.096018 0.054493 0.065835
    MSE 0.179051 0.181592 0.214757 0.206736 0.188754 0.245358 0.184643 0.203975
    MRE 0.059681 0.060532 0.071587 0.068916 0.062924 0.081788 0.061553 0.067995
    Sum of Ranks 31 62 217 186 124 248 93 155
    500 |Bias| 0.027821 0.027832 0.042086 0.042107 0.030833 0.056718 0.031354 0.039265
    MSE 0.131901 0.133422 0.162476 0.163547 0.141354 0.191518 0.139933 0.158535
    MRE 0.043971 0.044472 0.054166 0.054517 0.047124 0.063848 0.046643 0.052845
    Sum of Ranks 31 62 186 217 114 248 103 155
    700 |Bias| 0.023182 0.020301 0.029917 0.029496 0.024254 0.041888 0.024003 0.029385
    MSE 0.123103 0.112981 0.137557 0.136685 0.123234 0.164438 0.123022 0.136796
    MRE 0.041033 0.037661 0.045857 0.045565 0.041084 0.054818 0.041012 0.045606
    Sum of Ranks 83 31 217 165 124 248 72 176
    1000 |Bias| 0.014562 0.014041 0.019807 0.019265 0.016493 0.029098 0.016554 0.019516
    MSE 0.095782 0.090161 0.112597 0.110845 0.102593 0.134868 0.104204 0.111056
    MRE 0.031932 0.030051 0.037537 0.036955 0.034203 0.044958 0.034734 0.037026
    Sum of Ranks 62 31 217 155 93 248 124 186

     | Show Table
    DownLoad: CSV
    Table 4.  Simulation outcomes for Scheme II.
    n Est. MPSE ADE MLE LSE RADE PCE CVME WLSE
    50 |Bias| 0.806284 0.474461 1.117876 1.258808 0.769853 1.012145 0.748912 1.208837
    MSE 0.613262 0.559571 0.719975 0.772037 0.640104 0.786798 0.632233 0.769586
    MRE 0.204422 0.186521 0.239995 0.257347 0.213374 0.262268 0.210743 0.256536
    Sum of Ranks 82.5 31 165 228 114 217 82.5 196
    150 |Bias| 0.172462 0.153641 0.252855 0.260106 0.199764 0.389138 0.187983 0.299367
    MSE 0.322652 0.313311 0.390585 0.392616 0.350634 0.503748 0.335733 0.416637
    MRE 0.107552 0.104441 0.130195 0.130876 0.116884 0.167918 0.111913 0.138887
    Sum of Ranks 62 31 155 186 124 248 93 217
    300 |Bias| 0.078442 0.073871 0.123506 0.120355 0.090514 0.201408 0.089083 0.137827
    MSE 0.220912 0.216361 0.278156 0.268765 0.234243 0.360168 0.235934 0.292167
    MRE 0.073642 0.072121 0.092726 0.089595 0.078083 0.120058 0.078644 0.097397
    Sum of Ranks 62 31 186 155 103 248 114 217
    500 |Bias| 0.045882 0.044081 0.073876 0.072065 0.052174 0.125208 0.050473 0.080707
    MSE 0.168772 0.164551 0.213156 0.212115 0.180044 0.287658 0.177033 0.222557
    MRE 0.056262 0.054851 0.071056 0.070705 0.060014 0.095888 0.059013 0.074187
    Sum of Ranks 62 31 186 155 124 248 93 217
    700 |Bias| 0.036222 0.031341 0.051026 0.050535 0.039164 0.094518 0.038333 0.057157
    MSE 0.152282 0.136281 0.178475 0.179556 0.156784 0.248248 0.155293 0.188637
    MRE 0.050762 0.045431 0.059495 0.059856 0.052264 0.082758 0.051763 0.062887
    Sum of Ranks 62 31 163 174 124 248 93 217
    1000 |Bias| 0.023972 0.021641 0.034126 0.033045 0.025943 0.064498 0.026564 0.039887
    MSE 0.122222 0.107241 0.147746 0.144515 0.128613 0.202688 0.131174 0.157387
    MRE 0.040742 0.035751 0.049256 0.048175 0.042873 0.067568 0.043724 0.052467
    Sum of Ranks 62 31 186 155 93 248 124 217

     | Show Table
    DownLoad: CSV
    Table 5.  Simulation outcomes for Scheme III.
    n Est. MPSE ADE MLE LSE RADE PCE CVME WLSE
    50 |Bias| 0.276313 0.231451 0.367346 0.397147 0.298334 0.419418 0.270062 0.330115
    MSE 0.403712 0.391951 0.468076 0.481017 0.421394 0.521448 0.408013 0.442605
    MRE 0.161482 0.156781 0.187236 0.192417 0.168554 0.208588 0.163203 0.177045
    Sum of Ranks 72 31 186 217 124 248 83 155
    150 |Bias| 0.080842 0.078441 0.110957 0.110356 0.089253 0.152368 0.090254 0.102425
    MSE 0.224561 0.226722 0.265747 0.263646 0.238374 0.314448 0.237573 0.251255
    MRE 0.089831 0.090692 0.106297 0.105466 0.095354 0.125788 0.095033 0.100505
    Sum of Ranks 41 52 217 186 114 248 103 155
    300 |Bias| 0.038811 0.039742 0.056697 0.053896 0.043444 0.076918 0.042803 0.050815
    MSE 0.158651 0.161152 0.190367 0.183336 0.167234 0.219778 0.163583 0.179175
    MRE 0.063461 0.064462 0.076147 0.073336 0.066894 0.087918 0.065433 0.071675
    Sum of Ranks 31 62 217 186 124 248 93 155
    500 |Bias| 0.021831 0.022032 0.033096 0.033127 0.024193 0.045488 0.024614 0.030295
    MSE 0.116871 0.119372 0.144036 0.145037 0.125174 0.171718 0.123953 0.139215
    MRE 0.046751 0.047752 0.057616 0.058017 0.050074 0.068698 0.049583 0.055685
    Sum of Ranks 31 62 186 217 114 248 103 155
    700 |Bias| 0.018172 0.016041 0.023507 0.023206 0.019064 0.033698 0.018843 0.022685
    MSE 0.108922 0.100941 0.121927 0.121216 0.109214 0.147608 0.109003 0.120185
    MRE 0.043572 0.040381 0.048777 0.048496 0.043694 0.059048 0.043603 0.048075
    Sum of Ranks 62 31 217 186 124 248 93 155
    1000 |Bias| 0.011402 0.011251 0.015577 0.015136 0.012963 0.023388 0.013004 0.015065
    MSE 0.084732 0.081891 0.099797 0.098236 0.090943 0.121028 0.092334 0.097545
    MRE 0.033892 0.032761 0.039927 0.039296 0.036383 0.048418 0.036934 0.039025
    Sum of Ranks 62 31 217 186 93 248 124 155

     | Show Table
    DownLoad: CSV
    Table 6.  Simulation outcomes for Scheme IV.
    n Est. MPSE ADE MLE LSE RADE PCE CVME WLSE
    50 |Bias| 0.484534 0.300921 0.674745 0.758988 0.470833 0.677046 0.463832 0.724477
    MSE 0.483862 0.446101 0.566575 0.605667 0.505024 0.646498 0.500893 0.602646
    MRE 0.193542 0.178441 0.226635 0.242277 0.202014 0.258608 0.200353 0.241056
    Sum of Ranks 82.5 31 155 227.5 114 227.5 82.5 196
    150 |Bias| 0.107622 0.097611 0.157305 0.161356 0.125184 0.266018 0.117323 0.184887
    MSE 0.255402 0.250701 0.308925 0.309806 0.278024 0.417018 0.266213 0.328597
    MRE 0.102162 0.100281 0.123575 0.123926 0.111214 0.166808 0.106483 0.131447
    Sum of Ranks 62 31 155 186 124 248 93 217
    300 |Bias| 0.049512 0.046801 0.077156 0.074985 0.056624 0.137268 0.055823 0.085597
    MSE 0.175492 0.173501 0.220036 0.212505 0.185593 0.297488 0.187214 0.230617
    MRE 0.070192 0.069401 0.088016 0.085005 0.074243 0.118998 0.074884 0.092247
    Sum of Ranks 62 31 186 155 103 248 114 217
    500 |Bias| 0.029012 0.028301 0.046226 0.045035 0.032894 0.085738 0.031733 0.050247
    MSE 0.134272 0.134111 0.168776 0.167685 0.142964 0.237878 0.140403 0.175767
    MRE 0.053712 0.053641 0.067516 0.067075 0.057184 0.095158 0.056163 0.070307
    Sum of Ranks 62 31 186 155 124 248 93 217
    700 |Bias| 0.022822 0.020171 0.031966 0.031635 0.024614 0.065178 0.024093 0.035417
    MSE 0.120792 0.111761 0.141345 0.142096 0.124384 0.206318 0.123223 0.148667
    MRE 0.048322 0.044711 0.056545 0.056846 0.049754 0.082528 0.049293 0.059467
    Sum of Ranks 62 31 165 176 124 248 93 217
    1000 |Bias| 0.015172 0.014161 0.021406 0.020735 0.016323 0.044408 0.016744 0.024847
    MSE 0.097292 0.090461 0.117016 0.114385 0.102053 0.168278 0.104134 0.124307
    MRE 0.038922 0.036181 0.046806 0.045755 0.040823 0.067318 0.041654 0.049727
    Sum of Ranks 62 31 186 155 93 248 124 217

     | Show Table
    DownLoad: CSV
    Table 7.  Ranking of estimation methods based on simulation results.
    n MPSE ADE MLE LSE RADE PCE CVME WLSE
    Schema I 50 2 1 6 7 4 8 3 5
    150 1 2 7 6 4 8 3 5
    300 1 2 7 6 4 8 3 5
    500 1 2 6 7 4 8 3 5
    700 3 1 7 5 4 8 2 5
    1000 2 1 7 5 3 8 4 5
    Schema II 50 2.5 1 5 8 4 7 2.5 6
    150 2 1 5 6 4 8 3 7
    300 2 1 6 5 3 8 4 7
    500 2 1 6 5 4 8 3 7
    700 2 1 3 4 4 8 3 7
    1000 2 1 6 5 3 8 4 7
    Schema III 50 2 1 6 7 4 8 3 5
    150 1 2 7 6 4 8 3 5
    300 1 2 7 6 4 8 3 5
    500 1 2 6 7 4 8 3 5
    700 2 1 7 6 4 8 3 5
    1000 2 1 7 6 3 8 4 5
    Schema IV 50 2.5 1 5 7.5 4 7.5 2.5 6
    150 2 1 5 6 4 8 3 7
    300 2 1 6 5 3 8 4 7
    500 2 1 6 5 4 8 3 7
    700 2 1 5 6 4 8 3 7
    1000 2 1 6 5 3 8 4 7
    Sum of Ranks 44 30 144 141.5 90 190.5 76 142
    Overall Rank 2 1 7 5 4 8 3 6

     | Show Table
    DownLoad: CSV

    From Tables 3 to 7, it is evident that as the sample size 'n' increases, the bias of the parameter β tends to decrease toward zero. Similarly, both the MSE and MRE of the DBHE parameter also decrease toward zero with increasing sample size 'n'. These findings indicate the consistent performance of the derived estimators. Furthermore, all estimation methods demonstrate good performance across different sample sizes, with Table 7 highlighting that the ADE method performs the best.

    In this section, we will delve into the significance of the proposed distribution by analyzing various datasets from different domains. We will evaluate how well the DBHE distribution fits these datasets in comparison to several other competing distributions, including the discrete Pareto (DP), discrete Rayleigh (DR), discrete inverse Rayleigh (DIR), discrete Burr-Hatke (DBH), Poisson (Poi), and discrete Burr-XII (DB-XII) distributions. To assess the goodness-of-fit (GOF), we will employ various criteria, which encompass the negative log-likelihood (L), Akaike information criterion (AIC), Bayesian information criterion (BIC), corrected Akaike information criterion (CAIC), Hannan-Quinn information criterion (HQIC), and the Kolmogorov-Smirnov (KS) test, along with its associated P-value. In the interpretation of AIC, CAIC, BIC, and HQIC, lower values indicate a better balance between model fit and simplicity. Consequently, the model with the lowest AIC, CAIC, BIC, and HQIC is considered the most suitable among the available options. BIC imposes a stricter penalty on complex models in comparison to AIC and CAIC, displaying a more conservative preference for selecting simpler models, especially in scenarios with smaller sample sizes. On the other hand, since there is a limited number of frequencies for each observation in datasets I, II, and IV the Pearson's Chi-square statistic cannot be employed for an inference test. Therefore, the KS measure is adequate in this case.

    The first dataset pertains to the time until failure of 15 electron components during an accelerated life test (refer to [16]). To explore the characteristics of dataset I, we have created nonparametric plots, which include box plots, normal quantile-quantile (Q-Q) plots, violin plots, and strip plots. For additional details and visual representations, please refer to Figure 3.

    Figure 3.  Nonparametric plots for dataset I.

    The MLEs along with their respective SE, C.I for the parameter(s), and GOF test results for this dataset can be found in Tables 8 and 9. Notably, the values of L, AIC, BIC, CAIC, HQIC, and KS are all lower, and the P-value is higher for the DBHE distribution in comparison to the values obtained for the other models. As a result, based on this analysis of the real dataset, it appears that the proposed distribution is a highly competitive model.

    Table 8.  The MLEs, standard error (SE), and confidence interval (C.I) for dataset I.
    β α
    Model  Parameter MLE SE C.I MLE SE CI
    DBHE 0.9801 0.0057 [0.9692,0.9915]
    DR 0.9991 2.581×104 [0.9980,0.9993]
    DIR 1.801×107 0.0552 [0,0.1075]
    DBH 0.9992 0.0076 [0.9843,1.0142]
    DPa 0.7201 0.0611 [0.6004,0.8398]
    Poi 27.5332 1.3553 [24.8781,30.1892]
    DIW 2.212×104 7.751×104 [0,0.0013] 0.8752 0.1642 [0.5542,1.1964]
    DB-XII 0.9756 0.0512 [0.8743,1] 13.3676 27.7857 [0,67.8244]

     | Show Table
    DownLoad: CSV
    Table 9.  The GOF test for dataset I.
    Statistic  Parameter DBHE DR DIR DBH DPa Poi DIW DB-XII
    L 65.5581 66.3943 89.0961 91.3684 77.4023 151.2064 68.7037 75.7245
    AIC 133.1174 134.7880 180.192 184.7368 156.8047 304.4129 141.4063 155.4483
    CAIC 133.4247 135.0961 180.4994 185.0445 157.1124 304.7206 142.4068 156.4480
    BIC 133.8256 135.4967 180.8990 185.4448 157.5127 305.1209 142.8223 156.8645
    HQIC 133.1094 134.7814 180.1841 184.7292 156.7971 304.4053 141.3919 155.4334
    KS 0.1896 0.2161 0.6984 0.7917 0.4051 0.3812 0.2092 0.3887
    P-value 0.5886 0.4330 <0.0001 <0.0001 0.0094 0.0258 0.4827 0.0152

     | Show Table
    DownLoad: CSV

    Figure 4 depicts the probability-probability (P-P) plot for dataset I, while Figure 5 showcases the estimated CDFs and the profile of the L for the parameter β in dataset I. Figure 5 reinforces our empirical findings, supporting the conclusion that the DBHE distribution is a more suitable fit for analyzing this data. Additionally, it highlights that the estimator for β is indeed unique.

    Figure 4.  The P-P plot for dataset I.
    Figure 5.  The estimated CDFs (left panel) and L profile of ˆβ (right panel) for dataset I.

    Table 10 provides a compilation of various estimation methods applied to dataset I within the framework of the proposed model.

    Table 10.  Various estimators for dataset I.
    Method MLE MPSE LSE CVME WLSE PCE ADE RADE
    β 0.9801 0.9818 0.9836 0.9834 0.9828  0.9772 0.9831 0.9818
    KS 0.1896 0.1609 0.1569 0.1542 0.1452 0.1932 0.1487 0.1597
    P-Value 0.5886 0.7756 0.8004 0.8166 0.8664 0.5144 0.8479 0.7833

     | Show Table
    DownLoad: CSV

    The analysis revealed that all estimation methods perform satisfactorily for data fitting, with the WLSE approach emerging as the most effective among them.

    This dataset pertains to leukemia remission times, measured in weeks, for a total of 20 patients, as described in [17], utilizing the concept of discretization. In order to delve into the characteristics of dataset II, we have generated nonparametric plots, including box plots, normal Q-Q plots, violin plots, and strip plots. For more comprehensive information and visual representations, please consult Figure 6.

    Figure 6.  Nonparametric plots for dataset II.

    The MLEs along with their corresponding SE, C.I for the parameter(s), and the results of the GOF tests for this dataset are provided in Tables 11 and 12. Importantly, it's noteworthy that the values of L, AIC, BIC, CAIC, HQIC, and KS all exhibit lower values, while the P-value is higher when considering the DBHE distribution in comparison to the values obtained for the other models. Consequently, based on this comprehensive analysis of the real dataset, it is evident that the proposed distribution stands out as a highly competitive model.

    Table 11.  The MLEs, SE, and C.I for dataset II.
    β α
    Model  Parameter  MLE SE C.I MLE SE C.I
    DBHE 0.9603 0.0097 [0.9412,0.9794]
    DR 0.9971 0.0007 [0.9961,0.9982]
    DIR 3.374×107
    DBH 0.9972 0.0124 [0.9734,1.0213]
    DPa 0.6552 0.0619 [0.5342,0.7770]
    Poi 13.7545 0.8292 [12.1267,15.3887]
    DIW 0.0039 0.0072 [0,0.0184] 1.0073 0.1751 [0.6640,1.3501]
    DB-XII 0.9943 0.0113 [0.9765,1.0132] 158.3545 35.4094 [0,3395.9312]

     | Show Table
    DownLoad: CSV
    Table 12.  The GOF test for dataset II.
    Statistic Parameter DBHE DR DIR DBH DPa Poi DIW DB-XII
    -L 73.5159 79.3092 85.0865 94.6355 84.5822 145.4324 74.7965 79.9804
    AIC 149.0318 160.6175 172.1711 191.2695 171.1659 292.8652 153.5932 163.9614
    CAIC 149.2541 160.8401 172.3944 191.4917 171.3876 293.0870 154.2997 164.6671
    BIC 150.0275 161.6136 173.1672 192.2652 172.1613 293.862 155.5851 165.9527
    HQIC 149.2262 160.8124 172.3665 191.4639 171.3596 293.0598 153.9824 164.3511
    KS 0.1471 0.2541 0.4822 0.6691 0.3721 0.3799 0.1966 0.2913
    P-value 0.7800 0.1323 <0.0001 <0.0001 0.008 0.006 0.4221 0.0671

     | Show Table
    DownLoad: CSV

    Figure 7 illustrates the P-P plot for dataset II, while Figure 8 presents the estimated CDFs and the profile of the L for the parameter β in dataset II. Figure 8 reaffirms our empirical observations, providing further support for the suitability of the DBHE distribution in analyzing this dataset. Furthermore, it underscores the uniqueness of the estimator for β.

    Figure 7.  The P-P plot for dataset II.
    Figure 8.  The estimated CDFs (left panel) and L profile of ˆβ (right panel) for dataset II.

    Table 13 presents an overview of diverse estimation techniques applied to dataset II under the proposed model framework.

    Table 13.  Various estimators for dataset II.
    MPSE ADE MLE LSE RADE PCE CVME WLSE
    β 0.9686 0.9703 0.9652 0.9710 0.9681 0.9641 0.9708 0.9661
    KS 0.1233 0.1119 0.1550 0.1146 0.1280 0.1642 0.1128 0.1462
    P-Value 0.9307 0.9670 0.7531 0.9600 0.9106 0.6839 0.9649 0.8106

     | Show Table
    DownLoad: CSV

    The examination indicated that all estimation methods demonstrate satisfactory performance in terms of fitting the data, with the ADE approach emerging as the most effective among the available methods.

    The third dataset pertains to the count of carious teeth among the four deciduous molars. Detailed information regarding this dataset can be referenced in the work of Krishna and Pundir, as cited in [4]. In order to investigate the attributes of dataset III, we have generated nonparametric plots, which encompass box plots, normal Q-Q plots, violin plots, and strip plots. For more comprehensive information and visual representations, see consult Figure 9.

    Figure 9.  Nonparametric plots for dataset III.

    The MLEs along with their corresponding SE, C.I for the parameter(s), and the results of the GOF tests for this dataset are available in Tables 1416. Remarkably, it is evident that the DBHE distribution shows lower values for the chi-squared (χ2) statistic while yielding higher p-values in comparison to the values obtained for the other models. As a result, this comprehensive analysis of the real dataset strongly suggests that the proposed distribution is a highly competitive model.

    Table 14.  The MLEs, SE and C.I for dataset III.
    β α
    Model  Parameter  MLE SE C.I MLE SE C.I
    DBHE 0.5767 0.0372 [0.5042,0.6495]
    DR 0.6651 0.0290 [0.6081,0.7225]
    DIR 0.6259 0.0491 [0.5292,0.7214]
    Geo 0.5988 0.0379 [0.5242,0.6738]
    DPa 0.1842 0.0325 [0.1207,0.2479]
    Poi 0.6700 0.0819 [0.5096,0.8304]
    PoiLi 1.9982 0.2636 [1.4812,2.5146]
    DLi 1.2942 0.1042 [1.0901,1.4987]
    DLogL 0.7455 0.1016 [0.5462,0.9449] 1.7682 0.2671 [1.2440,2.2921]
    DIW 0.6338 0.0492 [0.5375,7293] 1.5764 0.2515 [1.0843,2.0676]
    DW 0.3745 0.0496 [0.2782,0.4706] 0.8951 0.1192 [0.6627,1.1282]
    EDLi 0.3791 0.0651 [0.2527,0.5063] 0.5437 0.1587 [0.2343,0.8529]
    DLi-II 0.4012 0.2695 [0,0.9281] 0.4782 0.5293 [0,1.5147]
    GGeo 0.4676 0.0892 [0.2932,0.6414] 0.6784 0.3027 [0.0863,1.2705]
    DGE-II 0.4681 0.0728 [0.3270,0.6092] 0.7181 0.2062 [0.3146,1.1222]
    DLFR 0.4013 0.0560 [0.2912,0.5115] 1.0000 0.0449 [0.9132,1]

     | Show Table
    DownLoad: CSV
    Table 15.  The GOF test for dataset III.
    X Ob. Fr. DBHE DR DIR Geo DPa DLi PoiLi Poi
    0 64 62.8037 33.5000 62.5034 59.8802 69.0678 57.1253 37.5183 51.1709
    1 17 21.3654 46.9437 26.4176 24.0238 15.3611 26.8834 25.0582 34.2845
    2 10 8.5966 17.0130 5.9918 9.6383 6.0031 10.4459 15.6336 11.4853
    3 6 3.7795 2.3970 2.1903 3.8669 3.0100 3.7068 9.3877 2.5650
     4 3 3.4548 0.6463 2.9126 2.5908 6.5579 1.8385 12.4902 0.4943
    Total 100 100 100 100 100 100 100 100 100
    χ2 1.5748 48.2769 9.0561 3.3515 3.2416 6.6322 30.8894 13.2954
    df 2 1 2 2 2 2 2 1
    P-value 0.4550 <0.001 0.0113 0.188 0.199 0.0362 <0.001 <0.001

     | Show Table
    DownLoad: CSV
    Table 16.  The GOF test for dataset III part II.
    Expected Frequences (Ex. Fr.)
    X Ob. Fr. DLogL DW DIW GGeo EDLi DLi-II DGE-II DLFR
    0 64 62.7253 62.6000 63.3000 62.7335 63.5850 59.8817 63.5630 59.9011
    1 17 22.4187 21.3414 22.4805 21.3633 19.7546 24.0262 20.1733 24.0136
    2 10 7.0053 8.8439 6.4429 8.7638 9.0954 9.6448 8.7926 9.6362
    3 6 2.9774 3.8811 2.7621 3.8645 4.1898 3.8710 4.0029 3.8667
    4 3 4.8734 3.3337 5.0143 3.2749 3.3752 2.5928 3.4682 2.6084
    Total 100 100 100 100 100 100 100 100 100
    χ2 2.78403 1.50736 3.5001 1.5760 0.7490 3.3470 0.9809 3.3401
    df 1 1 1 1 1 1 1 1
    Pvalue 0.0952 0.2195 0.06137 0.2094 0.3868 0.0672 0.3219 0.0685

     | Show Table
    DownLoad: CSV

    Figure 10 illustrates the observed and expected PMFs for dataset III. Figure 11 displays the L profile of the DBHE model parameters for dataset III, and it's noteworthy that the estimators are distinct and singular.

    Figure 10.  The observed and expected PMFs for dataset III.
    Figure 11.  The L profile of ˆβ for dataset III.

    Table 17 offers a consolidated overview of diverse estimation techniques employed for dataset III within the context of the proposed model.

    Table 17.  Various estimators for dataset III.
    MPSE ADE MLE LSE RADE PCE CVME WLSE
    β 0.8543 0.8654 0.7990 0.8687 0.8512 0.7656 0.8663 0.8086
    KS 0.2620 0.2817 0.3593 0.2902 0.2669 0.4173 0.2842 0.3411
    P-Value 0.8056 0.7347 0.4361 0.7021 0.7885 0.3501 0.7255 0.5030

     | Show Table
    DownLoad: CSV

    The analysis has shown that all the estimation methods perform well in terms of fitting the data, with the MPSE approach being the most effective among them.

    The forth dataset comprises the number of deaths attributed to coronavirus in the Punjab region during the period from March 24, 2020, to April 30, 2020. The dataset is as follows: 1, 2, 3, 5, 5, 6, 9, 9, 11, 11, 11, 12, 15, 15, 16, 17, 18, 19, 21, 23, 24, 28, 34, 36, 37, 41, 42, 45, 51, 58, 65, 73, 81, 83, 91,100,103,106. To examine the attributes of dataset IV, we have generated non-parametric plots, encompassing box plots, normal Q-Q plots, violin plots, and strip plots. For more in-depth information and visual representations, please consult Figure 12.

    Figure 12.  Nonparametric plots for dataset III.

    The MLEs along with their corresponding SE, C.I for the parameter(s), and the results of the GOF tests for this dataset can be located in Tables 18 and 19. Significantly, it is evident that for the DBHE distribution, the values of L, AIC, BIC, CAIC, HQIC, and KS all show lower values, while the P-value is higher when compared to the values obtained for the other models. Consequently, based on this comprehensive analysis of the real dataset, it is clear that the proposed distribution emerges as a highly competitive model.

    Table 18.  The MLEs, SE, and C.I for dataset IV.
    β α
    Model  Parameter MLE SE C.I MLE SE C.I
    DBHE 0.9838 0.0029 [0.9781,0.9891]
    DR 0.9996 0.00007 [0.9994,0.9997]
    DIR 1.634×1010
    DBH 0.9996 0.0035 [0.9927,1.0064]
    DPa 0.7298 0.0373 [0.6567,0.8031]
    Poi 34.9211 0.9586 [33.0423,36.7999]
    DIW 0.00005 0.0001 [0,0.0003] 0.8969 0.1070 [0.6874,1.1067]
    DB-XII 0.9960 0.0041 [0.9892,1.0028] 79.5877 82.3391 [0,2153.0236]

     | Show Table
    DownLoad: CSV
    Table 19.  The GOF test for dataset IV.
    Statistic DBHE DR DIR DBH DPa Poi DIW DB-XII
    L 174.1947 186.7001 226.3555 241.3062 202.5788 594.7516 179.1153 198.7273
    AIC 350.3893 375.4005 454.7092 484.6124 407.1552 1191.5021 362.2356 401.4544
    CAIC 350.5005 375.5113 454.8201 484.7235 407.2676 1191.6130 362.5713 401.7976
    BIC 352.0269 377.0386 456.3476 486.2534 408.7931 1193.1432 365.5042 404.7292
    HQIC 350.9723 375.9832 455.2923 485.1951 407.7384 1192.0851 363.3955 402.6197
    KS 0.1124 0.3089 0.6442 0.7786 0.3793 0.5193 0.1388 0.3667
    P-value 0.7227 0.00142 <0.0001 <0.0001 <0.0001 <0.0001 0.4564 <0.0001

     | Show Table
    DownLoad: CSV

    Figure 13 presents the P-P plot for dataset IV, whereas Figure 14 exhibits the estimated CDFs and the profile of the L for the parameter β in dataset IV. Figure 14 further reinforces our empirical observations, providing additional support for the appropriateness of the DBHE distribution in analyzing this dataset. Additionally, it emphasizes the uniqueness of the estimator for β.

    Figure 13.  The P-P plot for dataset IV.
    Figure 14.  The estimated CDFs (left panel) and L profile of ˆβ (right panel) for dataset IV.

    Table 20 offers a comprehensive compilation of various estimation techniques applied to dataset IV within the context of the proposed model framework.

    Table 20.  Various estimators for dataset IV.
    MPSE ADE MLE LSE RADE PCE CVME WLSE
    β 0.9862 0.9871 0.9855 0.9874 0.9865 0.9848 0.9873 0.9852
    KS 0.1027 0.1046 0.1165 0.1091 0.0974 0.1328 0.1079 0.1208
    P-Value 0.8426 0.8267 0.7180 0.7875 0.8829 0.6224 0.7972 0.6771

     | Show Table
    DownLoad: CSV

    The examination indicated that all estimation techniques adequately achieve data fitting, with the WLSE method standing out as the most efficient among them.

    This article centers on a discrete distribution with one parameter, developed using the survival discretization approach, referred to as the DBHE distribution. The statistical properties of the DBHE model have been derived and expressed in terms of generalized hypergeometric functions. It has been established that the DBHE model is particularly suitable for modeling right-skewed datasets characterized by leptokurtic shapes. The presented discrete distribution can serve as a valuable statistical tool for modeling a decreasing HRF in the presence of outlier observations. The DBHE parameter has been estimated using various approaches, including MPSE, ADE, MLE, LSE, RADE, PCE, CVME, and WLSE. Simulation studies conducted across different sample sizes, revealed that all these techniques are effective in estimating the DBHE parameter, with the ADE approach performing best. Furthermore, the study includes the analysis of four real datasets to demonstrate the effectiveness of the DBHE distribution. It was observed that the DBHE distribution outperforms all other competing distributions across all aspects of the analysis. Looking ahead, the article hints at future directions, including the proposal and detailed discussion of bivariate extensions of the DBHE models, as well as the exploration of regression models and the integer-valued autoregressive of order one process along with their applications.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number (PSAU/2023/01/27231).

    The authors declare no conflicts of interest.



    [1] A. S. Yadav, E. Altun, H. M. Yousof, Burr-Hatke exponential distribution: A decreasing failure rate model, statistical inference and applications, Ann. Data. Sci, 8 (2021), 241–260. https://doi.org/10.1007/s40745-019-00213-8 doi: 10.1007/s40745-019-00213-8
    [2] M. El-Morshedy, M. S. Eliwa, E. Altun, Discrete Burr-Hatke distribution with properties, estimation methods and regression model, IEEE Access, 8 (2020), 74359–74370. https://doi.org/10.1109/ACCESS.2020.2988431 doi: 10.1109/ACCESS.2020.2988431
    [3] M. El-Morshedy, A discrete linear-exponential model: Synthesis and analysis with inference to model extreme count data, Axioms, 11 (2022), 531. https://doi.org/10.3390/axioms11100531 doi: 10.3390/axioms11100531
    [4] H. Krishna, P. S. Pundir, Discrete Burr and discrete Pareto distributions, Statist. Methodol., 6 (2009), 177–188. https://doi.org/10.1016/j.stamet.2008.07.001 doi: 10.1016/j.stamet.2008.07.001
    [5] T. Hussain, M. Ahmad, Discrete inverse Rayleigh distribution, Pakistan J. Statist., 30 (2014), 203.
    [6] M. A. Jazi, C. D. Lai, M. H. Alamatsaz, A discrete inverse Weibull distribution and estimation of its parameters, Statist. Methodol., 7 (2010), 121–132. https://doi.org/10.1016/j.stamet.2009.11.001 doi: 10.1016/j.stamet.2009.11.001
    [7] E. Gómez-Déniz, E. Calderín-Ojeda, The discrete Lindley distribution: properties and applications, J. Statist. Comput. Simul., 81 (2011), 1405–1416. https://doi.org/10.1080/00949655.2010.487825 doi: 10.1080/00949655.2010.487825
    [8] J. M. Jia, Z. Z. Yan, X. Y. Peng, A new discrete extended Weibull distribution, IEEE Access, 7 (2019), 175474–175486. https://doi.org/10.1109/ACCESS.2019.2957788 doi: 10.1109/ACCESS.2019.2957788
    [9] E. Gómez-Déniz, Another generalization of the geometric distribution, Test, 19 (2010), 399–415. https://doi.org/10.1007/s11749-009-0169-3 doi: 10.1007/s11749-009-0169-3
    [10] M. A. Hegazy, R. E. Abd El-Kader, A. A. El-Helbawy, G. R. Al-Dayian, Bayesian estimation and prediction of discrete Gompertz distribution, J. Adv. Math. Comput. Sci., 36 (2021), 1–21.
    [11] V. Nekoukhou, M. H. Alamatsaz, H. Bidram, Discrete generalized exponential distribution of a second type, Statistics, 47 (2013), 876–887. https://doi.org/10.1080/02331888.2011.633707 doi: 10.1080/02331888.2011.633707
    [12] E. M. Almetwally, S. Dey, S. Nadarajah, An overview of discrete distributions in modelling COVID-19 data sets, Sankhya A, 85 (2023), 1403–1430. https://doi.org/10.1007/s13171-022-00291-6 doi: 10.1007/s13171-022-00291-6
    [13] A. S. Eldeeb, M. Ahsan-ul-Haq, M. S. Eliwa, A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: Properties and various estimation techniques with inference, AIMS Math., 7 (2022), 1726–1741. https://doi.org/10.3934/math.2022099 doi: 10.3934/math.2022099
    [14] H. Haj Ahmad, D. A. Ramadan, E. M. Almetwally, Evaluating the discrete generalized Rayleigh distribution: Statistical inferences and applications to real data analysis, Mathematics, 12 (2024), 183. https://doi.org/10.3390/math12020183 doi: 10.3390/math12020183
    [15] H. M. Aljohani, M. Ahsan-ul-Haq, J. Zafar, E. M. Almetwally, A. S. Alghamdi, E. Hussam, et al., Analysis of COVID-19 data using discrete Marshall-Olkinin length biased exponential: Bayesian and frequentist approach, Sci. Rep., 13 (2023), 12243. https://doi.org/10.1038/s41598-023-39183-6 doi: 10.1038/s41598-023-39183-6
    [16] J. F. Lawless, Statistical Models and Methods for Lifetime Data, Hoboken: John Wiley & Sons, 2011.
    [17] P. Damien, S. Walker, A Bayesian non-parametric comparison of two treatments, Scand. J. Statist., 29 (2002), 51–56. https://doi.org/10.1111/1467-9469.00891 doi: 10.1111/1467-9469.00891
  • This article has been cited by:

    1. Kizito E. Anyiam, Fatimah M. Alghamdi, Chrysogonus C. Nwaigwe, Hassan M. Aljohani, Okechukwu J. Obulezi, A new extension of Burr-Hatke exponential distribution with engineering and biomedical applications, 2024, 10, 24058440, e38293, 10.1016/j.heliyon.2024.e38293
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1016) PDF downloads(58) Cited by(1)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog