1.
Introduction
Type-Ⅰ and Type-Ⅱ censoring schemes are the two most popular censoring schemes which are used in practice. The mixture of Type-Ⅰ and Type-Ⅱ censoring schemes has been discussed in the literature for this purpose, is known as the hybrid censoring scheme which was first introduced by Epstein [3]. The hybrid censoring scheme becomes quite popular in the reliability and life-testing experiments, see for example, Fairbanks et al. [4], Draper and Guttman [5], Chen and Bhattacharya [6], Jeong et al. [7], Childs et al. [8] and Gupta and Kundu [9]. Balakrishnan and Kundu [10] have extensively reviewed and discussed Type-Ⅰ and Type-Ⅱ hybrid censoring schemes and associated inferential issues. They have presented details on developments of generalized hybrid censoring and unified hybrid censoring schemes that have been introduced in the literature and they have presented several examples to illustrate the described results. From now on, we refer to this hybrid censoring scheme as Type-Ⅰ hybrid censoring scheme (Type-Ⅰ HCS). It is evident that the complete sample situation as well as Type-Ⅰ and Type-Ⅱ right censoring schemes are all special cases of this Type-Ⅰ HCS.
Recently, Tripathic and Lodhi [11] have discussed inferential procedures for Weibull competing risks model with partially observed failure causes under generalized progressive hybrid censoring. Jeon and Kang [12] have estimated the half-logistic distribution based on multiply Type-Ⅱ hybrid censoring. Nassar and Dobbah [13] have analyzed the reliability characteristics of bathtub-shaped distribution under adaptive Type-Ⅰ progressive hybrid censoring. Algarni, Almarashi and Abd-Elmougoud [14] have considered the joint Type-Ⅰ generalized hybrid censoring for estimation two Weibull distributions.
A three parameter Dagum distribution was proposed by Dagum [15,16], which plays are important role in size distribution of personal income. This distribution offers a more flexible for modeling lifetime data, such as in reliability. The Dagum distribution is not much popular, perhaps, because of its difficult mathematical procedures. In the 1970s, Camilo Dagum embarked on a quest for a statistical distribution closely fitting empirical income and wealth distributions. Not satisfied with the classical distributions, he looked for a model accommodating the heavy tails present in empirical income and wealth distributions as well as permitting an interior mode. He end up with Dagum Type Ⅰ distribution, a three-parameter distribution, and two four parameter generalizations see Dagum [16,17,18]. Dagum distribution is also called the inverse Burr, especially in the actuarial literature, as it is the reciprocal transformation of the Burr XII. Nevertheless, unlike the Burr XII, which is widely known in various fields of science. Since Dagum proposed his model as income distribution, its properties have been appreciated in economics and financial fields and its features have been extensively discussed in the studies of income and wealth. Kleiber and Kotz [19] and Kleiber [20] provided an exhaustive review on the origin of the Dagum model and its applications. Contributions from Quintano and D'Agostino [21] adjusted Dagum model for income distribution to account for individual characteristics, while Domma et al. [22,23] studied the Fisher information matrix in doubly censored data from the Dagum distribution and reliability studies of the Dagum distribution. An important characteristic of Dagum distribution is that its hazard function can be monotonically decreasing, an upside-down bathtub, or bathtub and then upside-down bathtub shaped, see Domma [24]. This behavior has led several authors to study the model in different fields. In fact, Dagum distribution has been studied from a reliability point of view and used to analyze survival data, see Domma, et al.[23].
Dagum distribution specified by the probability density function (pdf)
and cumulative distribution function (cdf)
where λ is the scale parameter and β,θ are the shape parameters.
Huang and Yang [1] have considered a combined hybrid censoring sampling (CHCS) scheme which define as follows: For fixed m, r ∈{1,2,…,n}, (T1, T2) ∈(0,∞) such that m<r, T1<T2 and T∗ denote the terminating time of the experiment. If the kth failure occurs before time T1, the experiment terminates at min{Xr:n,T1}, if the mth failure occurs between T1 and T2, the experiment is terminated at Xm:n and finally if the mth failure occurs after time T2, then the experiment terminates at T2. For our later convenience, we abbreviate this scheme as combined CHCS(m,r;T1,T2). In fact, this system contains the following six cases, and obviously, in each case some part of data are unobservable as:
where the data in parentheses are unobservable.
Balakrishnan, et al.[2] have proposed an unified hybrid censoring sampling (UHCS) scheme as follows: For fixed m, r ∈{1,2,…,n}, (T1, T2) ∈(0,∞), m<r, T1<T2 and T∗ denote the terminating time of the experiment. If the kth failure occurs before time T1, the experiment terminates at min{max{Xr:n,T1},T2}, if the mth failure occurs between T1 and T2, the experiment is terminated at min{Xr:n,T2} and finally if the mth failure occurs after time T2, then the experiment terminates at Xm:n. Again, for our later convenience, we abbreviate this scheme as UHCS(m,r;T1,T2). Similarly, each type of these hybrid censored samples contains six cases, and obviously, in each case some part of data are unobservable as following:
where the data in parentheses are unobservable.
In this paper, we merge CHCS(m,r;T1,T2) and UHCS(m,r;T1,T2), in a unified approach known as a combined-unified hybrid censored scheme (C-UHCS(m,r;T1,T2)). To the best of our knowledge, no attempt has been made on estimation of the parameters of the Dagum distribution using CHCS(m,r;T1,T2) or UHCS(m,r;T1,T2), so, we apply C-UHCS(m,r;T1,T2) to Dagum distribution. We first obtain the maximum likelihood estimate of the parameters and use them to construct asymptotic and bootstrap confidence intervals (CIs). Next, we obtain the Bayes estimates of λ,β and θ. The layout of this paper as follows. In Section 2, we first describe the construction of likelihood function based C-UHCS(m,r;T1,T2) and obtain the MLEs of λ,β and θ. The asymptotic and bootstrap confidence intervals based on the observed Fisher information matrix is also discussed here. Next, Section 3, we consider Bayesian estimation of the unknown parameters under squared error and LINEX loss functions. Simulation studies are carried out in Section 4 to assess the performance of the proposed methods. Section 5 contains a brief conclusion.
2.
Likelihood function under C-UHCS
Let X1,X2,...,Xn denote a sequence of the lifetimes of reliability experiment units that placed on a life-test, we shall assume that these variables of this sample is iid from an absolutely continuous population with cumulative distribution function (cdf) F(x) and probability density function (pdf) f(x). In this section, we construct the likelihood function under the censoring scheme C-UHCS(m,r;T1,T2). Let Dj denote the maximum number of failures until Tj,j=1,2, obviously have D1≤D2. Then, the likelihood function of CHCS(m,r;T1,T2), for a parameter space Ω, is given as
Similarly, the observed likelihood function based on UHCS(m,r;T1,T2) is given as
Assume that, for any case, we terminate the experiment at T that may refer to time T1, T2, observation xk or observation xr, and let k denote the maximum number of failures until T which equal, respectively, D1,D2,k and r. The likelihood function of C-UHCS(k,r;T1,T2), that represents all previous likelihood functions L(C)(Ω|x) and L(U)(Ω|x) under different values of k, T and xk=(x1,x2,...,xk) can be written as
where k and T can be chosen as:
3.
Maximum likelihood estimation
Suppose that {x1,x2,...,xk} be a sequence of observed data from Dagum distribution. Substituting (1.1) and (1.2) in (2.3), the observed likelihood function of λ,β and θ based on these C-UHCS(k,r;T1,T2) becomes
and the corresponding log-likelihood function (Ł) is
Taking the first partial derivatives of log-likelihood (3.2) with respect to λ,β,θ and equating each to zero. Let Ma=(1+λT−β)−(θ+a), we obtain
The solutions of the above nonlinear equations are the maximum likelihood estimators of the Dagum distribution parameters λ,β and θ. As the equations expressed in (3.3), (3.4) and (3.5) cannot be solved analytically, one must use a numerical procedure to solve them.
Then, we can use the asymptotic normality of the MLEs to compute the asymptotic confidence intervals of the parameters λ,β and θ. The observed variance-covariance matrix for the MLEs of the parameters ˆV=[σi,j],i,j=1,2,3, was taken as
where
A 100(1−α)% two-sided approximate confidence intervals for the parameters λ, β and θ are then given by
and
respectively, where V(ˆλ), V(ˆβ), and V(ˆθ) are the estimated variances of ˆλML, ˆβML and ˆθML, which are given by the diagonal elements of ˆV, and zα/2 is the upper (α2) percentile of standard normal distribution.
In order to construct the bootstrap confidence intervals Boot-p for the unknown parameters ϕ=(λ,β, θ) based on C-UHCS scheme, we apply the following algorithms [For more details, one may refer to Kundu and Joarder [25] and Dube, Garg and Krishna [26]].
Boot-p interval's Algorithm:
step-1: Simulate x1:n,x2:n,...,xk:n from Dagum distribution given in (1.1) and derive an estimate ˆϕ of ϕ.
step-2: Simulate another sample x∗1:n,x∗2:n,...,x∗k:n using ˆϕ, k and T. Then derive the updated bootstrap estimate ˆϕ∗ of ϕ.
step-3: Repeat the previous step, a prescribed B number of replications.
step-4: With ˆF(x)=P(ˆϕ∗≤x) denoting the distribution function of ˆϕ∗, the 100(1−α)% confidence interval of ϕ is given by
where ˆϕBoot−p(x)=ˆF−1(x) and x is prefixed.
4.
Bayesian estimation
Bayesian inference is a convenient method to be used with C-UHCS(k,r;T1,T2). Indeed, given that C-UHCS(k,r;T1,T2) are so scarce, prior information is welcome. Risk functions are chosen depending on how one measures the distance between the estimate and the unknown parameter. In order to conduct the Bayesian analysis, usually quadratic loss function is considered. A very popular quadratic loss is the squared error (SE) loss function given by
where ˆg(φ) is an estimate of the parametric function g(φ). The Bayes estimate of g(φ), say ˆgST(φ), against SE loss function is the posterior mean given by
Using SE loss function in the Bayesian approach leads to the equal penalization for underestimation and overestimation which is inappropriate in practical purposes. For instance, in estimating the reliability characteristics, overestimation is more serious than the underestimation. Therefore, different asymmetric loss functions are considered by researchers such as LINEX loss function. The LINEX loss function given by
is a popular asymmetric loss function that penalizes underestimation and overestimation for negative and positive values of ν, respectively. For ρ close to zero, the LINEX loss is approximately equal to the SE loss and therefore almost symmetric. The Bayes estimate of g(φ) under LE loss function becomes
Here, we derive different Bayes estimates by using the mentioned loss functions. Under the assumption that the parameters λ, β and θ are unknown and independent, we assume the joint prior density function, suggested by Al-Hussaini et al. [27] which gave good results, that is given by
where ν1,ν2 and ν3 are positive constants.
4.1. Tierney-Kadane's approximation
In order to use Tierney-Kadane's approximation technique, we set
Now, assuming squared error loss functions, Bayes estimate of the function of parameters g(λ,β,θ) can be written in terms of (4.6) as
By using Tierney and Kadane [28], the approximate form of (4.7) becomes
where (¯λ(g),¯β(g),¯θ(g)) and (¯λ,¯β,¯θ) maximize ϕ(g)(λ,β,θ) and ϕ(λ,β,θ), respectively, and H(g) and H are minus the inverse Hessians Matrix of ϕ(g)(λ,β,θ) and ϕ(λ,β,θ) at (¯λ(g),¯β(g),¯θ(g)) and (¯λ,¯β,¯θ), respectively. Here, from (3.2), (4.5) and (4.6), we have
Now, (¯λ,¯β,¯θ) can be calculated from the simultaneous solution of the nonlinear equations
The second order derivatives of Ł, given in (3.7)-(3.12) can be used to determine the determinant of the negative of the inverse Hessian matrix of ϕ(λ,β,θ) at (¯λ,¯β,¯θ) as
Then, the Bayesian estimate of λ,β and θ based on square error loss function can be obtained by replacing g(λ,β,θ) by λ,β and θ, respectively and corresponding ϕ(g)ST(λ,β,θ) take the forms:
Hence, (¯λ(λ)ST,¯β(λ)ST,¯θ(λ)ST), (¯λ(β)ST,¯β(β)ST,¯θ(β)ST) and (¯λ(θ)ST,¯β(θ)ST,¯θ(θ)ST) can be computed by maximizing ϕ(λ)ST(λ,β,θ), ϕ(β)ST(λ,β,θ) and ϕ(θ)ST(λ,β,θ), respectively, through the simultaneous solution of the each of the following systems:
System 1: ∂∂λŁ(λ,β,θ)−ν1+1nλ=0,∂∂βŁ(λ,β,θ)−ν2=0 and ∂∂θŁ(λ,β,θ)−ν3=0,
System 2: ∂∂λŁ(λ,β,θ)−ν1=0,∂∂βŁ(λ,β,θ)−ν2+1nβ=0 and ∂∂θŁ(λ,β,θ)−ν3=0,
System 3: ∂∂λŁ(λ,β,θ)−ν1=0,∂∂βŁ(λ,β,θ)−ν2=0 and ∂∂θŁ(λ,β,θ)−ν3+1nθ=0.
Again, using the second order derivative of ϕ(λ)ST(λ,β,θ), ϕ(β)ST(λ,β,θ) and ϕ(θ)ST(λ,β,θ) at (¯λ(λ)ST,¯β(λ)ST,¯θ(λ)ST), (¯λ(β)ST,¯β(β)ST,¯θ(β)ST) can be used to calculate (¯λ(θ)ST,¯β(θ)ST,¯θ(θ)ST), the elements of H(λ)ST, H(β)ST and H(θ)ST, respectively, as:
and
Therefore, the approximate Bayes estimate of λ based on square error loss function are:
Next, in order to obtain the Bayesian estimates of λ,β and θ based on LINEX loss function we replacing g(λ,β,θ) by e−ρλ,e−ρβ and e−ρθ, respectively, and corresponding ϕ(g)LT(λ,β,θ) take the forms:
Hence, (¯λ(λ)LT,¯β(λ)LT,¯θ(λ)LT), (¯λ(β)LT,¯β(β)LT,¯θ(β)LT) and (¯λ(θ)LT,¯β(θ)LT,¯θ(θ)LT) can be computed by maximizing ϕ(λ)LT(λ,β,θ), ϕ(β)LT(λ,β,θ) and ϕ(θ)LT(λ,β,θ), respectively, through solving simultaneously the following systems:
System 4: ∂∂λŁ(λ,β,θ)−ν1−ρn=0,∂∂βŁ(λ,β,θ)−ν2=0 and ∂∂θŁ(λ,β,θ)−ν3=0.
System 5: ∂∂λŁ(λ,β,θ)−ν1=0,∂∂βŁ(λ,β,θ)−ν2−ρn=0 and ∂∂θŁ(λ,β,θ)−ν3=0,
System 6: ∂∂λŁ(λ,β,θ)−ν1=0,∂∂βŁ(λ,β,θ)−ν2=0 and ∂∂θŁ(λ,β,θ)−ν3−ρn=0.
Once again, we can derive that H(λ)LT=H(β)LT=H(θ)LT=HLT by calculating the second order derivative of ϕ(λ)LT(λ,β,θ), ϕ(β)LT(λ,β,θ) and ϕ(θ)LT(λ,β,θ) at (¯λ(λ)LT,¯β(λ)LT,¯θ(λ)LT), (¯λ(β)LT,¯β(β)LT,¯θ(β)LT) and (¯λ(θ)LT,¯β(θ)LT,¯θ(θ)LT) and by using the same manner as in (4.12)-(4.15). Therefore, the approximate Bayes estimate of λ based on LINEX loss function are:
In order to calculate 100(1−α)% HPD credible intervals for the Bayesian estimates using both of SE and LE loss functions for any parameter, for example δ, we follow the steps below:
HPD credible interval:
1. Simulate censored sample of size n from Dagum distribution given in (1.1) and calculate the estimate of δ under a certain choice of k,r,T1 and T2, say δ∗.
2. Repeat the previous step M times to get δ∗1,δ∗2,…,δ∗M, and the order values are: δ∗1:M,δ∗2:M,…,δ∗M:M
3. 100(1−α)% HPD credible interval for δ is shortest length through the intervals (δ∗j:M,δ∗j+(1−α)M:M),j=1,2,…,αM.
4.2. MCMC method
In the previous subsection, we have used Tierney-Kadane's approximation to derive the Bayes estimates of the parameters. However, it is not possible to obtain HPD credible intervals using this method. In this subsection, we adopt a Metropolis-Hastings within Gibbs sampling approach to generate random samples from the conditional densities of the parameters and use them to obtain the HPD credible intervals and point Bayes estimates. From (3.1) and (4.5), the posterior density of λ,β, and θ can be extracted as
In the following algorithm, we employ Metropolis-Hastings (M-H) technique with normal proposal distribution to generate samples from these distributions.
1. Start with initial values of the parameters (λ(0),β(0),θ(0)). Then, simulate censored sample of size k under a certain choice of m,r,T1 and T2 from Dagum distribution given in (1.1) and set l=1.
2. Generate λ(∗),β(∗),θ(∗) using the proposal distributions N(λ(l−1),1),N(β(l−1),1) and N(θ(l−1),1), respectively.
3. Calculate the acceptance probability r=min(1,π∗(λ(∗),β(∗),θ(∗))π∗(λ(l−1),β(l−1),θ(l−1))).
4. Generate U from uniform(0, 1).
5. Accept the proposal distribution and set (λ(l),β(l),θ(l))=(λ(∗),β(∗),θ(∗)) if U<r. Otherwise, reject the proposal distribution and set (λ(l),β(l),θ(l))=(λ(l−1),β(l−1),θ(l−1)).
6. Set l=1+1.
7. Repeat Steps 2-6, M times, and obtain λ(l),β(l) and θ(l) for l=1,...,M.
By using the generated random samples from the above Gibbs sampling technique and for N is the nburn, the approximate Bayes estimate of the parameters under squared error and LINEX loss functions can be obtained as
and
MCMC HPD credible interval Algorithm:
1. Arrange the values of λ(∗),β(∗) and θ(∗) in increasing magnitude.
2. Find the positions of the lower bounds which is (M−N)∗α/2, then determine the lower bounds of λ,β and θ.
3. Find the positions of the upper bounds which is (M−N)∗(1−α/2), then determine the upper bounds of λ,β and θ.
4. Repeat the above steps M times. Find the average value of the lower and upper bounds MCMC HPD credible interval of λ,β and θ.
5. Get n number of MCMC HPD credible intervals. Find the average value of the lower and upper bounds credible interval of λ,β and θ.
5.
Simulation experiments
In this section, we show the usefulness of the theatrical findings in this paper by conducting series of simulation experiments. The simulations show the bias and estimated risk of the maximum likelihood and Bayesian estimates, respectively. The Bayesian estimate are calculated based on mean squared and LINEX loss functions. In addition, 95% and 90% the confidence, Bootstrap and HPD credible intervals are calculated with the corresponding width. The simulation experiments can be explained though the following steps:
Evaluate the performances of Bayes predictors obtained from LINEX and squared error loss functions. In this case, to investigate the sensitivity of the predictors with respect to the choices of hyper parameters, the above mentioned priors are considered. We perform simulations to investigate the behavior of the different methods for n=30 and for various r,k,T1,T2.
1. Fix different censoring cases as given in (7) as Xr=10:n, Xr=15:n, Xk=20:n, Xk=25:n, T1=18, T1=24, T2=30 and T2=36, next generate the censored samples from Dagum distribution using λ=5,β=2 and θ=2.
2. Use each of the censoring cases in step (1) for calculating the MLEs by solving the system of nonlinear equations (3.3), (3.4) and (3.5).
3. Again, we use each of the censoring cases in step (1) for calculating the Bayesian estimates by using Tierney-Kadane's approximation in Section (4) for both cases of mean squared and LINEX loss functions. Let the hyper-parameters be the inverse of initial values which are ν1=0.2 and ν2=ν3=0.5, respectively. The parameter ρ in LINEX is chosen as -0.5, 1.0 and 1.5.
4. The steeps (1)-(3) are repeated 1000 times, then the bias and estimated risk (ER) in each cases are calculated in Table 1. The ER of parameter φ under squared error and LINEX loss functions by:
where ˆφ is the estimate of φ and R is the number of replication.
5. The 90% and 95% approximate confidence, bootstrap and HPD credible intervals with their width for the parameters λ,β and θ are calculated in Tables (2), (3) and (4), respectively.
From Tables 1, 2, 3 and 4, we see that:
1. The estimate of λ is overestimated except just few cases. Also, the Bayesian estimate of λ is best in terms of the bias in the case of LINEX loss function at ρ=1.0. Also, the ER supports the Bayesian estimate in the case of LINEX loss function.
2. Again, the estimate of β is overestimated except just few cases. Also, the Bayesian estimate of λ behaves better in terms of the bias in the case of LINEX loss function. Similar argument is also can be stated for ER.
3. One again, the estimate of θ is underestimated for most of the cases. Also, the ER shows that the Bayesian estimate of θ is the best in terms of the bias in the case of LINEX loss function.
4. HPD credible interval estimation for λ behaves better in terms of the the interval width for the LINEX loss function when ρ=−0.5.
5. HPD credible interval estimation for β behaves better in terms of the the interval width for the SE loss function.
6. HPD credible interval estimation for θ behaves better in terms of the the interval width for the LINEX loss function when ρ=1.0.
6.
Data analysis
Here we use one data set that will be used for the purpose of making comparisons between the estimators presented in this paper. The data set is taken from Nichols and Padgett [29] consisting of 100 observations on breaking stress of carbon fibers (in Gba). Dey, et al. [30] have fitted the Dagum distribution to this data set. The data are: 3.7, 2.74, 2.73, 3.11, 3.27, 2.87, 4.42, 2.41, 3.19, 3.28, 3.09, 1.87, 3.75, 2.43, 2.95, 2.96, 2.3, 2.67, 3.39, 2.81, 4.2, 3.31, 3.31, 2.85, 3.15, 2.35, 2.55, 2.81, 2.77, 2.17, 1.41, 3.68, 2.97, 2.76, 4.91, 3.68, 3.19, 1.57, 0.81, 1.59, 2, 1.22, 2.17, 1.17, 5.08, 3.51, 2.17, 1.69, 1.84, 0.39, 3.68, 1.61, 2.79, 4.7, 1.57, 1.08, 2.03, 1.89, 2.88, 2.82, 2.5, 3.6, 1.47, 3.11, 3.22, 1.69, 3.15, 4.9, 2.97, 3.39, 2.93, 3.22, 3.33, 2.55, 2.56, 3.56, 2.59, 2.38, 2.83, 1.92, 1.36, 0.98, 1.84, 1.59, 5.56, 1.73, 1.12, 1.71, 2.48, 1.18, 1.25, 4.38, 2.48, 0.85, 2.03, 1.8, 1.61, 2.12, 2.05, 3.65.
The point and estimation techniques in Section (3), (4) and (5) can be applied to this data set throughout the steps below:
1. Sorting the data set in ascending order.
2. Applying the censoring scheme C-UHCS(m,r;T1,T2) using one arbitrary case of Type-Ⅱ censoring at X25:100=1.74 and arbitrary case of Type-Ⅰ censoring at T=2.4.
3. Applying the point estimations of the parameters λ,β and θ using MLE, Tierney-Kadane and MCMC methods (MCMC are based on 15000 repetitions and 5000 burns).
4. Calculating the 95% and 90% HPD credible intervals using MCMC based on squared error loss function.
5. The results of the point and interval estimation of the unknown parameters are displayed in Tables 7 and 8.
The underline selections in Tables (7) represent the best point estimation with minimum variances. Also, underline selections in Tables (8) represent the selected best interval estimates with minimum interval width.
7.
Conclusion
In this paper, parameters point and interval estimation of C-U hybrid censored model of the Dagum model under classical and Bayesian perspectives are discussed. The MLEs and asymptotic CIs for the interested parameters are computed. Since the Bayesian estimates of the involved parameters could not be obtained analytically, so Tierney and Kadane's approach have employed to obtain approximate Bayes estimates. It is found that, the performances of the Bayesian estimates based on LINEX loss function are superior than those of the corresponding ML estimators. Similar improvements are observed for the Bayesian estimates evaluated for different loss functions. However, depending on the different values of asymmetry parameter ρ, the ER of LINEX loss function may be smaller than those of MLEs. The points estimates in the real data set show that both of Tierney-Kadanean approximation and MCMC are comparable in terms of the estimated variances as well as in the interval estimation in terms on the interval width.
Acknowledgments
The authors would like to thank the editor and referees for their helpful comments, which improved the presentation of the paper. Also, the authors would like to extend their sincere appreciation to the Deanship of Scientific Research, King Saud University for funding the Research Group (RG -1435-056).
Conflict of interest
The authors have no conflict of interest.