Loading [MathJax]/jax/output/SVG/jax.js
Research article

Estimation of the general population parameter in single- and two-phase sampling

  • Received: 06 February 2023 Revised: 01 April 2023 Accepted: 10 April 2023 Published: 21 April 2023
  • MSC : 62D05, 62F10, 62J05

  • Estimation of population characteristics has been an area of interest for many years. Various estimators of the population mean and the population variance have been proposed from time-to-time with a view to improve efficiency of the estimates. In this paper, we have proposed some estimators for estimation of the general population parameters. The estimators have been proposed for single-phase and two-phase sampling using information of single and multiple auxiliary variables. The bias and mean square errors of the proposed estimators have been obtained. Some comparison of the proposed estimators has been done with some existing estimators of mean and variance. Some specific cases of the proposed estimators have been discussed. Simulation and numerical study have also been conducted to see the performance of the proposed estimators.

    Citation: Saman Hanif Shahbaz, Aisha Fayomi, Muhammad Qaiser Shahbaz. Estimation of the general population parameter in single- and two-phase sampling[J]. AIMS Mathematics, 2023, 8(7): 14951-14977. doi: 10.3934/math.2023763

    Related Papers:

    [1] Khazan Sher, Muhammad Ameeq, Muhammad Muneeb Hassan, Basem A. Alkhaleel, Sidra Naz, Olyan Albalawi . Novel efficient estimators of finite population mean in stratified random sampling with application. AIMS Mathematics, 2025, 10(3): 5495-5531. doi: 10.3934/math.2025254
    [2] Amber Yousaf Dar, Nadia Saeed, Moustafa Omar Ahmed Abu-Shawiesh, Saman Hanif Shahbaz, Muhammad Qaiser Shahbaz . A new class of ratio type estimators in single- and two-phase sampling. AIMS Mathematics, 2022, 7(8): 14208-14226. doi: 10.3934/math.2022783
    [3] Xuechen Liu, Muhammad Arslan . A general class of estimators on estimating population mean using the auxiliary proportions under simple and two phase sampling. AIMS Mathematics, 2021, 6(12): 13592-13607. doi: 10.3934/math.2021790
    [4] Sohaib Ahmad, Sardar Hussain, Muhammad Aamir, Faridoon Khan, Mohammed N Alshahrani, Mohammed Alqawba . Estimation of finite population mean using dual auxiliary variable for non-response using simple random sampling. AIMS Mathematics, 2022, 7(3): 4592-4613. doi: 10.3934/math.2022256
    [5] Anum Iftikhar, Hongbo Shi, Saddam Hussain, Ather Qayyum, M. El-Morshedy, Sanaa Al-Marzouki . Estimation of finite population mean in presence of maximum and minimum values under systematic sampling scheme. AIMS Mathematics, 2022, 7(6): 9825-9834. doi: 10.3934/math.2022547
    [6] Mehreen Fatima, Saman Hanif Shahbaz, Muhammad Hanif, Muhammad Qaiser Shahbaz . A modified regression-cum-ratio estimator for finite population mean in presence of nonresponse using ranked set sampling. AIMS Mathematics, 2022, 7(4): 6478-6488. doi: 10.3934/math.2022361
    [7] Sohaib Ahmad, Sardar Hussain, Javid Shabbir, Muhammad Aamir, M. El-Morshedy, Zubair Ahmad, Sharifah Alrajhi . Improved generalized class of estimators in estimating the finite population mean using two auxiliary variables under two-stage sampling. AIMS Mathematics, 2022, 7(6): 10609-10624. doi: 10.3934/math.2022592
    [8] Yasir Hassan, Muhammad Ismai, Will Murray, Muhammad Qaiser Shahbaz . Efficient estimation combining exponential and ln functions under two phase sampling. AIMS Mathematics, 2020, 5(6): 7605-7623. doi: 10.3934/math.2020486
    [9] Tolga Zaman, Cem Kadilar . Exponential ratio and product type estimators of the mean in stratified two-phase sampling. AIMS Mathematics, 2021, 6(5): 4265-4279. doi: 10.3934/math.2021252
    [10] Riffat Jabeen, Aamir Sanaullah, Muhammad Hanif, Azam Zaka . Two-exponential estimators for estimating population mean. AIMS Mathematics, 2021, 6(1): 737-753. doi: 10.3934/math.2021045
  • Estimation of population characteristics has been an area of interest for many years. Various estimators of the population mean and the population variance have been proposed from time-to-time with a view to improve efficiency of the estimates. In this paper, we have proposed some estimators for estimation of the general population parameters. The estimators have been proposed for single-phase and two-phase sampling using information of single and multiple auxiliary variables. The bias and mean square errors of the proposed estimators have been obtained. Some comparison of the proposed estimators has been done with some existing estimators of mean and variance. Some specific cases of the proposed estimators have been discussed. Simulation and numerical study have also been conducted to see the performance of the proposed estimators.



    Estimation of some population parameters, using a specific sampling design, has been an interesting area of research. The popular parameters which have been an area of interest, in simple random sampling, are the population mean and variance. The basic estimators of the population mean and variance in simple random sampling are sample mean, ˉy, and sample variance, s2. In certain situations, the information of some auxiliary variables is also available and can be used to obtain more efficient estimators for some population parameters. Several authors have proposed some improved estimators of the population mean and the population variance by using the information of the auxiliary variables. The popular estimators of population mean, using information of auxiliary variables, are the ratio and regression estimators given by [1]. The ratio and regression estimators have attracted several authors, and different modifications have been proposed from time to time. A class of estimators of population mean by using information of some auxiliary variables has been proposed by [2]. Another class of regression and ratio-product type estimators has been proposed by [3], and it performs better than the classical ratio estimator. Several estimators of population mean in cases of single- and two-phase sampling have been proposed by [4]. A general class of estimators of the population mean in single- and two-phase sampling has been proposed by [5]. More details on estimators of population mean can be found in [6,7], among others.

    In recent years, the estimation of population variance has also attracted a lot of authors. Classical ratio and regression estimators of the population variance in single-phase sampling have been proposed by [8,9]. An improved ratio type estimator of the population variance has been proposed by [10]. Some ratio and regression type estimators of population variance in two-phase sampling have been proposed by [11]. The exponential type estimators have also attracted some authors in recent times and [12] have proposed an exponential estimator of population variance. Some general classes of exponential estimators have been proposed by [13,14]. An estimator of coefficient of variation in single-phase sampling has been proposed by [15]. Some other notable works on variance estimation are [1621], among others.

    Recently, [22] proposed an estimator of general population parameters in single-phase sampling. The estimator has been proposed by using information of a single auxiliary variable. The estimator provides a unified way to estimate the population mean, variance and coefficient of variation for specific values of the constants involved. In this paper, we have proposed some estimators of general population parameters in single- and two-phase sampling. The estimators have been proposed using information of a single and multiple auxiliary variables. The plan of the paper is as follows.

    Some methodology and notations are given in Section 2. The new estimators of general population parameters for single phase sampling are proposed in Section 3. The estimators have been proposed by using information of single and multiple auxiliary variables. The expressions for the bias and the mean square error (MSE) of the proposed estimators are obtained. In Section 4, estimators for the general population parameters are proposed for two-phase sampling alongside the expressions for the bias and MSE of the proposed two-phase sampling estimators. In Section 5, the comparison of the estimators of specific parameters is given. Some numerical study of the proposed estimator is given in Section 6. The numerical study comprises simulation study and applications using some real populations, and the conclusions and recommendations are given in Section 7.

    In this section, we have given some methodology and notations that will be used in this paper. Suppose that the units of a population are labeled as U1,U2,,UN while the values of some variable of interest are Y1,Y2,,YN. Suppose, further, that the estimation of some general population parameter

    t(a,b)=ˉYa(S2y)b/b22

    is required, where

    ˉY=N1Ni=1Yi

    and

    S2y=(N1)1Ni=1(YiˉY)2

    are, respectively, the population mean and variance of Y. It is to be noted that the general parameter t(a,b) reduces to the population mean for a = 1 and b = 0, and it reduces to the population variance for a = 0 and b = 2 and to the coefficient of variation for a=1 and b = 1. When information of some auxiliary variable is known, then the conventional regression estimator, using a sample of size n, is

    ˉylr=ˉy+β(ˉXˉx), (1)

    where β=Sxy/SxyS2xS2x is the population regression coefficient between X and Y, and

    ˉX=N1Ni=1Xi

    and

    ˉx=n1ni=1xi

    are the population and the sample mean of the auxiliary variable X. The mean square error of (1) is

    MSE(ˉylr)=θS2y(1ρ2yx), (2)

    where

    θ=n1N1

    and

    ρ=Syx/SyxS2xS2yS2xS2y

    is the population correlation coefficient between X and Y.

    In some situations, the population information of auxiliary variable is not available, and in such situations the regression estimator (1) cannot be used. The problem can be solved by using a two-phase sampling technique. In two-phase sampling, a first phase sample of size n1 is drawn from a population of size N, and information of an auxiliary variable is recorded. A sub-sample of size n2 < n1 is drawn from the first-phase sample, and information of the auxiliary variable and the study variable is recorded. The conventional regression estimator, in two phase sampling, is given as

    ˉylr(2)=ˉy(2)+β[ˉx(1)ˉx(2)], (3)

    where

    ˉy(2)=n12n2i=1yi

    is second phase sample mean of study variable Y,

    ˉx(2)=n12n2i=1xi

    is the second-phase sample mean of auxiliary variable X, and

    ˉx1=n1(1)n1i=1xi

    is the first-phase sample mean of auxiliary variable X. The MSE of two-phase sampling regression estimator is

    MSE(ˉylr(2))=ˉY2C2y[θ2(1ρ2yx)+θ1ρ2yx], (4)

    where

    θ2=n12N1,

    and

    θ1=n11N1.

    The regression estimator of population variance is given by [9] as

    s2y(lr)=s2y+γ(S2xs2x), (5)

    where γ is a constant, S2x and s2x are, respectively, the population and the sample variances of the auxiliary variable, and s2y is the sample variance of Y. The estimator for two-phase sampling can be easily written. Several modifications of the two-phase sampling regression estimator of mean are given in [6].

    The derivation of bias and MSE of the estimators of the mean and the variance require certain notations. In this paper, we will assume that the sample mean and the sample variance of study and auxiliary variable are connected with the population mean and the population variance as

    ˉy=ˉY(1+εy),ˉxj=ˉX(1+εxj),s2y=S2y(1+ey),

    and

    s2xj=S2x(1+exj).

    The relation between sample estimates and the population parameters in case of two-phase sampling is

    ˉy(2)=ˉY(1+εy(2)),ˉxj(2)=ˉX(1+εxj(2)),s2y(2)=S2y(1+ey(2)),

    and

    s2xj(2)=S2x(1+exj(2)).

    The expected values of error terms εs and es are all zero. Some additional expectations for single- and two-phase sampling and for single auxiliary variable, are

    E(ε2y)=θC2y,E(ε2x)=θC2x,E(e2y)=θφ40,E(e2x)=θφ04,E(εyεx)=θρyxCxCy,E(εxex)=θφ03Cx,E(εyey)=θφ30Cy,E(eyεx)=θφ21Cx,E(εyex)=θφ12Cy,E(eyex)=θφ22.} (6)
    E(ε2y(2))=θ2C2y,E(e2y(2))=θ2φ40,E(εy(2)ey(2))=θ2φ30Cy,E[(εx(2)εx(1))2]=(θ2θ1)C2x,E[(ex(2)ex(1))2]=(θ2θ1)φ04,E[εy(2)(εx(2)εx(1))]=(θ2θ1)ρyxCxCy,E[ey(2)(ex(2)ex(1))]=(θ2θ1)φ22,E[εy(2)(ex(2)ex(1))]=(θ2θ1)φ12Cy,E[ey(2)(εx(2)εx(1))]=(θ2θ1)φ21Cx,E[εx(2)(ex(2)ex(1))]=(θ2θ1)φ03Cx.} (7)

    In case of multiple auxiliary variables, we will use the following results for single- and two-phase sampling:

    E(εyεεx)=θCyRcx,E(εεxεε/x)=θCx,E(eyεεx)=θΦ21cx,E(εεxe/x)=θΦ012,E(εyex)=θCyφφ12,E(eyex)=θφφ22,E(exe/x)=θΦx,E(ex(2)e/x(2))=θ2Φx,E[εεx(2)(εεx(2)εεx(1))/]=(θ2θ1)Cx,E[εεx(2)(ex(2)ex(1))/]=(θ2θ1)Φ012,E[εy(2)(εεx(2)εεx(1))]=(θ2θ1)CyRcx,E[εy(2)(ex(2)ex(1))/]=(θ2θ1)Cyφφ12,E[ey(2)(εεx(2)εεx(1))]=(θ2θ1)Φ21cx,E[ey(2)(ex(2)ex(1))]=(θ2θ1)φφ22.} (8)

    where εεx=[εx1εxq]/, ex=[ex1exq]/, R=diag(ρyxj), Φ21=diag(φ21j), Cx=diag(Cxj) and

    cx=[Cx1Cx2Cxq],φφ12=[φ121φ122φ12q],φφ22=[φ221φ222φ22q],Cx=[C2x1ρ12Cx1Cx2ρ1qCx1Cxqρ21Cx2Cx1C2x2ρ2qCx2Cxqρq1CxqCx1ρq2CxqCx2C2xq],
    Φx=[φ04jφ02122φ0212qφ02221φ042φ0222qφ02q21φ02q22φ04q],

    and

    Φ012=[φ031Cx1φ01122Cx1φ0112qCx1φ01221Cx2φ032Cx2φ0122qCx2φ01q21Cxqφ01q22Cxqφ03qCxq].

    Also,

    φrsj=(φrsj1),φrsj=μrsj/μrsj(μr/r2220μs/s2202j)(μr/r2220μs/s2202j),
    φ0sjth=(φ0sjth1),φ0sjth=μ0sjth/μ0sjth(μs/s2202j0hμt/t2200j2h)(μs/s2202j0hμt/t2200j2h),
    μrsj=(N1)1Ni=1(yiˉY)r(xijˉXj)s,

    and

    μ0sjth=(N1)1Ni=1(xijˉXj)s(xihˉXh)t.

    We will, now, propose some new estimators for single-phase sampling.

    In this section, we have proposed some new estimators of the general population parameter for single-phase sampling. These estimators have been proposed using information of a single and several auxiliary variables. These estimators are proposed in the following.

    In the following, we have proposed a new estimator of general population parameter using information of a single auxiliary variable. The proposed estimator is

    t1=ˉya(s2y)b/b22[1+α(ˉXˉx)+β(S2xs2x)]. (9)

    It is easy to see that the proposed estimator reduces to the classical estimator of mean for

    (a,b,α,β)=(1,0,0,0).

    Also, for

    (a,b,α,β)=(0,2,0,0),

    the estimator (9) reduces to the classical estimator of variance. For

    (a,b,α,β)=(1,0,α,β),

    the estimator (9) becomes a regression type estimator of the population mean, and for

    (a,b,α,β)=(0,2,α,β),

    we have a regression type estimator of the population variance. Further, for

    (a,b,α,β)=(1,1,α,β),

    the estimator (9) becomes a regression type estimator of coefficient of variation. Now, to obtain the bias and MSE of (9), we write the estimator using the error notations as

    t1=[ˉYa(1+εy)aSby(1+ey)b/b22](1αˉXεxβS2xex)=t(a,b)(1+εy)a(1+ey)b/b22(1αˉXεxβS2xex).

    Expanding, and retaining only the linear terms, we have

    t1=t(a,b)(1+aεy)(1+b2ey)(1αˉXεxβS2xex)=t(a,b)[1+aεy+b2ey+ab2εyeyαˉXεxaαˉXεyεxαb2ˉXeyεxβS2xexaβS2xεyexbβ2S2xeyex],

    or

    t1t(a,b)=t(a,b)[aεy+b2ey+ab2εyeyαˉXεxaαˉXεyεxαb2ˉXeyεxβS2xexaβS2xεyexbβ2S2xeyex]. (10)

    Applying expectation and simplifying, the bias of the proposed estimator (9) is

    Bias(t1)=θt(a,b)[aCy(b2θφ30αˉXρyxCxβS2xφ12)b2(αˉXφ21CxβS2xφ22)]. (11)

    Again, squaring (10) and retaining only the quadratic terms, we have

    (t1t(a,b))2=t2(a,b)[a2ε2y+b24e2y+α2ˉX2ε2x+β2S4xe2x+abεyey2aαˉXεyεx2aβS2xεyexαbˉXeyεxbβS2xeyex+2αβˉXS2xεxex].

    Applying expectation and using (6), the MSE of (9) is

    MSE(t1)=θt2(a,b)[a2C2y+α2ˉX2C2x+b24φ40+β2S4xφ04(2aβS2xφ12abφ30)Cy2aαˉXρyxCxCy(αbˉXφ212αβˉXS2xφ03)CxbβS2xφ22]. (12)

    We will, now, obtain the optimum values of α and β which minimize (12). For this, we differentiate (12) with respect to α and β, equate the derivatives to zero and solve the resulting equations simultaneously. Now, the derivatives of (12) with respect to α and β are

    αMSE(t1)=θt2(a,b)ˉXCx(2ˉXCxα+2S2xφ03β2aρyxCybφ21),

    and

    βMSE(t1)=θt2(a,b)S2x(2ˉXCxφ03α+2S2xφ04β2aCyφ12bφ22).

    Equating the above derivatives to zero and simultaneously solving the resulting equations, the optimum values of α and β which minimizes (9) are

    α=2aCy(ρyxφ04φ03φ12)+b(φ04φ21φ03φ22)2ˉXCx(φ04φ203), (13)

    and

    β=2aCy(φ12ρyxφ03)+b(φ22φ03φ21)2S2x(φ04φ203). (14)

    Using these values in (12), the minimum MSE of estimator given in (9) is

    MSEmin(t1)=θt2(a,b)(φ04φ203)(a2C2yf1+abCyf2+b24f3), (15)

    where

    f1=φ04(1ρ2yx)φ203+2ρyxφ03φ12φ212,f2=φ30(φ04φ203)ρyx(φ04φ21φ03φ22)φ12(φ22φ03φ21),

    and

    f3=φ40(φ04φ203)+φ21(2φ03φ22φ04φ21)φ222.

    The MSE for specific cases of (9) are readily obtained. For example, if

    (a,b,α,β)=(1,0,αopt,βopt),

    then the MSE of a regression type estimator of population mean is obtained as

    MSEmin(t1)=θˉY2C2yf1(φ04φ203)1. (16)

    Further, if

    (a,b,α,β)=(0,2,αopt,βopt),

    the expression for MSE of a regression type estimator of variance is obtained as

    MSEmin(t1)=θS4yf3(φ04φ203)1. (17)

    Again, if

    (a,b,α,β)=(1,1,αopt,βopt),

    the MSE of a regression type estimator of coefficient of variation is obtained as

    MSEmin(t1)=θt2(a,b)(φ04φ203)(C2yf1Cyf2+14f3). (18)

    It is interesting to note that for

    (a,b,α,β)=(1,0,α,0),

    the optimum MSE of (9) reduces to the MSE of classical regression estimator given in (2). Also, for

    (a,b,α,β)=(0,2,0,β),

    the optimum MSE of (9) reduces to the classical regression type estimator of variance as given by [9].

    In this section, we will give an estimator of general population parameter in single-phase sampling using the information of several auxiliary variables. The proposed estimator is

    t2=ˉya(s2y)b/b22[1+qj=1αj(ˉXjˉxj)+qj=1βj(S2xjs2xj)]. (19)

    Again, it is easy to see that the proposed estimator (19) provides certain estimators as a special case for different values of (a,b,αj,βj). Using error notations, the estimator (19) can be written as

    t2=[ˉYa(1+εy)a][S2y(1+ey)]b/b22[1qj=1αjˉXjεxjqj=1βjS2jexj]=ˉYa(S2y)b/b22(1+εy)a(1+ey)b/b22(1αα/ˉXεεxββ/Sxex),

    where

    αα/=[α1α2αq],ββ=[β1β2βq],ˉX=diag(ˉXj),Sx=diag(S2xj).

    Expanding, and retaining only the linear terms, we have

    t2=t(a,b)(1+aεy+b2ey+ab2εyey)(1αα/ˉXεεxββ/Sxex),

    or

    t2t(a,b)=[aεy+b2ey+ab2εyeyαα/ˉXεεxaαα/ˉXεyεεxb2αα/ˉXeyεεxββ/Sxexaββ/Sxεyexb2ββ/Sxeyex]. (20)

    Taking expectation on both sides, the bias of the proposed estimator (18) is

    Bias(t2)=θt(a,b)[(ab2φ30aαα/ˉXRcxaββ/Sxφφ12)b2αα/ˉXΦ21cxb2ββ/Sxφφ22]. (21)

    Again, squaring (20) and retaining only the quadratic terms, we have

    (t2t(a,b))2=t2(a,b)[a2ε2y+b24e2y+αα/ˉXεεxεε/xˉXαα+ββ/Sxexe/xSxββ+abεyey2aαα/ˉXεyεεx2aββ/Sxεyexbαα/ˉXeyεεxbββ/Sxeyex+2αα/ˉXεεxe/xSxββ].

    Taking expectation of the above equation and using (8), the MSE of (19) is

    MSE(t2)=θt2(a,b)[a2C2y+b24φ40+αα/ˉXCxˉXαα+ββ/SxΦxSxββ+abφ30Cy2aCyαα/ˉXRcx2aCyββ/Sxφφ12bαα/ˉXΦ21cxbββ/Sxφφ22+2αα/ˉXΦ012Sxββ]. (22)

    We will, now, obtain the optimum values of αα and ββ which minimizes (22). For this, we will first differentiate (22) with respect to αα and ββ. The derivatives are

    ααMSE(t2)=θt2(a,b)(2ˉXCxˉXαα2aCyˉXRcxbˉXΦ21cx+2ˉXΦ012Sxββ),

    and

    ββMSE(t2)=θt2(a,b)(2SxΦxSxββ2aCySxφφ12bSxφφ22+2SxΦ/012ˉXαα).

    Equating the derivatives to zero, the normal equations are

    ˉXCxˉXαα+ˉXΦ012Sxββ=aCyˉXRcx+b2ˉXΦ21cx,

    and

    SxΦ/012ˉXαα+SxΦxSxββ=aCySxφφ12+b2Sxφφ22.

    Writing the above equations in matrix form, we have

    [ˉXCxˉXˉXΦ012SxSxΦ/012ˉXSxΦxSx][ααββ]=[aCyˉXRcx+(b/b22)ˉXΦ21cxaCySxφφ12+(b/b22)Sxφφ22].

    Solving the above matrix equations, the optimum values of αα and ββ are given as the solution of

    [ααoptββopt]=[ˉXCxˉXˉXΦ012SxSxΦ/012ˉXSxΦxSx]1[ˉX{aCyR+(b/b22)Φ21}cxSx{aCyφφ12+(b/b22)φφ22}]. (23)

    Now, we invert the above partitioned matrix as below. Let

    Σ=[ˉXCxˉXˉXΦ012SxSxΦ/012ˉXSxΦxSx]=[A11A12A21A22],

    and then

    Σ1=[B11B12B21B22],

    where

    B11=(A11A12A122A21)1=[¯X{CxΦ012Sx(SxΦxSx)1SxΦ/012}¯X]1=[¯X(CxΦ012Φ1xΦ/012)¯X]1,B12=B11A12A122=B11(¯XΦ012Sx)(SxΦxSx)1=¯X1(CxΦ012Φ1xΦ/012)1Φ012Φ1xS1x,B21=A122A21B11=(SxΦxSx)1(SxΦ/012¯X)B11=S1xΦ1xΦ/012(CxΦ012Φ1xΦ/012)1¯X1,

    and

    B22=A122(I+A21B11A12A122)=(SxΦxSx)1[I+(SxΦ/012ˉX)B11(ˉXΦ012Sx)(SxΦxSx)1]=(SxΦxSx)1+S1xΦ1xΦ/012(CxΦ012Φ1xΦ/012)1Φ012Φ1xS1x.

    Using the values of the inverted matrix in (22), the optimum values of αα and ββ are

    ααopt=B11[ˉX{aCyR+(b/b22)Φ21}cx]+B12[Sx{aCyφφ12+(b/b22)φφ22}] (24)

    and

    ββopt=B21[ˉX{aCyR+(b/b22)Φ21}cx]+B22[Sx{aCyφφ12+(b/b22)φφ22}]. (25)

    Using these optimum values of αα and ββ in (21), the minimum MSE of (18) is

    MSEmin(t2)=θt2(a,b)[a2C2y+abφ30Cy+b24φ40+αα/optˉXCxˉXααopt+ββ/optSxΦxSxββoptαα/optˉX(2aCyR+bΦ21)cxββ/optSx(2aCyφφ12+bφφ22)+2αα/optˉXΦ012Sxββopt]. (26)

    It is interesting to note that, for

    (a,b,αα,ββ)=(1,0,ααopt,0),

    the minimum MSE, given in (26), reduces to the minimum mean square error of the classical regression estimator of mean with several auxiliary variables; see [6]. Also, for

    (a,b,αα,ββ)=(0,2,0,ββopt),

    the minimum MSE, given in (26), reduces to the minimum MSE of a general estimator of variance given by [19].

    In this section, we have proposed some new estimators of the general population parameter for two-phase sampling. The estimators have been proposed using information of a single and several auxiliary variables.

    In the following, we have proposed a new estimator of general population parameter for two-phase sampling using information of a single auxiliary variable. The proposed estimator is

    t1(2)=ˉya(2)[s2y(2)]b/b22[1+α(2)(ˉx(1)ˉx(2))+β(2)(s2x(1)s2x(2))]. (27)

    It is easy to see that the proposed estimator (27) reduces to the regression type estimator of mean in two-phase sampling for

    (a,b,α(2),β(2))=(1,0,α(2),0).

    The estimator (27) reduces to the regression type estimator of variance in two-phase sampling for

    (a,b,α(2),β(2))=(0,2,0,β(2)).

    Now, to derive the bias and MSE of (27), we write the estimator (27), using error notations, as

    t1(2)=ˉYa(1+εy(2))a(S2y)b/b22(1+ey(2))b/b22[1+α(2)ˉX(εx(1)εx(2))+β(2)S2x(ex(1)ex(2))].

    Now, expanding the power series and retaining only the linear terms, we have

    t1(2)=t(a,b)(1+aεy(2)+b2ey(2)+ab2εy(2)ey(2))[1α(2)ˉX(εx(2)εx(1))β(2)S2x(ex(2)ex(1))],

    or

    t1(2)t(a,b)=t(a,b)[aεy(2)+b2ey(2)+ab2εy(2)ey(2)α(2)ˉX(εx(2)εx(1))aα(2)ˉXεy(2)(εx(2)εx(1))b2α(2)ˉXey(2)(εx(2)εx(1))β(2)S2x(ex(2)ex(1))aβ2S2xεy(2)(ex(2)ex(1))b2β(2)S2xey(2)(ex(2)ex(1))]. (28)

    Applying expectation on (28) and using (7), the bias of (27) is

    Bias(t1(2))=t(a,b)[θ2ab2φ30Cy(θ2θ1)aCy{α(2)ˉXρyxCxβ2S2xφ12}(θ2θ1)b2{α(2)ˉXφ21Cxβ(2)S2xφ22}]. (29)

    Again, squaring (29) and retaining only the terms whose powers add up to 2, we have

    (t1(2)t(a,b))2=t2(a,b)[a2ε2y(2)+b24e2y(2)+α2(2)ˉX2(εx(2)εx(1))2+β2(2)S4x(ex(2)ex(1))2+abεy(2)ey(2)2aα2εy(2)(εx(2)εx(1))2aβ2εy(2)(ex(2)ex(1))α(2)bˉXey(2)(εx(2)εx(1))bβ(2)S2xey(2)(ex(2)ex(1))+2α(2)β(2)ˉXS2x(εx(2)εx(1))(ex(2)ex(1))].

    Applying expectation, and using (7), the mean square error of (29) is

    MSE(t1(2))=t2(a,b)[θ2a2C2y+θ2b24φ40+(θ2θ1)α2(2)ˉX2C2x+(θ2θ1)β2(2)S4xφ04+θ2abφ30Cy(θ2θ1)2aα2ˉXρyxCyCx2(θ2θ1)aβ(2)S2xφ12Cy(θ2θ1)α(2)bˉXφ21Cx(θ2θ1)bβ(2)S2xφ22+2(θ2θ1)α(2)β(2)ˉXS2xφ03Cx]. (30)

    The optimum values of α and β which minimize (30) are the same as given in (13) and (14). The minimum mean square error is obtained by using the optimum values of α and β in (28) and is

    MSEmin(t1(2))=t2(a,b)(φ04φ203)(a2C2yf1(2)+abCyf2(2)+b24f3(2)), (31)

    where

    f1(2)=θ2φ04(1ρ2yx)+θ1ρ2yxφ04θ2φ203(θ2θ1)(φ2122ρφ03φ12),f2(2)=θ2φ30(φ04φ203)(θ2θ1)[ρxy(φ04φ21φ03φ22)+φ12(φ22φ03φ21)],

    and

    f3(2)=θ2φ40(φ04φ203)(θ2θ1)(φ04φ2212φ03φ21φ22+φ222).

    It is to be noted that the minimum MSE, given in (31), reduces to (15) for θ1=0. Further, for

    (a,b,α(2),β(2))=(1,0,αopt(2),0),

    the minimum MSE, given in (31), reduces to the MSE of the two-phase sampling regression estimator of the population mean. Also, for

    (a,b,α(2),β(2))=(0,2,0,βopt(2)),

    the minimum MSE, given in (31), reduces to the MSE of the two-phase sampling regression estimator of the population variance; see, for example, [19]. Further, for

    (a,b,α(2),β(2))=(1,1,αopt(2),βopt(2)),

    the minimum MSE, given in (31), reduces to the MSE of the two-phase sampling estimator of coefficient of variation and is given as

    MSEmin(t1(2))=S2yˉY2(φ04φ203)(C2yf1(2)Cyf2(2)+14f3(2)). (32)

    We will, now, propose a new estimator of general population parameter in two-phase sampling using information of several auxiliary variables.

    The proposed estimator of general population parameter in two-phase sampling with multiple auxiliary variables is

    t2(2)=ˉya(2)(s2y(2))b/b22[1+qj=1αj(2)(ˉxj(1)ˉxj(2))+qj=1βj(2)(s2xj(1)s2xj(2))]. (33)

    The estimator (33) provides various estimators as special cases for specific choices of the parameters involved. For example, if

    (a,b,αj(2),βj(2))=(1,0,αj(2),0),

    then we have a regression type estimator of the population mean for two-phase sampling with multiple auxiliary variables. Again, if

    (a,b,αj(2),βj(2))=(0,2,0,βj(2)),

    then we have a regression type estimator of the population variance in two-phase sampling with multiple auxiliary variables. Further, if

    (a,b,αj(2),βj(2))=(1,1,αj(2),βj(2)),

    then we have a two-phase sampling estimator of the coefficient of variation with multiple auxiliary variables. Now, to derive the bias and MSE of the proposed two-phase sampling estimator, we write it as

    t2(2)=ˉYa(1+εy(2))aSby(1+ey(2))b/b22[1+qj=1α(2)jˉXj(εxj(1)εxj(2))+qj=1β(2)jS2xj(exj(1)exj(2))].

    Expanding the powers and retaining only the linear terms, we have

    t2(2)=ˉYaSby(1+aεy(2))(1+b2ey(2))[1+αα/(2)ˉX(εεx(1)εεx(2))+ββ/(2)Sx(ex(1)ex(2))]=t(a,b)[1+aεy(2)+b2ey(2)+ab2εy(2)ey(2)αα/(2)ˉX(εεx(2)εεx(1))aαα/(2)ˉXεy(2)(εεx(2)εεx(1))b2αα/(2)ˉXey(2)(εεx(2)εεx(1))ββ/(2)Sx(ex(2)ex(1))aββ/(2)Sxεy(2)(ex(2)ex(1))b2ββ/(2)Sxey(2)(ex(2)ex(1))],

    or

    t2(2)t(a,b)=[aεy(2)+b2ey(2)+ab2εy(2)ey(2)αα/(2)ˉX(εεx(2)εεx(1))aαα/(2)ˉXεy(2)(εεx(2)εεx(1))b2αα/(2)ˉXey(2)(εεx(2)εεx(1))ββ/(2)Sx(ex(2)ex(1))aββ/(2)Sxεy(2)(ex(2)ex(1))b2ββ/(2)Sxey(2)(ex(2)ex(1))]. (34)

    Applying expectations, and using (8), the bias of the proposed estimator is

    Bias(t2(2))=[θ2ab2Cyφ03(θ2θ1){aαα/(2)ˉXCyRcxb2αα/(2)ˉXΦ21cxaCyββ/(2)Sxφφ12b2ββ/(2)Sxφφ22}]. (35)

    Again, squaring (34), applying expectation and using (8), the MSE of (35) is

    MSE(t2(2))=t2(a,b)[a2θ2C2y+b24θ2φ40+abCyθ2φ30+(θ2θ1){αα/(2)ˉXCxˉXαα(2)+ββ/(2)SxΦxSxββ(2)2aCyαα/(2)ˉXRcx2aCyββ/(2)Sxφφ12bαα/(2)ˉXΦ21cxbββ/(2)Sxφφ22+2αα/(2)ˉXΦ012Sxββ(2)}]. (36)

    The optimum values of αα(2) and ββ(2) which minimizes (36) are the same as given in (24) and (25). Using the optimum values αα(2) and ββ(2) in (36), the minimum MSE is

    MSEmin(t2(2))=t2(a,b)[θ2a2C2y+θ2b24φ40+θ2abCyφ30+(θ2θ1){αα/(2)ˉXCxˉXαα(2)+ββ/(2)SxΦxSxββ(2)αα/(2)ˉX(2aCyR+bΦ21)cxββ/(2)Sx(2aCyφφ12+φφ22)+2αα/(2)ˉXΦ012Sxββ(2)}]. (37)

    The mean square error of specific cases of (33) can be easily obtained from (37) by using the specific values of the parameters.

    In this section we have given some comparison of the proposed estimators with some existing estimators. The comparison will be given in case of a single auxiliary variable. The comparison for the multiple auxiliary variables case is analogous.

    We will first give a comparison of the proposed estimators with the general estimator of population parameter suggested by [22]. The estimator is

    ˆt(a,b)=[ˉyasby+k(ˉXˉx)]exp[w1(ˉXˉx)ˉX+(α1)ˉx]exp[w2(S2xs2x)S2x+(β1)s2x], (38)

    with mean square error

    MSE(ˆt(a,b))=θt2(a,b)[f1(a,b)(φ04φ203)1{f23(a,b)2f2(a,b)f3(a,b)φ03+φ04f22(a,b)}], (39)

    where

    f1(a,b)=a2C2y+abCyϕ03+b24ϕ40,f2(a,b)=aρyxCy+b2ϕ21,

    and

    f3(a,b)=aCyφ12+b2φ22.

    A close comparison of (39) with (15) indicates that the MSEs of the two estimators are equal. It is interesting to note that our proposed estimator (9) is much simpler in application than (38). We will, now, give a comparison of the estimators of specific population parameters.

    In the following, we will give a comparison of estimators for estimation of the mean. We know that the proposed estimator reduces to the estimator of mean for (a,b,α,β)=(1,0,αopt,βopt) and is given as

    t1(M)=ˉy[1+α(ˉXˉx)+β(S2xs2x)]. (40)

    The MSE of the above estimator is given in (16) and can also be written as

    MSEmin(t1(M))=θˉY2C2y(φ04φ203)1[φ04(1ρ2yx)φ203+2ρyxφ03φ12φ212],

    or

    MSEmin(t1(M))=Var(ˉy)[1(φ04ρ2yx2ρyxφ03φ12+φ212)(φ04φ203)], (41)

    where Var(ˉy)=θˉY2C2y is the variance of the mean per unit estimator. From above, we can see that the proposed estimator of the mean is always more efficient than the mean per unit estimator. Again, the MSE of the proposed estimator of the mean can be written as

    MSEmin(t1(M))=Var(ˉylr)[1(ρyxφ03φ12)2(φ04φ203)(1ρ2yx)], (42)

    where

    Var(ˉylr)=θˉY2C2y(1ρ2yx)

    is variance of the classical regression estimator of the mean. It is clear that the proposed estimator will be more efficient than the classical regression estimator of the mean if ρyxφ103φ12. Since the MSE of the estimators of the mean proposed by [23] is the same as the MSE of the classical regression estimator, the proposed estimator of the mean, (40), is more efficient than the estimator proposed by [23] if ρyxφ103φ12.

    Further, the estimators proposed by [24,25] are less efficient than the classical regression estimator; therefore, they are less efficient than the proposed estimator of the mean, given in (40).

    It is easy to see that the proposed estimator reduces to the regression type estimator of variance for

    (a,b,α,β)=(0,2,αopt,βopt)

    and is given as

    t1(V)=s2y[1+α(ˉXˉx)+β(S2xs2x)]. (43)

    The MSE of the above estimator is given in (17) and can also be written as

    MSE(t1(V))=θS4y[φ40φ221(φ22φ03φ21)2(φ04φ203)1], (44)

    or

    MSE(t1(V))=MSE(s2y)θS2y[φ221+(φ22φ03φ21)2(φ04φ203)1],

    where

    MSE(s2y)=θS4yφ40

    is the MSE of the classical estimator of the variance. The expression of MSE, (44), is same as the expression of the MSE of the variance estimator proposed by [14], but the construction of our proposed estimator of the variance, (42), is much simpler as compared with the variance estimator given by [14]. Further, it is easy to show that our proposed estimator, (42), is more efficient than the classical estimator of variance, s2y, and the estimator proposed by [13].

    We will, now, compare our proposed estimator of variance with the estimators proposed by [18,19]. For this, we first see that the MSE of estimators proposed by [18,19] is the same and is given as

    MSEmin(ˆS2MHS1)=θS4Y(φ40φ222φ104).

    Now, our proposed estimator of variance will be more efficient than the estimators proposed by [18,19], if

    (φ22φ03φ21)2(φ04φ203)(φ22φ104φ221).

    In this section, we have conducted numerical study of the specific cases of the proposed estimator of general population parameter. The numerical study has been conducted in two ways: simulation and study using real population. These numerical studies are given in the following sub-sections.

    In this section, the comparison of the proposed estimator is done with some existing estimators through simulation. The simulation has been done using some popular single- and two-phase sampling estimators of the mean and the variance. We have used two-phase versions of some of the estimators of mean and variance which are not available in the literature. The estimators used in the simulation, in addition to classical ratio and regression estimators of the mean, are given in Tables 1 and 2 below. The simulation algorithm for single-phase sampling is as below:

    Table 1.  Estimators of the mean.
    Estimator Single-Phase Two-Phase
    Bhal and Tuteja [23] tBT=ˉyexp(ˉXˉxˉX+ˉx) tBT(2)=ˉy(2)exp[ˉx(1)ˉx(2)ˉx(1)+ˉx(2)]
    Singh [26] tS=ˉy(ˉX+Sxˉx+Sx) tS(2)=ˉy(2)(ˉx(1)+sx(1)ˉx(2)+sx(1))
    Kadilar and Cingi [24] tKC=[ˉy+b(ˉXˉx)]ˉXˉx tKC(2)=[ˉy(2)+b(ˉx(1)ˉx(2))]ˉx(1)ˉx(2)
    Adichwal et al. [21] tAR=t(R)exp[ˉXˉxˉX+(α1)ˉx]×exp[S2xs2xS2x+(β1)s2x] tAR(2)=tR(2)exp[ˉx(1)ˉx(2)ˉx(1)+(α1)ˉx(2)]×exp[s2x(1)s2x(2)s2x(1)+(β1)s2x(2)]
    Proposed t1(M)=ˉy[1+α(ˉXˉx)+β(S2xs2x)] t1(M)(2)=ˉy[1+α(ˉx1ˉx2)+β(s2x(1)s2x(2))]

     | Show Table
    DownLoad: CSV
    Table 2.  Estimators of the variance.
    Estimator Single-Phase Two-Phase
    Isaki [9] tC=s2yS2x/s2yS2xs2xs2x tC(2)=s2y(2)s2x(1)/s2y(2)s2x(1)s2x(2)s2x(2)
    Isaki [9] tR=s2y+γ(S2xs2x) tR(2)=s2y(2)+γ(s2x(1)s2x(2))
    Yadav and Kadilar [18] tYK=s2yexp[S2xs2xS2x+(a1)s2x] tYK(2)=s2y(2)exp[s2x(1)s2x(2)s2x(1)+(a1)s2x(2)]
    Al-Marshadi [19] tMHS=s2y+ln(s2x/s2xS2xS2x)α tMHS(2)=s2y(2)+ln(s2x(2)/s2x(2)s2x(1)s2x(1))α
    Adichwal et al. [21] tAR=[s2y+k(ˉXˉx)]exp[ˉXˉxˉX+(α1)ˉx]×exp[S2xs2xS2x+(β1)s2x] tAR=[s2y(2)+k(ˉx(1)ˉx(2))]exp[ˉx(1)ˉx(2)ˉx(1)+(α1)ˉx2]×exp[s2x(1)s2x(2)s2x(1)+(β1)s2x(2)]
    Proposed t1(V)=s2y[1+α(ˉXˉx)+β(S2xs2x)] t1(V)(2)=s2y(2)[1+α(ˉx(1)ˉx(2))+β(s2x(1)s2x(2))]

     | Show Table
    DownLoad: CSV

    1) Generate an artificial population of size 5000 from a bivariate normal distribution N2(60,45,52,42,ρ) by using different values of the correlation coefficient.

    2) Generate random samples of sizes 50,100,200 and 500 from the generated population.

    3) Compute different estimators by using the generated samples.

    4) Repeat steps 2 and 3 for 20000 times for each sample size.

    5) Compute mean square error of each estimator of mean and variance at different sample sizes by using

    MSE(ti)=12000020000j=1(tijˉti)2;ˉti=12000020000j=1tij;i=C,R,BT,S,KC,AR,1(M);
    MSE(tk)=12000020000j=1(tkjˉtk)2;ˉtk=12000020000j=1tkj;k=C,R,YK,MHS,AR,1(V).

    In the above tables ˉy(2) and s2y(2) are the second phase mean and variance of the study variable. Similar notations hold for the auxiliary variable.

    The simulation algorithm for two-phase sampling is as below:

    1) Generate an artificial population of size 5000 from a bivariate normal distribution N2(60,45,52,42,ρ) by using different values of the correlation coefficient.

    2) Generate first phase random samples of sizes 500 and 1000 from the generated population.

    3) Generate second phase random samples of sizes 5%, 10% and 20% of the first phase sample.

    4) Compute different estimators by using the second phase sample mean of Y, first and second phase sample means of auxiliary variable X and some population measures of auxiliary variable X.

    5) Repeat steps 2–4 for 20000 times for each combination of first and second phase sample size.

    6) Compute bias and mean square error of each estimator at different sample sizes as given in step 5 for the single-phase case above.

    The results of the simulation study are given in Tables 36 below.

    Table 3.  Mean square error of estimators of mean in single-phase sampling.
    ρxy n t(C) t(R) t(BT) t(S) t(KC) t(AAR) t1(M)
    –0.9 50 1.0427 0.6061 0.6287 0.9567 1.0519 0.7035 0.4151
    100 0.5201 0.2969 0.3133 0.4771 0.5227 0.3304 0.2006
    200 0.2571 0.1479 0.1566 0.2363 0.2575 0.1605 0.0991
    500 0.0959 0.0535 0.0575 0.0879 0.0960 0.0592 0.0357
    –0.5 50 1.0844 0.6233 0.6561 0.9966 1.0931 0.7246 0.4283
    100 0.5172 0.3000 0.3141 0.4753 0.5198 0.3395 0.2030
    200 0.2526 0.1442 0.1526 0.2319 0.2533 0.1587 0.0966
    500 0.0945 0.0541 0.0572 0.0868 0.0945 0.0607 0.0361
    0.5 50 1.0713 0.6151 0.6479 0.9843 1.0826 0.7256 0.4232
    100 0.5140 0.2954 0.3116 0.4724 0.5157 0.3342 0.1997
    200 0.2577 0.1475 0.1566 0.2369 0.2581 0.1619 0.0989
    500 0.0950 0.0541 0.0578 0.0873 0.0949 0.0598 0.0361
    0.9 50 1.0681 0.6224 0.6477 0.9812 1.0797 0.7154 0.4279
    100 0.5247 0.2952 0.3150 0.4814 0.5272 0.3317 0.1990
    200 0.2560 0.1435 0.1534 0.2347 0.2560 0.1600 0.0962
    500 0.0972 0.0540 0.0583 0.0892 0.0973 0.0599 0.0360

     | Show Table
    DownLoad: CSV
    Table 4.  Mean square error of estimators of variance in single-phase sampling.
    ρxy n tC tR tYK tMHS tAAR t1(V)
    –0.9 50 61.6715 26.9648 26.9308 26.8381 27.7020 22.0301
    100 27.7252 13.1662 13.1631 13.1524 13.3192 10.6404
    200 12.8528 6.3378 6.3375 6.3360 6.3789 5.1016
    500 4.6520 2.3183 2.3183 2.3183 2.3230 1.8582
    –0.5 50 59.5725 27.2676 27.2198 27.1438 27.9374 22.2485
    100 27.4137 13.0218 13.0167 13.0075 13.1495 10.5085
    200 12.8765 6.2683 6.2680 6.2685 6.2991 5.0372
    500 4.6790 2.3569 2.3569 2.3568 2.3614 1.8891
    0.5 50 58.6802 26.2101 26.1634 26.0787 26.9010 21.4129
    100 26.9036 12.7161 12.7129 12.7077 12.8867 10.2948
    200 12.6693 6.0603 6.0600 6.0589 6.0901 4.8706
    500 4.5555 2.2384 2.2384 2.2383 2.2453 1.7961
    0.9 50 56.5997 24.9762 24.9441 24.8933 25.6947 20.4355
    100 26.0339 12.1086 12.1053 12.1030 12.2614 9.7939
    200 12.4012 5.8972 5.8967 5.8956 5.9264 4.7395
    500 4.5886 2.2476 2.2476 2.2476 2.2505 1.8003

     | Show Table
    DownLoad: CSV
    Table 5.  Mean square error of estimators of mean in two-phase sampling.
    ρyx n1 n2 t(C) t(R) t(BT) t(S) t(KC) t(AAR) t1(M)
    –0.9 500 25 2.0890 1.0373 1.9112 1.9194 2.1234 1.1317 0.8782
    50 0.9999 0.5051 0.9329 0.9213 1.0102 0.5195 0.4134
    100 0.4722 0.2473 0.4534 0.4369 0.4746 0.2507 0.2003
    1000 50 1.0267 0.5036 0.9417 0.9438 1.0365 0.5205 0.4134
    100 0.5100 0.2496 0.4710 0.4694 0.5129 0.2521 0.2015
    200 0.2321 0.1216 0.2230 0.2148 0.2328 0.1221 0.0977
    –0.5 500 25 2.0671 1.0531 1.9128 1.9030 2.1063 1.1463 0.8937
    50 1.0167 0.5058 0.9433 0.9365 1.0255 0.5211 0.4152
    100 0.4712 0.2495 0.4559 0.4366 0.4739 0.2525 0.2018
    1000 50 1.0273 0.4998 0.9395 0.9445 1.0371 0.5154 0.4099
    100 0.4932 0.2450 0.4587 0.4545 0.4954 0.2479 0.1982
    200 0.2315 0.1211 0.2222 0.2142 0.2317 0.1217 0.0973
    0.5 500 25 2.0570 1.0334 1.8796 1.8898 2.0893 1.1305 0.8792
    50 1.0152 0.5163 0.9491 0.9359 1.0239 0.5310 0.4228
    100 0.4744 0.2517 0.4595 0.4397 0.4760 0.2542 0.2031
    1000 50 1.0414 0.5136 0.9596 0.9584 1.0559 0.5271 0.4196
    100 0.4927 0.2492 0.4619 0.4544 0.4948 0.2527 0.2017
    200 0.2330 0.1212 0.2237 0.2157 0.2334 0.1217 0.0973
    0.9 500 25 2.0973 1.0670 1.9361 1.9292 2.1516 1.1602 0.9049
    50 1.0017 0.5139 0.9407 0.9236 1.0123 0.5300 0.4219
    100 0.4711 0.2470 0.4530 0.4360 0.4733 0.2488 0.1989
    1000 50 1.0307 0.5008 0.9377 0.9464 1.0417 0.5150 0.4101
    100 0.4979 0.2491 0.4637 0.4586 0.5004 0.2520 0.2014
    200 0.2337 0.1219 0.2236 0.2160 0.2346 0.1224 0.0979

     | Show Table
    DownLoad: CSV
    Table 6.  Mean square error of estimators of variance in two-phase sampling.
    ρyx n1 n2 tC tR tYK tMHS tAAR t1(V)
    –0.9 500 25 140.5358 57.3500 56.9743 56.3199 61.4633 48.0799
    50 57.6091 26.9982 26.9350 26.8751 27.7169 22.0829
    100 24.3699 13.1517 13.1486 13.1413 13.2844 10.6166
    1000 50 58.5709 26.8444 26.7882 26.6943 27.5534 21.9302
    100 26.1174 12.8386 12.8346 12.8230 12.9687 10.3608
    200 11.7293 6.3042 6.3039 6.3032 6.3292 5.0615
    –0.5 500 25 138.3087 57.6001 57.1780 56.5402 61.8000 48.1299
    50 57.1008 26.5565 26.4872 26.4285 27.1874 21.6863
    100 24.3847 12.9400 12.9366 12.9321 13.0727 10.4453
    1000 50 59.4134 27.1282 27.0971 27.0369 27.9262 22.2336
    100 26.4397 13.0561 13.0523 13.0450 13.2347 10.5728
    200 11.5907 6.2638 6.2633 6.2621 6.2878 5.0286
    0.5 500 25 136.2103 54.8615 54.5167 53.8812 58.8619 45.7701
    50 56.2874 26.2092 26.1650 26.1164 26.8153 21.3371
    100 23.7153 12.4007 12.3988 12.3934 12.4807 9.9721
    1000 50 56.8057 25.9255 25.8763 25.7851 26.5693 21.1616
    100 25.5146 12.5213 12.5186 12.5121 12.6728 10.1253
    200 11.3721 6.0205 6.0202 6.0187 6.0462 4.8353
    0.9 500 25 129.8467 54.2527 53.9338 53.3785 57.7821 45.2401
    50 54.9371 25.4004 25.3675 25.3054 26.0071 20.6980
    100 23.6268 12.3209 12.3179 12.3106 12.4336 9.9337
    1000 50 57.6789 25.1016 25.0690 25.0215 25.8700 20.5610
    100 24.9836 12.3588 12.3558 12.3502 12.4728 9.9643
    200 11.3057 5.8657 5.8654 5.8639 5.8861 4.7076

     | Show Table
    DownLoad: CSV

    We can see, from the above tables, that our proposed estimators of the mean and the variance outperform other competing estimators. The results given in the above tables also indicate that the mean square error of all of the estimators decreases with the increase in the sample size.

    The graphs of relative efficiency of various estimators of the mean and the variance, relative to the ratio estimators of mean and variance, are given in Figures 1 and 2 below. The graphs also show that our proposed estimators of the mean and the variance have the best efficiency as compared with the competing estimators. We can also see, from the figures, that the estimator proposed by [25] is the worst estimator to estimate the population mean. This estimator is even worse than the ratio estimator. The derived estimator of the mean by [22] is better than some of the estimators used in the study, but still this estimator performs worse than the classical regression estimator of the mean and the estimator proposed by [24]. Similar conclusions can be drawn from the comparison of estimators of the variance, and we can see that our derived estimator of variance outperforms all other estimators used in the study. The relative efficiencies of the estimators of the variance show that all of the estimators used in the study perform better than the classical ratio estimator of variance proposed by [9].

    Figure 1.  Relative efficiency of various estimators of mean.
    Figure 2.  Relative efficiency of various estimators of variance.

    In this section, we have conducted an empirical study of some popular estimators of the mean and the variance by using some real populations. We have used five populations for this empirical study. The first three populations are taken from [27], and the last two are taken from [28]. Summary measures of the populations are given in Table 7 below.

    Table 7.  Summary measures of the populations.
    Measures Pop-Ⅰ Pop-Ⅱ Pop-Ⅲ Pop-Ⅳ Pop-Ⅴ
    N 17 58 32 23 110
    ˉY 202.9529 13.1879 55.9062 61.3478 6.8317
    ˉX 25.0588 31.8207 4.4222 39.6087 27.4273
    S2y 33.1739 2.4702 247.5071 279.3281 5.2488
    S2x 9.1211 24.4701 2.1090 71.7036 278.3754
    ρyx 0.9972 0.5557 0.7815 -0.7737 -0.0645
    φ40 0.9469 2.0227 2.2414 1.4656 1.6078
    φ04 0.9062 1.7776 1.8657 0.9113 0.7985
    φ22 0.9199 0.2282 1.2776 0.5534 -0.0469
    φ03 0.3713 0.4208 0.9532 0.2330 0.0280
    φ21 0.3175 0.0146 0.5743 -0.1069 0.1094
    φ12 0.3438 -0.0931 0.6016 0.0190 0.0139

     | Show Table
    DownLoad: CSV

    The empirical study has been conducted by using a 25% sample from each of the populations. We have used six estimators of the mean and five estimators of the variance in this empirical study. The estimators of the mean that we have used are given in Table 1 above, excluding the estimator by [22], as it has the same mean square error as the mean square error of our proposed estimator. The estimators of variance that we have used in this empirical study are classical ratio and regression estimators by [9], estimator by [12], estimator by [13] and our derived estimator of variance, given in Table 2 above. The mean square error of various estimators is computed for each population. The results are given in Tables 8 and 9 below.

    Table 8.  Mean square error of selected estimators of mean.
    Estimator Population-1 Population-2 Population-3 Population-4 Population-5
    tC 67.0059 0.1675 12.4789 123.7334 0.6637
    tR 0.0353 0.0925 9.0312 17.5502 0.1461
    tBT 8.0789 0.0938 9.9412 76.9963 0.2845
    tS 49.5017 0.1365 9.2494 105.7338 0.3546
    tKC 114.4157 0.3203 40.6317 44.4738 0.6287
    t1(M) 0.0296 0.0836 8.5327 15.5233 0.1460

     | Show Table
    DownLoad: CSV
    Table 9.  Mean square error of selected estimators of variance.
    Estimator Population-1 Population-2 Population-3 Population-4 Population-5
    tC 2.8368 1.1056 8912.8527 15510.8332 1.9248
    tR 2.7938 0.6591 7848.1442 13794.3352 1.2358
    tSC 53.3671 0.7403 8213.9491 13922.3775 1.4277
    tAA 191.9420 0.6700 11944.0214 18316.7933 1.2579
    t1(V) 1.9203 0.6585 7779.2856 12993.2609 1.2263

     | Show Table
    DownLoad: CSV

    From the above tables, we can see that our proposed estimators of the mean and the variance perform better than all other competing estimators. We can also see that the estimator of the mean proposed by [25] and the estimator of the variance proposed by [13] are the worst estimators. The performance of these estimators increases where population variance of the study variable is much smaller as compared with the population variance of the auxiliary variable.

    In this paper, we have proposed some estimators of the general population parameters for single- and two-phase sampling. These estimators have been proposed by using information of a single and several auxiliary variables. The proposed estimators can be used to obtain estimators of population mean, population variance and population coefficient of variation. The expressions for the mean square error of the proposed estimators have been obtained for single- and two-phase sampling. We have seen that our proposed estimators have smaller mean square error as compared with some of the existing estimators. We have conducted extensive simulation study of the proposed estimator for single- and two-phase sampling. Several available estimators are compared in the simulation study. We have seen that our proposed estimators of the mean and the variance perform better than the competing estimators used in the study. We have also seen that the simulated mean square errors of various estimators decrease with increase in the sample size. We have also conducted an empirical study using some real populations. The empirical study has been conducted by computing the analytical mean square error of various estimators. The empirical study shows that our proposed estimators of the mean and the variance are better than the other estimators used in the study. It is, therefore, recommended that the proposed estimators are better choices for estimation of population mean and population variance as compared with the existing estimators.

    The authors declare no conflicts of interest.



    [1] W. G. Cochran, The estimation of the yields of cereal experiments by sampling for the ratio gain to total produce, J. Agric. Soc., 30 (1940), 262–275. https://doi.org/10.1017/S0021859600048012 doi: 10.1017/S0021859600048012
    [2] S. K. Srivastava, H. S. Jhajj, A class of estimators of the population mean in survey sampling using auxiliary information, Biometrika, 68 (1981), 341–343. https://doi.org/10.1093/biomet/68.1.341 doi: 10.1093/biomet/68.1.341
    [3] H. P. Singh, M. R. Espejo, On linear regression and ratio-product estimation of a finite population mean, J. R. Stat. Soc., 52 (2003), 59–67. https://doi.org/10.1111/1467-9884.00341 doi: 10.1111/1467-9884.00341
    [4] M. Samiuddin, M. Hanif, Estimation of population mean in single- and two-phase sampling with or without additional information, Pak. J. Stat., 23 (2007), 1–9.
    [5] A. Y. Dar, N. Saeed, M. O. A. Abu-Shawiesh, S. H. Shahbaz, M. Q. Shahbaz, A new class of ratio estimator in single- and two-phase sampling, AIMS Math., 7 (2022), 14208–14226. https://doi.org/10.3934/math.2022783 doi: 10.3934/math.2022783
    [6] Z. Ahmad, M. Q. Shahbaz, M. Hanif, Two phase sampling, Cambridge Scholars Publishing, 2013. https://doi.org/10.13140/2.1.4488.7042
    [7] M. Hanif, M. Q. Shahbaz, M. Ahmed, Sampling techniques: methods and applications, Nova Science Publishers, 2018.
    [8] S. K. Srivastava, H. S. Jhajj, A class of estimators using auxiliary information for estimating finite population variance, Sankhya, 42 (1980), 87–96.
    [9] C. Isaki, Variance estimation using auxiliary information, J. Am. Stat. Assoc., 78 (1983), 117–123. https://doi.org/10.2307/2287117 doi: 10.1080/01621459.1983.10477939
    [10] C. Kadilar, H. Cingi, Ratio estimators for the population variance in simple and stratified random sampling, Appl. Math. Comput., 173 (2006), 1047–1059. https://doi.org/10.1016/j.amc.2005.04.032 doi: 10.1016/j.amc.2005.04.032
    [11] W. Abu-Dayyeh, M. S. Ahmed, Ratio and regression estimator for the variance under two-phase sampling, Int. J. Stat. Sci., 4 (2005), 49–56.
    [12] R. Singh, P. Chauhan, N. Sawan, F. Smarandache, Improved exponential estimator for population variance using two auxiliary variables, arXiv, 2009. https://doi.org/10.48550/arXiv.0902.0126
    [13] A. Asghar, A. Sanaullah, M. Hanif, Generalized exponential type estimator for population variance in survey sampling, Rev. Colomb. Estad., 37 (2014), 211–222. https://doi.org/10.15446/rce.v37n1.44368 doi: 10.15446/rce.v37n1.44368
    [14] J. Shabbir, S. Gupta, A note on generalized exponential type estimator of population variance in survey sampling, Rev. Colomb. Estad., 38 (2015), 385–397. https://doi.org/10.15446/rce.v38n2.51667 doi: 10.15446/rce.v38n2.51667
    [15] R. Singh, M. Mishra, B. P. Singh, P. Singh, N. K. Adichwal, Improved estimators for population coefficient of variation using auxiliary variable, J. Stat. Manage. Syst., 21 (2018), 1335–1355. https://doi.org/10.1080/09720510.2018.1503405 doi: 10.1080/09720510.2018.1503405
    [16] S. Gupta, J. Shabbir, Variance estimation in simple random sampling using auxiliary information, Hacettepe J. Math. Stat., 37 (2008), 57–67.
    [17] J. Subramani, G. Kumarapandiyan, Variance estimation using median of the auxiliary variable, Int. J. Prob. Stat., 1 (2012), 62–66. https://doi.org/10.5923/j.ijps.20120103.02 doi: 10.5923/j.ijps.20120103.02
    [18] S. K. Yadav, C. Kadilar, Improved exponential type ratio estimator of population variance, Rev. Colomb. Estad., 36 (2013), 145–152.
    [19] A. H. Al-Marshadi, A. H. Alharby, M. Q. Shahbaz, On some new estimators of population variance in single and two-phase sampling, Maejo Int. J. Sci. Technol., 12 (2018), 272–281.
    [20] T. Akhlaq, M. Ismail, M. Q. Shahbaz, On efficient estimation of process variability, Symmetry, 11 (2019), 554. https://doi.org/10.3390/sym11040554 doi: 10.3390/sym11040554
    [21] C. Long, W. Chen, R. Yang, D. Yao, Ratio estimation of the population mean using auxiliary information under the optimal sampling design, Probab. Eng. Inf. Sci., 36 (2022), 449–460. https://doi.org/10.1017/S0269964820000625 doi: 10.1017/S0269964820000625
    [22] N. K. Adichwal, A. A. H. Ahmadini, Y. S. Raghav, R. Singh, I. Ali, Estimation of general parameters using auxiliary information in simple random sampling without replacement, J. King Saud Univ. Sci., 34 (2022), 101754. https://doi.org/10.1016/j.jksus.2021.101754 doi: 10.1016/j.jksus.2021.101754
    [23] L. N. Upadhyaya, H. P. Singh, S. Chatterjee, R. Yadav, Improved ratio and product exponential type estimators, J. Stat. Theory Pract., 5 (2011), 285–302. https://doi.org/10.1080/15598608.2011.10412029 doi: 10.1080/15598608.2011.10412029
    [24] S. Bahl, R. K. Tuteja, Ratio and product type exponential estimators, J. Inf. Optim. Sci., 12 (1991), 159–164. https://doi.org/10.1080/02522667.1991.10699058 doi: 10.1080/02522667.1991.10699058
    [25] C. Kadilar, H. Cingi, Ratio estimators in simple random sampling, Appl. Math. Comput., 151 (2004), 893–902. https://doi.org/10.1016/S0096-3003(03)00803-8 doi: 10.1016/S0096-3003(03)00803-8
    [26] G. N. Singh, On the improvement of product method of estimation in sample surveys, J. Indian Soc. Agrci. Stat., 56 (2003), 267–275.
    [27] S. Weisberg, Applied linear regression, 2 Eds., John Wiley, 1987. https://doi.org/10.2307/2531984
    [28] M. H. Kutner, Applied linear statistical models, 5 Eds., McGraw Hill Irwin, 2005.
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1652) PDF downloads(99) Cited by(0)

Figures and Tables

Figures(2)  /  Tables(9)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog