Processing math: 100%
Research article

A general class of estimators on estimating population mean using the auxiliary proportions under simple and two phase sampling

  • Received: 27 May 2021 Accepted: 13 September 2021 Published: 24 September 2021
  • MSC : 62D99, 62F10

  • This article deals with estimation of finite population mean using the auxiliary proportion under simple and two phase sampling scheme utilizing two auxiliary variables. Mathematical expressions for the mean squared errors of the proposed estimators are derived under first order of approximation. We compare the proposed class of estimators "theoretically and numerically" with the usual mean estimator of Naik and Gupta [1]. The theoretical as well as numerical findings support the superiority of our proposed class of estimator as compared to estimators available in literature.

    Citation: Xuechen Liu, Muhammad Arslan. A general class of estimators on estimating population mean using the auxiliary proportions under simple and two phase sampling[J]. AIMS Mathematics, 2021, 6(12): 13592-13607. doi: 10.3934/math.2021790

    Related Papers:

    [1] Tolga Zaman, Cem Kadilar . Exponential ratio and product type estimators of the mean in stratified two-phase sampling. AIMS Mathematics, 2021, 6(5): 4265-4279. doi: 10.3934/math.2021252
    [2] Khazan Sher, Muhammad Ameeq, Sidra Naz, Basem A. Alkhaleel, Muhammad Muneeb Hassan, Olayan Albalawi . Developing and evaluating efficient estimators for finite population mean in two-phase sampling. AIMS Mathematics, 2025, 10(4): 8907-8925. doi: 10.3934/math.2025408
    [3] Sohaib Ahmad, Sardar Hussain, Muhammad Aamir, Faridoon Khan, Mohammed N Alshahrani, Mohammed Alqawba . Estimation of finite population mean using dual auxiliary variable for non-response using simple random sampling. AIMS Mathematics, 2022, 7(3): 4592-4613. doi: 10.3934/math.2022256
    [4] Yasir Hassan, Muhammad Ismai, Will Murray, Muhammad Qaiser Shahbaz . Efficient estimation combining exponential and ln functions under two phase sampling. AIMS Mathematics, 2020, 5(6): 7605-7623. doi: 10.3934/math.2020486
    [5] Saman Hanif Shahbaz, Aisha Fayomi, Muhammad Qaiser Shahbaz . Estimation of the general population parameter in single- and two-phase sampling. AIMS Mathematics, 2023, 8(7): 14951-14977. doi: 10.3934/math.2023763
    [6] Amber Yousaf Dar, Nadia Saeed, Moustafa Omar Ahmed Abu-Shawiesh, Saman Hanif Shahbaz, Muhammad Qaiser Shahbaz . A new class of ratio type estimators in single- and two-phase sampling. AIMS Mathematics, 2022, 7(8): 14208-14226. doi: 10.3934/math.2022783
    [7] Sanaa Al-Marzouki, Christophe Chesneau, Sohail Akhtar, Jamal Abdul Nasir, Sohaib Ahmad, Sardar Hussain, Farrukh Jamal, Mohammed Elgarhy, M. El-Morshedy . Estimation of finite population mean under PPS in presence of maximum and minimum values. AIMS Mathematics, 2021, 6(5): 5397-5409. doi: 10.3934/math.2021318
    [8] Sohaib Ahmad, Sardar Hussain, Javid Shabbir, Muhammad Aamir, M. El-Morshedy, Zubair Ahmad, Sharifah Alrajhi . Improved generalized class of estimators in estimating the finite population mean using two auxiliary variables under two-stage sampling. AIMS Mathematics, 2022, 7(6): 10609-10624. doi: 10.3934/math.2022592
    [9] Khazan Sher, Muhammad Ameeq, Muhammad Muneeb Hassan, Basem A. Alkhaleel, Sidra Naz, Olyan Albalawi . Novel efficient estimators of finite population mean in stratified random sampling with application. AIMS Mathematics, 2025, 10(3): 5495-5531. doi: 10.3934/math.2025254
    [10] Hleil Alrweili, Fatimah A. Almulhim . Estimation of the finite population mean using extreme values and ranks of the auxiliary variable in two-phase sampling. AIMS Mathematics, 2025, 10(4): 8794-8817. doi: 10.3934/math.2025403
  • This article deals with estimation of finite population mean using the auxiliary proportion under simple and two phase sampling scheme utilizing two auxiliary variables. Mathematical expressions for the mean squared errors of the proposed estimators are derived under first order of approximation. We compare the proposed class of estimators "theoretically and numerically" with the usual mean estimator of Naik and Gupta [1]. The theoretical as well as numerical findings support the superiority of our proposed class of estimator as compared to estimators available in literature.



    It is a well-known fact, that at large scale survey sampling, the use of several auxiliary variables improve the precision of the estimators. In survey sampling, researchers have already attempted to obtain the estimates for population parameter such as mean, median etc, that posses maximum statistical properties. For that purpose a representative part of population is needed, when population of interest is homogeneous then one can use simple random sampling (SRS) for selecting units. In some situations, information available in the form of attributes, which is positively correlated with study variables. Several authors including Naik and Gupta [1], Jhajj [2], Abd-Elfattah [3], Koyuncu [4], Solanki [5], Sharma [6] and Malik [7] proposed a set of estimators, taking the advantages of bi-serial correlation between auxiliary and study variables, utilizing information on single auxiliary attribute. Verma [8], Malik [7], Solanki et al., [9] and Sharma [10] suggested some estimators utilizing information on two auxiliary attributes in SRS, Mahdizadeh and Zamanzade [11] developed a kernel-based estimation of P(X>Y) in ranked-set sampling, SinghPal and Solanki [12] developed a new class of estimators of finite population mean survey sampling and Mahdizadeh and Zamanzade [13] suggest a smooth estimation of a reliability function in ranked set sampling, further more Hussain et al., [14] and Al-Marzouki et al., [15] also work in this side.

    In this article, we consider the problem of estimating the finite population mean using the auxiliary proportion under simple and two phase sampling scheme. The mathematical expression of the bias and mean squared error of the proposed estimator are derived under first order of approximation. The performance of proposed class of estimator is compared with that of the existing estimators both theoretically and numerically. In terms of percentage relative efficiency (PRE), it is found that proposed class of estimator outperforms the existing ones.

    Let U={u1,u2,...,uN} represent a finite population of size N distinct units, assumed that a sample of size n units is drawn from this population U using simple random sampling without replacement. Let yi and ϕij (i = 1, 2) denotes the observations on variable y and ϕi (i = 1, 2) for the jth unit (j = 1, 2, ..., N).

    ϕij=1,   if ith unit posses atrributes

    ϕij=0,   otherwise

    Pj=Niϕij=Aj/N,(j=1,2) and pj=Niϕij=aj/n,(j=1,2) are the population and sample proportions of auxiliary variable respectively. Let ˉY=Ni=1yiN, ˉy=ni=1yin be the population and sample mean of the study variable y. S2ϕjy=Ni=1(ϕijPj)(yi¯Y)N1,(j=1,2) are the variations between the study and the auxiliary attributes. S2ϕ1ϕ2=Ni=1(ϕi1P1)(ϕi2P2)N1 are the variations between the auxiliary attributes. ρyϕj=SϕjySySϕ represents the point bi-serial correlation between the study variable y and the two auxiliary attributes p1 and p2 respectively. ρϕ1ϕ2=Sϕ1ϕ2Sϕ1Sϕ2 represents the point bi-serial correlation between the two auxiliary attributes p1 and p2 respectively.

    Let us define, e0=ˉyˉYˉY, e1=p1P1P1, e2=p2P2P2,

    such that, E(ei)=0 (i=0,1,2),

    E(e20)=fC2y=V200, E(e21)=fC2ϕ21=V020, E(e22)=fC2ϕ22=V002,

    E(e0e1)=fρyϕ1CyCϕ1=V110, E(e0e2)=fρyϕ2CyCϕ2=V101,

    E(e1e2)=fρϕ1ϕ2Cϕ1Cϕ2=V011.

    Where Cy=SyˉY, Cϕj=SϕjPj,(j=1,2), is the co-efficient of variation of the study and auxiliary attribute. S2y=Ni=1(yiˉY)2N1, S2ϕj=Ni=1(ϕijPj)2N1,(j=1,2), is the variance of study and auxiliary attribute. f=(1n1N) is the correction factor.

    The rest of the paper is organized as follows. In Sections 1.1 and 1.2, introduction and notations are given for simple random sampling and two phase sampling. In Sections 2.1 and 2.3, we discussed some existing estimators of the finite population mean for both sampling designs. The proposed estimators are given in Sections 2.2 and 2.4. In Sections 3.1 and 3.2, theoretical comparisons are conducted. While in Sections 4.1 and 4.2 we focus on empirical studies. Finally, application and conclusions are drawn in Sections 5 and 6.

    The precision of estimate can be increased by using two methodologies. Firstly the precision may be increased by using using adequate sampling design for the estimated variable. Secondly the precision may be increased by using an appropriate estimation procedure, i.e. some auxiliary information which is closely associated with the variable under study. In application there exist a situation when complete auxiliary information or attribute is not available or information on that attribute is expensive. In that case, a method of two phase sampling or double sampling is used to obtain the estimates of unknown population parameters. In two phase sampling, a large preliminary sample (n) is selected by SRSWOR to obtain the estimate of unknown parameter of the auxiliary variable at first phase and the information on the auxiliary variable is collected, which is use to estimate the unknown auxiliary variable. Then a sub sample (n<n) is selected at second phase and both the study and auxiliary variables are collected. Here we assume that Population proportion (P1) is unknown and introduce an improved estimator to estimate the population mean. Kiregyera [16], Mohanty [17], Malik [7] and Haq [18] used two auxiliary variables in two phase sampling for the better estimation of mean.

    An example in this context is while estimating the yield of a crop, it is likely that the area under the crop may be unknown but the area of each farm may be known. Then y, P1 and P2 respectively are the yield area under the crop and area under cultivation.

    Consider a finite population U=(u1,u2...uN) of size N and let yi, ϕi1 and ϕi2 is the information on the study variable and two auxiliary attributes associated with each unit ui(i=1,2,...,N) of the population such that:

    ϕij=1, if the ith unit in the population possesses auxiliary attribute ϕj, ϕij=0 otherwise.

    We assume that the population mean of the first auxiliary proportion P1 is unknown but the same information is known for the second proportion. Let pj=niϕijn=aj/n for j=1,2 be the estimate of Pj obtained from the first phase sample of size n, drawn by using SRSWOR from the population of N units. Let ˉy=niyi and p1=niϕi1n=a1/n be the estimates of ˉY and P1 respectively, obtained from a second sample of size n, drawn from the first phase n using SRSWOR.

    To obtain the bias and MSE for estimators in two phase sampling we define the error terms as follows:

    e0=ˉyˉYˉY, e1=p1P1P1, e2=p2P2P2, e1=p1P1P1.

    such that:

    E(e0)=E(e1)=E(e2)=E(e1)=0,

    E(e20)=fC2y=V200,    E(e21)=fC2ϕ1=V020  E(e22)=fC2ϕ2=V002,

    E(e0e1)=fρyϕ1CyCϕ1=V110,    E(e0e2)=fρyϕ2CyCϕ2=V101,

    E(e1e2)=fρϕ1ϕ2Cϕ1Cϕ2=V011, E(e1e0)=fρyϕ1CyCϕ1=V110,

    E(e0e2)=fρyϕ2CyCϕ2=V101, E(e21)=fC2ϕ1=V020,

    E(e22)=fC2ϕ22=V002, f=1n1N, f=1n1N,

    ˉY=NNi=1yi,    ˉy=nni=1yi,

    S2y=Ni=1(yiˉY)2N1,    S2yϕj=Ni=1(ϕijPj)(yi¯Y)N1,

    S2ϕ1ϕ2=Ni=1(ϕi1P1)(ϕi2P2)N1,    Cy=SyˉY,

    Cϕj=SϕjPj, S2ϕj=Ni=1(ϕijPj)2N1,

    s2ϕj=ni=1(ϕijpj)2n1, represents the sample variance of size n,

    s2ϕj=ni=1(ϕijpj)2n1, represents the sample variance of size n

    ρyϕj=SϕjySySϕ represent point bi-serial correlation between the study variable (y) and the two auxiliary attributes (P1) and (p2).

    ρϕ1ϕ2=Sϕ1ϕ2Sϕ1Sϕ2 represent point bi-serial correlation between the two auxiliary attributes (P1) and (P2) respectively.

    In order to have an estimate of the study variable, using information of population proportion P, Naik [1] proposed the following estimators respectively.

    tU=ˉy. (2.1)

    The MSE of tU is given by

    MSE(tU)=ˉY2V200. (2.2)

    Naik [1] the following estimator respectively

    tA=ˉy(P1p1), (2.3)
    tB=ˉy(p2P2), (2.4)
    tC=ˉyexp(P1p1P1+p1), (2.5)
    tD=ˉyexp(p2P2p2+P2). (2.6)

    The MSE expressions of the estimators tA, tB, tC and tD are respectively given as

    MSE(tA)ˉY2(V2002V110+V020), (2.7)
    MSE(tB)ˉY2(V200+2V101+V002), (2.8)
    MSE(tC)ˉY2(V200V110+14V020), (2.9)
    MSE(tD)ˉY2(V200+V101+14V002). (2.10)

    Malik [7] proposed exponential type estimator as

    tMS=ˉyexp(P1p1P1+p1)γ1exp(P2p2P2+p2)γ2+b1(P1p1)+b2(P2p2), (2.11)

    where b1=syϕ1s2ϕ1 and b2=syϕ2s2ϕ2 are the sample regression coefficients. γ1 and γ2 are two unknown constants. The optimum values of these constants are given as:

    γ1(opt)=2{P1β1Cϕ1(1+ρ2ϕ1ϕ2)+ˉYCy(ρyϕ1+ρϕ1ϕ2ρyϕ2)}ˉYCϕ1(1+ρ2ϕ1ϕ2),
    γ2(opt)=2{P1β2Cϕ1(1+ρ2ϕ1ϕ2)+ˉYCy(ρyϕ2+ρϕ1ϕ2ρyϕ1)}ˉYCϕ2(1+ρ2ϕ1ϕ2).

    where β1=Syϕ1S2ϕ1 and β2=Syϕ2S2ϕ2, are the regression coefficients. The minimum mean squared error for the optimum values of γ1 and γ2 are given as:

    MSE(tMSmin)fˉY2C2y(1R2yϕ1ϕ2), (2.12)

    where R=ρ2ϕ1y+ρ2ϕ2y2ρϕ1yρϕ2yρϕ1ϕ21ρ2ϕ1ϕ2 is the multiple correlation of y on ϕ1 and ϕ2.

    We used some formulas for readers to easily understand and pick-out the difficulty of long equations.

    We proposed generalized class of estimators for estimating mean in simple random sampling using two auxiliary attributes, as

    tRPR=k1ˉyk2(p1P1)[α{2exp(η(p2P2)η(p2+P2)+2λ)}+(1α)exp(η(P2p2)η(P2+p2)+2λ)], (2.13)

    where k1 and k2 are suitable constants whose values are to be determined such that MSE of tRPR is minimum; η and λ are either real numbers or functions of known parameters of the auxiliary attribute ϕ2 such as coefficient of variation (Cϕ2), coefficient of kurtosis (βϕ2) and α is the scalar (0α1) for designing different estimators. Let ˉY and (P1,P2) be the population means of the study variable and auxiliary proportions respectively. ˉy and (p1,p2) be the sample means of the study variable and auxiliary proportions respectively.

    Putting α=1 and α=0 in (2.13), we get the following estimators.

    For α=1, the suggested class of estimators reduces to:

    tRPR(α=1)=k1ˉyk2(p1P1)[2exp{η(p2P2)η(p2P2)+2λ}].

    For α=0, the suggested class of estimators reduces to

    tRPR(α=0)=k1ˉyk2(p1P1)[exp{η(P2p2)η(P2p2)+2λ}].

    A set of of new estimators generated from Eq (2.13) using suitable use of α, η and λ are listed in Table 1.

    Table 1.  Set of estimators generated from estimator tRPR.
    Subset of proposed estimator α η λ
    tRPR1=k1ˉyk2(p1P1)[exp{Cϕ2(P2p2)Cϕ2(P2p2)+2β2ϕ2}] 0 Cϕ2 β2ϕ2
    tRPR2=k1ˉyk2(p1P1)[exp{P2(P2p2)P2(P2p2)+2}] 0 P2 1
    tRPR3=k1ˉyk2(p1P1)[exp{(P2p2)(P2p2)+2Cϕ2}] 0 1 Cϕ2
    tRPR4=k1ˉyk2(p1P1)[exp{(P2p2)(P2p2)+2}] 0 1 1
    tRPR5=k1ˉyk2(p1P1)[2exp{Cϕ2(p2P2)Cϕ2(p2P2)+2β2ϕ2}] 1 Cϕ2 β2ϕ2
    tRPR6=k1ˉyk2(p1P1)[2exp{P2(p2P2)P2(p2P2)+2}] 1 P2 1
    tRPR7=k1ˉyk2(p1P1)[2exp{(p2P2)(p2P2)+2Cϕ2}] 1 1 Cϕ2
    tRPR8=k1ˉyk2(p1P1)[2exp{(p2P2)(p2P2)+2}] 1 1 1

     | Show Table
    DownLoad: CSV

    Expressing Eq (2.13) in terms of e's we have

    tRPR=k1ˉY(1+e0)k2P1e1[α{2(1γe212γ2e22)}+(1α)(1γe2+32γ2e22)], (2.14)

    where γ=ηP22(ηP2+λ).

    To the first degree of approximation, we have:

    tRPRˉYk1ˉY+k1ˉYe0k2P1e1k1γˉYe2γˉYk1e2e0+ˉYk1e22γ232αˉYk1e22γ2+γk2P1e1e2ˉY. (2.15)

    Taking expectation of the above equation we get bias of tRPR, given by:

    Bias(tRPR)k1ˉYγˉYk1V101+ˉYk1V002γ2(32α)+γk2P1V011. (2.16)

    Squaring both sides of Eq (2.15) and taking expectations of both sides, we get the MSE of the estimator tRPR to the first order of approximation, as

    E(tRPRˉY)2ˉY2+ˉY2k21(14γV1012αV002γ2+4γ2V002+V200)
    k1ˉY2(22αV002γ22γV101+3V002γ2)
    +2k1k2ˉY(2γP1V011P1V110)
    2k2ˉY(γP1V011)+k22(P21V020), (2.17)
    MSEtRPRˉY2+ˉY2k21Ak1ˉY2B+2k1k2ˉYC2k2ˉYD+k22E. (2.18)

    where

    A=14γV1012αV002γ2+4γ2V002+V200,
    B=22αV002γ22γV101+3V002γ2,
    C=2γ2P1V011P1V110,D=γP1V011,E=P21V020.

    The optimum values of k1 and k2 are obtained by minimizing Eq (2.18) and is given by

    k1=BE2CD2(AEC2),

    and

    k2=ˉY(2ADBC)2(AEC2),

    Substituting the optimum values of k1 and k2 in Eq (2.18) we get the minimum MSE of tRPR as:

    MSE(t(RPR)min)=ˉY2(14AD2+B2E4BCD)4(AEC2). (2.19)

    The minimum MSE of the proposed estimator tRPR at Eq (2.19) depends upon many parametric constants, we use these constant for readers to easily understand and for notation convenient.

    The usual mean per unit estimator in two phase sampling is:

    tU=ˉy. (2.20)

    The MSE of tU is given by

    MSE(tU)=ˉY2V200. (2.21)

    The Naik [1] estimators in two phase sampling are :

    tA=ˉy(p1p1), (2.22)
    tB=ˉy(P2p2), (2.23)
    tC=ˉyexp(p1p1p1+p1), (2.24)
    tD=ˉyexp(P2p2P2+p2). (2.25)

    The MSE expressions of estimators tA, tB, tC and tD are respectively given as:

    MSE(tA)ˉY2(V200+V020V020+2V1102V110), (2.26)
    MSE(tB)ˉY2(V200+V002+2V101), (2.27)
    MSE(tC)ˉY2(V200+V110V11014V020+14V020), (2.28)
    MSE(tD)ˉY2(V200+14V002+V101). (2.29)

    Malik [7] used exponential type estimator with regression coefficients in two phase sampling which is given by:

    tMS=ˉyexp(p1p1p1+p1)δ1exp(P2p2P2+p2)δ2+b1(P1p1)+b2(P2p2), (2.30)

    where b1=syϕ1s2ϕ1 and b2=syϕ2s2ϕ2 are the sample regression coefficients. δ1 and δ2 are two unknown constants. The optimum values of these constants are given as:

    δ1(opt)=2P1β1ˉY+2Cyρyϕ1Cϕ1,
    δ2(opt)=2P2β2ˉY+2Cyρyϕ2Cϕ2,

    where, β1=Syϕ1S2ϕ1 and β2=Syϕ2S2ϕ2 are the regression coefficients.

    The minimum mean square error for the optimum values of δ1 and δ2 are given as:

    MSE(tMSmin)fˉY2C2y{f(1+ρ2yϕ1)+λ(ρ2yϕ1ρ2yϕ2)}. (2.31)

    We suggest a generalized exponential estimator when P1 is unknown and P2 is known:

    tRPR=k1ˉyk2(p1p1)[α{2exp(η(p2P2)η(p2+P2)+2λ)}+(1α)exp(η(P2p2)η(P2+p2)+2λ)]. (2.32)

    where k1 and k2 are suitable constants whose value are to be determined such that MSE of tRPR is minimum. η and λ are either real numbers or functions of known parameters of the auxiliary attribute ϕ2 such as coefficient of variation, coefficient of kurtosis (βϕ2) and α is a scalar (0α) for designing different estimators.

    Putting α=1 and α=0 in above suggested class of estimators, we get the following estimators.

    For α=1, the suggested class of estimators reduces to:

    tRPR(α=1)=k1ˉyk2(p1p1)[2exp{η(p2P2)η(p2P2)+2λ}].

    For α=0, the suggested class of estimators reduces to:

    tRPR(α=0)=k1ˉyk2(p1p1)[exp{η(P2p2)η(P2p2)+2λ}].

    Expressing (2.32) in terms of errors we have,

    tRPR=k1ˉY(1+e0)k2P1e1+k2P1e1[α{2(1+γe212γ2e22)}+(1α)(1γe2+32γ2e22)], (2.33)

    where γ=ηP22(ηP2+λ).

    To the first degree of approximation,

    tRPRˉYk1ˉY+k1ˉYe0k2P1e1+k2P1e1k1γˉYe2γˉYk1e2e0
    +ˉYk1e22γ232αˉYk1e22γ2+γk2P1e1e2γk2P1e1e2ˉY. (2.34)

    Taking expectation both sides of Eq (2.34) we have:

    Bias(tRPR)ˉY(k11)γˉYk1V101+ˉYk1V002γ2(32α). (2.35)

    Squaring Eq (2.34) and neglecting higher powers, we get

    E(tRPRˉY2)ˉY2+k21ˉY2(14γV1012αV002γ2+4γ2V002+V200)
    +k1ˉY2(2+2αV002γ2+2γV101+3V002γ2)
    +2k1k2ˉY(P1V110P1V110)+2k22(P21V020P21V020).
    MSE(tRPR)ˉY2+k21ˉY2A+k1ˉY2B+2k1k2ˉYC+k22D, (2.36)
    A=14γV1012αV002γ2+4γ2V002+V200,
    B=k12+2αV002γ2+2γV101+3V002γ2,
    C=P1V110P1V110,D=P21V020P21V020.

    The optimum values of k1, k2 are obtained by minimizing Eq (2.36):

    k1=DB(ADC2),
    k2=ˉY(BC)2(ADC2).

    Substituting the optimum values of k1 and k2 in Eq (2.36) we get the minimum MSE of tRPR as:

    MSE(t(RPR)min)ˉY2(1B2D)4(ADC2). (2.37)

    In this section we compare theoretically the minimum MSE of the proposed parent family of estimators tRPR with the MSE of existing estimators.

    Comparison with usual mean per unit estimator:

    (i) MSE(tU)MSE(t(RPRi)min)0(i=1,2,...,8), if ˉY2V200[ˉY2(14AD2+B2E4BCD)4(AEC2)]0,

    Comparison with Naik [1] estimators:

    (ii) MSE(tA)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V2002V110+V020)[ˉY2(14AD2+B2E4BCD)4(AEC2)]0.

    (iii) MSE(tB)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200+2V101+V002)[ˉY2(14AD2+B2E4BCD)4(AEC2)]0.

    (iv) MSE(tC)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200V110+14V020)[ˉY2(14AD2+B2E4BCD)4(AEC2)]0.

    (v) MSE(tD)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200+V101+14V002)ˉY2(14AD2+B2E4BCD)4(AEC2)0.

    (vi) MSE(tMS)MSE(t(RPRi)min)0 (i=1,2,...,8), if fˉY2C2y(1R2yϕ1ϕ2)ˉY2(14AD2+B2E4BCD)4(AEC2)0.

    We observed that the proposed estimators perform better than the existing estimators if above condition (i)–(vi) are satisfied.

    In this section we compare theoretically the minimum MSE of the proposed parent family of estimators tRPR with the MSE of existing estimators.

    Comparison with usual mean per unit estimator:

    (i) MSE(tU)MSE(t(RPRi)min)0(i=1,2,..,8), if ˉY2(V200)ˉY2(1B2D)4(ADC2)0.

    Comparison with Naik [1] estimator:

    (ii) MSE(tA)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200+V020V020+2V1102V110)ˉY2(1B2D)4(ADC2)0.

    (iii) MSE(tB)MSE(t(RPRi)min) 0 (i=1,2,...,8), if ˉY2(V200+V002+2V101)ˉY2(1B2D)4(ADC2)0.

    (iv) MSE(tC)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200+V110V11014V020+14V020)ˉY2(1B2D)4(ADC2)0.

    (v) MSE(tD)MSE(t(RPRi)min)0 (i=1,2,...,8), if ˉY2(V200+14V002+V101)ˉY2(1B2D)4(ADC2)0.

    (vi) MSE(tMS)MSE(t(RPRi)min)0 (i=1,2,...,8), if fˉY2C2y{f(1+ρ2yϕ1)+λ(ρ2yϕ1ρ2yϕ2)}ˉY2(1B2D)4(ADC2)0.

    We observed that the proposed estimators perform better than the existing estimators if above condition (i)–(vi) are satisfied.

    Population 1. [19]

    Let Y be the study variable of the cultivated area of wheat in 1964.

    P1 be the proportion of cultivated area of wheat greater than 100 acre in 1963.

    P2 be the proportion of cultivated area of wheat greater than 500 in 1961.

    N=34, n=15, ˉY=199.4412, P1=0.73529, P2=0.647059, Sy=150.215, Sϕ1=0.4478111, Sϕ2=0.4850713, β2ϕ2=1.688, Cϕ1=0.6090231, ρϕ1ϕ2=0.6729, Cϕ2=0.7496556, Cy=0.7531,

    ρyϕ2=0.6281, ρyϕ1=0.559.

    Population 2. [20]

    Let Y be the study variable of the number of fishes caught in 1995.

    P1 be the proportion of fishes caught which is greater than 1000 in 1993.

    P2 be the proportion of fishes caught which is greater than 2000 in 1994.

    N=69, n=14, ˉY=4514.89, P1=0.7391304, P2=0.5507246, Sy=6099.14, Sϕ1=0.4423259, Sϕ2=0.5010645, β2ϕ2=2.015, Cϕ1=0.5984409, ρϕ1ϕ2=0.6577519, Cϕ2=0.9098277, Cy=1.350, ρyϕ2=0.538047, ρyϕ1=0.3966081.

    Population 3. [21]

    Let study variable Y be the tobacco area production in hectares during the year 2009.

    P1 be the proportion of farms with tobacco cultivation area greater than 500 hectares during the year 2007.

    P2 be proportion of farms with tobacco cultivation area greater than 800 hectares during the year 2008 for 47 districts of Pakistan.

    N=47, n=10, ˉY=1004.447, P1=0.4255319, P2=0.3829787, sy=2351.656, sϕ1=0.499, sϕ2=0.4850713, β2ϕ2=1.8324, Cϕ1=1.174456, ρϕ1ϕ2=0.9153857, Cϕ2=1.283018, Cy=2.341245, ρyϕ2=0.4661508, ρyϕ1=0.4395989.

    Population 4. [21]

    Let study variable Y be the cotton production in hectares during the year 2009.

    P1 be the proportion of farms with cotton cultivation area greater than 37 hectares during the year 2007.

    P2 be proportion of farms with cotton cultivation area greater than 35 hectares during the year 200 for 52 districts of Pakistan.

    N=52, n=11, ˉY=50.03846, P1=0.3846154, P2=0.4423077, Sy=71.13086, Sϕ1=0.4912508, Sϕ2=0.501506, β2ϕ2=1.62014, Cϕ1=1.277252, ρϕ1ϕ2=0.8877181, Cϕ2=1.13384, Cy=1.421524, ρyϕ2=0.6935718, ρyϕ1=0.7369579.

    We use the following expression to obtain the Percentage Relative Efficiency PRE:

    PRE=MSE(t0)MSE(timin)100, (4.1)

    where i = U, A, B, C, D, MS, RPR1, RPR2, RPR3, RPR4, RPR5, RPR6, RPR7 and RPR8.

    In Table 2, it is clearly shown that our suggested class of estimator tRPRi perform better than all the existing estimators tA, tB, tC, tD and tMS. A significant increase is observed in the percentage relative efficiency of estimators of tRPR6, tRPR7 and tRPR8.

    Table 2.  Set of estimators generated from estimator tRPR(α=1).
    Subset of proposed estimator α η λ
    tRPR1=k1ˉyk2(p1p1)[exp{Cϕ2(P2p2)Cϕ2(P2p2)+2β2ϕ2}] 0 Cϕ2 β2ϕ2
    tRPR2=k1ˉyk2(p1p1)[exp{P2(P2p2)P2(P2p2)+2}] 0 P2 1
    tRPR3=k1ˉyk2(p1p1)[exp{(P2p2)(P2p2)+2Cϕ2}] 0 1 Cϕ2
    tRPR4=k1ˉyk2(p1p1)[exp{(P2p2)(P2p2)+2}] 0 1 1
    tRPR5=k1ˉyk2(p1p1)[2exp{Cϕ2(p2P2)Cϕ2(p2P2)+2β2ϕ2}] 1 Cϕ2 β2ϕ2
    tRPR6=k1ˉyk2(p1p1)[2exp{P2(p2P2)P2(p2P2)+2}] 1 P2 1
    tRPR7=k1ˉyk2(p1p1)[2exp{(p2P2)(p2P2)+2Cϕ2}] 1 1 Cϕ2
    tRPR8=k1ˉyk2(p1p1)[2exp{(p2P2)(p2P2)+2}] 1 1 1

     | Show Table
    DownLoad: CSV

    Population 1. [19]

    Let Y be the study variable cultivated area of wheat in 1964.

    P1 be the proportion of cultivated area of wheat greater than 100 acres in 1963.

    P2 be the proportion of cultivated area of wheat greater than 500 in 1961.

    N=34, n=15, n=3, ˉY=199.4412, P1=0.73529, P2=0.647059, Sy=150.215, Sϕ1=0.4478111, Sϕ2=0.4850713, β2ϕ2=1.688, Cϕ1=0.6090231, ρϕ1ϕ2=0.6729, Cϕ2=0.7496556, Cy=0.7531, ρyϕ2=0.6281, ρyϕ1=0.559.

    Population 2. [20]

    Let Y be the study variable, number of fishes caught in 1995.

    P1 be the proportion of fishes caught greater than 1000 in 1993.

    P2 be the proportion of fishes caught greater than 2000 in 1994.

    N=69, n=20, n=7, ˉY=4514.89, P1=0.7391304, P2=0.5507246, sy=6099.14, sϕ1=0.4423259, sϕ2=0.5010645, β2ϕ2=2.015, Cϕ1=0.5984409, ρϕ1ϕ2=0.6577519, Cϕ2=0.9098277, Cy=1.350, ρyϕ2=0.538047, ρyϕ1=0.3966081.

    Population 3. [21]

    Let Y be the study variable, tobacco area production in hectares during the year 2009.

    P1 be the proportion of farms with tobacco cultivation area greater than 500 hectares during the year 2007.

    P2 be proportion of farms with tobacco cultivation area greater than 800 hectares during the year 2008 for 47 districts of Pakistan.

    N=47, n=15, n=7, ˉY=1004.447, P1=0.4255319, P2=0.3829787, Sy=2351.656, Sϕ1=0.49, Sϕ2=0.4850713, β2ϕ2=1.8324, Cϕ1=1.174456, ρϕ1ϕ2=0.9153857, Cϕ2=1.283018, Cy=2.341245, ρyϕ2=0.4661508, ρyϕ1=0.4395989.

    Population 4. [21]

    Let Y be the study variable, cotton production in hectares during the year 2009.

    P1 be the proportion of farms with cotton cultivation area greater than 37 hectares during the year 2007.

    P2 be proportion of farms with cotton cultivation area greater than 35 hectares during the year 2008 for 52 districts of Pakistan.

    N=52, n=11, n=3, ˉY=50.03846, P1=0.3846154, P2=0.4423077, Sy=71.13086, Sϕ1=0.4912508, Sϕ2=0.501506, β2ϕ2=1.62014, Cϕ1=1.277252, ρϕ1ϕ2=0.8877181, Cϕ2=1.13384, Cy=1.421524, ρyϕ2=0.6935718, ρyϕ1=0.7369579.

    We use the following expression to obtain the Percentage Relative Efficiency(PRE):

    PRE=MSE(t0)MSE(timin)100, (4.2)

    where i = U, A,B,C,D,MS,RPR1,RPR2,RPR3,RPR4,RPR5,RPR6,RPR7andRPR8.

    The results for data set 1–4 are given in Table 4.

    In Table 4, it is clearly shown that our suggested class of estimator tRPR perform better than all the existing estimators of tA, tB, tC and tD and tMS. A significant increase is observed in the percentage relative efficiency of estimators of tRPR6, tRPR7 and tRPR8.

    There are many situations where we only interest in knowing everything about the study variable, which is too difficult. For this we can use two auxiliary variables in the form of proportion to find out the study variable. This manuscript provides us the basic tools to the problems related to proportion estimation and two-phase sampling. Here we can see that in abstract of the manuscript we just talk about the minimum MSE of proposed and existing estimators, reason behind is that we can easily compare the minimum MSE with other properties of good estimators like MLE ect., we can also see that the comparison is made in the form of percentage relative efficiency.

    Statisticians are constantly trying to develop efficient estimators and estimation methodologies to increase the efficiency of estimates. The progress is going on for estimators of population mean. In the present paper our task is to develop a new estimator for estimating the finite population mean under two different sampling schemes, which are simple random sampling and two-phase sampling. The new estimators will be proposed under the following situations:

    1). The initial sample is collected through simple random sampling.

    2). And then by two-phase sampling using simple random sampling.

    In this article, we consider the problem of estimating the finite population mean using the auxiliary proportion under simple random sampling and two-phase sampling scheme. In general, during surveys, it is observed that information in most cases is not obtained on the first attempt even after some call-backs, in such types of issue we use simple random sampling. And when the required results are not obtained, we use two-phase sampling. These approaches are used to obtain the information as much as possible. In sample surveys, it is well known that while estimating the population parameters, i.e., Finite population (mean, median, quartiles, coefficient of variation and distribution function) the information of the auxiliary variable (Proportion) is usually used to improve the efficiency of the estimators. The main aim of studies is to find out more efficient estimators than classical and recent proposed estimators using the auxiliary information (in the form of proportion) for estimating finite population mean under simple random sampling and two-phase sampling scheme.

    There are situations where our work is deemed necessary and can be used in daily life.

    1). For a nutritionist, it is interesting to know the proportion of population that consumes 25% or more of the calorie intake from saturated fat.

    2). Similarly, a soil scientist may be interested in estimating the distribution of clay percent in the soil.

    3). In addition, policy-makers may be interested in knowing the proportion of people living in a developing country below the poverty line.

    In this paper, we have proposed a generalized class of exponential ratio type estimators for estimating population mean using the auxiliary information in the form of proportions under simple and two phase sampling. We used SRS to estimate the population mean using the proportions of available auxiliary information, and when the auxiliary information is unknown, we used two phase sampling for estimation resolution. From the numerical results available in Tables 3 and 4 we can see that two phase sampling gave more efficient results than simple random sampling. Thus the use of auxiliary information in estimation processes increases the efficiency of the estimator, that's we have used two auxiliary variables as attributes. In the numerical study we showed that the proposed estimator is more efficient that tU, tA, tB, tC, tD, tMS and any other suggested family of estimators both in simple and two phase sampling schemes.

    Table 3.  Percentage relative efficiency (PRE) with respect to usual mean estimator tU.
    Estimator Data set1 Data set 2 Data set 3 Data set 4
    tU 100 100 100 100
    tA 133.37 118.36 123.36 207.04
    tB 30.84 45.95 55.04 36.46
    tC 140.5 114.40 118.75 185.29
    tD 55.39 67.77 75.01 58.40
    tMS 139.06 110.94 105.64 146.93
    tRPR1 125.98 134.42 165.4 225.72
    tRPR2 106.66 134.57 167.49 235.43
    tRPR3 111.08 137.09 167.89 235.65
    tRPR4 109.10 137.39 167.82 233.82
    tRPR5 125.75 120.83 166.47 223.72
    tRPR6 161.80 137.02 167.63 235.16
    tRPR7 168.29 137.47 168.16 235.96
    tRPR8 165.09 134.42 168.01 235.93

     | Show Table
    DownLoad: CSV
    Table 4.  Percentage relative efficiency (PRE) with respect to usual mean estimator tU.
    Estimator Data set1 Data set 2 Data set 3 Data set 4
    tU 100 100 100 100
    tA 128.13 112.64 113.46 166.39
    tB 78.44 75.41 76.75 71.54
    tC 133.90 110.08 110.95 155.10
    tD 90.33 88.37 89.01 86.01
    tMS 134.80 111.05 133.6 209.34
    tRPR1 149.36 131.69 178.24 225.68
    tRPR2 158.43 138.50 181.06 239.47
    tRPR3 160.04 139.77 181.55 242.17
    tRPR4 159.63 139.58 181.78 242.08
    tRPR5 149.59 131.98 178.95 227.06
    tRPR6 158.56 138.59 181.13 239.69
    tRPR7 160.36 139.77 181.13 242.80
    tRPR8 159.40 139.99 182.10 242.70

     | Show Table
    DownLoad: CSV

    Some possible extensions of the current work are as follows:

    Develop improved finite population mean estimators,

    1). using supplementary information more than one auxiliary variable.

    2). under stratified two-phase sampling.

    3). in the presence of measurement errors.

    4). under non-response with two-phase sampling.

    The authors are thankful to the learned referee for his useful comments and suggestions.

    The authors declare no conflict of interest.



    [1] V. D. Naik, P. C. Gupta, A note on estimation of mean with known population proportion of an auxiliary character, J. Indain. Soc. Agric. Stat., 48 (1996), 151–158.
    [2] H. S. Jhajj, M. K. Sharma, L. K. Grover, A family of estimators of population mean using information on auxiliary attribute, Pak. J. Stat., 48 (2006), 43.
    [3] A. M. Abd-Elfattah, E. A El-Sherpieny, S. M Mohamed, O. F Abdou, Improvement in estimating the population mean in simple random sampling using information on auxiliary attribute, Appl. Math. Comput., 215 (2010), 4198–4202.
    [4] N. Koyuncu, Efficient estimators of population mean using auxiliary attributes, Appl. Math. Comput., 218 (2012), 10900–10905.
    [5] R. S. Solanki, H. P. Singh, Improved estimation of population mean using population proportion of an auxiliary character. Chil. J. Stat., 4 (2013), 3–17.
    [6] P. Sharma, H. K. Verma, A. Sanaullah, R. Singh, Some exponential ratio-product type estimators using information on auxiliary attributes under second order approximation, Int. J. Stat. Econ., 12 (2013), 58–66.
    [7] S. Malik, R. Singh, An improved estimator using two auxiliary attributes, Appl. Math. Comput., 219 (2013), 10983–10986.
    [8] H. Verma, R. Singh, F. Smarandache, Some improved estimators of population mean using information on two auxiliary attributes, In: On improvement in estimating population parameter(s) using auxiliary information, Columbus: Educational Publishing, Beijing: Journal of Matter Regularity, 2013, 17–24.
    [9] R. S. Solanki, H. P. Singh, S. K. Pal, Improved estimation of finite population mean in sample surveys, Columbia Int. Publ. J. Adv. Comput., 1 (2013), 70–78.
    [10] P. Sharma, R. Singh, Improved ratio type estimator using two auxiliary variables under second order approximation, Math. J. Interdiscip. Sci., 2 (2014), 179–190. doi: 10.15415/mjis.2014.22014
    [11] M. Mahdizadeh, E. Zamanzade, Kernel-based estimation of p (x > y) in ranked set sampling, SORT-Stat. Oper. Res. T., 40 (2016), 243–266.
    [12] H. P Singh, S. K. Pal, R. S. Solanki, A new class of estimators of finite population mean in sample surveys, Commun. Stat. Theor. Methods, 46 (2017), 2630–2637. doi: 10.1080/03610926.2015.1030429
    [13] M. Mahdizadeh, E. Zamanzade, Smooth estimation of a reliability function in ranked set sampling, Statistics, 52 (2018), 750–768. doi: 10.1080/02331888.2018.1477157
    [14] S. Hussain, S. Ahmad, S. Akhtar, A. Javed, U. Yasmeen, Estimation of finite population distribution function with dual use of auxiliary information under non-response, PloS One, 15 (2020), e0243584. doi: 10.1371/journal.pone.0243584
    [15] S. Al-Marzouki, C. Chesneau, S. Akhtar, J. A. Nasir, S. Ahmad, S. Hussain, et al., Estimation of finite population mean under pps in presence of maximum and minimum values, AIMS Mathematics, 6 (2021), 5397–5409. doi: 10.3934/math.2021318
    [16] B. Kiregyera, A chain ratio-type estimator in finite population double sampling using two auxiliary variables, Metrika, 27 (1980), 217–223. doi: 10.1007/BF01893599
    [17] S. Mohanty, J. Sahoo, A note on improving the ratio method of estimation through linear transformation using certain known population parameters, Sankhyā: Indian J. Stat. Ser. B, 1995, 93–102.
    [18] A. Haq, J. Shabbir. An improved estimator of finite population mean when using two auxiliary attributes, Appl. Math. Comput., 241 (2014), 14–24.
    [19] M. N. Murthy, Sampling theory and methods, Florida: CRC Press LLC, 1967.
    [20] S. Singh, Advanced sampling theory with applications, Springer Science and Business Media, 2003.
    [21] A. Sharmin, J. R. Sarker, K. R. Das, Growth and trend in area, production and yield of major crops of Bangladesh. Int. J. Econ. Financ. Manage. Sci., 4 (2016), 20–25.
  • This article has been cited by:

    1. Muhammad Ahmed Shehzad, Anam Nisar, Aamna Khan, Walid Emam, Yusra Tashkandy, Haris Khurram, Isra Al-Shbeil, Modified median quartile double ranked set sampling for estimation of population mean, 2024, 10, 24058440, e34627, 10.1016/j.heliyon.2024.e34627
    2. Muhammad Nadeem Intizar, Muhammad Ahmed Shehzad, Haris Khurram, Soofia Iftikhar, Aamna Khan, Abdul Rauf Kashif, Integrating endogeneity in survey sampling using instrumental-variable calibration estimator, 2024, 10, 24058440, e33969, 10.1016/j.heliyon.2024.e33969
    3. Anoop Kumar, Walid Emam, Yusra Tashkandy, Memory type general class of estimators for population variance under simple random sampling, 2024, 10, 24058440, e36090, 10.1016/j.heliyon.2024.e36090
    4. Jing Wang, Sohaib Ahmad, Muhammad Arslan, Showkat Ahmad Lone, A.H. Abd Ellah, Maha A. Aldahlan, Mohammed Elgarhy, Estimation of finite population mean using double sampling under probability proportional to size sampling in the presence of extreme values, 2023, 9, 24058440, e21418, 10.1016/j.heliyon.2023.e21418
    5. Muhammad Junaid, Sadaf Manzoor, Sardar Hussain, M.E. Bakr, Oluwafemi Samson Balogun, Shahab Rasheed, An optimal estimation approach in non-response under simple random sampling utilizing dual auxiliary variable for finite distribution function, 2024, 10, 24058440, e38343, 10.1016/j.heliyon.2024.e38343
    6. Khazan Sher, Muhammad Ameeq, Muhammad Muneeb Hassan, Olayan Albalawi, Ayesha Afzal, Development of improved estimators of finite population mean in simple random sampling with dual auxiliaries and its application to real world problems, 2024, 10, 24058440, e30991, 10.1016/j.heliyon.2024.e30991
    7. Abdullah Mohammed Alomair, Soofia Iftikhar, Calibrated EWMA estimators for time-scaled surveys with diverse applications, 2024, 10, 24058440, e31030, 10.1016/j.heliyon.2024.e31030
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2366) PDF downloads(72) Cited by(7)

Figures and Tables

Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog