1.
Introduction
In survey sampling, the appropriate use of auxiliary information is known to enhance the accuracy of an estimator of the unknown population parameter. This information (auxiliary) can be used to select a random sample using SRSWR or SRSWOR. Auxiliary information gives us a sort of technique in terms of ratio, product, regression, and other methods, it is therefore necessary to have a representative part of the population, when the population of interest is more homogeneous, than simple random sampling can be used to select units. A considerable amount of work was done on estimating the population mean by simple random sampling, a number of important references include [2,5,7,11,12,13,16,17,18,19,20,22,23,24,26,27,30,32] and the references cited therein, have suggested different types of estimators to estimate the population mean and population distribution function in the presence of non-response.
As a practical matter, one of the main problems with surveys is that they suffer from non-response, non-response has a lot of ways to happen. Examples are language problems, non-availability of response, incorrect return address and input from another person, censorship or clustering is a problem across several data. The statistician has recognized for quite few time that ignoring the stochastic nature of incompleteness or non-response may change the nature of the data. Several factors affect the non-response rate for a survey, some of these factors are the type of information collected, the official status of the investigating agency, the extent of the publicity, legal requirements of respondents, the duration of the enumerator's visit and the length of withdrawal period etc.
A great deal of work has been done on the estimation of the population mean to check non-response bias and increase efficiency of estimators by different authors. The issue of non-response in sample surveys is more common and prevalent in mail surveys than in special interview surveys. [10] was the first to address the issue of incomplete samples in the postal or telephone surveys. For certain related work, we refer to [1,2,3,12,13,14,15,17,18,20,21,23,24,25,26,28,31,32] and the references cited therein.
On the line of [11] a new family of estimators is proposed for the estimation of population mean in the presence of non-response. We will prove theoretically and numerically that the proposed family of estimators is more precise than the existing estimators.
The rest of the paper is set out as follows: In Section 2, some notations are introduced by SRS with non-responding. In Section 3, the existing estimators examined for the two non-response situations. A new family of estimators is presented in Section 4 under both non-response situations using simple random sampling. The existing and proposed estimators are theoretically compared in Section 5. In Section 6, the existing and proposed family of estimators are compared numerically. Section 7 condenses the principal discovery and culminate the document.
2.
Notations
Suppose Ω={U1,U2,...,UN} denotes be a finite population of N distinct units that is bisect into two groups, respondents and non-respondents, having sizes N1 and N2, where N=N1+N2. Thus we denote Ω1={U1,U2,...,UN1} for the response group and Ω2={U1,U2,...,UN2} for the non-response group. In order to estimate the population mean, a sample of n is taken from the underlying population by simple random sampling without replacement (SRSWOR), and for which units n1 are responding and n2=n−n1 are not responding. It is also assumed that the sample size n1 is drawn from the response group of Ω1 and n2 is drawn from the non-response group of Ω2. Moreover a sample of size r=n2/k units, where k>1 is drawn by simple random sampling without replacement from n2, and the temporal response is obtained from all r units.
Let Y, X, Z, be the study, auxiliary and ranks of the auxiliary variable.
ˉY=∑Ni=1Yi/N, ˆˉY=∑ni=1Yi/n: The population and sample mean of Y.
ˉX=∑Ni=1Xi/N, ˆˉX=∑ni=1Xi/n: The population and sample mean of X.
ˉZ=∑Ni=1Zi/N, ˆˉZ=∑ni=1Zi/n: The population and sample mean of Z.
ˉY(2)=∑N2i=1Yi/N2: The population mean of Y for non-response group.
ˉX(2)=∑N2i=1Xi/N2: The population mean of X for non-response group.
ˉZ(2)=∑N2i=1Zi/N2: The population mean of Z for non-response group.
ˆˉY(1)=∑n1i=1Yi/n1 denote the sample mean based on n1 responding units out of n units.
ˆˉX(1)=∑n1i=1Xi/n1 denote the sample mean based on n1 responding units out of n units.
ˆˉZ(1)=∑n1i=1Zi/n1 be the sample mean based on n1 responding units out of n units.
ˆˉY(2r)=∑ri=1Yi/r be the sample mean based on r reacting units out of n2 non-response units.
ˆˉX(2r)=∑ri=1Xi/r be the sample mean based on r reacting units out of n2 non-response units.
ˆˉZ(2r)=∑ri=1Zi/r denote the sample mean based on r reacting units out of n2 non-response units.
S2Y=N∑i=1(Yi−ˉY)2/(N−1), S2X=N∑i=1(Xi−ˉX)2/(N−1). S2Z=N∑i=1(Zi−ˉZ)2/(N−1): The population variance of Y, X, and Z.
S2Y2=N2∑i=1(Yi−ˉY2)2/(N2−1), S2X2=N2∑i=1(Xi−ˉX2)2/(N2−1), S2Z2=N2∑i=1(Zi−ˉZ2)2/(N2−1): The population variance of Y, X, and Z for non-response group.
CY=SY/ˉY, CX=SX/ˉX, CZ=SZ/ˉZ: The population coefficient of variation of Y, X and Z.
CY(2)=SY(2)/ˉY(2), CX(2)=SX(2)/ˉX(2), CZ(2)=SY(2)/ˉZ(2): Be the population coefficient of variation of Y, X and Z for non-response group.
SYX=N∑i=1(Yi−ˉY)(Xi−ˉX)/(N−1), SYZ=N∑i=1(Yi−ˉY)(Zi−ˉZ)/(N−1), SXZ=N∑i=1(Xi−ˉX)(Zi−ˉZ)/(N−1): The population covariance between (Y,X), (Y,Z), and (X,Z).
SY2X2=N2∑i=1(Yi−ˉY2)(Xi−ˉX2)/(N2−1), SY2Z2=N2∑i=1(Yi−ˉY2)(Zi−ˉZ2)/(N2−1), SX2Z2=N2∑i=1(Xi−ˉX2)(Zi−ˉZ2)/(N2−1): The population covariance between (Y,X), (Y,Z), and (X,Z) for non-response group.
ρYX=SYX/(SYSX), ρYZ=SYZ/(SYSZ), ρXZ=SXZ/(SXSZ): Be the population correlation coefficient between (Y,X), (Y,Z), and (X,Z).
ρY2X2=SY2X2/(SY2SX2), ρY2Z2=SY2Z2/(SY2SZ2), ρX2Z2=SX2Z2/(SX2SZ2): The population correlation coefficient between (Y,X), (Y,Z), and (X,Z) for non-response group.
R2Y.XZ=(ρ2YX+ρ2YZ−2ρYXρYZρXZ)/(1−ρ2XZ): The population coefficient of multiple determination of Y on X and Z.
R2Y.XZ(2)=(ρ2YX(2)+ρ2YZ(2)−2ρYX(2)ρYZ(2)ρXZ(2))/(1−ρ2XZ(2)): The population coefficient of multiple determination of Y on X and Z for non-response group.
The population mean Y may be written as such
where Wj=Nj/N, ˉYj=∑Nji=1Zi/Nj, for j=1,2., ˉXj=∑Nji=1Xi/Nj and ˉZj=∑Nji=1Zi/Nj. Following [10,12] have suggested an unbiased estimator of ˉY under non-response, which is given by
and
where wj=nj/n for j=1, 2, λ=(1/n−1/N) and λ2=W2(k−1).
Similarly
are unbiased estimators of ˉX and ˉZ respectively under non-response with corresponding variances
respectively.
In order to obtain the properties of the proposed estimator, we consider the following relative error terms.
Let ξ∗0=(ˆˉYH−ˉY)/ˉY, ξ∗1=(ˆˉXH−ˉX)/ˉX, ξ∗2=(ˆˉZH−ˉZ)/ˉZ, ξ1=(ˆˉXH−ˉX)/ˉX, and ξ2=(ˆˉZH−ˉZ)/ˉZ, such that E(ξ∗i)=E(ξi)=0 for i = 0, 1, 2, and for i = 1, 2. Where E(⋅) represents the mathematical expectation of (⋅). Let
where r,s,t,u=0,1,2. Here,
where θ=(1/n−1/N) and θ2=W2(k−1)/n.
Usually in case of non-response, two situations are more likely to happen, namely non-response on Y only (say Situation-Ⅰ) and non-response on both Y, X and Z (say Situation-Ⅱ).
3.
Existing estimators
In this portion, some existing estimates of the population mean for non-response are briefly reviewed for both situations.
3.1. Situation-Ⅰ
When non-response occurs in only one study variable, say Y
(1) The estimator of the typical ratio of the ˉY is given as:
The properties of ˆˉYR, are given by:
respectively.
(2) The typical product estimator ˉY is given as:
The properties of ˆˉY∗P, are given as:
(3) The typical difference estimator for the ˉY is given as:
The minimal variance of ˆˉY∗D at d(opt)=(ˉYV110)/(ˉXV020) is given as:
Here in (3.6) can be written as:
(4) Following [27], a difference-type estimator of ˉY is
The properties of ˆˉY∗R,D, are given by:
and
By simplify Eq (3.10) the value of k1 and k2, are given as:
respectively. The minimal MSE of ˆˉY∗R,D at the optimal values is given by:
Equation (3.11) may be written as
(5) Following [4], is given as:
The biases and MSEs of ˆˉY∗BT,R and ˆˉY∗BT,P, are given as:
and
(6) Following [29], a generalized ratio-type exponential estimator of ˉY is
The properties of ˆˉY∗S, are given as:
where θ=aˉX/(aˉX+b).
(7) Following [8], a generalized class of ratio-type exponential estimators of ˉY is given as:
The properties of ˆˉY∗GK, are given as:
The optimum values of k1 and k2 determined by simplifying (23), are given as:
The simplified minimum MSE of ˆˉY∗GK at the optimum values of k1 and k2 is given by
Here (3.20) may be written as
3.2. Situation-Ⅱ
When non response is occur in both study and auxiliary variables, say Y and X.
(1) The traditional ratio estimator of ˉY is given as:
The properties of ˆˉY∗∗R, are given as:
(2) The traditional product estimator of ˉY is given as:
The properties of ˆˉY∗∗P, are given as:
(3) The traditional difference estimator of ˉY is
The minimal variance of ˆˉY∗∗D at the optimal value d(opt)=(ˉYΨ110)/(ˉXΨ020) is
Equation (3.27) may be written as:
(4) Following [27], a difference-type estimator of ˉY is
The properties of ˆˉY∗∗R,D, are given as:
and
The optimal values of k1 and k2, determined by minimizing (3.30), are given as:
The minimal MSE of ˆˉY∗∗R,D at the optimal values is given by:
Equation (3.31) may be written as:
(5) Following [4], the ratio and product-type exponential estimators of ˉY, are given by:
The biases and MSEs of ˆˉY∗∗BT,R and ˆˉY∗∗BT,P, are given by:
and
(6) Following [6], a generalized ratio-type exponential estimator of ˉY is given by:
The properties of ˆˉY∗∗S, are given by:
where θ=aˉX/(aˉX+b).
(7) Following [8], estimators of ˉY is given by:
The properties of ˆˉY∗∗GK, are given by:
The ideal values of k1 and k2 is expressing by (3.40),
The minimal MSE of ˆˉYGK at the optimal values of k1 and k2 is given by:
Equation (3.41) may be written as:
4.
Proposed estimator in non-response using simple random sampling
The proper use of ancillary variable improve the accuracy of estimator in the design and estimation stages. Complete auxiliary information is frequently supplied along with the sample frame for social, economic, and natural surveys. When the study variable and the auxiliary variable have a sufficient amount of connection, the rankings of the auxiliary variable are also correlated with the values of the auxiliary variable. Consequently, The categorised auxiliary variable (which includes the auxiliary variable's rank) can be treated as a new auxiliary variable, and this information can help an estimator perform better. Because of We present an improved family of estimators for predicting the population mean that requires additional information on the study and auxiliary variable sample means, as well as the ranks of the auxiliary variable under non-response using simple random sampling.
4.1. Situation-Ⅰ
When non-response occur only in study variable. On the lines of [11], the proposed improved estimator of ˉY in the presence of non-response using SRS, say ˆˉY∗Suggested is given as:
where w1, w2, and w3 are unknown constant. The proposed estimator ˆˉY∗Suggested can be rewritten as
Simplifying (4.2), we have
The properties of ˆˉY∗Suggested, are given as:
The optimal values of w1, w2, and w3 determined by minimizing (4.4), are
The minimal MSE of ˆˉY∗Suggested at optimal values of w1, w2 and w3 is given by:
Equation (4.5) me be written as
where
4.2. Situation-Ⅱ
When non-response are in both study and auxiliary variable. Taking motivation on the lines of [11], we proposed a family of estimators of ˉY in the presence of non-response say ˆˉY∗∗Suggested, is given by:
where w1, w2, and w3 are unknown constants. The proposed estimator ˆˉY∗∗Suggested can be rewritten as:
Simplifying (4.9), we can write
The properties of ˆˉY∗∗Suggested, are given by:
The optimal values of w1, w2, and w3 determined by minimizing (4.11), are given as:
The minimal MSE of the ˆˉY∗∗Suggested at optimal values of w1, w2 and w3 is given by:
Here (4.12) me be written as:
where
In Table 1, we put some members of the [8,29], and proposed families of estimators with selected choices of a and b.
5.
Efficiency comparisons for both condition
5.1. Situation-Ⅰ
In this section, we performed a comparison of the adapted and proposed estimators, when non-response is available in the study variable.
(ⅰ) By taking (2.3) and (3.21),
(ⅱ) By taking (2.4) and (3.21),
(ⅲ) By taking (3.2) and (3.21),
(ⅳ) By taking (3.5) and (3.21),
(ⅴ) By taking (3.10) and (3.21),
(ⅵ) By taking (3.12) and (3.21),
(ⅶ) By taking (3.16) and (3.21),
5.2. Situation-Ⅱ
In this section, we made efficiency comparison of all estimator, when non-response occur in both the study and auxiliary variables.
(ⅰ) By taking (2.3) and (3.21),
(ⅱ) By taking (2.4) and (3.21),
(ⅲ) By taking (3.2) and (3.21),
(ⅳ) By taking (3.5) and (3.21),
(ⅴ) By taking (3.10) and (3.21),
(ⅵ) By taking (3.12) and (3.21),
(ⅶ) By taking (3.16) and (3.21),
6.
Numerical investigation
In this section, the mathematical result is shown to verify the effectiveness of all estimators as compared to existing estimators. Four data sets are under consideration. The data description and mean square error are listed in Tables 2 and 3. The percent efficiency of estimator ˆˉYi w.r.t ˆˉYSRS:
where i=R,P,…,Sugeested.
The MSEs and PREs of mean estimators, computed from two populations, are given in Tables 4–11.
Population Ⅰ. (Source:[9]) Y: The egg assemble in 1990, X: Value per dozen in 1991.
Population Ⅱ. (Source:[9]) Y: Eggs assemble in 1990, X: Value per dozen in 1990.
7.
Conclusions
In this paper, a new family of estimators for estimating the population mean with information on the auxiliary variable in the form of the sample mean and ranks of the auxiliary variable in the presence of non-response has been devised. The suggested family of estimators a mathematical expressions for biases and minimum MSEs have been generated up to the first order of approximation and compared both theoretically and numerically with the [6,10,22], the conventional difference, [8,27,29] estimators under Situation-Ⅰ and Situation-Ⅱ. It has been observed that the proposed family of estimators is more efficient in both non-response situations.
Conflict of interest
The authors declare no conflict of interest.