In this study, we consider a model of T cell homeostasis based on the Smith-Martin model. This nonlinear model is structured by age and CD44 expression. First, we establish the mathematical well-posedness of the model system. Next, we prove the theoretical identifiability regarding the up-regulation of CD44, the proliferation time phase and the rate of entry into division, by using the experimental data. Finally, we compare two versions of the Smith-Martin model and we identify which model fits the experimental data best.
1.
Introduction
The use of survey sampling is useful in solving real world problems in the field of environmental changes, engineering, management and biological sciences. For instance, environmental sampling is a key tool to verify sources of pollution and adequacy of hygiene process, refining the frequency and intensity of cleaning and sanitation, identifying problem areas, food safety validation programs, and to give an early cautioning of issues that may require remedial action. Sampling can be conducted to characterize background radiological levels, determine the concentration of radionuclides and make recommendations on environmental surveillance for agricultural products. Over all, it gives the confirmation that product being made will be made under clean conditions.
In case of forming travelling patterns of a certain city residents it is hard to approach every person of the city and ask the information on their travel pattern. A sample data is gathered and based on that information the developers try to understand the travel behavior of residents. Likewise, when choosing the trademark quality of cement, of a specific blend, data resulting from the tests on a sample is used. Similarly, a sample of borehole information is used to find quality soil. The sample is representative of population in all the cases and while doing so the representative ability of the sample data must be ensured statistically.
Velasco-Muñoz, et al. [1] reviewed 25 years of researches on sustainable water use in farming. A bibliometric examination was established to sample 2084 published articles from 1993 to 2017. Results showed that sustainable water used for agriculture attained exponential growth and has turned into a worldwide issue. Serbu, et al. [2] collected water samples from four different locations beside the Cibin River over a time of a year and applied Multiple-Criteria Decision Analysis strategies for the surveying the effect of toxins on the earth.
Ziemer, et al. [3] discussed that how data is collected and in what way the sample of electrical engineering departments is evolved that is used as database. De Mello, et al. [4] proposed theoretical framework that contributes to strengthen representativeness of their outcomes, though some key problems concerning survey study are still open and deserve attention from Software Engineering community. De Mello, et al. [5] described the heterogeneity and members who repeatedly contributed to increase the strength of survey’s results. Consequently, De Mello and Travassos [5] believed that sharing of this experience, questionnaire and idea can be helpful for scientists interested on executing surveys on broader scale in Software Engineering.
The information retrieved from auxiliary variables has resulted in wide ranging gain in performance over estimators which don’t take such information. When the auxiliary variable X is available in advance or observed easily and high correlation exists with study variable Y, the use of auxiliary information is effective to estimate population mean. In these situations ratio, regression and product estimators are good examples to use. Estimation method of regression was used by Watson [6] to estimate the mean area of leaves on plant. Cochran [7] proposed ratio method in case of strong positive correlation existence between Y and X (study and auxiliary variables). Murthy [8] revisited the idea given by Robson [9] that product method of estimation is appropriate if strong negative correlation exits between auxiliary and study variables. Srivastava [10] proposed the general ratio estimator using a single auxiliary variable and given that population information on this variable is not available. Exponential smoothing is one of the forecast methods to recognize substantial changes in data by incorporating the most recent information. Given that there are numerous different approaches to make forecasts, exponential smoothing is significantly easy to learn, forecasts accurately, and its application to recent observations gives this technique an edge over others.
Bahl, et al. [11] proposed exponential product estimator when study variable and auxiliary variable are negatively correlated and exponential ratio estimator, in case of positive correlation, exists between Y and X. Numerous authors including Srivastava [10], Hidiroglou [12], Samiuddin, et al. [13], Singh, et al. [14], Singh, et al. [15], Hanif, et al. [16], Hanif, et al. [17], Singh, et al. [18], Singh, et al. [19], Noor-ul-Amin, et al. [20], Tailor, et al. [21] and Shabbir, et al. [22] have also proposed improved ratio and product for estimating the population average of study variable. Al-Marshadi, et al. [23] suggested estimator for estimating population variance by using multi auxiliary variables.
Generally, ratio and product estimators are less efficient than the linear regression estimator when it passes through the origin. It is observed, in most of the situations, that the regression line does not pass through the origin. Considering this fact, Vishwakarma, et al. [24] proposed the ratio product type estimator to improve the performance of Singh and Espejo [14] estimator using two auxiliary variables under two phase sampling.
Vishwakarma, et al. [25] proposed generalized class of estimators by using the information of multi-auxiliary variables in two-phase sampling scheme and claimed that their generalized class of estimators performed well over the Dash, et al. [26] proposed class of estimators.
Mishra, et al. [27] used log function and proposed a ln-product type estimator for estimating the mean value of study variable Y. Akhlaq, et al. [28] proposed an exponential estimator, which is most efficient than the previous estimators for estimating the process variability by using auxiliary information.
To estimate the population mean, considering the N population units $ D = \left({{D_1}, {D_2}, {D_3}..........{D_N}} \right), $ Let $ {y_i} $ and $ \left({{x_i}, {z_i}} \right) $ be the values of study variable $ \left(Y \right) $ and auxiliary variables $ \left({X, Y} \right) $ respectively. Population averages are $ \bar Y = {N^{ - 1}}\sum\limits_{i = 1}^N {{Y_i}~{\rm{ and }}~} \bar X = {N^{ - 1}}\sum\limits_{i = 1}^N {{X_i}{\rm{ }}} $ where $ \bar Y $ is the population mean of study variable and $ \bar X $ is population mean of auxiliary variable.
Auxiliary information in two-phase sampling proves to be effective in estimating the population mean. The first phase estimates the population variable X (auxiliary) in such a manner that when information of the auxiliary variable is not available, it then evaluates the second phase. The first phase sample selection is known as the primary sample, which is comprised of $ {n_1}\left({{n_1} < N} \right) $ units. We select a sample from ‘N’ units by applying simple random sampling without replacement (SRSWR) technique in the first phase. In some situations, another auxiliary variable $ \left(Z \right) $ is helpful for obtaining the information of the first auxiliary variable X and both the variables are observed in the first phase sample.
The essential information is estimated in first phase for the auxiliary variable X before moving into the second phase. With the assistance of the auxiliary variable X, the population mean $ \bar Y $ of Y is estimated in second phase. To serve this purpose, another sample of SRSWR is sorted out from a primary sample of $ {n_2} $ units $ \left({{n_2} < {n_1}} \right) $ Singh, et al. [29]. The sample size $ {n_1} $ from the population ‘N’ is a primary sample. The sample $ {n_2} $ from $ {n_1} $ is a subsample of our study.
The classical ratio estimator of population mean $ \bar Y $ under double sampling scheme is given by Cochran [30] as:
where $ \bar y = {n^{ - 1}}\sum\limits_{i = 1}^n {{y_i}~~~{\rm{ and }}~}~~ \bar x = {n^{ - 1}}\sum\limits_{i = 1}^n {{x_i}{\rm{ }}} $
Upadhyaya, et al. [31] used available information of coefficient of variation $ \left({{C_z}} \right) $ and coefficient of kurtosis $ \left({{\beta _{2(z)}}} \right) $ from auxiliary variable and proposed estimator for estimating study variable $ \bar Y $ as:
The MSE of [31] estimator up to the order $ o{\left(n \right)^{ - 1}} $ as:
A modified version of Bahl and Tuteja [11] exponential ratio estimator given by Singh and Vishwakarma [15] as:
where $ {\bar y_2} $ is the sample mean of study variable in second phase.
Mean square error of $ {{\bar Y}_{sv}} $ is given as:
Exponential ratio estimator suggested by Singh, et al. [32] given as:
where $ {\bar y_2}, {\bar y_{rsd}}~{\rm{ and }}~{\bar y_{rsed}} \in {\rm{ }}{w_r} $ , while $ {w_r} $ represents the set of all possible ratio-type estimator for estimating study variable $ {\bar Y_s}. $
Minimum MSE of $ {\bar Y_S} $ is given as:
Noor-ul-Amin and Hanif [20] suggested exponential estimator using ratio with product technique in double sampling as:
The expression for min (MSE) of the $ {\bar Y_{NH}} $ is
Sanaullah, et al. [33] suggested modification in Noor-ul-Amin and Hanif [20] estimator using two auxiliary variables as:
Minimum mean square error of $ {{\bar Y}_{SA}} $ is
Kadilar, et al. [34] incorporates two auxiliary variables in classical regression and proposed a new linear regression estimator as:
The mean square error of $ {{\bar Y}_{kc}} $ is given as:
where regression coefficients are $ {b_{xy}} = \frac{{{s_{xy}}}}{{s_x^2}} $ and $ {b_{zy}} = \frac{{{s_{zy}}}}{{s_z^2}} $ respectively.
Where
The coefficient of variation $ X $ is $ {C_x} = {s_x}{\left({\bar X} \right)^{ - 1}} $ , $ Y $ is $ {C_y} = {s_y}{\left({\bar Y} \right)^{ - 1}} $ and Z is $ {C_z} = {s_z}{\left({\bar Z} \right)^{ - 1}} $ while correlation coefficient of $ \left({Y, X} \right) $ is $ {\rho _{yx}} = {s_{xy}}{\left({{s_x}{s_y}} \right)^{ - 1}}, $ is $ \left({Y, Z} \right) $ $ {\rho _{yz}} = {s_{yz}}{\left({{s_y}{s_z}} \right)^{ - 1}} $ and $ \left({X, Z} \right) $ is $ {\rho _{xz}} = {s_{xz}}{\left({{s_x}{s_z}} \right)^{ - 1}}. $
The estimators given above are used for estimating mean of study variable in different conditions. The main purpose of this article is to suggest an improved exponential estimator of ratio type and explores its properties. A ln function and two auxiliary variables are used with in proposed estimator and discussed in following section. Third section is comprised of mathematical illustration of suggested estimator and it’s comparison with some existing estimators $ \bar Y, {\bar Y_S}, {\bar Y_{SV}}, {\bar Y_{NH}}{\rm{ }}and{\rm{ }}{\bar Y_{SA}}. $ In section four empirical study is carried out while conclusion is being discussed in last section.
2.
Materials and method
Motivated by Cekim, et al. [35], who proposed ratio estimator using ln function for estimation of population variance, We propose a new estimator by combining the exponential and ln ratio terms using the information of two auxiliary variables Z and X in two-phase sampling for estimating population mean $ {\bar Y} $ . Let $ {n_1} $ and $ {n_2} < {n_1} $ be the sample sizes of first and second phase respectively, while an unbiased estimator of $ \bar Z $ be $ {\bar z_1}^* = {{\left({N\bar Z - {n_1}{{\bar z}_{_{\left(1 \right)}}}} \right)} \mathord{\left/
{\vphantom {{\left({N\bar Z - {n_1}{{\bar z}_{_{\left(1 \right)}}}} \right)} {\left({N - {n_1}} \right)}}} \right.
} {\left({N - {n_1}} \right)}} $ , as suggested by Bandyopadhyay [36] and Srivenkataramana [37]. The proposed estimator is given as:
where $ {\theta _1} $ , $ {\theta _2} $ are optimization constants and are used to minimize the mean square error while $ {\bar z^*} $ introduces as transformed variable. The “ln” function has been introduced to control the variability of ratio for first and second phase variables while exponential function on ratio of transformed variables also helps to reduce the estimator’s mean square errors. The properties of the propose estimator can be studied by considering
Putting $ \bar y = \left({\bar Y + {e_y}} \right), \bar x = \left({\bar X + {e_x}} \right)and{\rm{ \bar z}} = \left({\bar Z + {e_z}} \right) $ and $ \bar z_1^ * $ in Eq (2.1), we get
after simplification we have
For obtaining the MSE of the suggested estimator $ {\bar Y_g}^{{{\left(1 \right)}^{**}}}, $ we write
Expanding the right side of (2.1) in term of e’s and on further simplification to the first degree of approximation, we have
taking square on both sides and apply expectation
where $ \theta _1^* $ and $ \theta _2^* $ are optimum values of $ {\theta _1} $ and $ {\theta _2} $ respectively and are determined using differential calculus, we get
using optimum values of $ \theta _1^*{\rm{ }} $ and $ \theta _2^* $ , the minimum MSE of proposed estimator is obtained as:
2.1. Comparing mathematically $ {\bar Y_g}^{{{\left(1 \right)}^{**}}} $ with some existed estimators
In this section, we compared our proposed estimator $ {\bar Y_g}^{{{\left(1 \right)}^{**}}} $ with some of existed estimators. The comparison has been made in terms of the mean square errors and we have obtained some conditions under which our proposed estimator has smaller minimum MSE as compared to some already existing estimators. These comparisons are:
Case I
Case II
Case III
Case IV
If the above given expressions are not met then estimators $ \bar Y, {\bar Y_S}, {\bar Y_{SV}}, {\bar Y_{NH}}{\rm{ }}~~and~~{\rm{ }}{\bar Y_{SA}} $ are more efficient than the suggested estimator $ {\bar Y_g}^{{{\left(1 \right)}^{**}}}. $
3.
Results
In this section some real populations available in literature have been selected for empirical study to obtain mean square error and relative efficiency of our proposed estimator $ {{{\hat{\bar{Y}}}}_{g}}^{{{\left(1 \right)}^{**}}}. $ For checking the performance of suggested estimator the following real data sets have been used.
Sugar Cane Disease “coal of sugar-cane” (This is a disease that is common in sugar-cane plantations in certain areas of Brazil)
Soil Compositions of Physical and Chemical Characteristics
Appliances energy prediction Data Set
Ozone (The data is monthly ozone averages on a very coarse 24 by 24 grid covering Central America, from Jan 1995 to Dec 2000)
Combined Cycle Power Plant Data Set (The voltage output of engines was measured at various combinations of blade speed and sensor extension)
The comparison of our suggested estimator has been made with the conventional unbiased estimators $ \bar Y, {\bar Y_S}, {\bar Y_{SV}}, {\bar Y_{NH}}{\rm{ }}~~and~~{\rm{ }}{\bar Y_{SA}} $ . The depiction of variables for each population is given below:
The following Table 2 depicts the values of means, CV and correlations coefficient, which are essential to estimate the mean square error.
The percent relative efficiency formula for calculating efficiencies is given as, $ PRE\left({., \bar y} \right) = \frac{{MSE\left({\bar y} \right)}}{{MSE\left(. \right)}} \times 100{\rm{ }}~where~\left(. \right) = \bar Y, {\bar Y_S}, {\bar Y_{SV}}, {\bar Y_{NH}}, {\bar Y_{SA}}{\rm{ }}~and~{\rm{ }}{\bar Y_g}^{{{\left(1 \right)}^{**}}} $
Table 3 shows relative efficiencies of already developed and our proposed estimator.
It is observed in Table 3 that most of the proposed estimators give more efficient results than the classical ratio estimator. The relative efficacies table clearly indicated that our proposed estimator $ {{{\hat{\bar{Y}}}}_{g}}^{{{\left(1 \right)}^{**}}} $ is more efficient than $ \bar Y, {\bar Y_S}, {\bar Y_{SV}}, {\bar Y_{NH}}{\rm{ }}and{\rm{ }}{\bar Y_{SA}}. $
3.1. Simulation results
The simulation study has been steered by generating random populations from a bivariate normal distribution. For this simulation study, a random population of size 50000 was generated, for auxiliary variables X and Z, from standard bivariate normal distribution. Using these auxiliary variables, the study variable Y was generated by using $ {Y_i} = {X_i} + {Z_i} + {e_i} $ , where $ {e_i} $ is $ N\left({0, 1} \right) $ . From this population two phase samples were generated by using three different first phase and second phase samples as for simulation I 20% of 50000 and then 50% of 10000 $ \left({{n_1} = 10000, \, {n_2} = 5000} \right) $ , for simulation II 10% of 50000 and then 40% of 5000 $ \left({{n_1} = 5000, \, {n_2} = 2000} \right) $ and for simulation III 25% of 50000 and then 30% of 12500 $ \left({{n_1} = 12500, \, {n_2} = 3750} \right) $ respectively. For each of the sample various estimators were computed. The procedure was repeated for 50000 times and using 50000 values of each estimator, the MSE of each estimator was calculated and results are given in Table 4 below. From the results given in Table 4, we can see that the simulated mean square error of our proposed estimator is approximately half of the mean square of other estimators used in the study for $ \left({{n_1} = 5000, \, {n_2} = 2000} \right) $ and $ \left({{n_1} = 12500, \, {n_2} = 3750} \right) $ . The simulated mean square error of our proposed estimator is approximately 40% of the mean square error of other estimators for $ \left({{n_1} = 10000, \, {n_2} = 5000} \right) $ . From this simulation study, we can say that our proposed estimator, $ \bar Y_g^{\left(1 \right)} $ , will have smaller mean square error while estimation of population mean as compared with the other available estimators of population mean.
In Table 4, the mean square error of our proposed estimator and already developed estimators have been compared. It is observed from the Table 4 that the MSE of our proposed estimator is minimum in all simulated populations, which proves it to be an efficient estimator since the estimator with least mean square error is considered to be most effective.
4.
Discussion
In this study, an exponential ratio type estimator utilizing two auxiliary variables under two-phase sampling is suggested. A theoretical comparison of derived estimator is carried out by establishing the conditions under which proposed estimator, $ {{{\hat{\bar{Y}}}}_{g}}^{{{\left(1 \right)}^{**}}} $ , will be more efficient than the exponential ratio type estimator of Singh and Vishwakarma [29] $ \left({{{{\hat{\bar{Y}}}}_{SV}}} \right) $ , Singh, et al. [32] $ \left({{{{\hat{\bar{Y}}}}_S}} \right) $ , exponential difference of ratio product estimator of Noor-ul-Amin and Hanif [20] $ \left({{{{\hat{\bar{Y}}}}_{NH}}} \right) $ , Sanaullah, et al. [33] $ \left({{{{\hat{\bar{Y}}}}_{SA}}} \right) $ and classical regression estimator with two auxiliary variables proposed by Kadilar and Cingi [34] $ \left({{{\hat Y}_{kc}}} \right) $ in two phase sampling. We considered real population information as well as simulated data to inspect the performance of proposed estimator under two-phase sampling scheme. In Table 3, the proposed estimator is compared with some of the existing estimators on the basis of their relative efficiencies while Table 4 indicates that the performance of the proposed exponential estimator is better than the Singh and Vishwakarma [29], Singh, et al. [32], Noor-ul-Amin and Hanif [20] and Sanaullah, et al. [33] for all the simulated populations based on their mean square errors. The estimator with higher relative efficiency and least mean square error is considered to be more efficient. It can be clearly observed from Table 3 and Table 4 that the relative efficiency of our proposed estimator is higher for all real populations and its mean square error is minimum in all simulation studies. This proves our proposed estimator to be an efficient estimator.
5.
Conclusion
An exponential type ratio estimator in two-phase sampling has been proposed, which obtains mean square error by involving two auxiliary variables. The main purpose of this study is to compare the efficiency of our proposed estimator with some of the existing exponential estimators. MSE equation of our proposed estimator has been derived theoretically and its efficiency has been checked through different simulations and datasets of different populations from different fields. The percent relative efficacies values obtained from table-3 are (502.5515), (355.1199), (247.3183), (390.9441) and (279.0334), which are highest for all the populations proving our proposed estimator to be most efficient estimator in this study. Sanaullah, et al. [33]
“Ratio Estimator” appears as the second most efficient estimator in term of relative efficiencies for all populations except population 5. Subsequently, looking on the predominance nature of our proposed estimator, we suggest its use for its practical applications especially in the field of environment, engineering and biological sciences.
Acknowledgement
The authors would like to thank Dr. Olga Korosteleva of California State University, Long Beach for valuable feedback on this work. We also would like to thank the referees for their productive suggestions on the paper.
Conflicts of interest
The authors have no conflict of interest.