Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates

Qiang Zhao; Chao Zhang; Jingjing Wu; Xiuli Wang; Qiang Zhao; Chao Zhang; Jingjing Wu; Xiuli Wang

doi:10.3934/math.2022452

AIMS Mathematics

2022, Volume 7, Issue 5: 8127-8146. doi: 10.3934/math.2022452

Previous Article Next Article

Research article

Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates

1.
School of Mathematics and Statistics, Shandong Normal University, Jinan 250014, China
2.
Department of Mathematics and Statistics, University of Calgary, Calgary, AB, Canada

Received: 16 November 2021 Revised: 22 January 2022 Accepted: 09 February 2022 Published: 24 February 2022
MSC : 62F12, 62G08

In this article, two types of weighted quantile estimators were proposed for nonlinear models with missing covariates. The asymptotic normality of the proposed weighted quantile average estimators was established. We further calculated the optimal weights and derived the asymptotic distributions of the correspondingly resulted optimal weighted quantile estimators. Numerical simulations and a real data analysis were conducted to examine the finite sample performance of the proposed estimators compared with other competitors.

Keywords:

Citation: Qiang Zhao, Chao Zhang, Jingjing Wu, Xiuli Wang. Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates[J]. AIMS Mathematics, 2022, 7(5): 8127-8146. doi: 10.3934/math.2022452

Related Papers:

[1]	Qiang Zhao, Zhaodi Wang, Jingjing Wu, Xiuli Wang . Weighted expectile average estimation based on CBPS with responses missing at random. AIMS Mathematics, 2024, 9(8): 23088-23099. doi: 10.3934/math.20241122
[2]	Yanting Xiao, Yifan Shi . Robust estimation for varying-coefficient partially nonlinear model with nonignorable missing response. AIMS Mathematics, 2023, 8(12): 29849-29871. doi: 10.3934/math.20231526
[3]	Mohamed Kayid . Statistical inference of an $ \mathit{\alpha } $-quantile past lifetime function with applications. AIMS Mathematics, 2024, 9(6): 15346-15360. doi: 10.3934/math.2024745
[4]	Huimin Li, Jinru Wang . Pilot estimators for a kind of sparse covariance matrices with incomplete heavy-tailed data. AIMS Mathematics, 2023, 8(9): 21439-21462. doi: 10.3934/math.20231092
[5]	Liqi Xia, Xiuli Wang, Peixin Zhao, Yunquan Song . Empirical likelihood for varying coefficient partially nonlinear model with missing responses. AIMS Mathematics, 2021, 6(7): 7125-7152. doi: 10.3934/math.2021418
[6]	Jinliang Wang, Fang Wang, Songbo Hu . On asymptotic correlation coefficient for some order statistics. AIMS Mathematics, 2023, 8(3): 6763-6776. doi: 10.3934/math.2023344
[7]	Yanting Xiao, Wanying Dong . Robust estimation for varying-coefficient partially linear measurement error model with auxiliary instrumental variables. AIMS Mathematics, 2023, 8(8): 18373-18391. doi: 10.3934/math.2023934
[8]	Mohamed Kayid . Estimation of monotone bivariate quantile inactivity time with medical applications. AIMS Mathematics, 2024, 9(10): 28472-28486. doi: 10.3934/math.20241381
[9]	Zhongqi Liang, Yanqiu Zhou . Model averaging based on weighted generalized method of moments with missing responses. AIMS Mathematics, 2023, 8(9): 21683-21699. doi: 10.3934/math.20231106
[10]	Ahmed M. Gemeay, Najwan Alsadat, Christophe Chesneau, Mohammed Elgarhy . Power unit inverse Lindley distribution with different measures of uncertainty, estimation and applications. AIMS Mathematics, 2024, 9(8): 20976-21024. doi: 10.3934/math.20241021

Abstract

1. Introduction

In recent years, regression analysis is widely used in various fields; for example, logistic regression was used to implement distributed classification of large data sets in Wang, Xu and Wu ^[1]. Traditional regression analysis is based on the mean, which is easy to calculate and is straightforward to interpret. But mean regression may fail for heavy-tailed error distributions, so a series of new regression methods were proposed. Rank regression and quantile regression are robust estimation methods which are widely used. There are many applications in which several response variables are predicted with a common set of predictors. Zhao, Lian and Ma ^[2] took the possible correlations among the responses into account, and introduced robust reduced-rank estimator via rank regression. Zhang et al. ^[3] applied rank regression to the varying-coefficient model and proposed a robust multivariate varying-coefficient model based on rank loss that models the relationships among different responses via reduced-rank regression and penalized variable selection. The above two methods are often used to multivariate regression model. Gong, Xu and Chen ^[4] proposed a penalized modal regression method for additive models in high dimensional. Quantile regression (QR), as introduced by Koenker and Bassett ^[5], is also a robust regression and can describe the entire conditional distribution of the response variable given the covariates. Because of these significant advantages, QR has become an effective method for statistical research. It is well known that different quantiles may contain different information of error distributions. Therefore, combining different quantile information could appropriately be a feasible way to improve efficiency. With this idea, Zou and Yuan ^[6] defined a new loss function which is simply an average of the loss function based on different quantiles, and named the new method as composite quantile regression (CQR). CQR could be considered as a useful extension of the quantile regression. Zhao and Xiao ^[7] pointed out that simple average (using equal weights) is not an efficient way of using distributional information from different quantile regressions. Koenker ^[8,9] proposed a more general approach, which assigns different weights to different quantiles. Jiang et al. ^[10] extended the research on robust and efficient estimation and model selection in high dimensions to nonlinear models. Unfortunately, when the number of quantiles is large, the calculation is very demanding. Therefore, Bloznelis et al. ^[11] considered a model-averaged quantile estimator with a computationally cheaper alternative and compared its performance to the composite quantile estimator in both low and high dimensional cases.

Classical regression analysis and related theories are based on completely observed data, while missing data are frequently encountered in almost all research areas, such as psychological sciences and medical studies. In cases of missing data, classical statistical methods such as maximum likelihood estimation (MLE) cannot be applied directly to the corresponding statistical analysis. We know that the complete-case (CC) method, which only uses the fully observed data, can lead to seriously biased parameter estimations when the covariate is not missing completely at random. Yates ^[12] introduced an imputation method which is widely used to handle missing responses. This method aims to find an appropriate value that to be filled in for each missing data. Then the data with the filled in values can be treated as fully observed data that can be analyzed by classical methods. Xia ^[13] employ the profile nonlinear least squares estimation based on the weighted imputation method to estimate the unknown parameter and nonparametric function and consider empirical likelihood inferences based on the weighted imputation method for the varying coefficient partially nonlinear model with missing responses. The inverse probability weighted (IPW) method is another frequently used method dated back to Horvitz and Thompson ^[14] that can be applied to the case of missing covariates. In this method, the inverse of the selection probability is chosen to be the weight assigned to the fully observed data. The missing at random (MAR) assumption, in the sense of Rubin ^[15], is a common assumption for statistical analysis with missing data. Under the MAR assumption, many approaches for mean regression with missing values were developed to obtain efficient estimators, such as the imputation method proposed by Little and Rubin ^[16], the IPW method introduced by Robins et al. ^[17], and likelihood-based methods given by Ibrahim et al. ^[18]. For a comprehensive review, readers are referred to Qin, Shao and Zhang ^[19]. It is worth mentioning that IPW method is unbiased under MAR assumption.

However, most of the above methods are built on least squares (LS) estimator which is not robust against outliers. Recently, Sherwood, Wang and Zhou ^[20]considered a linear QR approach based on IPW with a parametric model for the selection probability when covariates are missing at random, and investigated the variable selection problem with the proposed method. Chen, Wan and Zhou ^[21] proposed three estimation methods for a linear quantile regression when observations are missing at random, one of which is to use nonparametric IPW. The above three references focused on a given individual quantile. Due to the effectiveness and robustness of the CQR method, Yang and Liu ^[22] investigated the CQR estimation of linear models with missing covariates by using IPW method. It is worth pointing out that, they used equal weights at different quantiles to construct their CQR estimator for a linear model. Recently, Wang, Song and Zhang ^[23] proposed an optimal weighted quantile average estimation for parameters in additive partially linear models with missing covariates, and their simulation results verified that the proposed method is an efficient and reliable alternative of both the weighted least squares (WLS) method and the weighted CQR (WCQR) method. So in this paper, applying the idea of Jiang et al. ^[24] and Wang, Song and Zhang ^[23], we consider two types of WCQRs for nonlinear models with missing covariates and the proposed methods are demonstrated superior via simulation studies and a real data example.

The rest of this paper is organized as follows. The proposed estimation technique and its theoretical properties are presented in Section 2. Numerical simulation studies are conducted in Section 3 in order to examine the performance of the proposed methods and to justify the derived theoretical results in Section 2. A real data analysis is given in Section 4 to illustrate the implementation of the proposed methods. The regularity conditions and the proofs of those theoretical results are given in Appendix.

2. Methodology

Zhao and Lian ^[25] studied two weighting schemes to further improve the efficiency of CQR for linear models. And they showed that the two weighting schemes are asymptotically equivalent to each other and always result in more efficient estimators compared with CQR in theory. Now, In order to get a more general approach, we generalize the linear models to the nonlinear models and consider the covariates missing at random. Consider the nonlinear model

$\begin{equation} Y_{i} = f(X_{i},\beta)+\varepsilon_{i}, \ \ \ i = 1,\dots,n, \end{equation}$

(2.1)

where $Y_{i}$ is an observable response, $X_{i} = (U_{i}^{T}, V_{i}^{T})^{T} \in R^{q+s}$ is the vector of covariates, $\beta$ is the $p$ -dimensional vector of unknown parameters, and $\varepsilon_{i}$ is the random error independent of $X$ . Let $K$ be the number of quantiles, for the equally spaced quantiles $\tau_{k} = \frac{k}{K+1}, k = 1, 2, \ldots, K$ . Jiang et al. ^[24] proposed the weighted composite quantile estimator for $\beta$ by minimizing

$l_{n}(\beta,{\bf{b}}) = \sum\limits_{k = 1}^{K}\omega_{k}\sum\limits_{i = 1}^{n}\rho_{\tau_{k}}(Y_{i}-f(X_{i},\beta)-b_{\tau_{k}})$

over $\beta$ and ${\bf{b}} = (b_{\tau_{1}}, b_{\tau_{2}}, \ldots, b_{\tau_{k}})^{T}$ , where $\rho_{\tau}(t) = t(\tau-I(t < 0))$ , and $\omega_{k}$ is the weight which controls the amount of contribution of the $\tau_{k}$ -th quantile regression satisfying $\sum_{k = 1}^{K}\omega_{k}g(b_{\tau_{k}}) > 0$ with $g(\cdot)$ being the density of $\varepsilon$ .

Here we assume some covariates are missing. More specifically, we assume $U_{i}$ 's are all observed while some $V_{i}$ 's are missing. Let $\delta_{i} = 0$ if $V_{i}$ is missing, and $\delta_{i} = 1$ if $V_{i}$ is observed. Throughout this paper, following Wang, Song and Zhang ^[23], we assume the following missing mechanism

$\begin{equation} P(\delta_{i} = 1|Y_{i},U_{i},V_{i}) = P(\delta_{i} = 1|U_{i})\triangleq \pi(U_{i}), \end{equation}$

(2.2)

where $\pi(\cdot)$ is called the selection probability function or the propensity score.

When the selection probability function $\pi(\cdot)$ is known, the IPW estimator of $\beta$ under missing covariates is defined as

$\begin{equation} (\hat{{\bf{b}}}, \hat{\beta}) = \underset{b,\beta}{\arg\min}\; L_{n}\left (\pi(U),\beta,{\bf{b}}\right ), \end{equation}$

(2.3)

where $L_{n}(\pi(U), \beta, {\bf{b}}) = \sum_{k = 1}^{K}\omega_{k}\sum_{i = 1}^{n}\frac{\delta_{i}}{\pi(U_{i})}\rho_{\tau_{k}}(Y_{i}-f(X_{i}, \beta)-b_{\tau_{k}})$ . However, in reality the selection probability function $\pi(\cdot)$ is usually unknown and needs to be estimated. Next we follow Wang, Song and Zhang ^[23] and consider estimating $\pi(U_i)$ using both parametric and nonparametric models.

2.1. Estimation of propensity scores

To estimate the propensity scores nonparametrically, we apply nonparametric smoothing techniques. Particularly, we use the Nadaraya-Watson estimator of $\pi(U_{i})$ which is defined as

$\begin{equation} \hat{\pi}(U_{i}) = \frac{\sum_{j = 1}^{n}K_{h}(U_{i}-U_{j})\delta_{j}}{\sum_{j = 1}^{n}K_{h}(U_{i}-U_{j})}, \end{equation}$

(2.4)

where $K_{h}(\cdot) = K(\cdot/h)/h^{q}$ is a $q$ -variate kernel function, $h$ is the bandwidth.

When the dimension of $U$ is high, a fully nonparametric estimation is encountered with the curse of dimensionality. In this case, a parametric approach might be more feasible for the estimation of $\pi(U_i)$ given in (2.2). A commonly used model for (2.2) is the logistic regression given by

$\begin{equation} \pi(U_{i},\gamma) = \frac{\exp(\gamma_{0}+U_{i}^{T}\gamma_{1})}{1+\exp(\gamma_{0}+U_{i}^{T}\gamma_{1})} = \frac{\exp(\Gamma_{i}^{T}\gamma)}{1+\exp(\Gamma_{i}^{T}\gamma)}, \end{equation}$

(2.5)

where $\Gamma_{i} = (1, U_{i}^{T})^{T}$ and $\gamma = (\gamma_{0}, \gamma_{1}^{T})^{T}\in\Theta$ is an unknown parameter vector with $\Theta \subset R^{q+1}$ . Here $\gamma$ can be estimated by maximizing the log-likelihood function

$L(\gamma) = \sum\limits_{i = 1}^{n}\left\{\delta_{i}\log\pi(U_{i},\gamma)+(1-\delta_{i})\log(1-\pi(U_{i},\gamma)\right\}.$

Let $\hat\gamma$ be the MLE of $\gamma$ , then the parametric estimator of $\pi(U_{i})$ is denoted by $\pi(U_{i}, \hat{\gamma})$ . If the specified parametric model (2.5) of the selection probability function $\pi(\cdot)$ is valid, then the IPW method is applicable.

2.2. WCQR estimation of regression parameters

In this subsection, we propose two weighting schemes for the WCQR estimation. The first one is based on weighting the quantile loss and the second one is weighting the quantile regression estimator at different levels with details given below. For convenience, we use $\hat{\pi}(U_{i})$ for the estimator of $\pi(U_{i})$ by either the parametric or nonparametric method.

As in Jiang et al. ^[24], we let $\tau_{k} = \frac{k}{K+1}$ , $k = 1, 2, \ldots, K$ for some $K$ . By weighting the different loss functions in CQR with the IPW method, our first WCQR estimator is defined as

$\begin{equation} (\hat{{\bf{b}}}, \hat{\beta}_{ \rm{WCQR1}}) = {\rm{argmin}}_{b,\beta}L_{n}(\hat{\pi}(U),\beta,{\bf{b}}), \end{equation}$

(2.6)

where $L_{n}(\hat{\pi}(U), \beta, {\bf{b}}) = \sum_{k = 1}^{K}\omega_{k}\sum_{i = 1}^{n}\frac{\delta_{i}}{\hat{\pi}(U_{i})}\rho_{\tau_{k}}(Y_{i}-f(X_{i}, \beta)-b_{\tau_{k}})$ , the weight $\omega_{k}$ 's are allowed to be negative and satisfy $\sum_{k = 1}^{K}\omega_{k}g(b_{\tau_{k}}) > 0$ , where $g(\cdot)$ is the density function of the error term $\varepsilon$ .

The following theorem presents the asymptotic distribution of $\hat{\beta}_{ \rm{WCQR1}}$ . We first introduce some notations. Let $\beta^{*}$ be the true value of $\beta$ , $b_{\tau_{k}}^{*}$ be the $\tau_{k}$ -th quantile of $\varepsilon$ and ${\bf{b}}^{*} = (b_{\tau_{1}}^{*}, b_{\tau_{2}}^{*}, \ldots, b_{\tau_{K}}^{*})^{T}$ . Denote $f_{i}^{*} = f(X_{i}, \beta^{*})$ , $\nabla f_{i}^{*} = \frac{\partial f(X_{i}, \beta)}{\partial\beta}|_{\beta = \beta^{*}}$ , $\Sigma_{1} = E[\nabla f_{1}^{*} (\nabla f_{1}^{*})^{T}]$ , $\Sigma_{2} = E[\frac{\nabla f_{1}^{*} (\nabla f_{1}^{*})^{T}}{\pi(U)}]$ , ${\bf{g}} = (g(b_{\tau_{1}}^{*})$ , $g(b_{\tau_{2}}^{*}), \ldots, g(b_{\tau_{K}}^{*}))^{T}$ , ${\bf{\Omega}} = \{\min(\tau_{k}, \tau_{k^{\prime}})(1-\max(\tau_{k}, \tau_{k^{\prime}}))\}_{1\leq k, k^{\prime}\leq K}$ , and ${\textbf{H}} = (\frac{\min(\tau_{k}, \tau_{k^{\prime}})(1-\max(\tau_{k}, \tau_{k^{\prime}}))} {g(b_{\tau_{k}}^{*})g(b_{\tau_{k^{\prime}}}^{*})})_{1\leq k, k^{\prime}\leq K}$ .

Theorem 2.1. Suppose that the conditions $C1-C6$ in Appendix hold and $\beta^{*}$ is the true value. Then we have

$\sqrt{n}( \hat{\beta}_{ {WCQR1}}-\beta^{*})\stackrel{D}{\longrightarrow}N\left (0,\frac{ \omega ^{T}\Omega \omega }{ \omega ^{T}{\bf{g}}{\bf{g}}^{T} \omega }\Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}\right ).$

Similar to Jiang et al. ^[24] and Zhao et al. ^[26], we can derive the optimal weights by minimizing $\frac{ \omega ^{T}\Omega \omega }{ \omega ^{T}{\bf{g}}{\bf{g}}^{T} \omega }$ in the asymptotic variance given in Theorem 2.1.

Corollary 2.1. The optimal weight vector $\omega ^{*} = (\omega_{1}^{*}, \omega_{2}^{*}, \cdots, \omega_{K}^{*})^{T}$ for $\hat{\beta}_{ {WCQR1}}$ is

$\begin{equation} \omega ^{*} = {{argmin}}\frac{ \omega ^{T}\Omega \omega }{ \omega ^{T}{\bf{g}}{\bf{g}}^{T} \omega } = ({\bf{g}}^{T}\Omega^{-2}{\bf{g}})^{-1/2}\Omega^{-1}{\bf{g}}. \end{equation}$

(2.7)

Note that the optimal weight depends on the density function of $\varepsilon$ . Based on estimated residuals $\hat{\varepsilon}_{i}$ , the usual nonparametric density estimation methods can provide a consistent estimator $\hat{g}(\cdot)$ of $g(\cdot)$ . Then the estimated optimal weight vector is $\hat{ \omega }^{*} = (\hat{{\bf{g}}}^{T}\Omega^{-2}\hat{{\bf{g}}})^{-1/2}\Omega^{-1}\hat{{\bf{g}}}$ . With the optimal weight vector $\hat{ \omega }^{*}$ obtained in hand, the first optimal WCQR estimator of $\beta$ is defined as

$\begin{equation} \hat{\beta}_{ \rm{OWCQ1}} = \underset{\beta}{\arg\min}\sum\limits_{k = 1}^{K}\hat{\omega}_{k}^{*}\sum\limits_{i = 1}^{n}\frac{\delta_{i}}{\hat{\pi}(U_{i})}\rho_{\tau_{k}}(Y_{i}-f(X_{i},\beta)-\hat{b}_{\tau_{k}}). \end{equation}$

(2.8)

Corollary 2.2. The optimal weighted compositive quantile estimators $\hat{\beta}_{ \rm{OWCQ1}}$ of $\beta$ has the optimal asymptotic variance $\frac{1}{n}({\bf{g}}^{T}\Omega^{-1}{\bf{g}})^{-1}\Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}$ .

Next, we present the second weighting schemes. Our method is inspired by Wang, Song and Zhang ^[23]. Let

$(\hat{b}_{\tau_{k}},\hat{\beta}_{\tau_{k}}) = \underset {b_{\tau_{k}},\beta} {\arg\min}\; \sum\limits_{i = 1}^{n}\frac{\delta_{i}}{\hat{\pi}(U_{i})}\rho_{\tau_{k}}(Y_{i}-f(X_{i},\beta)-b_{\tau_{k}}),$

then the second WCQR estimator is defined as

$\begin{equation} \hat{\beta}_{ \rm{WCQR2}} = \sum\limits_{k = 1}^{K}\omega_{k}\hat{\beta}_{\tau_{k}}, \end{equation}$

(2.9)

where $\omega_{k}$ 's satisfy $\sum_{k = 1}^{K} \omega_{k} = 1$ . The asymptotic distribution of $\hat{\beta}_{ \rm{WCQR2}}$ is summarized in the following theorem.

Theorem 2.2. Suppose that the conditions $C1-C6$ in Appendix hold and $\beta^{*}$ be is true parameter value. Then we have

$\sqrt{n}(\hat{\beta}_{ \rm{WCQR2}}-\beta^{*})\stackrel{\cal D}\longrightarrow N\left (0, \omega ^{T}{\boldsymbol{H}} \omega \Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}\right).$

Similarly, we can obtain the optimal weight by minimizing $\omega ^{T}{\textbf{H}} \omega$ in the asymptotic covariance given in Theorem 2.2. As a result, the second optimal WCQR of $\beta$ can be correspondingly defined as $\hat{\beta}_{ \rm{OWCQ2}}$ with the associated optimal asymptotic variance derived in the following corollary.

Corollary 2.3. The optimal weight vector $\omega ^{*} = (\omega_{1}^{*}, \omega_{2}^{*}, \ldots, \omega_{K}^{*})^{T}$ of WCQR2 is

$\begin{equation} \omega^{*} = \arg\min\limits_{\omega^{T}1 = 1} \omega ^{T}{\boldsymbol{H}} \omega = \frac{{\boldsymbol{H}}^{-1}\boldsymbol{1}}{\boldsymbol{1}^{T}{\boldsymbol{H}}^{-1}\boldsymbol{1}}, \end{equation}$

(2.10)

where $\boldsymbol{1}$ is a $K \times 1$ vector with all elements 1. With this optimal weight vector, the optimal WCQR estimator $\hat{\beta}_{ {OWCQ2}} = \sum_{k = 1}^{K}\omega_{k}^{*}\hat{\beta}_{\tau_{k}}$ has the optimal asymptotic variance

$\frac{1}{n}({\boldsymbol{1}^{T}{\boldsymbol{H}}^{-1}\boldsymbol{1}})^{-1}\Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}\ = \ \frac{1}{n}({\bf{g}}^{T}\Omega^{-1}{\bf{g}})^{-1}\Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}.$

Remark 1. The optimal weight of OWCQ2 is essentially the same as Zhao and Lian ^[25], but with different representation. And from the above results for the two weighting methods we observe that if we use the optimal weight vectors, the optimal WCQR estimators achieve the same optimal asymptotic variance $\frac{1}{n}({\bf{g}}^{T}\Omega^{-1}{\bf{g}})^{-1}\Sigma_{1}^{-1}\Sigma_{2}\Sigma_{1}^{-1}$ .

3. Simulation studies

In this section, we use simulation studies to examine the finite sample performance of our proposed methods and compare it with the inverse probability weighted CQR (IWCQ) method which uses the same weight for different QR models, and the inverse probability WLS estimator. Referring to Zou and Yuan ^[6], the estimator of the proposed methodology is nearly efficient as the oracle maximum likelihood (OML) estimator for $K \geq 9$ in various error distributions. Therefore, we take $K = 10$ , $\tau_k = k/11$ , $k = 1, 2, \dots, 10$ , and consider the exponential regression models

$Y = \exp(\beta_{1}X_{1}+\beta_{2}X_{2}+\beta_{3}X_{3})+\varepsilon,$

where $\beta_{1} = 0.5$ , $\beta_{2} = 1$ , $\beta_{3} = 1$ and $(X_{1}, X_{2}, X_{3})$ follows multivariate normal distribution with covariances always 0.5 and variances always 1. The model error $\varepsilon$ and $X = (X_{1}, X_{2}, X_{3})^{T}$ are independent. Then, using the method described in Section 4 in Wang, Chen and Lin ^[27], we set the data in $X_{3}$ to be missing at random while $X_{1}, X_{2}, Y$ are fully observed. And we consider two selection probability functions

$\begin{array}{ll} \pi_{1}(X_{1},X_{2}) = \exp(2+0.5X_{1}+0.5X_{2})/\left[1+\exp(2+0.5X_{1}+0.5X_{2})\right], \\ \pi_{2}(X_{1},X_{2}) = \exp(1+1.25X_{1}+X_{2})/\left[1+\exp(1+1.25X_{1}+X_{2})\right]. \\ \end{array}$

Their corresponding average missing rates are 15% and 35% respectively. In our simulation, four different distributions of model error $\varepsilon$ are considered:

(Case 1) The standard normal distribution $N(0, 1)$ .

(Case 2) The centralized $t$ distribution with four degrees of freedom.

(Case 3) The mixture of normal distribution $0.6N(0, 1)+0.4N(2, 1)$ .

(Case 4) The centralized $\chi^{2}$ distribution with four degrees of freedom.

In the simulation, samples of size $n = 200$ and $n = 600$ are generated independently. Four estimation methods, OWCQ1, OWCQ2, WLS and IWCQ are used to estimate $\beta_1$ , $\beta_2$ and $\beta_3$ under the above selection probability functions and error distributions. Then the root of mean squared errors (RMSEs) can be calculated. To evaluate the different estimators, we repeat the process $1000$ times and calculate the average RMSEs. The simulation results are reported in – for cases that the selection probability function $\pi(\cdot)$ is known (denoted as T), estimated nonparametrically (denoted as N) and parametrically (denoted as P). When the selection probability is estimated nonparametrically, we use the Gaussian kernel $K(x) = \frac{1}{\sqrt{2\pi}}\exp(-\frac{x^{2}}{2})$ to construct the multiplicative kernel $L(x_{1}, x_{2}) = K(x_{1})K(x_{2})$ , and use the bandwidth proposed by Ruppert, Sheather and Wand ^[28]. When $\pi(\cdot)$ is estimated by parametric method, we apply model (2.5) to estimate it. Meanwhile, similar to Jiang et al. ^[24], our proposed estimator involves a weighting scheme and the density of error is known in simulations, so we took the optimal weight $\omega^*$ (see Section 2.2) for all simulations.

Table 1. The RMSEs (multiplied by

$10^4$ ) for

$\beta$ under the selection probability function

$\pi_1(X_1;X_2)$ for

$n = 200$ .

$\varepsilon$	$\beta$	OWCQ1			OWCQ2			WLS			IWCQ
$\varepsilon$	$\beta$	T	N	P	T	N	P	T	N	P	T	N	P
Case1	$\beta_1$	69.815	70.083	70.394	72.635	72.568	72.357	69.481	69.228	69.405	70.208	69.758	70.045
	$\beta_2$	67.619	67.682	67.697	70.266	70.334	70.316	66.842	66.872	66.880	67.598	67.771	67.520
	$\beta_3$	67.451	66.999	67.862	69.436	69.148	69.571	66.548	66.339	66.534	67.456	67.096	67.599
Case2	$\beta_1$	91.124	90.406	91.482	92.599	92.082	92.282	97.444	97.329	97.498	91.918	91.159	91.469
	$\beta_2$	86.790	84.551	86.247	85.497	84.982	85.157	93.340	92.772	93.262	86.649	86.011	86.640
	$\beta_3$	87.330	86.256	86.832	86.259	86.408	86.438	92.537	92.775	92.519	87.302	87.413	86.463
Case3	$\beta_1$	103.49	103.60	104.43	104.01	104.21	104.02	116.86	116.63	116.76	103.24	104.69	104.19
	$\beta_2$	97.220	96.335	97.271	94.204	94.251	94.005	112.09	111.73	112.13	99.724	100.34	99.991
	$\beta_3$	103.11	103.66	104.10	100.51	100.17	100.26	117.34	117.38	117.33	105.15	103.92	105.87
Case4	$\beta_1$	194.52	192.19	195.21	168.33	170.55	171.35	411.30	411.00	409.41	256.99	259.41	259.73
	$\beta_2$	178.00	168.97	174.80	150.73	148.00	149.04	391.41	390.18	391.30	241.40	238.28	241.31
	$\beta_3$	196.63	195.57	196.35	163.73	167.43	169.52	400.91	400.87	399.81	250.51	247.66	247.75

| Show Table

DownLoad: CSV

Table 2. The RMSEs (multiplied by

$10^4$ ) for

$\beta$ under the selection probability function

$\pi_2(X_1;X_2)$ for

$n = 200$ .

$\varepsilon$	$\beta$	OWCQ1			OWCQ2			WLS			IWCQ
$\varepsilon$	$\beta$	T	N	P	T	N	P	T	N	P	T	N	P
Case1	$\beta_1$	72.059	70.502	72.837	72.631	72.315	72.331	68.665	68.472	68.565	73.059	71.171	72.846
	$\beta_2$	69.884	66.711	69.543	69.375	69.399	69.523	65.568	65.516	65.486	69.057	67.296	68.708
	$\beta_3$	70.649	69.247	71.376	71.898	70.976	71.581	67.101	66.943	67.111	70.199	68.972	70.806
Case2	$\beta_1$	89.885	90.720	91.987	91.404	90.922	89.848	95.801	95.740	95.810	94.513	91.495	94.635
	$\beta_2$	84.532	83.575	85.351	81.752	81.301	82.248	89.878	89.691	89.885	84.997	81.760	84.635
	$\beta_3$	86.637	86.019	88.037	86.442	85.448	85.523	90.878	91.106	90.837	88.500	85.344	88.625
Case3	$\beta_1$	111.25	107.40	110.85	106.60	105.38	106.70	117.16	117.54	117.09	114.25	110.63	110.23
	$\beta_2$	102.81	98.443	103.60	95.112	96.201	95.374	111.66	111.44	111.53	108.05	102.58	107.68
	$\beta_3$	109.44	106.47	105.94	100.80	101.37	100.26	115.30	115.45	115.55	113.12	106.08	108.44
Case4	$\beta_1$	200.03	190.67	196.16	178.68	173.67	185.99	410.08	412.02	410.44	279.83	264.94	285.81
	$\beta_2$	170.64	167.98	174.57	155.36	146.81	153.24	382.39	381.07	382.02	259.52	239.58	250.77
	$\beta_3$	196.65	186.94	195.98	175.10	167.23	175.20	391.13	391.01	391.34	270.68	257.52	270.36

| Show Table

DownLoad: CSV

Table 3. The RMSEs (multiplied by

$10^4$ ) for

$\beta$ under the selection probability function

$\pi_1(X_1;X_2)$ for

$n = 600$ .

$\varepsilon$	$\beta$	OWCQ1			OWCQ2			WLS			IWCQ
$\varepsilon$	$\beta$	T	N	P	T	N	P	T	N	P	T	N	P
Case1	$\beta_1$	25.572	25.515	24.947	26.134	26.437	26.173	24.447	24.525	24.442	25.424	25.342	25.320
	$\beta_2$	23.663	23.839	23.743	24.582	24.823	24.726	23.140	23.253	23.142	23.938	23.643	23.878
	$\beta_3$	24.046	23.974	23.934	24.758	24.746	24.892	23.426	23.507	23.429	24.246	24.019	24.215
Case2	$\beta_1$	30.219	30.031	30.487	30.349	30.274	30.454	32.960	32.914	32.959	30.739	30.452	30.924
	$\beta_2$	30.093	30.356	30.087	29.949	29.838	29.894	32.766	32.736	32.772	30.636	30.554	30.107
	$\beta_3$	29.388	29.358	29.372	29.042	29.074	29.069	32.962	32.924	32.966	29.949	29.783	29.989
Case3	$\beta_1$	41.449	41.209	41.214	37.899	37.862	37.805	48.247	48.244	48.183	41.469	41.915	42.827
	$\beta_2$	37.420	37.788	37.515	35.155	34.787	34.723	46.331	46.481	46.301	39.647	38.938	40.156
	$\beta_3$	37.194	36.275	36.613	33.889	33.677	33.792	45.890	45.957	45.905	37.930	37.878	38.686
Case4	$\beta_1$	69.441	67.445	68.620	56.757	56.679	56.131	173.27	173.52	173.24	104.62	106.89	104.78
	$\beta_2$	65.303	63.240	62.968	52.234	50.795	51.504	176.16	176.44	176.46	105.95	107.51	103.95
	$\beta_3$	67.368	62.231	62.897	54.403	54.812	54.689	182.13	182.04	182.06	104.29	111.02	103.67

| Show Table

DownLoad: CSV

Table 4. The RMSEs (multiplied by

$10^4$ ) for

$\beta$ under the selection probability function

$\pi_2(X_1;X_2)$ for

$n = 600$ .

$\varepsilon$	$\beta$	OWCQ1			OWCQ2			WLS			IWCQ
$\varepsilon$	$\beta$	T	N	P	T	N	P	T	N	P	T	N	P
Case1	$\beta_1$	26.185	25.492	26.202	25.716	25.794	25.532	24.259	24.271	24.288	26.034	25.786	25.998
	$\beta_2$	24.210	24.086	24.215	24.426	24.648	24.535	22.966	23.048	22.968	24.498	24.623	24.252
	$\beta_3$	24.971	24.826	25.319	25.077	25.128	25.043	23.276	23.295	23.311	25.454	24.999	25.898
Case2	$\beta_1$	31.548	31.632	31.631	30.495	30.447	30.524	33.423	33.397	33.411	32.692	32.331	32.422
	$\beta_2$	30.301	29.961	30.383	29.742	29.760	29.827	32.471	32.430	32.454	31.721	31.218	31.278
	$\beta_3$	30.325	29.495	29.733	29.124	28.802	29.064	32.906	32.873	32.897	31.516	30.954	30.065
Case3	$\beta_1$	45.130	43.852	45.128	37.870	37.555	38.020	48.043	48.115	48.048	50.746	47.710	52.185
	$\beta_2$	41.167	39.934	41.160	35.190	35.213	35.411	46.267	46.312	46.233	45.117	43.963	47.649
	$\beta_3$	40.974	38.909	40.893	34.035	34.065	33.732	45.492	45.414	45.479	44.275	43.626	45.085
Case4	$\beta_1$	78.214	75.849	77.303	57.448	57.145	56.259	171.73	172.81	171.66	134.52	124.10	136.22
	$\beta_2$	73.627	69.376	70.722	52.432	51.211	51.105	175.48	175.61	175.55	136.08	124.71	136.97
	$\beta_3$	73.030	69.922	69.207	54.729	52.922	54.132	179.53	179.19	179.42	134.62	125.02	135.85

| Show Table

DownLoad: CSV

From – we observe that when the model error $\varepsilon$ follows the standard normal distribution $N(0, 1)$ , WLS performs the best among the four estimators considered, while OWCQ1, OWCQ2 and IWCQ behave very similarly. For all other non-normal distributions considered, WLS always performs the worst. The performance of the other three methods are very similar when the model error follows the centralized $t$ distribution with four degrees of freedom. It is further noted that when the missing rate is high or the sample size is large, our proposed methods are superior to IWCQ. Particularly, when the model error follows chi-square distribution with four degrees of freedom, the superiority of both OWCQ1 and OWCQ2 are even more obvious. We also find that for OWCQ1 and IWCQ methods a better result can be obtained by estimating the selection probability function with a nonparametric method. At the same time, IWCQ also performs much better than WLS.

When sample size is large, it can be seen from and that the performance of the four estimators are significantly improved compared with that when the sample case is small.And our proposed estimators have more obvious advantages over WLS and IWCQ. We observe that both OWCQ1 and OWCQ2 always have a high accuracy under any of the four error distributions, and OWCQ2 performs slightly better than OWCQ1 except when the model error $\varepsilon$ follows the standard normal distribution. We also find that the RMSEs are not sensitive to missing rate. In addition, the calculation speed of OWCQ1 is faster than OWCQ2 when the optimal weight obtained from the known error distribution is used. For example, when we simulated case1 at n = 200 and $\pi$ = 0.15, we found that OWCQ1 was about 20% faster than OWCQ2. For other cases, the difference between OWCQ1 and OWCQ2 in computing speed is similar.

4. A real data example

In this section, we will illustrate our proposed methods using a real data originally presented by Baum ^[29] to investigate how age, marriage state, number of children and education background affect whether a women works or not. For each women there are five variables:

● Work ( $y$ ): 1 = Yes, 0 = Not;

● Age ( $x_{1}$ ): the age of the women;

● Children( $x_{2}$ ): the number of the children the women raises;

● Education ( $x_{3}$ ): the years that the women has passed in school;

● Married ( $x_{4}$ ): 1 = Yes, 0 = Not.

Note that the response $y$ is the average estimated probability of work. A logistic model with all of covariates given by

$y_i = \frac{\exp(\beta_{0}+\beta_1x_{1i}+\beta_2x_{2i}+\beta_3x_{3i}+\beta_4x_{4i})}{1+\exp(\beta_{0}+\beta_1x_{1i}+\beta_2x_{2i}+\beta_3x_{3i}+\beta_4x_{4i})}+\varepsilon_{i}, \quad i = 1,2,\dots,2000$

is suitable for modeling the relationship between the choice of work and all possible factors. In order to use the data set to illustrate our methods, artificial missing data were created by using the selection probability $\pi(X) = \frac{\exp(\gamma_{0}+\gamma_1x_{1}+\gamma_2x_{2})}{1+\exp(\gamma_{0}+\gamma_1x_{1}+\gamma_2x_{2})}$ . The missing proportion is about 18.65% with $\gamma_{0} = 2, \gamma_1 = 0.15, \gamma_2 = 0.25$ , and, following Li and Ding ^[30], the quantile vector is taken as $\tau = (0.2, 0.4, 0.6, 0.8)^T$ with $K = 4$ .

From (2.7) and (2.10), we know that the optimal weights depend on $g(b^{*}_{\tau})$ and $b^{*}_{\tau}$ , both of which are unknown here and need to be estimated. Motivated by Sun and Sun ^[31] and Zhao and Xiao ^[7], we propose the following procedure under the case when the selection probability is known.

$(1)$ Use the uniform weight $\omega = (1/K, 1/K, \ldots, 1/K)^{T}$ to obtain the preliminary estimator $\hat{\beta}$ of $\beta$ as follows:

$(\hat b_{\tau_{1}},\hat b_{\tau_{1}},\ldots,\hat b_{\tau_{K}},\hat{\beta}) = \underset{b_{\tau_{k}},\beta} {\arg\min}\; \sum\limits_{k = 1}^{K}\frac{1}{K}\sum\limits_{i = 1}^{n}\frac{\delta_{i}}{\pi(U_{i})}\rho_{\tau_{k}}(Y_{i}-b_{\tau_{k}}-f(X_{i},\beta)).$

$(2)$ Let $m = \sum_{i = 1}^{n}\delta_{i}$ . Without loss of generality, we assume the first $m$ observations are complete. Then, based on the complete data, the pseudo residuals $\hat{\varepsilon}_{i}$ with $\delta_{i} = 1$ are computed as $\hat{\varepsilon}_{i} = \frac{\delta_{i}}{\pi(U_{i})}(Y_{i}-f(X_{i}, \hat{\beta}))$ , $i = 1, 2, \ldots, m$ .

$(3)$ Use the nonparametric kernel density estimator to estimate $g(t)$ :

$\hat{g}(t) = \frac{1}{mb}\sum\limits_{i = 1}^{m}K(\frac{t-\hat{\varepsilon}_{i}}{b}),$

where $K(\cdot)$ is a non-negative kernel function and the bandwidth $b$ is selected by

$b = 0.9\times\min\left\{ \rm{SD}(\hat{\varepsilon}_{1},\hat{\varepsilon}_{2},\ldots,\hat{\varepsilon}_{m}), \frac{ \rm{IQR}(\hat{\varepsilon}_{1},\hat{\varepsilon}_{2},\ldots,\hat{\varepsilon}_{m})}{1.34}\right\}\times m^{-1/5},$

where SD and IQR stand for the sample standard deviation and sample interquantile range, respectively.

$(4)$ Estimate $g(b_{\tau_{k}}^{*})$ by $\hat{g}(\hat{b}_{\tau_{k}})$ and then substitute it into (2.7) or (2.10), from which the optimal weight vector can be obtained, where $\hat{b}_{\tau_{k}}$ denotes the sample $\tau_{k}$ -quantile of $\hat{\varepsilon}_{1}, \hat{\varepsilon}_{2}, \ldots, \hat{\varepsilon}_{m}$ .

It is obvious that when a women has a work, the response $y_i$ will take a larger value. Because there are only 32.85% of women in the data set does not work, we could believe that a woman has a job if the corresponding response $\hat{y_i}$ is bigger than the 0.3285 quantile of the fitted values $\hat{y}$ . In order to compare the performance of our proposed methods with IWCQ and the composite quantile estimator which only uses the fully observed data (denoted by CQR-CCA), we calculate the fitted values $\hat{y}$ with all the $2000$ data of the above four methods respectively, and predict whether a women works or not. The prediction accuracy is reported in Table 5. From Table 5 we observe that IWCQ method can obviously improve the efficiency of estimation in the case of missing data, and CQR-CCA estimator has the lowest accuracy. It is obvious that our proposed methods are more accurate compared with IWCQ method.

Table 5. Accuracy of prediction.

	OWCQ1	OWCQ2	IWCQ	CQR-CCA
Accuracy	0.708	0.693	0.6725	0.6195

| Show Table

DownLoad: CSV

5. Discussion

In this article, we have proposed two types of weighted quantile estimators for nonlinear models with missing covariates. The asymptotic properties of our proposals have been obtained under certain conditions. Our simulation studies reveal that our proposed method has better advantages than the existing methods. Finally, we propose some future directions. First, We only consider the estimates of unknown parameters in this paper, and future studies can start from variable selection. Second, the logistic model for the selection probability function is assumed in our article. When the selection probability function is misspecified, how to derive a robust estimation of the selection probability could be a direction for further study. Third, our method could be used in Altun et al. ^[32] to obtain the unknown model parameters of new extended gamma distribution. At last, how to generalize our method to optimal reinsurance problems of Fang, Cheng and Qu ^[33] is also an interesting topic.

Acknowledgments

The research is supported by NSF projects (ZR2021MA077, ZR2021MA048 and ZR2019MA016) of Shandong Province of China.

Conflict of interest

All authors declare that there is no conflict of interest.

References

[1]	D. L. Wang, H. L. Xu, Q. Wu, Averaging versus voting: A comparative study of strategies for distributed classification, Math. Found. Comput., 3 (2020), 185–193. http://dx.doi.org/10.3934/mfc.2020017 doi: 10.3934/mfc.2020017
[2]	W. Zhao, H. Lian, S. Ma, Robust reduced-rank modeling via rank regression, J. Stat. Plan. Infer., 180 (2017), 1–12. http://dx.doi.org/10.1016/j.jspi.2016.08.009 doi: 10.1016/j.jspi.2016.08.009
[3]	F. Zhang, R. Li, H. Lian, D. Bandyopadhyay, Sparse reduced-rank regression for multivariate varying-coefficient models, J. Stat. Comput. Simul., 91 (2021), 752–767. http://dx.doi.org/10.1080/00949655.2020.1829622 doi: 10.1080/00949655.2020.1829622
[4]	T. L. Gong, C. Xu, H. Chen, Modal additive models with data-driven structure identification., Math. Found. Comput., 3 (2020), 165–183. http://dx.doi.org/10.3934/mfc.2020016 doi: 10.3934/mfc.2020016
[5]	R. Koenker, G. W. Bassett, Regression quantiles, Econometrica, 46 (1978), 33–50. http://dx.doi.org/10.2307/1913643 doi: 10.2307/1913643
[6]	H. Zou, M. Yuan, Composite quantile regression and the oracle model selection theory, Ann. Stat., 36 (2008), 1108–1126. http://dx.doi.org/10.1214/07-AOS507 doi: 10.1214/07-AOS507
[7]	Z. Zhao, Z. Xiao, Efficient regressions via optimally combining quantile information, Economet. Theory, 30 (2014), 1272–1314. http://dx.doi.org/10.1017/S0266466614000176 doi: 10.1017/S0266466614000176
[8]	R. Koenker, A note on L-estimates for linear models, Statist. Probab. Lett., 2 (1984), 323–325. http://dx.doi.org/10.1016/0167-7152(84)90040-3 doi: 10.1016/0167-7152(84)90040-3
[9]	R. Koenker, Quantile regression, Cambridge: Cambridge University Press, 2005. http://dx.doi.org/10.1017/CBO9780511754098
[10]	X. J. Jiang, J. Jiang, X. Song, Oracle model selection for nonlinear models based on weighted composite quantile regression, Stat. Sin., 22 (2012), 1479–1506. http://dx.doi.org/10.5705/ss.2010.203 doi: 10.5705/ss.2010.203
[11]	D. Bloznelis, G. Claeskens, J. Zhou, Composite versus model-averaged quantile regression, J. Stat. Plan. Infer., 200 (2019), 32–46. http://dx.doi.org/10.1016/j.jspi.2018.09.003 doi: 10.1016/j.jspi.2018.09.003
[12]	F. Yates, The analysis of replicated experiments when the field results are incomplete, Emprie Jour. Exp. Agric., 1 (1933), 129–142.
[13]	L. Q. Xia, X. L. Wang, P. X. Zhao, Y. Q. Song, Empirical likelihood for varying coefficient partially nonlinear model with missing responses, AIMS Mathematics, 6 (2021), 7125–7152. http://dx.doi.org/10.3934/math.2021418 doi: 10.3934/math.2021418
[14]	D. G. Horvitz, D. J. Thompson, A generalization of sampling without replacement from a finite universe, J. Am. Stat. Assoc., 47 (1952), 663–685. http://dx.doi.org/10.1080/01621459.1952.10483446 doi: 10.1080/01621459.1952.10483446
[15]	D. B. Rubin, Inference and missing data, Biometrika, 63 (1976), 581–592. http://dx.doi.org/10.1093/biomet/63.3.581 doi: 10.1093/biomet/63.3.581
[16]	R. J. A. Little, D. B. Rubin, Statistical analysis with missing data, 2 Eds., New York: Wiley, 2002. http://dx.doi.org/10.1002/9781119013563
[17]	J. M. Robins, A. Rotnitzky, L. P. Zhao, Estimation of regression coefficients when some of regression coefficients estimation regressors are not always observed, J. Am. Stat. Assoc., 89 (1994), 846–866. http://dx.doi.org/10.2307/2290910 doi: 10.2307/2290910
[18]	J. G. Ibrahim, H. T. Zhu, N. S. Tang, Model selection criteria for missing data problems via the EM algorithm, J. Am. Stat. Assoc., 103 (2008), 1648–1658. http://dx.doi.org/10.1198/016214508000001057 doi: 10.1198/016214508000001057
[19]	J. Qin, J. Shao, B. Zhang, Efficient and doubly robust imputation for covariate-dependent missing responses, J. Am. Stat. Assoc., 103 (2008), 797–810. http://dx.doi.org/10.1198/016214508000000238 doi: 10.1198/016214508000000238
[20]	B. Sherwood, L. Wang, X. H. Zhou, Weighted quantile regression for analyzing health care cost data with missing covariates, Stat. Med., 32 (2013), 4967–4979. http://dx.doi.org/10.1002/sim.5883 doi: 10.1002/sim.5883
[21]	X. R. Chen, A. T. Wan, Y. Zhou, Efficient quantile regression analysis with missing observations, J. Am. Stat. Assoc., 110 (2015), 723–741. http://dx.doi.org/10.1080/01621459.2014.928219 doi: 10.1080/01621459.2014.928219
[22]	H. Yang, H. L. Liu, Penalized weighted composite quantile estimators with missing covariates, Stat. Papers, 57 (2014), 69–88. http://dx.doi.org/10.1007/s00362-014-0642-2 doi: 10.1007/s00362-014-0642-2
[23]	X. L. Wang, Y. Q. Song, S. X. Zhang, An efficient estimation for the parameter in additive partially linear models with missing covariates, J. Korean Stat. Soc., 49 (2020), 779–801. http://dx.doi.org/10.1007/s42952-019-00036-6 doi: 10.1007/s42952-019-00036-6
[24]	X. J. Jiang, J. Z. Li, T. Xia, W. F. Yan, Robust and efficient estimation with weighted composite quantile regression, Physica A, 457 (2016), 413–423. http://dx.doi.org/10.1016/j.physa.2016.03.056 doi: 10.1016/j.physa.2016.03.056
[25]	K. Zhao, H. Lian, A note on the efficiency of composite quantile regression, J. Stat. Comput. Simul., 86 (2016), 1334–1341. http://dx.doi.org/10.1080/00949655.2015.1062096 doi: 10.1080/00949655.2015.1062096
[26]	W. Zhao, H. Lian, M. Chen, X. Song, Composite quantile regression for correlated data, Comput. Stat. Data Anal., 109 (2009), 15–33. http://dx.doi.org/10.1016/j.csda.2016.11.015 doi: 10.1016/j.csda.2016.11.015
[27]	X. L. Wang, F. Chen, L. Lin, Empirical likelihood inference for estimating equation with missing data, Sci. China. Math., 56 (2013), 1233–1245. http://dx.doi.org/10.1007/s11425-012-4504-x doi: 10.1007/s11425-012-4504-x
[28]	D. Ruppert, S. J. Sheather, M. P. Wand, An effective bandwidth selector for local least squares regression, J. Am. Stat. Assoc., 90 (1995), 1257–1270. http://dx.doi.org/10.1080/01621459.1995.10476630 doi: 10.1080/01621459.1995.10476630
[29]	C. F. Baum, An introduction to modern econometrics using Stata, Texas: Stata Press, 2006.
[30]	Y. Li, J. Ding, Weighted composite quantile regression method via empirical likelihood for non linear models, Commun. Stat.-Theor. M., 47 (2018), 4286–4296. http://dx.doi.org/10.1080/03610926.2017.1373816 doi: 10.1080/03610926.2017.1373816
[31]	J. Sun, Q. H. Sun, An improved and efficient estimation method for varying-coefficient model with missing covariates, Statist. Probab. Lett., 105 (2015), 296–303. http://dx.doi.org/10.1016/j.spl.2015.09.009 doi: 10.1016/j.spl.2015.09.009
[32]	E. Altun, M. Korkmaz, M. Elmorshedy, M. S. Eliwa, The extended gamma distribution with regression model and applications, AIMS Mathematics, 6 (2021), 2418–2439. http://dx.doi.org/10.3934/math.2021147 doi: 10.3934/math.2021147
[33]	Y. Fang, G. Cheng, Z. F. Qu, Optimal reinsurance for both an insurer and a reinsurer under general premium principles, AIMS Mathematics, 5 (2020), 3231–3255. http://dx.doi.org/10.3934/math.2020208 doi: 10.3934/math.2020208
[34]	K. Knight, Limiting distributions for L1 regression estimators under general conditions, Ann. Statist., 26 (1998), 755–770. http://dx.doi.org/10.1214/aos/1028144858 doi: 10.1214/aos/1028144858
[35]	H. Wong, S. Guo, M. Chen, W. C. Ip, On locally weighted estimation and hypothesis testing of varyingcoefficient models with missing covariates, J. Stat. Plan. Infer., 139 (2009), 2933–2951. http://dx.doi.org/10.1016/j.jspi.2009.01.016 doi: 10.1016/j.jspi.2009.01.016

This article has been cited by:

1.	Tahir Mahmood, Muhammad Riaz, Anam Iqbal, Kabwe Mulenga, An improved statistical approach to compare means, 2023, 8, 2473-6988, 4596, 10.3934/math.2023227
2.	Cecilia Castro, Víctor Leiva, Maria do Carmo Lourenço-Gomes, Ana Paula Amorim, Advanced Mathematical Approaches in Psycholinguistic Data Analysis: A Methodological Insight, 2023, 7, 2504-3110, 670, 10.3390/fractalfract7090670
3.	Qiang Zhao, Zhaodi Wang, Jingjing Wu, Xiuli Wang, Weighted expectile average estimation of linear models with missing covariates, 2024, 0361-0918, 1, 10.1080/03610918.2024.2405566

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(2080) PDF downloads(87) Cited by(3)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Tables(5)

AIMS Mathematics

Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates

Related Papers:

Abstract

1. Introduction

2. Methodology

2.1. Estimation of propensity scores

2.2. WCQR estimation of regression parameters

3. Simulation studies

4. A real data example

5. Discussion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates

Related Papers:

Abstract

1. Introduction

2. Methodology

2.1. Estimation of propensity scores

2.2. WCQR estimation of regression parameters

3. Simulation studies

4. A real data example

5. Discussion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog