Processing math: 100%
Research article

Weighted expectile average estimation based on CBPS with responses missing at random

  • Received: 22 February 2024 Revised: 08 July 2024 Accepted: 22 July 2024 Published: 29 July 2024
  • MSC : 62F10, 62F12

  • An improved weighted expectile average estimator for the regression coefficient has been obtained based on the covariate balancing propensity score (CBPS), when the responses of linear models are missing at random. The asymptotic normality of the proposed method has been proved, and the estimation effect of the method is further illustrated by numerical simulation.

    Citation: Qiang Zhao, Zhaodi Wang, Jingjing Wu, Xiuli Wang. Weighted expectile average estimation based on CBPS with responses missing at random[J]. AIMS Mathematics, 2024, 9(8): 23088-23099. doi: 10.3934/math.20241122

    Related Papers:

    [1] Yanting Xiao, Yifan Shi . Robust estimation for varying-coefficient partially nonlinear model with nonignorable missing response. AIMS Mathematics, 2023, 8(12): 29849-29871. doi: 10.3934/math.20231526
    [2] Zhongqi Liang, Yanqiu Zhou . Model averaging based on weighted generalized method of moments with missing responses. AIMS Mathematics, 2023, 8(9): 21683-21699. doi: 10.3934/math.20231106
    [3] Qiang Zhao, Chao Zhang, Jingjing Wu, Xiuli Wang . Robust and efficient estimation for nonlinear model based on composite quantile regression with missing covariates. AIMS Mathematics, 2022, 7(5): 8127-8146. doi: 10.3934/math.2022452
    [4] Zouaoui Chikr Elmezouar, Fatimah Alshahrani, Ibrahim M. Almanjahie, Salim Bouzebda, Zoulikha Kaid, Ali Laksaci . Strong consistency rate in functional single index expectile model for spatial data. AIMS Mathematics, 2024, 9(3): 5550-5581. doi: 10.3934/math.2024269
    [5] Huimin Li, Jinru Wang . Pilot estimators for a kind of sparse covariance matrices with incomplete heavy-tailed data. AIMS Mathematics, 2023, 8(9): 21439-21462. doi: 10.3934/math.20231092
    [6] Liqi Xia, Xiuli Wang, Peixin Zhao, Yunquan Song . Empirical likelihood for varying coefficient partially nonlinear model with missing responses. AIMS Mathematics, 2021, 6(7): 7125-7152. doi: 10.3934/math.2021418
    [7] Hanji He, Meini Li, Guangming Deng . Group feature screening for ultrahigh-dimensional data missing at random. AIMS Mathematics, 2024, 9(2): 4032-4056. doi: 10.3934/math.2024197
    [8] Emrah Altun, Mustafa Ç. Korkmaz, M. El-Morshedy, M. S. Eliwa . The extended gamma distribution with regression model and applications. AIMS Mathematics, 2021, 6(3): 2418-2439. doi: 10.3934/math.2021147
    [9] Minyu Wu, Xizhong Yang, Feiran Yuan, Xuyi Qiu . Averaging principle for two-time-scale stochastic functional differential equations with past-dependent switching. AIMS Mathematics, 2025, 10(1): 353-387. doi: 10.3934/math.2025017
    [10] Fiaz Ahmad Bhatti, Azeem Ali, G. G. Hamedani, Mustafa Ç. Korkmaz, Munir Ahmad . The unit generalized log Burr XII distribution: properties and application. AIMS Mathematics, 2021, 6(9): 10222-10252. doi: 10.3934/math.2021592
  • An improved weighted expectile average estimator for the regression coefficient has been obtained based on the covariate balancing propensity score (CBPS), when the responses of linear models are missing at random. The asymptotic normality of the proposed method has been proved, and the estimation effect of the method is further illustrated by numerical simulation.



    In practical applications, due to the interference of various factors, collected data is often incomplete. Missing data is common in public opinion polls, medical research, experimental science, and other application fields. Missing data will not only result in the reduction of effective information, the deviation of the estimation result, but also affect the statistical decision-making and distort the analysis result to some extent. One approach to deal with missing data is complete-case analysis, which deletes all incomplete data. However, Little and Rubin [1] pointed out that this will cause biased estimation when the occurrence of missing data is not completely at random. Yates [2] introduced an imputation method which is widely used to handle missing responses. The purpose of this method is to find suitable values for the missing data to impute. Then, the data of the filled values are regarded as the complete observation data, which can be analyzed by the classical method. Inverse probability weighting (IPW), which was proposed by Horvitz and Thompson [3], is another method to deal with missing data. The inverse of the selection probability is chosen to be the weight assigned to the fully observed data. The missing at random (MAR) assumption, in the sense of Rubin et al. [4], is a common assumption for statistical analysis with missing data.

    In the case of missing data, the missing mechanism is usually unknown, and parameter methods and nonparametric methods are commonly used to estimate. For the parameter method, there may be a model misspecification problem. Imai and Ratkovic [5] proposed the covariate balanced propensity score (CBPS), which improves the parameter method. Based on the CBPS method, Guo et al. [6] applied the CBPS method to mean regression to obtain the estimators of the regression parameters β and the mean μ in the case of missing data.

    Expectile regression, which was proposed by Newey and Powell [7], can be regarded as a generalization of mean regression. Expectile regression uses the sum of asymmetric residual squares as the loss function, and since the loss function is convex and differentiable, expectile regression has computational advantages over quantile regression. Recently, people have carried out a lot of specific research on expectile regression. Sobotka et al. [8] established the asymptotic properties of a semi-parametric expectile regression estimator and introduced confidence intervals for expectiles. Waltrup et al. [9] observed that expectile regression tends to have less crossing and more robustness against heavy tailed distributions than quantile regression. Ziegel [10] concluded that expectile shares coherence and elicitability. Pan et al. [11] considered fitting a linear expectile regression model for estimating conditional expectiles based on a large quantity of data with covariates missing at random. Recently, Pan et al. [12] developed a weighted expectile regression approach for estimating the conditional expectile when covariates are missing at random (MAR). They only considered a single expectile, and the missing mechanism was assumed to be logistic regression. However, the missing mechanism model may be misspecified. In addition, it is known that making full use of multiple target information can improve the efficiency of parameter estimation. In summary, when the model may be misspecified, we use the idea of covariate balance to study the weighted expectile average estimation of unknown parameters based on CBPS by using multiple expected information. Our estimators can improve performance of the usual weighted expectile average estimator in terms of standard deviation (SD) and mean squared error (MSE).

    The rest of this paper is organized as follows. In Section 2, we propose a CBPS-based estimator for the propensity score. In Section 3, we estimate the expected quantile weighted average of the regression parameters based on CBPS. Moreover, we establish the asymptotic normality of the weighted estimator in Section 4. In Section 5, a simulation study is carried out to assess the performance of the proposed method. The proofs of those theoretical results are deferred to the Appendix.

    Consider the following linear regression model:

    Yi=XTiβ+εi,i=1,2,,n, (2.1)

    where Yi is response, Xi is covariate, β is the p-dimensional vector of unknown parameters, and εi is the random error. Assuming that the response variable Yi is missing at random, the covariate Xi can be fully observed. For the ith individual, let δi denote the observing indicator, i.e., δi=1 if Yi is observed and 0 otherwise. In our paper, we only consider the missing mechanism of missing at random (MAR), that is,

    P(δi=1|Xi,Yi)=π(Xi)πi, (2.2)

    where πi is called the selection probability function or the propensity score.

    The most popular choice of π(Xi) is a logistic regression function (Peng et al. [13]). We make the same choice and posit a logistic regression model for π(Xi),

    π(Xi,γ)=exp(γ0+XTiγ1)1+exp(γ0+XTiγ1), (2.3)

    and γ=(γ0,γT1)TΘ is the unknown parameter vector with the parameter space ΘRq+1. Here, γ can be estimated by maximizing the log-likelihood function

    L(γ)=ni=1{δilogπ(Xi,γ)+(1δi)log(1π(Xi,γ)}.

    Assuming that π(Xi,γ) is twice continuously differentiable with respect to γ, maximizing L(γ) implies the first-order condition

    1nni=1s(δi,Xi,γ)=0,s(δi,Xi,γ)=δiπ(Xi,γ)π(Xi,γ)(1δi)π(Xi,γ)1π(Xi,γ), (2.4)

    where π(Xi,γ)=π(Xi,γ)/γT. The maximum likelihood method is a commonly used and simple parameter estimation method. However, when the selection probability model (2.3) is assumed to be wrong, the estimator based on this method will have a large deviation. In order to make the parameter method more robust, we use the covariate balanced propensity score method proposed by Imai and Ratkovic [5] to estimate the unknown parameter γ, that is,

    E{δi~Xiπ(Xi,γ)(1δi)~Xi1π(Xi,γ)}=0. (2.5)

    ~Xi=f(Xi) is an M-dimensional vector-valued measurable function of Xi. For any covariate function, as long as the expectation exists, Eq (2.5) must hold. If the propensity score model is incorrectly specified, then the maximum likelihood may not be able to balance the covariates. Following Imai and Ratkovic [5], we can set ~Xi=Xi to ensure that the first moment of each covariate is balanced even when the model is misspecified. π(Xi,γ) satisfies the condition

    E{δiXiπ(Xi,γ)(1δi)Xi1π(Xi,γ)}=0. (2.6)

    The sample form of the covariate equilibrium condition obtained from (2.6) is

    1nni=1z(δi,Xi,γ)Xi=0, (2.7)

    where

    z(δi,Xi,γ)=δiπ(Xi,γ)π(Xi,γ)(1π(Xi,γ)).

    According to Imai and Ratkovic [5], if we only use the condition of the π(Xi,γ) equilibrium, i.e., (2.4), at this time, the number of equations is equal to the number of parameters. Then, the covariate equilibrium propensity score is just-identified. If we combine Eq (2.4) with the score condition given in Eq (2.7),

    ˉU(γ)=1nni=1U(δi,Xi,γ), (2.8)

    where

    U(δi,Xi,γ)=(s(δi,Xi,γ)z(δi,Xi,γ)Xi),

    then the covariate equilibrium propensity score is over-identified because the number of moment conditions exceeds that of parameters. For over-identified CBPS, the estimation of γ can be obtained by using the generalized moment method (GMM) (Hansen [14]). For a positive semidefinite symmetric weight matrix W, the GMM estimator ˆγ can be obtained by minimizing the following objective function for γ:

    Q(γ)=ˉUT(γ)WˉU(γ). (2.9)

    The above method is also applicable to the case where the covariate balanced propensity score is just-identified.

    Pan et al. [12] introduced the weighted expectile regression estimation of a linear model in detail. According to the idea of inverse probability weighting, when the selection probability function (π1,πn)T is known, the expectile estimator of β under missing responses is defined as

    (ˆβτk,T,ˆbτk)=argminβ,bτkni=1δiπ(Xi,γ)Φτk(YiXTiβbτk), (3.1)

    where τk(0,1) is expectile level, and Φτk(v)=|τkI(v0)|v2. bτk represents the τk-expectile of the error term εi. Then, according to Zhao et al.[15], let K be the number of expectiles, and consider the equally spaced expectiles τk=kK+1, k=1,2,,K. The weighted expectile average estimator of the linear model parameter β when the missing mechanism is known is defined as

    ˆβ=Kk=1ωkˆβτk,T,

    where the weight vector (ω1,,ωK)T satisfies Kk=1ωk=1.

    When the selection probability function is unknown, we use the method proposed in the second section to estimate the parameter γ based on CBPS, so as to obtain π(Xi,ˆγ). The loss function of the τk-expectile can be defined as

    Ln(βτk,bτk)=ni=1δiπ(Xi,ˆγ)Φτk(YiXTiβbτk).

    By minimizing the loss function, we can obtain the expectile estimation of the unknown parameter β,

    (ˆβτk,ˆbτk)=argminβ,bτkLn(βτk,bτk). (3.2)

    Therefore, the weighted expectile average estimation of the linear model parameter β when the missing mechanism is unknown under the missing responses is defined as

    ˆβw=Kk=1ωkˆβτk. (3.3)

    The weight vector (ω1,,ωK)T satisfies Kk=1ωk=1.

    Let γ0 and β0 represent the true values of γ and β respectively, and U(γ)=(s(δ,X,γ)z(δ,X,γ)X). In addition, with reference to Pan et al. [12] and Guo [16], the following regularity conditions are required.

    C1: γ0 is the interior point of Θ.

    C2: U(γ) is differentiable in the neighborhood of γ0.

    C3: E[U(γ0)]=0, E[U(γ0)2]<.

    C4: E[supγγU(γ)]<, where γ is the first-order partial derivative of the function to γ.

    C5: Γ=E[γU(γ)] exists.

    C6: For any i, there exists a compact set X, such that XiXRp, and Xi and εi are independent.

    C7: The regression errors {εi}ni=1 are independent and identically distributed with common cumulative distribution function F(), satisfying E[ε2i]<.

    C8: There exists a>0 such that π(Vi,γ)>a for any i.

    C9: The symmetric matrix Σ1 is positive definite.

    The following theorem presents the asymptotic distribution of ˆβw.

    Theorem 4.1 (Asymptotic Normality of ˆβw) Under the assumptions C1–C9, we have

    n(ˆβωβ0)dN(0,Σ11ΛΣ11),

    where Σ1=E[XiXTi], Λ=E[λλT],λ=μE[μ/γT]{E[U(γ)/γT]}1U(γ), μ=δπ(X,γ)XKk=1ωkΨτk(εb0k)g(τk).

    In the following, the expectile weighted average estimation based on covariate balancing propensity score proposed in this paper is analyzed by numerical simulation, and the method is compared with the usual parameter estimation method in the case of correct and wrong model assumptions. Consider the following linear model:

    Y=β1X1+β2X2+β3X3+ε, (5.1)

    where β1=0.5, β2=1, β3=1, and (X1,X2,X3) obeys the joint normal distribution with mean of 0, covariance of 0.5, and variance of 1. The error term ε obeys the standard normal distribution. In our simulation, we take K=10, τk=k/11 for k=1,2,,10, and consider the real choice probability model as

    π(X1,X2,X3)=exp(0.3X1+0.25X2+0.25X3)/[1+exp(0.3X1+0.25X2+0.25X3)]. (5.2)

    Under the assumption of random missing, in order to illustrate the effect when the model is misspecified, we assume that the covariates

    X=(X1,X2,X3){exp(X1/2),(X2)/{1+exp(X1)}+10,(X1X3/25+0.6)3}.

    If the model (5.2) is represented by π(X), the model will be specified incorrectly. In the simulation study of the expectile regression of the unknown parameter β, we consider the following two cases: (1) Propensity score model is correctly specified. (2) Propensity score model is misspecified. Zhao [15] proposed the weighted composite expectile regression method for a varying-coefficient partially linear model. For a given scenario, referring to Zhao [15], we compare the weighted expectile average estimation based on CBPS, denoted as CBPS-WEAE, with weighted composite expectile regression, denoted as WCER, and weighted composite quantile regression, denoted as WCQR, to examine the performance of the estimator, where the weights of WCER and WCQR are estimated by the generalized linear model.

    In the simulation, samples of size n=500,800,1000,1200 are generated independently. For each scenario, we conduct 1000 simulations and calculate the average mean squared error (MSE) for estimator of β and the average bias (Bias) and standard deviation (SD) for estimator of β1, β2, and β3. In order to examine the influence of the error distribution on the performance of the proposed method, two different distributions of the model error ε are considered: standard normal distribution N(0,1) and centralized χ2 distribution with 4 degrees of freedom. The results of our simulations are presented in Tables 1 and 2.

    Table 1.  Simulation results (×100) under the error εN(0,1).
    n Model Method MSE β1 β2 β3
    Bias SD Bias SD Bias SD
    500 correct WCQR 2.437 -0.100 9.038 -0.221 9.041 2.498 8.966
    WCER 2.155 -0.172 8.533 -0.303 8.384 0.366 8.512
    CBPS-WEAE 2.139 0.023 8.490 -0.433 8.355 -0.218 8.488
    incorrect WCQR 2.371 0.731 8.908 -0.616 8.866 2.498 8.855
    WCER 2.256 0.518 8.642 -0.471 8.698 0.366 8.658
    CBPS-WEAE 2.122 0.348 8.382 -0.680 8.547 -0.104 8.280
    800 correct WCQR 1.490 -0.033 6.944 0.105 7.190 2.498 7.011
    WCER 1.380 -0.012 6.616 -0.036 6.886 0.366 6.844
    CBPS-WEAE 1.356 0.311 6.569 -0.219 6.931 0.076 6.663
    incorrect WCQR 1.392 0.474 6.729 -0.176 6.689 2.498 6.997
    WCER 1.357 0.266 6.980 0.291 6.455 0.366 6.732
    CBPS-WEAE 1.310 -0.262 6.676 0.098 6.536 -0.285 6.609
    1000 correct WCQR 1.491 0.123 6.156 -0.143 6.427 2.498 6.375
    WCER 1.107 0.067 6.008 0.003 6.182 0.366 6.044
    CBPS-WEAE 1.098 3.303 6.094 -0.296 5.973 -0.260 6.069
    incorrect WCQR 1.202 0.037 6.497 -0.307 6.477 2.498 6.000
    WCER 1.172 -0.155 6.252 0.213 6.452 0.366 6.042
    CBPS-WEAE 1.122 -0.483 6.137 -0.070 6.209 -0.279 5.985
    1200 correct WCQR 1.033 0.021 5.819 0.136 5.967 2.498 5.824
    WCER 0.902 0.132 5.513 0.027 5.470 0.366 5.472
    CBPS-WEAE 0.898 0.403 5.486 -0.115 5.487 -0.267 5.421
    incorrect WCQR 1.005 0.401 5.968 -0.347 5.687 2.498 5.681
    WCER 0.960 0.117 5.682 -0.027 5.578 0.366 5.712
    CBPS-WEAE 0.923 -0.118 5.611 -0.290 5.451 -0.119 5.571

     | Show Table
    DownLoad: CSV
    Table 2.  Simulation results (×100) under the error εχ2(4).
    n Model Method MSE β1 β2 β3
    Bias SD Bias SD Bias SD
    500 correct WCQR 42.007 1.385 36.000 1.970 37.445 2.498 38.742
    WCER 19.889 0.282 25.231 0.767 26.285 0.366 25.743
    CBPS-WEAE 18.424 -3.970 24.657 10.138 24.950 -0.236 24.431
    incorrect WCQR 37.935 4.173 36.241 0.443 35.828 2.498 34.384
    WCER 19.472 1.481 25.942 -0.235 25.159 0.366 25.316
    CBPS-WEAE 19.078 -3.454 25.124 -3.294 24.263 -4.148 25.491
    800 correct WCQR 33.040 1.769 33.706 1.452 32.203 2.498 33.593
    WCER 12.696 1.142 21.067 0.887 20.695 0.366 19.901
    CBPS-WEAE 12.471 -4.204 21.122 1.985 19.963 -0.534 19.540
    incorrect WCQR 32.691 2.495 34.420 1.215 32.328 2.498 32.158
    WCER 13.098 -0.743 21.348 0.776 20.046 0.366 21.269
    CBPS-WEAE 12.594 -4.931 20.047 -2.309 20.593 -3.187 19.872
    1000 correct WCQR 31.334 2.961 32.292 -1.051 32.338 2.498 32.208
    WCER 12.647 1.554 21.038 0.214 19.026 0.366 21.370
    CBPS-WEAE 9.456 -3.280 18.555 0.380 16.798 0.746 17.568
    incorrect WCQR 31.671 4.760 33.762 0.512 31.422 2.498 31.931
    WCER 11.049 0.102 18.822 -0.729 17.908 0.366 20.694
    CBPS-WEAE 9.811 -2.869 18.456 -3.676 17.077 -2.495 17.939
    1200 correct WCQR 29.885 -0.103 31.443 2.665 30.750 2.498 32.389
    WCER 11.241 -0.516 18.673 1.694 18.976 0.366 20.328
    CBPS-WEAE 8.751 -5.023 17.754 1.652 16.251 -0.198 16.391
    incorrect WCQR 31.091 1.617 33.886 3.476 30.736 2.498 31.688
    WCER 10.297 0.207 18.428 0.508 18.222 0.366 18.942
    CBPS-WEAE 9.455 -4.175 17.484 -2.462 16.866 -3.193 17.961

     | Show Table
    DownLoad: CSV

    From Tables 1 and 2 we observe that, as expected, all three estimators are unbiased. In terms of MSE, as a convenient measure of average error, we observe that when model error ε follows the standard normal distribution N(0,1), CBPS-WEAE performs best among the three estimators considered, followed immediately by WCER, while WCQR performs worst. When ε follows a centralized χ2 distribution with 4 degrees of freedom, CBPS-WEAE is superior to the other two methods. When sample size is large, it can be seen that the performance of the three estimators is significantly improved compared with that when the sample size is small. In general, our proposed improved estimator is effective.

    In this paper, in order to improve the estimation efficiency of weighted expectile average estimation, we estimate the selection probability function based on CBPS and propose a weighted expectile average estimator based on CBPS when the response variables are missing at random. The asymptotic normality of the proposed method is proved, and the estimation effect of the method is further illustrated by numerical simulation. The numerical simulation results show that the method is effective.

    Qiang Zhao: Conceptualization, methodology, supervision, writing-review and editing; Zhaodi Wang: Validation, software, writing-original draft; Jingjing Wu: Funding acquisition, formal analysis, writing-original draft; Xiuli Wang: Funding acquisition, investigation, resources, writing-review and editing. All authors have read and approved the final version of the manuscript for publication.

    The authors declare they have not used artificial intelligence (AI) tools in the creation of this article.

    The research is supported by Natural Science Foundation of Shandong Province (Grant Nos. ZR2021MA077 and ZR2021MA048).

    All authors declare that there is no conflict of interest.

    Define the following symbols:

    ηi=δiπ(Xi,γ)XiΨτk(εi),

    ^ηi=δiπ(Xi,ˆγ)XiΨτk(εi),

    Fn=1nni=1ˆηi,

    εi=YiXTiβ0,

    ω=(ω1,ω2,...ωn)T,

    Σ1=E[XiXTi],

    Ψτk=2|τkI(v0)|v,

    u=(u1,u2,...,un)T,

    Gn(u)=ni=1δiπ(Xi,ˆγ)[Ψτk(εiXTiun)Ψτk(εi)].

    Lemma 1. Assume that C1–C5 hold. Then, when n,

    n(ˆγγ0)dN(0,(ΓTΣ1Γ)1),

    where Γ=E[γU(γ)], Σ=E[U(γ)UT(γ)].

    The proof of Lemma 1 can refer to Theorem 2.2.1 in Guo [16].

    Lemma 2. If the conditions C1–C4 are satisfied, then

    FndN(0,Ω),

    where Ω=E[QQT],Q=ηE[η/γT]{E[U(γ)/γT]}1U(γ).

    Proof. By expanding 1nni=1ˆηi at γ and the proof process of Lemma 1, we can get

    1nni=1ˆηi=1nni=1ηi+[1nni=1ηiγ]γn(ˆγγ)=1nni=1ηi[1nni=1ηiγ]γ[1nni=1Ui(γ)γ]1γ[1nni=1Ui(γ)]=1nni=1[ηiDnB1nUi(γ)], (A.1)

    where Dn=[1nni=1ηiγ]γ,Bn=[1nni=1Ui(γ)γ]γ, and γ lies between γ and ˆγ.

    According to the central limit theorem,

    1nni=1(ηiDnB1nUi(γ))dN(0,Ω),

    where Ω=E[QQT],Q=ηE[η/γT]{E[U(γ)/γT]}1U(γ).

    Therefore, Lemma 2 is proved.

    Lemma 3. If the conditions C1–C4 are satisfied, then

    n(ˆβτkβ0)dN(0,14g2(τ)Σ11ΩΣ11).

    Proof. If the conditions C1–C4 are satisfied, it can be known from Pan et al. [12] that

    Gn(u)=g(τ)uTΣ1u+FTnu+op(1), (A.2)

    where g(τ)=(1τ)F(0)+τ(1F(0)).

    Known by Hjort and Pollard [17], if

    Dn(u)=12uTAu+BTu+op(1),

    where Dn(u) is a convex objective function with minimum point ˆun, A is a symmetric and positive definite matrix, and B is a random variable, then

    ˆundA1B.

    Therefore, if we define ˆun=n(ˆβτkβ0), then ˆβτk=β0+ˆunn. By some simple calculations and (A.2), we have

    ˆun=argminuni=1δiπ(Vi,ˆγ)[Ψτk(εiXTiun)Ψτk(εi)]=argminuGn(u)=argminu[g(τ)uTΣ1u+FTnu+op(1)]. (A.3)

    According to condition C4, Σ1 is a symmetric positive definite matrix. Lemma 3 is proved by Lemma 1 and Slutsky's theorem.

    Proof of Theorem 4.1. By Lemma 3 we know that

    n(ˆβτkβ0)=Σ111nni=1δiπ(Xi,ˆγ)XiΨτk(εib0k)2g(τk)+op(1).

    From ˆβw=Kk=1ωkˆβτk, Kk=1ωk=1, we can get

    n(ˆβwβ0)=n(Kk=1ωkˆβτkβ0)=nKk=1ωk(ˆβτkβ0)=1nΣ11ni=1δiπ(Xi,ˆγ)Xi{Kk=1ωkΨτk(εib0k)2g(τk)}+op(1). (A.4)

    According to the proof of Lemma 2, we can obtain that

    ni=1ˆηi=ni=1δiπ(Xi,ˆγ)XiΨτk(εi)=ni=1[ηiDnB1nUi(γ)].

    Let μi=δiπ(Xi,γ)XiKk=1ωkΨτk(εib0k)g(τk),ˆμi=δiπ(Xi,ˆγ)XiKk=1ωkΨτk(εib0k)g(τk), and then

    ni=1ˆμi=ni=1[μiHnB1nUi(γ)],

    where Hn=[1nni=1μiγ]γ. Therefore, Eq (A.4) is equivalent to

    n(ˆβwβ0)=1nΣ11ni=1{[μiHnB1nUi(γ)]}+op(1). (A.5)

    Therefore,

    n(ˆβωβ0)dN(0,Σ11ΛΣ11),

    where Λ=E[λλT], λ=μE[μ/γT]{E[U(γ)/γT]}1U(γ), μ=δπ(X,γ)XKk=1ωkΨτk(εb0k)g(τk).



    [1] R. J. A. Little, D. B. Rubin, Statistical analysis with missing data, 2 Eds., New York: Wiley, 2002. http://dx.doi.org/10.1002/9781119013563
    [2] F. Yates, The analysis of replicated experiments when the field results are incomplete, Emprie Jour. Exp. Agric., 1 (1933), 129–142.
    [3] D. G. Horvitz, D. J. Thompson, A generalization of sampling without replacement from a finite universe, J. Am. Stat. Assoc., 47 (1952), 663–685. http://dx.doi.org/10.1080/01621459.1952.10483446 doi: 10.1080/01621459.1952.10483446
    [4] J. M. Robins, A. Rotnitzky, L. P. Zhao, Estimation of regression coefficients when some of regression coefficients estimation regressors are not always observed, J. Am. Stat. Assoc., 89 (1994), 846–866. http://dx.doi.org/10.2307/2290910 doi: 10.2307/2290910
    [5] K. Imai, M. Ratkovic, Covariate balancing propensity score, J. R. Stat. Soc. B., 76 (2014), 243–263. http://dx.doi.org/10.1111/rssb.12027 doi: 10.1111/rssb.12027
    [6] D. Guo, L. Xue, Y. Hu, Covariate-balancing-propensity-score-based inference for linear models with missing responses, Statist. Probab. Lett., 123 (2017), 139–145. http://dx.doi.org/10.1016/j.spl.2016.12.001 doi: 10.1016/j.spl.2016.12.001
    [7] W. K. Newey, J. L. Powell, Asymmetric least squares estimation and testing, Econometrica, 55 (1987), 819–847. http://dx.doi.org/10.2307/1911031 doi: 10.2307/1911031
    [8] F. Sobotka, G. Kauermann, L. S. Waltrup, T. Kneib, On confidence intervals for semiparametric expectile regression, Stat. Comput., 23 (2013), 135–148. http://dx.doi.org/10.1007/s11222-011-9297-1 doi: 10.1007/s11222-011-9297-1
    [9] L. S. Waltrup, F. Sobotka, T. Kneib, Expectile and quantile regression David and Goliath, Stat. Model., 15 (2015), 433–456. http://dx.doi.org/10.1177/1471082X14561155 doi: 10.1177/1471082X14561155
    [10] J. F. Ziegel, Coherence and elicitability, Math. Financ., 26 (2016), 901–918. http://dx.doi.org/10.1111/mafi.12080 doi: 10.1111/mafi.12080
    [11] Y. Pan, Z. Liu, W. Cai, Large-scale expectile regression with covariates missing at random, IEEE. Access., 8 (2020), 36502–36513. http://dx.doi.org/10.1109/access.2020.2970741 doi: 10.1109/access.2020.2970741
    [12] Y. Pan, Z. Liu, W. Cai, Weighted expectile regression with covariates missing at random, Commun. Stat.-Simul. C., 52 (2023), 1057–1076. http://dx.doi.org/10.1080/03610918.2021.1873371 doi: 10.1080/03610918.2021.1873371
    [13] J. C. Peng, L. K. Lee, M. G. Ingersoll, An introduction to logistic regression analysis and reporting, J. Educ. Res., 96 (2002), 3–14. http://dx.doi.org/10.1080/00220670209598786 doi: 10.1080/00220670209598786
    [14] L. P. Hansen, Large sample properties of generalized method of moments estimators, Econometrica, 50 (1982), 1029–1054. http://dx.doi.org/10.2307/1912775 doi: 10.2307/1912775
    [15] S. Zhao, Expected regression estimation of semiparametric model, Shanxi Normal Univ., 2021.
    [16] D. Guo, Estimation methods and theories of several types of regression models under missing data, Beijing Univ. Tech., 2017.
    [17] N. L. Hjort, D. Pollard, Asymptotics for minimisers of convex processes, arXiv Preprint, 2011. https://doi.org/10.48550/arXiv.1107.3806
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(678) PDF downloads(34) Cited by(0)

Figures and Tables

Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog