B-spline estimation in varying coefficient models with correlated errors

Yanping Liu; Juliang Yin; Yanping Liu; Juliang Yin

doi:10.3934/math.2022195

AIMS Mathematics

2022, Volume 7, Issue 3: 3509-3523. doi: 10.3934/math.2022195

Previous Article Next Article

Research article

B-spline estimation in varying coefficient models with correlated errors

Yanping Liu ,
Juliang Yin ^,

School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China

Received: 13 August 2021 Accepted: 23 November 2021 Published: 02 December 2021
MSC : 62G08, 62M10, 65D07

The varying coefficient model assumes that the regression function depends linearly on some regressors, and that the regression coefficients are smooth functions of other predictor variables. It provides an appreciable flexibility in capturing the underlying dynamics in data and avoids the so-called "curse of dimensionality" in analyzing complex and multivariate nonlinear structures. Existing estimation methods usually assume that the errors for the model are independent; however, they may not be satisfied in practice. In this study, we investigated the estimation for the varying coefficient model with correlated errors via B-spline. The B-spline approach, as a global smoothing method, is computationally efficient. Under suitable conditions, the convergence rates of the proposed estimators were obtained. Furthermore, two simulation examples were employed to demonstrate the performance of the proposed approach and the necessity of considering correlated errors.

Keywords:

Citation: Yanping Liu, Juliang Yin. B-spline estimation in varying coefficient models with correlated errors[J]. AIMS Mathematics, 2022, 7(3): 3509-3523. doi: 10.3934/math.2022195

Related Papers:

[1]	Yanting Xiao, Yifan Shi . Robust estimation for varying-coefficient partially nonlinear model with nonignorable missing response. AIMS Mathematics, 2023, 8(12): 29849-29871. doi: 10.3934/math.20231526
[2]	Yan Wu, Chun-Gang Zhu . Generating bicubic B-spline surfaces by a sixth order PDE. AIMS Mathematics, 2021, 6(2): 1677-1694. doi: 10.3934/math.2021099
[3]	Liqi Xia, Xiuli Wang, Peixin Zhao, Yunquan Song . Empirical likelihood for varying coefficient partially nonlinear model with missing responses. AIMS Mathematics, 2021, 6(7): 7125-7152. doi: 10.3934/math.2021418
[4]	Abdulaziz Alsenafi, Mishari Al-Foraih, Khalifa Es-Sebaiy . Least squares estimation for non-ergodic weighted fractional Ornstein-Uhlenbeck process of general parameters. AIMS Mathematics, 2021, 6(11): 12780-12794. doi: 10.3934/math.2021738
[5]	SidAhmed Benchiha, Amer Ibrahim Al-Omari, Naif Alotaibi, Mansour Shrahili . Weighted generalized Quasi Lindley distribution: Different methods of estimation, applications for Covid-19 and engineering data. AIMS Mathematics, 2021, 6(11): 11850-11878. doi: 10.3934/math.2021688
[6]	Heng Liu, Xia Cui . Adaptive estimation for spatially varying coefficient models. AIMS Mathematics, 2023, 8(6): 13923-13942. doi: 10.3934/math.2023713
[7]	Peng Lai, Wenxin Tian, Yanqiu Zhou . Semi-supervised estimation for the varying coefficient regression model. AIMS Mathematics, 2024, 9(1): 55-72. doi: 10.3934/math.2024004
[8]	Bin Yang, Min Chen, Jianjun Zhou . Forecasting the monthly retail sales of electricity based on the semi-functional linear model with autoregressive errors. AIMS Mathematics, 2025, 10(1): 1602-1627. doi: 10.3934/math.2025074
[9]	Xiangbin Qin, Yuanpeng Zhu . Static term structure of interest rate construction with tension interpolation splines. AIMS Mathematics, 2024, 9(1): 240-256. doi: 10.3934/math.2024014
[10]	Gaosheng Liu, Yang Bai . Statistical inference in functional semiparametric spatial autoregressive model. AIMS Mathematics, 2021, 6(10): 10890-10906. doi: 10.3934/math.2021633

Abstract

1. Introduction

Nonlinear phenomena, namely nonnormality, asymmetric cycles, and nonlinear relationships between lagged variables, have been well observed in some classical data sets, such as the sunspot, Canadian lynx, and Australian blowfly data. However, linear ARMA models can not adequately approximate these nonlinear phenomena^[1]. Due to this, nonparametric regression models have found important applications in modeling nonlinear time series^[2,3]. Yet, in the multivariate setting with more than two variables, it is difficult to estimate regression function with reasonable accuracy due to the "curse of dimensionality". To solve this problem, various semiparametric models have been studied. For example, Stone^[4] proposed the additive model, and Gao^[5] investigated a class of partially linear models. In addition, one of the most popular semiparametric models is the varying coefficient model ^[6], whose regression function depends linearly on some regressors and regression coefficients vary with some threshold variables.

In this study, we consider the varying coefficient model of the following form:

$\begin{equation} Y_{t} = \bf{X}_{t}^{ \top}\mathit{\pmb{\beta}}(U_{t})+\epsilon_{t} \end{equation}$

(1.1)

where $\bf{X}_{t} = (X_{t1}, \cdots, X_{tp})^{ \top}$ and $\mathit{\pmb{\beta}}(U_{t}) = (\beta_{1}(U_{t}) \cdots, \beta_{p}(U_{t}))^{ \top}$ . The functions $\beta_{j}(\cdot), j = 1, \cdots, p$ are assumed to be unknown but smooth. ${ \top}$ denotes the transpose of a vector or matrix. $U_t$ is a univariate random variable, which is called the threshold variable. Both $\bf{X}_{t}$ and $U_t$ can consist of either exogenous variables or lagged variables of $Y_{t}$ . $\epsilon_{t}$ is the error term that satisfies $\mathbb{E}(\epsilon_{t}| \bf{X}_{t}, U_{t}) = 0$ , and $\boldsymbol{{\epsilon}} = (\epsilon_{1}, \ldots, \epsilon_{n})^{ \top}$ . As a generalization of the linear model, the Model (1.1) has attracted a great deal of attention over the past two decades. When $\bf{X}_{t}$ and $U_t$ are some lagged variables of $Y_{t}$ , Chen and Tsay ^[7] proposed an arranged local regression procedure for the specification of the Model (1.1), and the consistency result and a recursive algorithm were given. Cai et al.^[1] applied a local linear technique to estimate the time-varying coefficients. The asymptotic properties of the kernel estimators were investigated under the $\alpha-$ mixing condition. Nevertheless, as pointed out in ^[8,9], the local smoothing method is computationally expensive because it requires re-fitting at every point where the fitted function needs to be evaluated. In contrast, the advantage of the B-spline estimation method is its computational efficiency. However, it is difficult to establish the asymptotic normality of the spline estimators ^[10,11]. For varying coefficient models, Lai et al.^[12] considered the B-spline to estimate the time-varying coefficients.

The aforementioned work depends on the assumption of independent errors. However, model misspecification, such as omitting relevant variables and wrong function form, may result in correlated errors, as mentioned in ^[13]. From this perspective, the assumption of independent errors is inappropriate. Many authors have studied the topic of non-/semi-parametric regression with correlated errors. Under the assumption that model errors follow an invertible linear process, Xiao et al. ^[14] proposed a modification of the local polynomial estimation for nonparametric regression. Su and Ullah ^[13] considered nonparametric regression with an error process in a nonparametric autocorrelated form. Lei et al.^[15] investigated a semiparametric autoregressive model with ARMA errors. In this study, we propose a global smoothing method based on B-spline for the estimation of the time-varying coefficients. Similar to ^[16] and ^[17], we have relaxed the assumption of independence to model errors, assuming that they can be correlated. That is, when the model errors are independent, the covariance matrix of the errors is $\mathbb{E}(\boldsymbol{{\epsilon}}\boldsymbol{{\epsilon}}^{ \top}) = \sigma^{2} \boldsymbol{{I}}$ . Now, we assume that $\mathbb{E}(\boldsymbol{{\epsilon}}\boldsymbol{{\epsilon}}^{ \top}| \bf{X}_{t}, U_{t}) = \boldsymbol{{V}}$ . Specially, if model errors follow an AR(1) process, $\epsilon_{t} = \rho \epsilon_{t-1}+e_{t}$ , $| \rho| < 1$ , $e_{t}\stackrel{\mathrm{i.i.d}} {\sim} (0, \sigma^{2})$ ,

$\boldsymbol{{V}} = \frac{\sigma^{2}}{1- \rho^{2}}\left( \begin{array}{ccccc} 1 & \rho &\cdots &\rho^{n-2} &\rho^{n-1}\\ \rho & 1 &\cdots &\rho^{n-3}&\rho^{n-2}\\ \vdots & \vdots & \ddots & \vdots & \vdots \\ \rho^{n-2} &\rho^{n-3} &\cdots &1 &\rho\\ \rho^{n-1} &\rho^{n-2} &\cdots &\rho &1\\ \end{array} \right).$

Our study assumes that the covariance matrix of model errors is positive definite, without requiring a specific form of autoregression, in which case, the scope of application can be further expanded. Certainly, we extend Theorem 1 of ^[12] to the case of correlated errors.

The remainder of this paper has been presented as follows. In Section 2, the estimation procedure for the varying coefficient model with correlated errors has been introduced. Section 3 presents the consistency and convergence rates of the spline estimators and provides an estimation algorithm. In Section 4, we present numerical examples to illustrate the performance of the proposed estimation method. In Section 5, we compare the results of the proposed spline method with the local linear method proposed by Cai et al. ^[1]. The conclusions are presented in Section 6 and the proofs of the main results in the Appendix.

2. Estimation method

Let $a = \xi_{0} < \xi_{1} < \cdots < \xi_{M+1} = b$ partition the interval $[a, b]$ into subintervals $[\xi_{k}, \xi_{k+1}), k = 0, ..., M$ with $M$ internal knots. A polynomial B spline of order $r$ is a function whose restriction to each subinterval is a polynomial of degree $r-1$ and globally $r-2$ times continuously differentiable. A linear space $S_{r, M}$ with a fixed sequence of knots has a normalized B-spline basis $\{B_{1}(u), \cdots, B_{K}(u)\}$ with $K = M+r$ . As in ^[18], the basis satisfies (i) $B_{k}(u)\geq0, k = 1, \cdots, K,$ (ii) $\sum\limits_{s = 1}^K B_{k}(u)\equiv 1$ , and (iii) $B_{j}(u)$ is supported inside an interval of length $r/K$ , and at most $r$ of the basis functions are nonzero at any given $u$ .

Suppose that each coefficient function $\beta_{j}(u)$ in the model (1.1) is smooth; then, it can be well approximated by a B-spline function $\beta^{*}_{j}(u)\in S_{r, M}$ ^[19]. Thus, there is a set of constants $b^{\ast}_{js}, s = 1, \cdots, K$ , such that $\beta_{j}(u)\approx \beta^{\ast}_{j}(u) = \sum\limits_{s = 1}^K b^{\ast}_{js}B_{s}(u)$ . Different coefficients might be approximated by B-spline with a different number of knots in principle, but for simplicity, we have assumed the same basis for the different coefficients. Let $\boldsymbol{{b}} = (\boldsymbol{{b_{1}}}^{ \top}, \cdots, \boldsymbol{{b_{p}}}^{ \top})^{ \top} = (b_{11}, \cdots, b_{1K}, \cdots, b_{p1}, \cdots, b_{pK})^{ \top}$ , $\boldsymbol{{Z}}_{t} = (X_{t1}B_{1}(U_{t}), \cdots, X_{t1}B_{K}(U_{t}), \cdots, X_{tp}B_{K}(U_{t}))^{ \top}$ , $\boldsymbol{{Z}} = (\boldsymbol{{Z}}_{1}, \cdots, \boldsymbol{{Z}}_{n})^{ \top}$ , $\boldsymbol{{Y}} = (Y_{1}, \cdots, Y_{n})^{ \top}$ . Let $\mathbb{E}(\boldsymbol{{\epsilon}}\boldsymbol{{\epsilon}}^{ \top}| \bf{X}_{t}, U_{t}) = \boldsymbol{{V}}$ , and it is initially known. Subsequently, we can write the criterion as

$\begin{equation} Q(\boldsymbol{{b}}) = \frac{1}{n}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}})^{ \top}\boldsymbol{{V}}^{-1}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}}). \end{equation}$

(2.1)

Denoting the minimizer by $\hat{\boldsymbol{{b}}}$ , we estimate $\beta_{j}(u)$ by $\hat{\beta}_{j}(u) = \sum\limits_{k = 1}^K \hat{b}_{jk}B_{k}(u)$ .

3. Asymptotic property

We have imposed the following conditions. (C1) and (C2) are mild regularity conditions. (C3) is necessary for the identification of the coefficient functions in varying coefficient models, as mentioned in Huang and Shen ^[20]. (C4), (C6), and (C7) are the same as those used in ^[1] for stationary mixing data. (C5) imposes some smoothness conditions on the coefficient functions.

3.1. Assumptions

(C1) The eigenvalues of $\boldsymbol{{V}}$ are bounded away from zero and infinity.

(C2) The smoothing variable $U_{t}$ has a bounded density supported on $[a, b]$ .

(C3) The eigenvalues of the matrix $\mathbb{E}(\bf{X}_{t} \bf{X}_{t}^{ \top}\mid U_{t} = u)$ are uniformly bounded away from zero and infinity for all $u\in [a, b]$ .

(C4) The conditional density of $(U_{1}, U_{l+1})$ given $(\bf{X}_{1}, \bf{X}_{l+1})$ is uniformly bounded on the support of $(\bf{X}_{1}, \bf{X}_{l+1})$ . The conditional density of $U_{1}$ given $\bf{X}_{1}$ is uniformly bounded on the support of $\bf{X}_{1}$ .

(C5) For $g = \beta_{j}, 1\leq j \leq p,$ $g$ satisfies a Lipschitz condition of order $d > 1/2:|g^{(\lfloor d \rfloor)}(t)-g^{(\lfloor d \rfloor)}(s)|\leq C|s-t|^{d-\lfloor d \rfloor}$ , where $\lfloor d \rfloor$ is the biggest integer strictly smaller than $d$ and $g^{(\lfloor d \rfloor)}$ is the $\lfloor d \rfloor-$ th derivative of $g$ . The order of the B-spline used satisfies $q \geq d + 1/2$ .

(C6) The process $\{Y_{t}, \bf{X}_{t}, U_{t}\}_{t\in\mathbb{Z}}$ is jointly strictly stationary with $U_{t}$ taking values in $\mathbb{R}$ and $\bf{X}_{t}$ taking values in $\mathbb{R}^{p}$ . The $\alpha$ -mixing coefficient $\alpha(l)$ of $\{Y_{t}, \bf{X}_{t}, Z_{t}\}_{t\in\mathbb{Z}}$ satisfies $\sum\limits_{l = 1}^\infty l^{c}\alpha(l)^{1-2/\delta} < \infty$ for some $\delta > 2$ and $c > 1-2/\delta$ .

(C7) $\mathbb{E}(|X_{tj}|^{2\delta}) < \infty, j = 1, \cdots, d$ , where $\delta$ is given in condition (C6).

3.2. Theoretical result

Theorem 3.1: Assume (C1) $-$ (C7) and that $K\rightarrow \infty, K^{3}/n\rightarrow 0$ , then we have

$\|\hat{\beta}_{j}-\beta_{j}\|_{2}^{2} = O_{p}(K/n+1/K^{2d}), \quad 1\leq j \leq p.$

As noted in Lai et al.^[12], we know that by choosing $K\asymp n^{1/(2d+1)}$ , the well-known optimal convergence rate $O_{p}(n^{-2d/(2d+1)})$ is achieved.

If $\boldsymbol{{V}}$ is unknown and $\hat{\boldsymbol{{V}}}$ is an estimator of $\boldsymbol{{V}}$ , then an application of the substitution principle leads to

$\begin{equation} Q(\boldsymbol{{b}}) = \frac{1}{n}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}})^{ \top}\hat{\boldsymbol{{V}}}^{-1}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}}). \end{equation}$

(3.1)

A two-stage estimator $\bar{\boldsymbol{{b}}}$ can be obtained by minimizing (3.1); then, $\bar{\beta}_{j}(u) = \sum\limits_{k = 1}^K \bar{b}_{jk}B_{k}(u)$ .

Theorem 3.2: Assume (C1) $-$ (C7), $K\rightarrow \infty, K^{3}/n\rightarrow 0$ and that $\hat{\boldsymbol{{V}}}$ is consistent in probability estimating $\boldsymbol{{V}}$ , then we have

$\|\bar{\beta}_{j}-\beta_{j}\|_{2}^{2} = O_{p}(K/n+1/K^{2d}), \quad 1\leq j \leq p.$

Remark 1. In practice, $\boldsymbol{{V}}$ is usually unknown. It is customary to replace $\boldsymbol{{V}}$ in (2.1) with its consistent estimator $\hat{\boldsymbol{{V}}}$ ^[17,21]. Then, $\bar{\boldsymbol{{b}}}$ can be derived through a two-stage estimation. If the sample size is sufficiently large, we can split the data set into the training and test parts, and provide a consistent estimator of $\boldsymbol{{V}}$ from the residuals of ordinary least squares (OLS) for fitting (1.1) by grouping the training set.

3.3. Computational aspects

However, the method in Remark 1 is not feasible when the sample size is small. It should be noted that Montoril et al.^[17] have presented an iterative procedure for estimating $\boldsymbol{{V}}$ , but the iterative procedure is computationally expensive. Hence, they have provided a more efficient method. Generally, if the errors of the model follow an autoregressive process, we can estimate the $\beta_{j}(\cdot)$ by adopting some ideas of ^[17]. The estimation algorithm is as follows.

Step 1. Estimate the coefficient vector $\boldsymbol{{b}}$ by OLS, and denote it by $\boldsymbol{{b}}^{(0)}$ .

Step 2. Compute the residuals via $\epsilon_{t}^{(0)} = Y_{t}-\boldsymbol{{Z}}_{t}\boldsymbol{{b}}^{(0)}$ , and fit an autoregressive model to the residuals, i.e.,

$\epsilon_{t}^{(0)} = \varphi_{1}^{(0)}\epsilon_{t-1}^{(0)}+\cdots+\varphi_{p}^{(0)}\epsilon_{t-p}^{(0)}+e_{t}.$

Step 3. Letting $\boldsymbol{{b}}^{(0)}$ and $(\varphi_{1}^{(0)}\cdots \varphi_{p}^{(0)})$ in step 1 and step 2 as initial values, estimate $\boldsymbol{{b}}$ by minimizing numerically

$\begin{align*} \ell(\boldsymbol{{b}}) = \sum\limits_{t = 1}^n \{\varphi_{p}(L)(Y_{t}-\boldsymbol{{Z}}_{t}\boldsymbol{{b}})\}^{2}, \end{align*}$

where $\varphi_{p}(L) = 1-\varphi_{1}L-\cdots-\varphi_{p}L^{P}$ , with the backshift operator satisfying $L^{k}V_{t} = V_{t-k}$ , $k > 0$ .

4. Numerical examples

In this section, two simulated examples are considered: the threshold variable $U_{t}$ is an exogenous variable and the lagged variable of $Y_{t}$ . For a given data set, we used equally spaced knots. The values used as candidates for the number of internal knots varied from one to five. The optimal number of internal knots is selected using the Bayesian information criterion (BIC) (Schwarz ^[22]). The BIC criterion function is defined as

${\rm{BIC}} = \log( {\rm{RMS}})+\log(n)\times \frac{p}{n},$

where $n$ denotes the sample size, RMS denotes the residual mean square, $p$ is equal to the sum of number of autoregressive coefficients assumed for the errors and number of B-spline basis. The B-spline basis $\{B_{1}(U_{t}), \cdots, B_{K}(U_{t})\}$ can be obtained from function bs using the package splines in R language ^[3]. We consider time series data with lengths $n = 200,400$ and 600, and replicate the simulation 200 times in each case. For each replication, a total of $1000 + n$ observations were generated, and only the last $n$ observations were used to ensure approximate stationarity. The performance of estimators $\{\hat{\beta_{j}}(\cdot)\}$ can be demonstrated by the square root of average squared errors (RASE):

${\rm{RASE}}^{2} = \sum\limits_{k = 1}^{p} {\rm{RASE}}_{j}^{2},$

with

${\rm{RASE}}_{j} = \{n_{\mathrm{grid}}^{-1}\sum\limits_{k = 1}^{n_{\mathrm{grid}}}{[\hat{\beta}_{j}(u_{k})-\beta_{j}(u_{k})]^{2}}\}^{1/2},$

and $\{u_{k}, k = 1, \cdots, n_{\mathrm{grid}}\}$ are grid points on an interval over which the functions are evaluated. Because the range of the time series data varies from simulation to simulation, we need to select a common interval to compare the RASE values. The intervals selected for Example 4.1 and Example 4.2 are $[-0.45, 0.45]$ and $[-2, 2]$ , respectively.

Example 4.1. Consider the following data generating process

$Y_{t} = \beta_{1}(U_{t})Y_{t-1}+\beta_{2}(U_{t})Y_{t-2}+\epsilon_{t},$

with $\beta_{1}(u) = 0.9\sin(\pi u)$ and $\beta_{2}(u) = 0.85\cos(\pi u)$ . We study two autoregressive errors, i.e. AR(1): $\epsilon_{t} = 0.8 \epsilon_{t-1}+e_{t}$ and AR(2): $\epsilon_{t} = 0.5 \epsilon_{t-1}+ 0.45 \epsilon_{t-2} + e_{t}$ , where $e_{t}\stackrel{\mathrm{i.i.d}}{\sim} N(0, 0.2^{2})$ and $U_{t}\stackrel{\mathrm{i.i.d}}{\sim} U[-0.5, 0.5]$ .

Example 4.2. We now discuss an exponential autoregressive (EXPAR) model

$Y_{t} = \beta_{1}(Y_{t-1})Y_{t-1}+\beta_{2}(Y_{t-1})Y_{t-2}+\epsilon_{t},$

with $\beta_{1}(u) = 0.6+0.9e^{(-u^2)}$ and $\beta_{2}(u) = -0.4-1.2e^{(-u^2)}$ . In this case, the autoregressive errors we study are AR(1): $\epsilon_{t} = 0.9\epsilon_{t-1}+e_{t}$ and AR(2): $\epsilon_{t} = 0.6 \epsilon_{t-1}+ 0.35 \epsilon_{t-2} + e_{t}$ , where $e_{t} \stackrel{\mathrm{i.i.d}}{\sim}N(0, 0.2^{2})$ .

5. Numerical results and discussion

In this section, the numerical results for Examples 4.1 and 4.2 are presented. We compare the performance of local linear (Local) estimators proposed by Cai et al.^[1], the spline estimators under the assumption of independent errors (Spl.ind) and spline estimators (Spl.cor) proposed by us. and show the mean and standard deviation (in parentheses) of the RASEs for $\hat{\beta}_{1}(\cdot)$ and $\hat{\beta}_{2}(\cdot)$ with linear ( $k = 1$ ) and cubic ( $k = 3$ ) splines under different AR( $p$ ) errors. It is apparent that the standard deviation of the RASEs in columns Spl.cor decrease with the increase in the sample size $n$ . In addition, the results of linear splines are similar to those of the cubic splines. Moreover, it can be seen that the proposed approach performs better than the method that ignores the correlated errors. Based on cubic splines, and provide the resulting estimates while considering different AR (1) errors: $\epsilon_{t} = \theta \epsilon_{t-1}+e_{t}$ with $\theta = 0.3, 0.6, 0.9, 0.95$ when $n = 600$ . We can see that with the increase in the correlation levels $\theta$ , the performance of Local and Spl.ind worsens, but Spl.cor remains relatively stable. Furthermore, Spl.cor is always better than Spl.ind and Local, which emphasizes the importance of considering the correlated errors. The computational times of Example 4.2 are reported in Table 5. The local linear method is computationally expensive when the sample size is large. Moreover, when considering the correlated errors, the algorithm needs to search for the optimal solution iteratively. Therefore, the calculation time of Spl.cor is longer than that of Spl.ind.

Table 1. Simulation results of Example 4.1 for difference sample sizes.

degree	AR( $p$ )	sample size	Spl.ind		Spl.cor
degree	AR( $p$ )	sample size	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$
$k=1$	$1$	$n=200$	0.1554(0.0366)	0.1028(0.0297)	0.0694(0.0320)	0.0635(0.0269)
		$n=400$	0.1380(0.0286)	0.0937(0.0239)	0.0667(0.0311)	0.0606(0.0265)
		$n=600$	0.1335(0.0240)	0.0885(0.0193)	0.0606(0.0292)	0.0563(0.0245)
	$2$	$n=200$	0.1484(0.0382)	0.1366(0.0355)	0.0624(0.0315)	0.0633(0.0318)
		$n=400$	0.1340(0.0261)	0.1258(0.0270)	0.0626(0.0313)	0.0630(0.0312)
		$n=600$	0.1289(0.0224)	0.1236(0.0226)	0.0624(0.0310)	0.0629(0.0309)
$k=3$	$1$	$n=200$	0.1513(0.0372)	0.1015(0.0296)	0.0664(0.0330)	0.0595(0.0279)
		$n=400$	0.1340(0.0293)	0.0903(0.0249)	0.0592(0.0277)	0.0516(0.0254)
		$n=600$	0.1301(0.0240)	0.0857(0.0211)	0.0530(0.0270)	0.0479(0.0225)
	$2$	$n=200$	0.1448(0.0385)	0.1366(0.0365)	0.0724(0.0342)	0.0718(0.0331)
		$n=400$	0.1309(0.0264)	0.1245(0.0277)	0.0654(0.0321)	0.0638(0.0325)
		$n=600$	0.1264(0.0229)	0.1217(0.0236)	0.0631(0.0270)	0.0619(0.0282)

| Show Table

DownLoad: CSV

Table 2. Simulation results of Example 4.2 for difference sample sizes.

degree	AR( $p$ )	sample size	Spl.ind		Spl.cor
degree	AR( $p$ )	sample size	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$
$k=1$	$1$	$n=200$	0.1452(0.0683)	0.1095(0.0299)	0.0700(0.0307)	0.0658(0.0199)
		$n=400$	0.1049(0.0422)	0.1022(0.0248)	0.0615(0.0239)	0.0634(0.0192)
		$n=600$	0.0894(0.0337)	0.0972(0.0209)	0.0553(0.0208)	0.0599(0.0183)
	$2$	$n=200$	0.2242(0.1250)	0.1307(0.0413)	0.0743(0.0391)	0.0648(0.0227)
		$n=400$	0.1610(0.0807)	0.1305(0.0375)	0.0618(0.0284)	0.0591(0.0180)
		$n=600$	0.1397(0.0691)	0.1274(0.0330)	0.0608(0.0282)	0.0587(0.0166)
$k=3$	$1$	$n=200$	0.1420(0.0528)	0.1132(0.0290)	0.0647(0.0252)	0.0503(0.0165)
		$n=400$	0.1108(0.0411)	0.1021(0.0264)	0.0560(0.0220)	0.0431(0.0152)
		$n=600$	0.0883(0.0309)	0.0964(0.0196)	0.0501(0.0198)	0.0371(0.0146)
	$2$	$n=200$	0.2020(0.0988)	0.1349(0.0437)	0.0763(0.0282)	0.0531(0.0196)
		$n=400$	0.1584(0.0660)	0.1347(0.0373)	0.0585(0.0239)	0.0430(0.0147)
		$n=600$	0.1402(0.0567)	0.1291(0.0311)	0.0538(0.0213)	0.0389(0.0136)

| Show Table

DownLoad: CSV

Table 3. Simulation results of Example 4.1 for difference AR(1) error structures.

$\theta$	Local		Spl.ind		Spl.cor
$\theta$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$
$0.3$	0.0799(0.0207)	0.0447(0.0156)	0.0760(0.0206)	0.0400(0.0145)	0.0443(0.0155)	0.0400(0.0147)
$0.6$	0.1221(0.0229)	0.0474(0.0171)	0.1187(0.0226)	0.0566(0.0188)	0.0442(0.0214)	0.0392(0.0178)
$0.9$	0.1353(0.0226)	0.0752(0.0228)	0.1316(0.0238)	0.1064(0.0220)	0.0641(0.0297)	0.0585(0.0267)
$0.95$	0.1387(0.0229)	0.0971(0.0314)	0.1317(0.0234)	0.1181(0.0233)	0.0739(0.0306)	0.0681(0.0282)

| Show Table

DownLoad: CSV

Table 4. Simulation results of Example 4.2 for difference AR(1) error structures.

$\theta$	Local		Spl.ind		Spl.cor
$\theta$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$
$0.3$	0.0679(0.0266)	0.0373(0.0168)	0.0376(0.0148)	0.0309(0.0106)	0.0359(0.0139)	0.0287(0.0104)
$0.6$	0.0691(0.0281)	0.0469(0.0132)	0.0455(0.0161)	0.0434(0.0129)	0.0393(0.0154)	0.0325(0.0114)
$0.9$	0.1427(0.0678)	0.0888(0.0207)	0.0883(0.0309)	0.0964(0.0196)	0.0501(0.0198)	0.0371(0.0146)
$0.95$	0.2008(0.0977)	0.1504(0.0416)	0.1435(0.0510)	0.1508(0.0318)	0.0563(0.0221)	0.0456(0.0153)

| Show Table

DownLoad: CSV

Table 5. Computation time for the spline estimators and the local linear estimators in seconds based on Example 4.2.

Sample size	Spl.ind	Spl.cor	Local
$n=200$	4.4	85.7	57.3
$n=400$	5.0	94.5	120.1
$n=600$	5.5	102.4	208.6

| Show Table

DownLoad: CSV

The estimated $\beta_{1}(\cdot)$ and $\beta_{2}(\cdot)$ from a typical sample using cubic splines for Examples 4.1 and 4.2 with AR(1) error structures are plotted in Figures 1 to 4. The solid curve is the true curve, and the dotted curve represents the typical estimated curve. The typical sample is chosen in such a way that its RASE value is equal to the median in the 200 simulations ^[23]. Clearly, the proposed estimators capture the main features of the true functions well. Although it is not shown here, the examples with AR(2) error structures show similar results.

Figure 1. Typical estimated curves of

$\beta_{1}$ in Example 4.1.

DownLoad: Full-Size Img PowerPoint

Figure 2. Typical estimated curves of

$\beta_{2}$ in Example 4.1.

DownLoad: Full-Size Img PowerPoint

Figure 3. Typical estimated curves of

$\beta_{1}$ in Example 4.2.

DownLoad: Full-Size Img PowerPoint

Figure 4. Typical estimated curves of

$\beta_{2}$ in Example 4.2.

DownLoad: Full-Size Img PowerPoint

6. Conclusions

This study considered the B-spline estimation for varying coefficient models with correlated errors using a weighted least squares method. In terms of the theoretical results, convergence rates were derived under the $\alpha$ -mixing condition. Simulation studies were conducted to illustrate the performance of the proposed estimation method which showed better performance while considering correlated errors in fitting varying coefficient models.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61973096 and by GDUPS (2019).

Conflict of interest

The authors declare that they have no any competing interests.

Appendix

A. Proof of the main results

Notation: For two positive sequences ${a_{n}}$ and $b_{n}$ , $a_{n}\lesssim b_{n}$ means that $a_{n}/ b_{n}$ is uniformly bounded, $a_{n}\asymp b_{n}$ if $a_{n}\lesssim b_{n}$ and $b_{n}\lesssim a_{n}$ . Let $\|g\|_{2} = \{\int_{[a, b]}g^{2}(x)dx\}^{1/2}$ be the $L_{2}$ -norm of a square integrable function $g(\cdot)$ on $[a, b]$ . $\mathbb{R}$ denotes the set of real numbers, and $\mathbb{Z}$ denotes the set of integers. $|\cdot|$ denotes the Euclidean norm of a vector. $\boldsymbol{{I}}$ is a identity matrix. $\boldsymbol{{e}}_{j}$ denotes a unit vector whose $j$ -th entry is 1 and all other entries of which are $0$ . We denote by $\lambda_{\min}(\cdot)$ $(\lambda_{\max}(\cdot))$ the smallest (the largest) eigenvalue of a matrix. Let $C$ denote a generic constant that might assume different values at different places. Given random variables $V_{n}, n\geq 1$ , let $V_{n} = O_{p}(b_{n})$ means that the random variables $V_{n}/b_{n}, n\geq 1$ are bounded in probability, that is,

$\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\mid V_{n}\mid > M b_{n} ) = 0.$

And $V_{n} = o_{p}(b_{n})$ means that the random variables $V_{n}/b_{n}, n\geq 1$ converge to zero in probability, namely

$\lim\limits_{n \to \infty} \mathbb{P}(\mid V_{n}\mid > \epsilon b_{n} ) = 0, {\rm{ for}} \ \forall \epsilon > 0.$

In order to proof the theorems, we need the following Lemmas.

Lemma 1. (Lemma 2, Lai et al.^[12]) Under assumptions (C2) and (C3), there are constants $0 < b_{1} < b_{2} < \infty$ such that the eigenvalues of $\mathbb{E} \boldsymbol{{Z}}_{1} \boldsymbol{{Z}}_{1}^{ \top}$ fall in $[\frac{b_{1}}{K}, \frac{b_{2}}{K}]$ , and under the additional conditions (C4) $-$ (C7), there are constants $0 < b_{3} < b_{4} < \infty$ such that the eigenvalues of $\sum\limits_{t = 1}^n \boldsymbol{{Z}}_{t} \boldsymbol{{Z}}_{t}^{ \top}/n$ fall in $[\frac{b_{3}}{K}, \frac{b_{4}}{K}]$ with probability approaching 1 as $n\rightarrow \infty$ .

Lemma 2. Under the conditions (C1) $-$ (C7), we have $\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}(\boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1}\boldsymbol{{\epsilon}} = O_{p}(K)$ .

Proof. Note that $\mathbb{E}(\boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}) = \mathbb{E}(\sum\limits_{t = 1}^n \boldsymbol{{Z}}_{t} \boldsymbol{{Z}}_{t}^{ \top}) = n \mathbb{E}(\boldsymbol{{Z}}_{1} \boldsymbol{{Z}}_{1}^{ \top}).$ Hence, by Lemma 1 we obtain

$\begin{equation} \begin{split} \mathbb{E}(\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1}\boldsymbol{{\epsilon}}) & = \mathbb{E} \mbox{tr}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1}\boldsymbol{{\epsilon}}\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})\\ & = \mbox{tr} \mathbb{E}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \mathbb{E}(\boldsymbol{{\epsilon}}\boldsymbol{{\epsilon}}^{ \top}| \boldsymbol{{Z}})\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})\\ & = \mbox{tr} \mathbb{E}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})\\ &\lesssim \mbox{tr} \mathbb{E} ( \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}})\\ &\lesssim n. \end{split} \end{equation}$

(A.1)

Using the Markov inequality, we can derive

$\begin{equation} \|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2} = O_{p}(n). \end{equation}$

(A.2)

On the other hand, by Lemma 1 and (C1),

$\begin{align*} &\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} < M\frac{K}{n})\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\frac{1}{\lambda_{\min}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})} < M\frac{K}{n})\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\frac{1}{\lambda_{\min}(\boldsymbol{{V}}^{-1})\lambda_{\min}( \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}})} < M\frac{K}{n})\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\frac{1}{\lambda_{\min}( \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}})} < CM\frac{K}{n})\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}(\frac{ \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}}{n}) > \frac{C}{MK})\\ &\geq\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}(\frac{ \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}}{n}) > \frac{b_{3}}{K})\\ & = 1. \end{align*}$

Consequently,

$\begin{equation} \lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} = O_{p}(\frac{K}{n}). \end{equation}$

(A.3)

Then, it follows from (A.2) and (A.3) that

$\begin{align*} &\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1}\boldsymbol{{\epsilon}}\leq KM)\\ &\geq \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1}\|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2}\leq KM)\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1}\|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2}\leq KM, \|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2}\leq \sqrt{M}n)\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1}\sqrt{M}n\leq KM, \|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2}\leq \sqrt{M}n)\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1}\leq \frac{K}{n}\sqrt{M}, \|\boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}\|^{2}\leq \sqrt{M}n)\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1}\leq \frac{K}{n}\sqrt{M})\\ & = 1. \end{align*}$

This completes the proof of Lemma 2.

Lemma 3. Suppose $\beta_{0j}(u) = \sum\limits_{k = 1}^K b_{0jk}B_{k}(u)$ is the best approximating B-spline for $\beta_{j}(u)$ with $\|\beta_{0j}-\beta_{j}\|_{\infty} = O(K^{-d})$ (such approximation property is well-known for B-spline under smoothness assumptions (C5), see for example ^[19]). Let $\boldsymbol{{b}}_{0} = (b_{011}, \cdots, b_{01K}, \cdots, b_{0p1}, \cdots, b_{0pK})^{ \top}$ , $\boldsymbol{{\eta}} = \boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}(\boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}}_{0})$ , where $\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}} = \boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-\frac{1}{2}}$ is a projection matrix. Then under (C1) $-$ (C7) we have

$\|\boldsymbol{{\eta}}\|^{2} = O_{p}(K+\frac{n}{K^{2d}}).$

Proof. Denote $r_{t} = \sum\limits_{j = 1}^p X_{tj}\beta_{j}(U_{t})$ and $\boldsymbol{{r}} = (r_{1}, \ldots, r_{n})^{ \top},$ thus $\boldsymbol{{Y}} = \boldsymbol{{r}} + \boldsymbol{{\epsilon}}$ . We have

$\begin{align*} \|\boldsymbol{{\eta}}\|^{2}& = \|\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b}}_{0})\|^{2}\\ & = \|\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{\epsilon}}+\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}(\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0})\|^{2}\\ &\leq 2\|\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{\epsilon}}\|^{2}+ 2\| \boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}(\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0})\|^{2}\\ &\triangleq 2S_{1}+2S_{2}. \end{align*}$

Noting that $\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}$ is an idempotent matrix, by Lemma 2 we obtain that

$\begin{align*} S_{1}& = \boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{\epsilon}}\\ & = \boldsymbol{{\epsilon}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1}\boldsymbol{{\epsilon}}\\ & = O_{p}(K). \end{align*}$

By the approximation property of B-spline we have $\|\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0}\|^{2} = O_{p}(\frac{n}{K^{2d}})$ , hence

$\begin{align*} &\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\|\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}(\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0})\|^{2}\leq M\frac{n}{K^{2d}})\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}((\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0})^{ \top}\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}(\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0})\leq M\frac{n}{K^{2d}})\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\max}(\boldsymbol{{V}}^{-\frac{1}{2}}\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}})\|\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0}\|^{2}\leq M\frac{n}{K^{2d}})\\ &\geq\lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\|\boldsymbol{{r}} - \boldsymbol{{Z}}\boldsymbol{{b}}_{0}\|^{2}\leq CM\frac{n}{K^{2d}})\\ & = 1. \end{align*}$

Thus, $S_{2} = O_{p}(\frac{n}{K^{2d}})$ , the proof is complete.

Proof of Theorem 3.1.

Proof. By the definition of $\hat{\boldsymbol{{b}}}$ , we get

$\begin{equation} \begin{split} 0&\geq nQ(\hat{\boldsymbol{{b}}})-nQ(\boldsymbol{{b_{0}}})\\ & = ( \boldsymbol{{Y}}- \boldsymbol{{Z}}\hat{\boldsymbol{{b}}})^{ \top}\boldsymbol{{V}}^{-1}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\hat{\boldsymbol{{b}}})-( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b_{0}}})^{ \top}\boldsymbol{{V}}^{-1}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b_{0}}})\\ & = 2( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b_{0}}})^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})+[ \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})]^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})\\ & = 2[\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b_{0}}})]^{ \top}\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})+\|\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})\|^{2}. \end{split} \end{equation}$

(A.4)

Applying the Cauchy-Schwartz inequality, the equation (A.4) can be continued as

$0\geq-4\|\boldsymbol{{P}}_{\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}}\boldsymbol{{V}}^{-\frac{1}{2}}( \boldsymbol{{Y}}- \boldsymbol{{Z}}\boldsymbol{{b_{0}}})\|^{2}+\|\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})\|^{2}.$

Hence, by Lemma 3 we obtain

$\|\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})\|^{2} = O_{p}(K+\frac{n}{K^{2d}}).$

Then, it follows from Lemma 1 and (C1) that

$\begin{align*} 0& = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\|\boldsymbol{{V}}^{-\frac{1}{2}} \boldsymbol{{Z}}(\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}})\|^{2} > M(K+\frac{n}{K^{2d}}))\\ &\geq \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} > M(K+\frac{n}{K^{2d}}))\\ &\geq \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}(\boldsymbol{{V}}^{-1})\lambda_{\min}( \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}})\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} > M(K+\frac{n}{K^{2d}}))\\ &\geq \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}(\frac{ \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}}{n})\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} > C M(\frac{K}{n}+\frac{1}{K^{2d}}))\\ & = \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\lambda_{\min}(\frac{ \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}}{n})\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} > C M(\frac{K}{n}+\frac{1}{K^{2d}}), \lambda_{\min}(\frac{ \boldsymbol{{Z}}^{ \top} \boldsymbol{{Z}}}{n}) > \frac{b_{3}}{K})\\ &\geq \lim\limits_{M \to \infty}\lim\limits_{n \to \infty} \mathbb{P}(\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} > C M(\frac{K^{2}}{n}+\frac{1}{K^{2d-1}})), \end{align*}$

which means that $\|\boldsymbol{{b_{0}}}-\hat{\boldsymbol{{b}}}\|^{2} = O_{p}(\frac{K^{2}}{n}+\frac{1}{K^{2d-1}})$ . Then, by the property of B-spline (De Boor^[19])

$\begin{equation} \frac{b_{5}}{K}\|\hat{\boldsymbol{{b}}}_{j}-\boldsymbol{{b}}_{0j}\|^{2}\leq\|\hat{\beta}_{j}-\beta_{0j}\|_{2}^{2}\leq\frac{b_{6}}{K}\|\hat{\boldsymbol{{b}}}_{j}-\boldsymbol{{b}}_{0j}\|^{2}, \end{equation}$

(A.5)

for some constants $b_{5}, b_{6} > 0$ , we can derive

$\|\hat{\beta}_{j}-\beta_{j}\|_{2}^{2}\lesssim \frac{1}{K}\|\hat{\boldsymbol{{b}}}_{j}-\boldsymbol{{b}}_{0j}\|^{2}+\|{\beta}_{j}-\beta_{0j}\|_{\infty}^{2} = O_{p}(\frac{K}{n}+\frac{1}{K^{2d}}).$

Proof of Theorem 3.2.

Proof. The proof of this theorem is similar to that of Theorem 2.2 in ^[17].

Let $\bf{R} = \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}}$ and $\hat{ \bf{R}} = \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Z}}$ . Note that $\hat{\boldsymbol{{V}}}$ is consistent, then by the Theorem 1.4.2 in ^[24] we have

$\lambda_{\max}( \bf{R}\hat{ \bf{R}}^{-1})\leq \lambda_{\max}(\boldsymbol{{V}}^{-1}\hat{\boldsymbol{{V}}}) = 1+o_{p}(1).$

This implies that $\hat{ \bf{R}}$ is a consistent estimator of $\bf{R}$ , namely $\hat{ \bf{R}}^{-1} \bf{R}-\boldsymbol{{I}} = o_{p}(1)$ . It follows from (2.1) that

$\hat{\boldsymbol{{b}}} = ( \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Y}} = \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Y}}.$

Therefore, by (3.1) we can derive that

$\begin{align*} \bar{\boldsymbol{{b}}}& = ( \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Z}})^{-1} \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Y}}\\ & = (\hat{ \bf{R}}^{-1}- \bf{R}^{-1}) \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Y}}+ \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Y}}\\ & = (\hat{ \bf{R}}^{-1} \bf{R}-\boldsymbol{{I}}) \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}\hat{\boldsymbol{{V}}}^{-1} \boldsymbol{{Y}}+ \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}(\hat{\boldsymbol{{V}}}^{-1}\boldsymbol{{V}}-\boldsymbol{{I}})\boldsymbol{{V}}^{-1} \boldsymbol{{Y}}+\hat{\boldsymbol{{b}}}\\ & = (\hat{ \bf{R}}^{-1} \bf{R}-\boldsymbol{{I}}) \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}(\hat{\boldsymbol{{V}}}^{-1}\boldsymbol{{V}}-\boldsymbol{{I}})\boldsymbol{{V}}^{-1} \boldsymbol{{Y}}+(\hat{ \bf{R}}^{-1} \bf{R}-\boldsymbol{{I}}) \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}\boldsymbol{{V}}^{-1} \boldsymbol{{Y}}\\ &+ \bf{R}^{-1} \boldsymbol{{Z}}^{ \top}(\hat{\boldsymbol{{V}}}^{-1}\boldsymbol{{V}}-\boldsymbol{{I}})\boldsymbol{{V}}^{-1} \boldsymbol{{Y}}+\hat{\boldsymbol{{b}}}\\ & = o_{p}(1)\hat{\boldsymbol{{b}}}+\hat{\boldsymbol{{b}}}. \end{align*}$

This ensures that

$\begin{equation} \|\bar{\boldsymbol{{b}}}-\hat{\boldsymbol{{b}}}\|^{2} = o_{p}(1). \end{equation}$

(A.6)

By (A.5), (A.6) and Theorem 3.1, we conclude that

$\begin{align*} \|\bar{\beta}_{j}-\beta_{j}\|_{2}^{2}&\leq\|\bar{\beta}_{j}-\hat{\beta_{j}}\|_{2}^{2}+\|\hat{\beta}_{j}-\beta_{j}\|_{2}^{2}\\ &\lesssim\|\bar{\boldsymbol{{b}}}-\hat{\boldsymbol{{b}}}\|^{2}+\|\hat{\beta}_{j}-\beta_{j}\|_{2}^{2}\\ & = O_{p}(\frac{K}{n}+\frac{1}{K^{2d}}). \end{align*}$

This completes the proof of Theorem 3.2.

References

[1]	Z. Cai, J. Fan, Q. Yao, Functional-coefficient regression models for nonlinear time series, J. Am. Stat. Assoc., 95 (2000), 941–956. doi: 10.1080/01621459.2000.10474284. doi: 10.1080/01621459.2000.10474284
[2]	W. Härdle, H. Liang, J. Gao, Partially linear models, Heidelberg: Physica-Verlag, 2000. doi: 10.1007/978-3-642-57700-0.
[3]	R. S. Tsay, R. Chen, Nonlinear time series analysis, Hoboken: John Wiley & Sons, 2018. doi: 10.1002/9781119514312.
[4]	C. J. Stone, Additive regression and other nonparametric models, Ann. Stat., 13 (1985), 689–705. doi: 10.1214/aos/1176349548. doi: 10.1214/aos/1176349548
[5]	J. Gao, Nonlinear time series: Semiparametric and nonparametric methods, London: Chapman & Hall, 2007. doi: 10.1201/9781420011210.
[6]	T. Hastie, R. Tibshirani, Varying-coefficient models, J. R. Stat. Soc. B, 55 (1993), 757–779. doi: 10.1111/j.2517-6161.1993.tb01939.x. doi: 10.1111/j.2517-6161.1993.tb01939.x
[7]	R. Chen, R. S. Tsay, Functional-coefficient autoregressive models, J. Am. Stat. Assoc., 88 (1993), 298–308. doi: 10.1080/01621459.1993.10594322. doi: 10.1080/01621459.1993.10594322
[8]	X. Wu, Z. Tian, H. Wang, Polynomial spline estimation for nonparametric (auto-) regressive models, Stud. Sci. Math. Hung., 46 (2009), 515–538. doi: 10.1556/sscmath.2009.1105. doi: 10.1556/sscmath.2009.1105
[9]	L. Xue, H. Liang, Polynomial spline estimation for a generalized additive coefficient model, Scand. J. Stat., 37 (2010), 26–46. doi: 10.1111/j.1467-9469.2009.00655.x. doi: 10.1111/j.1467-9469.2009.00655.x
[10]	Y. Lu, R. Zhang, L. Zhu, Penalized spline estimation for varying-coefficient models, Commun. Stat. Theor. M., 37 (2008), 2249–2261. doi: 10.1080/03610920801931887. doi: 10.1080/03610920801931887
[11]	S. Ma, L. Yang, Spline-backfitted kernel smoothing of partially linear additive model, J. Stat. Plan. Infer., 141 (2011), 204–219. doi: 10.1016/j.jspi.2010.05.028. doi: 10.1016/j.jspi.2010.05.028
[12]	P. Lai, J. Meng, H. Lian, Polynomial spline approach for variable selection and estimation in varying coefficient models for time series data, Stat. Probabil. Lett., 96 (2015), 21–27. doi: 10.1016/j.spl.2014.09.008. doi: 10.1016/j.spl.2014.09.008
[13]	L. Su, A. Ullah, More efficient estimation in nonparametric regression with nonparametric autocorrelated errors, Economet. Theor., 22 (2006), 98–126. doi: 10.1017/S026646660606004X. doi: 10.1017/S026646660606004X
[14]	Z. Xiao, O. B. Linton, R. J. Carroll, E. Mammen, More efficient local polynomial estimation in nonparametric regression with autocorrelated errors, J. Am. Stat. Assoc., 98 (2003), 980–992. doi: 10.1198/016214503000000936. doi: 10.1198/016214503000000936
[15]	H. Lei, Y. Xia, X. Qin, Estimation of semivarying coefficient time series models with ARMA errors, Ann. Stat., 44 (2016), 1618–1660. doi: 10.1214/15-AOS1430. doi: 10.1214/15-AOS1430
[16]	M. H. Montoril, P. A. Morettin, C. Chiann, Spline estimation of functional coefficient regression models for time series with correlated errors, Stat. Probabil. Lett., 92 (2014), 226–231. doi: 10.1016/j.spl.2014.05.021. doi: 10.1016/j.spl.2014.05.021
[17]	M. H. Montoril, P. A. Morettin, C. Chiann, Wavelet estimation of functional coefficient regression models, Int. J. Wavelets Multi., 16 (2018), 1850004. doi: 10.1142/S0219691318500042. doi: 10.1142/S0219691318500042
[18]	H. Lian, P. Lai, H. Liang, Partially linear structure selection in Cox models with varying coefficients, Biometrics, 69 (2013), 348–357. doi: 10.1111/biom.12024. doi: 10.1111/biom.12024
[19]	C. De Boor, A practical guide to splines, New York: Springer-Verlag, 1978. doi: 10.2307/2006241.
[20]	J. Z. Huang, H. Shen, Functional coefficient regression models for non-linear time series: A polynomial spline approach, Scand. J. Stat., 31 (2004), 515–534. doi: 10.1111/j.1467-9469.2004.00404.x. doi: 10.1111/j.1467-9469.2004.00404.x
[21]	J. Shao, Mathematical statistics, New York: Springer-Verlag, 2003. doi: 10.1007/b97553.
[22]	G. Schwarz, Estimating the dimension of a model, Ann. Stat., 6 (1978), 461–464. doi: 10.1214/aos/1176344136. doi: 10.1214/aos/1176344136
[23]	J. Fan, Q. Yao, Nonlinear time series: Nonparametric and parametric methods, New York: Springer-Verlag, 2003. doi: 10.1007/978-0-387-69395-8.
[24]	P. K. Sen, J. M. Singer, Large sample methods in statistics: An introduction with applications, New York: Chapman & Hall, 1994. doi: 10.1201/9780203711606.

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(2156) PDF downloads(103) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(4) / Tables(5)

AIMS Mathematics

B-spline estimation in varying coefficient models with correlated errors

Related Papers:

Abstract

1. Introduction

2. Estimation method

3. Asymptotic property

3.1. Assumptions

3.2. Theoretical result

3.3. Computational aspects

4. Numerical examples

5. Numerical results and discussion

6. Conclusions

Acknowledgments

Conflict of interest

Appendix

A. Proof of the main results

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

B-spline estimation in varying coefficient models with correlated errors

Related Papers:

Abstract

1. Introduction

2. Estimation method

3. Asymptotic property

3.1. Assumptions

3.2. Theoretical result

3.3. Computational aspects

4. Numerical examples

5. Numerical results and discussion

6. Conclusions

Acknowledgments

Conflict of interest

Appendix

A. Proof of the main results

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog