1.
Introduction
Nowadays, network research has become increasingly important with the development of science and technology. A network system is composed of intricate relationships. For example, in a large social network, the subjects may be connected by more than one type of relationship. Instead, they can be complexly connected by multiple types of relationships, named the network autoregressive models studied by some scholars. Zhu et al. [1] established a network autoregressive model to study social relationships, which can fully consider all time-series information. However, subjects within a studied network may not be connected only through one type of association; instead, they could be complexly connected through multiple types of relationships. Therefore, Wei and Tian [2] proposed a network regression model with multiple types of connections and considered the impact of multiple connection nodes on the heterogeneity of continuous response variables. To express individual heterogeneity, Zhu and Pan [3] developed a grouped network vector autoregressive (GNAR) model based on the network autoregressive model. Zhu et al. [1] considered both non-specific variables and various connectivity relationships. Subsequently, Tian et al. [4] proposed a mixed network regression model that considers non-specific variable dependence on time based on their foundation. Huang et al. [5] proposed a network autoregressive model with the GARCH effect to describe the return dynamics of stock market indices. Due to different nodes having different effects on others, Tang et al. [6] proposed a penalty method for estimating the network vector autoregressive (NAR) models with different individual effects. Wang et al. [7] proposed a network binary segmentation method for detecting change points, which relies on a weighted average adjacency matrix. Xiao et al. [8] proposed the Huber estimator for estimating the parameters of network autoregressive models. Zhao and Liu [9] proposed the classifier-regularized approach for homogeneous analysis of network effects under the network autoregressive model.
The change-point problem was first proposed by Page [10] and has been widely applied in many fields. It mainly detects whether a change has occurred and determines its position. There are many methods for detecting change points, such as the cumulative sum (CUSUM), likelihood ratio test (LRT), Schwarz information criterion (SIC), Akaike information criterion (AIC), Bayesian information criterion (BIC), and modified information approach (MIC). Page [11] considered the problem of parameter changes in time series based on the CUSUM test. Kim [12] considered a test for a change point in linear regression by using the likelihood ratio statistic and studied the asymptotic behavior of the LRT statistic. Chen et al. [13] proposed the MIC for detecting change points in linear regression models. Jie [14] proposed the SIC to locate change points in simple linear regression models and multiple linear regression models. Basalamah et al. [15] proposed the MIC method to detect parameter changes in linear regression models with normally distributed error terms. Recently, Horváth et al. [16] proposed a weighted function based on the CUSUM of linear model residuals processed to detect the change points of linear regression models. Lee et al. [17] proposed a detector based on CUSUM of score vectors and residuals to investigate the sequential process of early detection of parameter changes in conditional heteroscedasticity time series models.
Empirical likelihood was proposed by Owen [18,19] and has been widely used due to its robustness to non-parametric properties and the efficiency of likelihood construction. Without knowing the distribution, the empirical likelihood method can be used to solve the problem, but it has some limitations in terms of computation. Specifically, many scholars use empirical likelihood methods to detect change points in regression models. Liu et al. [20] proposed a non-parametric method based on empirical likelihood to detect coefficient changes in linear regression models. Ning [21] considered the linear model of mean and proposed a non-parametric method for empirical likelihood testing to detect and estimate the position of change points. Zhao et al. [22] proposed an improved empirical likelihood ratio statistic to test for the presence of change points in long-term experiments and observations, which constructed empirical likelihood ratio statistics based on fitting residuals. Wu et al. [23] proposed a non-parametric method based on jackknife empirical likelihood (JEL) to detect changes in regression coefficients. Because empirical likelihood was initially proposed for independent data, applying it to related data such as time series data is difficult. Some scholars have conducted research on the transformation of dependent data into independent data. Akashi et al. [24] proposed empirical likelihood ratio statistics to detect change points when the position of the change point is unknown in autoregressive models. Gamage and Ning [25] used a powerful non-parametric method to propose empirical likelihood ratio statistics to detect changes in the parameter structure of autoregressive models. Yu et al. [26] proposed the empirical likelihood ratio test to detect structural changes in integer autoregressive (INAR) processes.
To the best of our knowledge, scholars have mainly focused on parameter estimation of network autoregressive models, and no one has used empirical likelihood methods to study the change-point problem. Detecting change points in network autoregressive models has great practical significance. For example, we can construct a network autoregressive model due to the intricate connections between stocks. Since some events may cause changes at a particular moment, detecting the location of these changes can provide better evidence for experts to study the stock market. According to some scholars, the idea of empirical likelihood detection of change points was adopted for time series data. Therefore, it is of great significance to perform change-point detection on network autoregressive models. The network autoregressive model is a high-dimensional time series model that can explain the natural world well by constructing adjacency matrices to clarify the relationships between subjects. This paper considers a non-parametric method to perform change-point detection on complex autoregressive models without knowing the distribution. In this paper, we propose an empirical likelihood method based on network autoregressive models to detect structural changes in the model. The structure of this article is as follows. Hypothesis testing and parameter estimation methods are proposed in Section 2. The empirical likelihood method is presented in Section 3. The simulation studies are considered in Section 4. Actual data application is given in Section 5. The conclusion of the paper is presented in Section 6.
2.
The change-point problem in the network vector autoregression model
In the following, we introduce the network autoregressive model proposed by Wei and Tian [2],
where i=1,...,N are the network nodes, t=1,...,T represents the observation times, Yi,t represents the response variable of the ith node at time t, β0 is the intercept term, βk characterizes the kth connection effect for k=1,...,p, βp+1 characterizes the momentum effect, and ϵi,t are the independent random variables with mean 0 and variance σ2. To describe the network structure composed of N nodes through the kth relationship, an adjacency matrix is defined: Ak=(aki,j)∈RN×N, k=1,...,p. If node i and node j have the kth relation, then aki,j=1, otherwise aki,j=0. Since the autocorrelation between nodes is not considered, the diagonal of the adjacency matrix Ak is set to 0. nk,i=∑j≠iaki,j is the total number of nodes connected by the ith node through the kth relationship. n−1k,i∑Nj=1aki,jYj,t−1 is the average impact of the kth connection of the ith node's neighbors. At the same time, it can be noted that Yt∈RN with 0≤t≤T is an ultra-high dimensional time series.
For the convenience of estimating unknown parameters, let Yt=(Y1,t,...,YN,t)′∈RN, Wk=diag(n−1k,1,...,n−1k,N)Ak∈RN×N, 1=(1,...,1)′∈RN, and ϵt=(ϵ1,t,...,ϵN,t)′∈RN. The model (2.1) can be rewritten as:
We consider one change point k∗ in the model (2.2), represented as:
where β=(β0,β1,...,βp,βp+1)′ and β∗=(β∗0,β∗1,...,β∗p,β∗p+1)′ are the unknown parameters, and k∗ is the unknown change-point position the needs to be estimated. To estimate the coefficient vectors β and β∗, we rewrite model (2.3) as:
where Xt−1=(X1,t−1,..., XN,t−1)′∈RN×(p+2), Xi,t−1=(1,w1iYi,t−1,...,wpiYi,t−1,Yi,t−1)′∈Rp+2, wki is the ith row of Wk, for 1≤t≤k∗, Zt−1=(Z1,t−1,..., ZN,t−1)′∈RN×(p+2), Zi,t−1=(1,w1iYi,t−1,...,wpiYi,t−1,Yi,t−1)′∈Rp+2, and wki is the ith row of Wk for k∗+1≤t≤T. Thus, the estimated coefficients can be obtained as follows:
Therefore, model (2.3) can be rewritten as:
and the errors are estimated in the following,
According to the switching rule suggested by Liu and Qian [27], the estimated error can be expressed as follows,
Under the null hypothesis H0, if no change occurs, we can obtain β=β∗. Therefore, we rewrite the hypothesis test as
3.
Empirical likelihood
Notice that under H0, for any k∗∈1,...,T, there is E(˜ϵi,t(k∗))=0. For a fixed k∗, let pi,t(k∗) be the mass probability at the value ˜ϵi,t(k∗), and for 1≤i≤N, the constraint is ∑Tt=1pi,t(k∗)=1. When pi,t(k∗)=1T, for any t=1,...T and 1≤i≤N, the empirical likelihood ∏Tt=1pi,t(k∗) reaches its maximum T−T. The empirical likelihood ratio is ∏Tt=1Tpi,t(k∗). Hence, for 1≤i≤N, the empirical log-likelihood ratio (ELR) statistic is defined as:
Owen [19] showed, similar to the log-likelihood ratio test statistic in a parametric model, with mild regular conditions, for 1≤i≤N, −2ℜ(i,k∗)→χ2d in distribution as T→∞, where d is the rank of Var(˜ϵi,t(k∗)).
Therefore, for 1≤i≤N, we propose an ELR test statistic for the change-point detection in the following,
A Lagrangian argument gives
where λi is chosen such that ∑Tt=1pi,t(k∗)˜ϵi,t(k∗)=0 for 1≤i≤N. After plugging back pi,t(k∗) in (3.1), Zi,k∗ can be rewritten as:
Define the score function
Then, ^λi are determined by
Therefore, for 1≤i≤N, (3.2) can be rewritten as:
and we propose the following EL test statistic:
However, when the position of the change point is close to 1 and T, the estimation efficiency will be poor because there is little information available for parameter estimation, in part with fewer samples, which leads to inaccurate estimation. Therefore, we suggest the trimmed likelihood ratio statistic as
To choose suitable values for k0, Csörgő and Horváth [28] gave the conditions of k0 such that the trimmed test statistic follows an extreme distribution asymptotically. In this paper, we choose k0=2[T12] for simplicity and convenience, where [x] means the largest integer not larger than x.
Next, we will outline the main results of the asymptotic distribution under the null hypothesis and the consistency under the alternative hypothesis.
Theorem 1. Assume that H0 holds and C.1–C.3 are satisfied. Then under the null model,
for all x, where A(x)=(2logx)12, Dr(x)=2logx+(r2)loglogx−logΓ(r2), u(T)=T2+(2[T12])2−2T[T12](2[T12])2, and r is the dimension of the parameter space.
Theorem 2. Under the conditions of Theorem 1, for 1≤i≤N, and if there exists a positive constant c0 satisfying 0<c0≤supElog(1+λi˜ϵi,t(k∗))<∞, if change point k∗ satisfies k∗T→c∈(0,1) as min(N,T)→∞, then the ELR test is consistent. That is, under the alternative hypothesis,
in probability.
Proofs of the above two theorems are given in the Supplementary Materials.
Theorem 1 indicates that, under the null hypothesis, the asymptotic distribution of the EL test statistic is the Gumbel extreme value distribution.
For any given r, α, N, and T, if M′N,T<cr,α,T, we fail to reject H0, where cr,α,T are the critical values for r, α, and T. Applying the above Theorem 1, cr,α,T is derived as follows:
Therefore,
Tables 1 and 2 provide the critical values cr,α,T for different r, α, and T.
4.
Simulation study
In this article, we consider p=1 and p=2 in the model (2.1). For the construction of the adjacency matrix A, we follow the same steps described in Zhu et al. [1].
4.1. The network autoregressive model with p=1
The adjacency matrix A is constructed from a power-law distribution model: First, we generate for each node its in-degree di=∑jaji according to the discrete power-law distribution, that is, P(di=l)=ql−b for a normalizing constant q and the exponent parameter b=1.2. A smaller b value implies a heavier distribution tail. Next, for the ith node, we randomly select di nodes to be its followers.
To study the detection performance of our proposed method, we study the values of the power of the empirical likelihood statistic. Consider the following model:
In our simulation, we set parameters β=(β0,β1,β2)′=(0,0.6,0.1)′ and β∗=(β∗0,β∗1,β∗2)′=(0,−0.7,0.8)′. Sample sizes N=25,30 and T=100,200,400 are considered. At the same time, we choose the position of the change point k∗=0.25T,0.5T,0.75T. According to Theorem 1, we can obtain the p-value and reject the H0 when it is below the significance level α=0.01,0.05. At the same time, we computer the value of the power, which is the probability that the H0 was rejected in 1000 simulations. Moreover, the values of the power are obtained by considering four different distributions of error terms that satisfy E(ϵi,t)=0 and Var(ϵi,t)=1. The four distributions are the: (ⅰ) Standard normal distribution N(0,1), (ⅱ) exponential distribution exp(1)−1, (ⅲ) chi-square distribution 12√2(χ2(4)−4), and (ⅳ) student-t distribution 1√2t(4). Next, Tables 3–6 show the values of the power of parameter changes in the network autoregressive model p=1 detected under four different distributions.
It can be seen from Tables 3–6 that the values of the power are approximately similar for four different error distributions under the same conditions. For α=0.05, almost all of the values of the power are greater than 95%. For α=0.01, the values of the power reach over 90%. Specifically, when N=30, the values of the power of the four distributions at the 0.75T position are generally close to 1. At the same time, it can effectively control the error of type Ⅰ when α=0.01. Moreover, the values of the power increase as N increases and decrease as T increases. Perhaps due to the relatively small value of N, there is increasing bias in estimated parameter values, which in turn affects the test statistic constructed based on the error term. This ultimately leads to a decrease in the values of the power as T increases. The values of the power change with the position of the change point for all error distributions and also increase with the increase of the change-point position.
4.2. The network autoregressive model with p=2
The constructions method of adjacency matrices A1 and A2 are as follows. First, the adjacency matrix A1 construction is the same as that of p=1. Next, the adjacency matrix A2 is simulated from the stochastic block model with 5 blocks. We randomly assign a block label to each node with equal probability, set P(ai,j=1)=0.3N−0.3 if nodes i and j belong to the same block, and P(ai,j=1)=0.3N−1 otherwise.
Then, we study the values of the power of the empirical likelihood statistic. Consider the following model:
The parameters are set to β=(β0,β1,β2,β3)′=(0,0.1,0.1,0.1)′ and β∗=(β∗0,β∗1,β∗2,β∗3)′=(0,−1,−0.2,0.6)′. Consider the same sample size and change-point position as p=1. The p-value based on 1000 simulation runs based on α=0.01,0.05 are calculated. Similarly, like the p=1 model, we consider the same four distributions for the error term to obtain the values of the power. Next, Tables 7–10 show the values of the power of parameter changes in the network autoregressive model p=2 detected under four different distributions.
It can be seen from Tables 7–10 that the results for the case of p=2 have the same pattern as the case of p=1. Although the values of the power for all four distributions have all decreased compared to the case of p=1, they are all above 80% for α=0.05 or α=0.01. Moreover, the biggest value of the power appears at the 0.75T position of α=0.05, N=30, and T=100; and the smallest value of the power appears at the 0.25T position of α=0.01, N=25, and T=400. Meanwhile, it can be concluded that the proposed method can effectively control type 1 errors. This indicates that the proposed ELR is sensitive and robust.
5.
Application
We apply our method to the Chinese stock market dataset, which contains the daily closing prices of 30 stocks from August 18, 2022 to November 15, 2023. The detailed information for these 30 stocks is given in the Supplementary Materials. Furthermore, a time series chart of the daily closing prices of 30 stocks with different prices from 0 to 230 (CNY) is displayed in Figure 1. In order to make the time series chart more aesthetically pleasing, we divided them into four charts for display. The stock prices range from 0–15, as shown in Figure 1(a); 10–40, as shown in Figure 1(b); 20–60, as shown in Figure 1(c); and 70–230, as shown in Figure 1(d).
The response variables of this dataset (Yi,t,i=1,...,30,t=1,...,300) take into account the daily closing prices of 30 stocks over 300 days. We consider the following two different connections: A1 is composed of four regional sectors of stocks: Shanghai, Beijing, Shaanxi, and Shenzhen, and A2 is constructed by five industry sectors: real estate, financial, tourism, media, and technology. Table 11 shows the construction of adjacency matrices. From Formula (3.5), k∗=251 with the corresponding pvalue=5.6488×10−7 can be obtained. Therefore, the change point is detected at position 251, corresponding to August 29, 2023. In addition, Figure 2 shows the values of Zi,k∗ for 30 stocks under different k∗ conditions. Figure 2 shows the maximum value of Zi,k∗ at position k∗=251, with M′N,T=112.1537.
The connecting effects before and after August 29, 2023 can be calculated as ˆβ=(5.6314×10−2,−6.9384×10−4,−5.1843×10−5,9.9816×10−1)′ and ^β∗=(9.9786×10−2,−4.022×10−3,−9.6017×10−4,9.9765×10−1)′. In fact, in late August 2023, the Chinese government issued some home purchase policies, such as recognizing a house but not a loan, lowering mortgage interest rates, and driving people's consumption through implementation in different regions, greatly promoting the development of the real estate and financial industries. The change in connectivity effect from ^β1=−6.9384×10−4 to ^β∗1=−4.022×10−3 may be due to the different implementation times of government-issued housing policies in different regions. The change in connectivity effect from ^β2=−5.1843×10−5 to ^β∗2=−9.6017×10−4 may be due to the degree to which the government's purchasing policies affect different industry sectors.
6.
Conclusions
In this paper, we considered the EL method to detect structural changes for the network autoregressive models. The asymptotic null distribution of the test statistic is the Gumbel extreme value distribution, which was also studied. Through the simulation studies, different error distributions were illustrated, and the results proved that the proposed test statistic has good performance in detecting the change points. The simulation experimental results show that our method can effectively identify changes in the given network autoregressive model. The final application of the Chinese stock market further demonstrated the practical significance of the proposed method. In the future, we will extend the network model to the spatiotemporal data and consider the potential change-point problem. On the other hand, due to the EL method's computational limitations, we can adopt other methods, such as JEL and Adjusted Empirical Likelihood (AEL), in the future to improve the computational problem. Finally, we can also apply the method to more updated network regression models to detect change points in the future.
Author contributions
W. T., C. T., and W. N.: Conceptualization, Methodology, Validation, Investigation, Resources, Supervision, Project administration, Visualization, Writing-review and editing; J. Y. and S. L.: Software, Formal analysis, Data curation, Writing-original draft preparation, Visualization. All authors have read and agreed to the published version of the manuscript.
Use of AI tools declaration
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
The authors thank the editor and three anonymous referees for their valuable comments and suggestions that helped to improve this article significantly.
The research was supported by the Natural Science Foundation of Top Talent of SZTU (GDRC202214).
Conflict of interest
The authors declare no conflicts of interest.
Supplementary Materials
The regular conditions needed are listed as follows. For 1≤i≤N, assume
C.1. rank(Xi,t−1) = rank(Zi,t−1) = d for k0≤k∗≤T−k0.
C.2. There are some δ>0,ν>0,ν>2+27/min(1+δ), σ2i,1>0, and σ2i,2>0, and positive-definite matrices Σi,1, Σi,2 such that as k∗→∞ and T−k∗→∞,
where r(x)=1/(logx)ν, and |⋅| is the ordinary norm: |(aij)|=(∑i∑ja2ij)1/2.
C.3. There is some δ>0 such that max1≤k∗≤T|qi,t−1|=o(T1/(2+δ)), and E|ϵi,t|2+δ<∞.
Assumption C.2 is slightly weaker than C.9 in Csörgő and Horváth (1997, page 204) that assumes Σi,1=Σi,2. (1/k∗)X′i,t−1Xi,t−1 and (1/T−k∗)Z′i,t−1Zi,t−1 may have different limits if existing. In the commonly adapted regression model that (yi,t,xi,t)'s are an independent and identically distributed sample with E|(yi,t,xi,t)|2+δ<∞ for some, it is easily seen that C.2 and C.3 hold in probability one. The first lemma gives an order estimate for max(˜ϵi,t(k∗)). Denote ˉϵi(k∗)=(1/T)∑Tt=1˜ϵi,t(k∗),
and s2i(k∗)=(1/T)∑Tt=1˜ϵ2i,t(k∗).
Lemma 1. Assume that H0 and C.1–C.3 hold. Then
Lemma 2. Assume that H0 and C.1–C.3 hold. Then
(a) maxk0≤k∗≤T−k0|ˉϵi(k∗)|=OP(T−1/2loglog1/2T).
(b) maxk0≤k∗≤T−k0s2i(k∗)=OP(1) and in probability,
Furthermore, if k→T∞ as T→∞, we have maxkT≤k∗≤T−kT|s2i(k∗)−σ2i|=OP(1).
Lemma 3. Assume that H0 and C.1–C.3 hold. Then for some τ>0,
The proof process of Lemmas 1–3 is similar to Zhao et al. (2013).
Proof of Theorem 1. First, we use Lemmas 1–3 to obtain a quadratic approximation to −2ℜ(i,k∗), uniformly in k∗. Following Owen's (2001, page 221) arguments, denote ζi,t=ˆλi˜ϵi,t(k∗). Using Taylor's expansion,
where |ξi,t|≤|ηi,t|=ˆλ˜ϵi,t(k∗)=OP(1), uniformly in k∗. By Lemmas 1 and 3, for some δ>0,
Next, by Lemma 3, for some τ>0,
and
Combining (6.4)–(6.7) yields that for any
Now applying the Taylor expansion
we have for any 0<τ2<τ1,
Using the same arguments as the proof of Theorem 3.1.2 of Csörgő and Horváth (1997), we have
because A(logu(T))OP(T−τ2)=o(1), and it follows from (6.9) that
Proof of Theorem 2. Under H1, we can obtain
where Gi=∑pk=1βkwki+βp+11 and Hi=∑pk=1β∗kwki+β∗p+11.
By Jensen's inequality,
Hence, Zi,k∗→∞. The proof is complete.