Optimal clustering by merge-based branch-and-bound

Pasi Fränti; Olli Virmajoki; Pasi Fränti; Olli Virmajoki

doi:10.3934/aci.2022004

Applied Computing and Intelligence

2022, Volume 2, Issue 1: 63-82. doi: 10.3934/aci.2022004

Previous Article Next Article

Research article

Optimal clustering by merge-based branch-and-bound

Pasi Fränti ^,,
Olli Virmajoki

School of Computing, University of Eastern Finland, Joensuu, Finland

Academic Editor: Chih-Cheng Hung

Received: 18 March 2022 Accepted: 22 March 2022 Published: 31 March 2022

We present a method to construct optimal clustering via a sequence of merge steps. We formulate the merge-based clustering as a minimum redundancy search tree, and then search the optimal clustering by a branch-and-bound technique. Optimal clustering is found regardless of the objective function used. We also consider two suboptimal polynomial time variants based on the proposed branch-and-bound technique. However, all variants are slow and has merely theoretical interest. We discuss the reasons for the results.

Keywords:

Citation: Pasi Fränti, Olli Virmajoki. Optimal clustering by merge-based branch-and-bound[J]. Applied Computing and Intelligence, 2022, 2(1): 63-82. doi: 10.3934/aci.2022004

Related Papers:

[1]	Hassan Shirzeh, Fazel Naghdy, Philip Ciufo, Montserrat Ros . Stochastic energy balancing in substation energy management. AIMS Energy, 2015, 3(4): 810-837. doi: 10.3934/energy.2015.4.810
[2]	K. M. S. Y. Konara, M. L. Kolhe, Arvind Sharma . Power dispatching techniques as a finite state machine for a standalone photovoltaic system with a hybrid energy storage. AIMS Energy, 2020, 8(2): 214-230. doi: 10.3934/energy.2020.2.214
[3]	Khuthadzo Kgopana, Olawale Popoola . Improved utilization of hybrid energy for low-income houses based on energy consumption pattern. AIMS Energy, 2023, 11(1): 79-109. doi: 10.3934/energy.2023005
[4]	Mohamed Hamdi, Hafez A. El Salmawy, Reda Ragab . Optimum configuration of a dispatchable hybrid renewable energy plant using artificial neural networks: Case study of Ras Ghareb, Egypt. AIMS Energy, 2023, 11(1): 171-196. doi: 10.3934/energy.2023010
[5]	Ryuto Shigenobu, Oludamilare Bode Adewuyi, Atsushi Yona, Tomonobu Senjyu . Demand response strategy management with active and reactive power incentive in the smart grid: a two-level optimization approach. AIMS Energy, 2017, 5(3): 482-505. doi: 10.3934/energy.2017.3.482
[6]	Habibullah Fedayi, Mikaeel Ahmadi, Abdul Basir Faiq, Naomitsu Urasaki, Tomonobu Senjyu . BESS based voltage stability improvement enhancing the optimal control of real and reactive power compensation. AIMS Energy, 2022, 10(3): 535-552. doi: 10.3934/energy.2022027
[7]	Syed Sabir Hussain Rizvi, Krishna Teerth Chaturvedi, Mohan Lal Kolhe . A review on peak shaving techniques for smart grids. AIMS Energy, 2023, 11(4): 723-752. doi: 10.3934/energy.2023036
[8]	Mohamed G Moh Almihat, MTE Kahn . Centralized control system for islanded minigrid. AIMS Energy, 2023, 11(4): 663-682. doi: 10.3934/energy.2023033
[9]	Tilahun Nigussie, Wondwossen Bogale, Feyisa Bekele, Edessa Dribssa . Feasibility study for power generation using off- grid energy system from micro hydro-PV-diesel generator-battery for rural area of Ethiopia: The case of Melkey Hera village, Western Ethiopia. AIMS Energy, 2017, 5(4): 667-690. doi: 10.3934/energy.2017.4.667
[10]	Aaron St. Leger . Demand response impacts on off-grid hybrid photovoltaic-diesel generator microgrids. AIMS Energy, 2015, 3(3): 360-376. doi: 10.3934/energy.2015.3.360

Abstract

1. Introduction

In recent decades, there has been a growing interest in the statistical study of functional random variables, which are values in infinite-dimensional spaces. Simply put, functional data analysis (FDA) typically deals with statistical problems where the data consists of a sample of $n$ functions $x_1 = x_1(t), \ldots, x_n = x_n(t)$ , defined on a compact interval of the real line $[0, 1]$ . FDA focuses on statistical issues, often referred to as "inference in stochastic processes", where sample information is derived from a partial trajectory $(x(t), t \in [0, T])$ of a stochastic process $(\{X(t), t \geq 0\})$ . In this context, the duration $(T)$ of the observation interval acts as the sample size $(n)$ . This increasing interest has brought to light several statistical challenges associated with functional random variables. The drive to explore this field has been fueled by the increasing availability of high-resolution temporal and spatial data. This trend is particularly notable in fields such as meteorology, medicine, satellite imaging, and various other scientific disciplines. Consequently, the statistical modeling of this data, viewed as stochastic functions, has led to numerous complex theoretical and computational research questions. To gain a comprehensive understanding of both the theoretical and practical components of functional data analysis, it is recommended that the reader consult the monographs authored by ^[13] for linear models about random variables that assume values in a Hilbert space, and ^[74] for scalar-on-function and function-on-function linear models, as well as functional principal component analysis and parametric discriminant analysis. Ferraty and Vieu ^[38] primarily concentrated on nonparametric techniques, particularly kernel-based estimation, for scalar-on-function nonlinear regression models. These methods were further expanded to encompass classification and discrimination analysis. In their study, ^[48] examined the extension of various statistical concepts, including goodness-of-fit tests, portmanteau tests, and change detection, to the functional data framework. Zhang ^[91] conducted a study on the analysis of variance for functional data, while ^[71] primarily investigated regression analysis for Gaussian processes. Various semi-parametric models have been explored in the literature, such as functional single index models ^[44], projection pursuit models ^[27], partial linear models ^[5], and functional sliced inverse regression ^[43]. Additional studies on functional data modeling and analysis have been documented in the following sources: ^{[45,49,57,67]} and for recent references ^{[1,2,15,17,18,19,20,21,22,66,79]}. One of the most used estimators is the Nadaraya-Watson ^[68,85] estimator. From a function approximation perspective, the Nadaraya-Watson estimator employs local constant approximations. As highlighted by the numerical analyst ^[80], "Through all of scientific computing runs this common theme: Increase the accuracy at least to second order. What this means is: Get the linear term right". In other words, local constant approximations are often inadequate, and local linear fits are preferable. Local polynomial fitting emerges as an attractive method from both theoretical and practical standpoints. Local polynomial fitting offers several advantages, as noted in ^[37]. It adapts to various design types, including random and fixed designs, highly clustered, and nearly uniform designs. Moreover, boundary effects are absent: the bias at the boundary remains of the same order as in the interior, without requiring specific boundary kernels. This is a significant departure from other existing methods. With local polynomial fitting, no boundary modifications are necessary, which is particularly beneficial in multidimensional situations where the boundary can involve a substantial number of data points (see ^[28,78]). Boundary modifications in higher dimensions pose a considerable challenge. The pivotal method proposed by ^[60] involved utilizing local higher-order polynomial fitting to estimate the function $m$ and its partial derivatives. The author demonstrated mean-square convergence and joint asymptotic normality (see also ^[61]). This research holds significant implications for estimating widely used ARCH time series. Masry ^[59] established the strong consistency of the regression estimator and its partial derivatives up to a fixed total order $p$ , obtaining sharp rates of almost sure convergence that are uniform over compact sets. In ^[63], the local polynomial estimation of regression functions and their derivatives is examined, establishing the joint asymptotic normality of the relevant estimates for both strongly mixing and $\rho$ -mixing processes. Masry and Mielniczuk ^[64] considered the nonparametric estimation of a multivariate regression function and its derivatives for a regression model with long-range dependent errors. The authors adopted a local linear fitting approach and established the joint asymptotic distributions for the estimators of the regression function and its derivatives.

The entire body of literature on functional data is primarily centered around the theme of local linear estimation. Rachdi et al. ^[73] innovatively amalgamated the $k$ -Nearest Neighbors method with the local linear estimation approach to formulate a novel estimator for the regression operator. This innovative approach was tailored for situations where the regressor adopts a functional form, and the response variable, though scalar, is subject to random missing observations. On a parallel note, ^[7] undertook the task of estimating the conditional density function through the application of the local linear methodology. Moreover, Chikr-Elmezouar et al. ^[29] delved into the estimation of the conditional density and mode when confronted with functional covariates. In the realm of local linear regression, ^[8] presented a novel local linear regression estimator and conducted a comprehensive study of its asymptotic behavior. The work by ^[9] and ^[11] introduced a general framework elucidating the local behavior of the regression operator, refer to ^[3] for recent references. The foundational work of ^[92] established mean-squared convergence and asymptotic normality for the local linear estimator. Expanding the scope to instances where both the response and explanatory variables are functionally characterized, ^[32] devised a nonparametric local linear estimator for the regression function. The principal objective of the current investigation is to establish a comprehensive framework for polynomial fitting. The central aim of this paper is to furnish the inaugural comprehensive theoretical validation for the kernel polynomial estimator. This entails determining the uniform consistency rate and discerning the asymptotic distribution. Nonetheless, it becomes evident that addressing this issue transcends a mere amalgamation of ideas from disparate domains. Instead, intricate mathematical derivations are imperative to contend with the challenges posed by estimators in our specific context. This necessitates the adept application of techniques rooted in large sample theory.

The article is structured as follows. Section 2 provides the necessary notation and definitions. The main results are outlined in Section 3. The asymptotic distribution is discussed in Section 3.1, and its application to the confidence regions is presented in Section 3.3. The establishment of uniform almost complete convergence is detailed in Section 3.4. Local linear estimators for the conditional distribution and their asymptotic normality are presented in Section 4. In Section 5, we summarize the findings and highlight remaining open issues. All proofs are deferred to Section 6, with a focus on the most crucial arguments due to the lengthiness of the proofs. Additionally, relevant technical results are provided in the appendix.

2. Local polynomial regression

We begin by introducing the necessary notation and definitions for the forthcoming sections. Specifically, we recall the definition of the strong mixing property. Let $\mathscr{F}_{i}^{k}(Z)$ denote the $\sigma$ -algebra generated by $\{Z_{j} : i \leq j \leq k\}$ . The following definition of the strong mixing coefficient is attributed to ^[76]. For further details, refer to ^{[24,75,82,83]}.

Definition 2.1. Let $Z = \{Z_{i}, i = 1, 2, \ldots \}$ be a strictly stationary sequence of random variables. Given a positive integer $n$ , define

$\alpha(n) = \sup \left\{ \left| \mathbb{P}(A \cap B) - \mathbb{P}(A) \mathbb{P}(B) \right| : A \in \mathscr{F}_{1}^{k}(Z) \ { and }\ B \in \mathscr{F}_{k+n}^{\infty}(Z), \; k \in \mathbb{N}^* \right\}.$

The sequence $Z$ is said to be $\alpha$ -mixing (strong mixing) if the mixing coefficient $\alpha(n) \to 0$ as $n \to \infty$ .

The $\alpha$ -mixing condition is the weakest among the commonly used mixing conditions and has numerous practical applications, particularly in economics and finance. For example, several time series models, such as ARCH, ARMA, GARMA, and stochastic volatility models, satisfy the $\alpha$ -mixing condition.

Remark 2.2. Asymptotic independence, or "mixing", of the data-generating process is a common assumption in statistical learning for time series. For many typical processes, the general forms of various mixing rates are often assumed to be known, but specific coefficients are not. These mixing assumptions are rarely tested, and there are no established methods for estimating mixing coefficients from data. In ^[65], an estimator for beta-mixing coefficients based on a single stationary sample path is introduced. Khaleghi and Lugosi ^[51] propose strongly consistent estimators for the $\ell_1$ norm of the sequence of $\alpha$ -mixing and $\beta$ -mixing coefficients of a stationary ergodic process. These estimators are subsequently used to develop strongly consistent goodness-of-fit hypothesis tests. Specifically, Khaleghi and Lugosi ^[51] develop hypothesis tests to determine whether, under the same summability assumption, the $\alpha$ -mixing or $\beta$ -mixing coefficients of a process are upper bounded by a given rate function.

We consider a sequence $\{\left(X_i, Y_i\right): i\geq 1\}$ of stationary^* $\alpha-$ mixing random copies of the random vector [rv] $(X, Y)$ , where $X$ takes its values in some abstract space $\mathfrak{F}$ and $Y$ in $\mathbb R.$ Suppose that $\mathfrak{F}$ is endowed with a semi-metric $d(\cdot, \cdot)$ ^† defining a topology to measure the proximity between two elements of $\mathfrak{F}$ and which is disconnected from the definition of $X$ to avoid measurability problems. We will consider especially the conditional expectation of $\psi(Y)$ given $X = x$ ,

$\begin{equation} r(x) = r_\psi(x) = \mathbb{E}(\psi(Y) \mid X = x), \end{equation}$

(2.1)

^*In the case of Hilbert space-valued elements, strict stationarity is not necessarily required; second-order stationarity suffices. A Hilbert space-valued sequence $\{X_{t}\}_{t \in \mathbb{Z}}$ is second-order (or weakly) stationary if $\mathbb{E}\left\|X_{t}\right\|^{2} < \infty$ , $\mathbb{E} X_{t} = \mu$ , and $\mathbb{E}\left[(X_{s} - \mu) \otimes (X_{t} - \mu)\right] = \mathbb{E}\left[(X_{s-t} - \mu) \otimes (X_{0} - \mu)\right]$ for all $s, t \in \mathbb{Z}$ .

We say that $\{X_{t}\}_{t \in \mathbb{Z}}$ is strictly stationary if the joint distribution of $\{X_{t_{1}}, \ldots, X_{t_{n}}\}$ is the same as the joint distribution of $\{X_{t_{1}+h}, \ldots, X_{t_{n}+h}\}$ for all $t_{1}, \ldots, t_{n} \in \mathbb{Z}$ , $n \geq 1$ , and $h \geq 1$ .

^†A semi-metric (sometimes called pseudo-metric) $d(\cdot, \cdot)$ is a metric which allows $d(x_{1}, x_{2}) = 0$ for some $x_{1}\neq x_{2}$ .

whenever this regression function is meaningful. Here and elsewhere, $\psi(\cdot)$ denotes a specified measurable function, which is assumed to be bounded on each compact subinterval of $\mathbb{R}$ . Note that we can write

$\psi(Y) = r(X)+\epsilon,\qquad \mbox{with} \qquad \mathbb E(\epsilon \mid X) = 0,\qquad \mathbb E(\epsilon^2 \mid X) = \sigma^2(X).$

In this work, we study the problem of nonparametric estimation of the regressor function $r(\cdot)$ using, for the first time in the functional data, the local polynomial fitting such that the regressor belongs to an infinite dimensional set. From now on, for the ease of the notation, we set $\psi(y) = y.$ Recall that in functional statistics, there are several approaches to developing the concept of local linear methods. For example (see ^[9]) the linear approximation of $r(\cdot)$ for any $z$ in the neighborhood of $x$ is given by

$r(z) = r(x)+b\beta(z,x)+o(\beta(z,x)),$

where the quantities $a = r(x)$ and $b$ are estimated by minimizing the following criterion

$\begin{equation} \min\limits_{(a,b)\in \mathbb{R}^2} \sum\limits_{i = 1}^{n}\left( Y_{i}- a-b\beta(X_{i}, x)\right)^{2} K(h_{ K}^{-1}\delta(x, X_{i})), \end{equation}$

(2.2)

where the locating functions $\beta(\cdot, \cdot)$ and $\delta(\cdot, \cdot)$ are defined on $(\mathfrak{F} \times \mathfrak{F})$ and map into $\mathbb{R}$ , with $|\delta(\cdot, \cdot)| = d(\cdot, \cdot)$ , where $d(\cdot, \cdot)$ is a distance metric. The function $\beta$ refers to the local behavior of our model, while $K(\cdot)$ is a kernel function, and $h_{K} = h_{K, n}$ is the bandwidth or smoothing parameter of the kernel $K(\cdot)$ , controlling the size of the local neighborhood and the degree of smoothing. The performance of the estimate depends crucially on the two locating functions, $\beta(\cdot, \cdot)$ and $\delta(\cdot, \cdot)$ , which are defined on $(\mathfrak{F} \times \mathfrak{F})$ and map into $\mathbb{R}$ . These functions satisfy $|\delta(\cdot, \cdot)| = d(\cdot, \cdot)$ , where $d(\cdot, \cdot)$ is a distance metric. $K(\cdot)$ is a kernel function, and $h_{K} = h_{K, n}$ is the smoothing parameter of the kernel $K(\cdot)$ . To clarify, specific forms for $\delta(\cdot, \cdot)$ and $\beta(\cdot, \cdot)$ can be given. For example, if the functional data are 'smooth' curves, one might use the following family of locating functions:

$\operatorname{loc}_a^{(q)}\left(x_1, x_2\right) = \displaystyle {\int} \theta(t)\left(x_1^{(q)}(t)-x_2^{(q)}(t)\right) \mathrm{d} t = \langle\theta, \, x_1^{(q)}-x_2^{(q)}\rangle_{\mathcal{X}},$

where $\theta$ is a function that can be adapted to the data, $\mathcal{X}$ is a Hilbert space, and $\langle\cdot, \cdot\rangle$ denotes the inner product. Choosing $\beta(\cdot, \cdot)$ from such a family is motivated by its connection to the following minimization problem:

$\widehat{c}(x) = \underset{c(x) \in \mathbb{R}^{p+1}}{\arg \min } \sum\limits_{i = 1}^n \left(Y_i - \sum\limits_{l = 0}^{p} c_{l}(x) \langle\theta, X_i^{(q)}-x^{(q)}\rangle_{\mathcal{X}}^l \right)^2 K\left(h^{-1}\left|\delta\left(X_i, x\right)\right|\right),$

which can be viewed as a type of 'local polynomial' regression approach when considering a functional explanatory variable. Metrics, or more generally semi-metrics, based on derivatives could also be suitable for locating one curve relative to another. For example, one might define:

$\operatorname{loc}_b^{(q)}\left(x_1, x_2\right) = \left(\displaystyle {\int}\left(x_1^{(q)}(t)-x_2^{(q)}(t)\right)^2 \mathrm{d} t\right)^{1 / 2},$

which is a semi-metric. This second family of locating functions is particularly well-suited for $\delta(\cdot, \cdot)$ , as it measures the proximity between two elements of $\mathcal{X}$ . Let $U$ be an open subset of a real Banach space $\mathfrak{F}$ . If $f: U \rightarrow \mathbb{R}$ is differentiable $p+1$ times on $U$ , it can be expanded using Taylor's formula:

$f(x) = f(a)+\mathrm{D} f(a) \cdot h+\frac{1}{2 !} \mathrm{D}^2 f(a) \cdot h^2+\cdots+\frac{1}{n !} \mathrm{D}^p f(a) \cdot h^p+R_p(x),$

with the following expressions for the remainder term $R_p(x)$ :

$\begin{aligned} & R_p(x) = \frac{1}{p !} \mathrm{D}^{p+1} f(\eta) \cdot(x-\eta)^p h\; \; \mbox{Cauchy form of remainder,}\\ & R_p(x) = \frac{1}{(p+1) !} \mathrm{D}^{p+1} f(\xi) \cdot h^{p+1} \; \; \mbox{Lagrange form of remainder, }\\ & R_p(x) = \frac{1}{p!} \displaystyle {\int}_0^1 \mathrm{D}^{p+1} f(a+t h) \cdot((1-t) h)^p h d t\; \; \mbox{integral form of remainder.} \end{aligned}$

Here $a$ and $x$ must be points of $U$ such that the line segment between $a$ and $x$ lie inside $U, h$ is $x-a$ , and the points $\xi$ and $\eta$ lie on the same line segment, strictly between $a$ and $x$ . If we collect the equal mixed partials (assuming that they are continuous) then

$\frac{1}{k !} \mathrm{D}^k f(a) \cdot h^k = \sum\limits_{|J| = k} \frac{1}{J !} \frac{\partial^J f}{\partial x^J} h^J,$

where $J$ is a multi-index of $m$ components, and each component $J_i$ indicates how many times the derivative with respect to the $i$ th coordinate should be taken, and the exponent that the $i$ th coordinate of $h$ should be raised to in the monomial $h^ J$ . The multi-index $J$ runs through all combinations such that $J_1+\cdots +J_m = |J| = k$ in the sum. The notation $J!$ means $J_1!\cdots J_m!$ . Let us specify our setting as in ^[11]. Let $U$ be an open subset of $\mathfrak{F}$ . If $f: U \rightarrow \mathbb R$ is $p+1$ times differentiable on $U$ , the Taylor expansion of $f$ around $x \in U$ is

$f(y) = \sum\limits_{j = 0}^p \frac{1}{j !} D^j f(x) d^j(x, y)+R_p(y),$

where $D^j f(x)$ denote the $j$ th Fréchet derivative of $f$ at $x$ , with $x$ and $y$ being points of $U$ and $d(\cdot, \cdot)$ the metric of $\mathfrak{F}$ . The remainder is given by

$R_p(y) = \frac{1}{p !} \displaystyle {\int}_0^1 D^{p+1} f\{x+t d(x, y)\} d^{p+1}(x, y)(1-t)^p d t = o\left\{d^p(x, y)\right\},$

as $d(x, y) \rightarrow 0$ ; see ^[26] and ^[90] for more details. Then, in our setting, by following a similar reasoning as in ^[9] and ^[11], the local polynomial estimator of order $p$ of the regression function denoted by $\widehat{r}(x)$ of $r(x)$ is defined as the first component of the estimate $\widehat{c}(x) = (\widehat{c}_{0}(x), \ldots, \widehat{c}_{p}(x))$ which is obtained by the following minimization problem

$\begin{equation} \widehat{c}(x) = \underset{c(x)\in \mathbb R^{p+1}}{\arg \min } \sum\limits_{i = 1}^{n}\left( Y_{i}- \sum\limits_{l = 0}^{p} c_{l}(x)\beta^{l}(X_{i}, x)\right)^{2} K(h_{ K}^{-1}\delta(x, X_{i})). \end{equation}$

(2.3)

Furthermore, the estimator $\widehat{r}^{l}(x) = l! \, \, \widehat{c}_{l}(x)$ , where $0\leq l \leq p$ can be expressed similarly as an estimator of the existing $l$ th order derivatives of the regression function $r(\cdot)$ . Then the estimator $\widehat{r}^{l}(x)$ can be written as

$\begin{equation} \widehat{c}(x) = \left(^{\top}\mathbf{Q}_{\beta} \mathbf{W}\mathbf{Q}_{\beta}\right)^{-1}\, ^{\top}\mathbf{Q}_{\beta}\mathbf{W}\,\mathbf{Y}, \end{equation}$

(2.4)

where $^{\top}\mathbf{Q}_{\beta}$ is the matrix defined by

$^{\top}\mathbf{Q}_{\beta} = \left( \begin{array}{ccc} 1 & \cdots & 1 \\ \beta(X_{1},x) & \cdots & \beta(X_{n},x) \\ \vdots & \cdots &\vdots \\ \beta^{p}(X_{1},x) & \cdots & \beta^{p}(X_{n},x) \\ \end{array} \right),\,\,\, Y = ^{\top}(Y_{1},\ldots, Y_{n}),\,\, {\rm{and}} \,\, W = {\rm{diag}}( K(h_{k}^{-1}\delta(X_{i},x))),$

where $^{\top}$ is the transpose symbol. Let us introduce

$u_{n,j} = \frac{1}{n \mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta(X_{i},x)}{h_{K}}\right)^{j} K(h_{K}^{-1}(\delta(X_{i},x)), \,\,\,\, v_{n,j} = \frac{1}{n\mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta(X_{i},x)}{h_{K}}\right)^{j}\,K(h_{K}^{-1}(\delta(X_{i},x))Y_{i},$

$\mathbf{U}_{n} = \left( \begin{array}{ccc} u_{n,0}& \cdots & u_{n,p} \\ \vdots & \ddots & \vdots \\ u_{n,p} & \cdots & u_{n,2p} \\ \end{array} \right),\,\,\, \hbox{and} \,\,\,\,\,\,\,\,\, V_{n} = ^{\top}( v_{n,0},\ldots,v_{n,p}).$

Keeping in mind this notation, we have the representation

$\widehat{c}(x) = {\rm diag}(1,h_{K}^{-1},h_{K}^{-2},\ldots,h_{K}^{-p})\,\,\mathbf U_{n}^{-1}\,\, V_{n}.$

Interestingly, this class of estimators includes both the classical Nadaraya-Watson estimator, which minimizes (2.3) when $p = 0$ , and the local linear kernel estimator, which corresponds to $p = 1$ .

3. Main results for the regression operator

Let $x$ (resp. $y$ ) be a fixed point in $\mathfrak{F}$ (resp. $\in\mathbb{R}$ ), and $N_{x}$ (resp. $N_{y}$ ) denote a fixed neighborhood of $x$ (resp. $y$ ).

3.1. Asymptotic distribution

In the remainder of the paper, we denote the closed ball in $\mathfrak{F}$ of center $x \in \mathfrak{F}$ and radius $h$ by

$B(x,h): = \lbrace x^{\prime} \in \mathfrak{F}: |\delta(x,x^{\prime}) |\leq h \rbrace.$

We define

$\phi_x(l_1,l_2) = \mathbb{P}(l_2\leq \delta(X,x)\leq l_1),$

where $l_{1}$ and $l_{2}$ are two real numbers. To establish the asymptotic behavior of the local polynomial of our estimator, we need some following assumptions.

(H1) For any $u > 0$ , $\phi_x(u): = \phi_x(-u, u) > 0$ where $\lim\limits_{\substack{u \rightarrow 0}}\phi_x(u) = 0.$

It is easy to see that

$\varphi_x(u) = \mathbb P({X} \in B(x, u)).$

When the function $|\delta(\cdot, \cdot)|$ satisfies the conditions of a metric or semi-metric, the expression $\varphi_x(u)$ can be understood as the probability of a ball in the set $\mathfrak{F}$ that is centered at $x$ and has a radius of $u$ . When the value of $u$ approaches zero, the term "small ball probabilities" is frequently employed in the field of probability theory, which has been extensively studied (refer to ^[54] for a comprehensive summary of this topic in the context of Gaussian processes). The function $\varphi_x(\cdot)$ , which is an extension of the small ball probability concept, serves a similar purpose in the functional situation as the density function does in the finite-dimensional context. In the context of multivariate nonparametric analysis, it is customary to estimate a specific quantity at a given position using a sufficiently large number of observations available in the vicinity. One commonly employed method for assuming this nature is to claim that the density function evaluated at this particular location possesses a strictly positive value. In the context of infinite dimensions, the absence of a reference measure, such as the Lebesgue measure in finite dimensions, necessitates the adoption of a similar assumption that does not rely on the concept of density. The objective of Hypothesis 1 (H1) is to incorporate the functional aspect, indicating that there are a sufficient number of observations surrounding $x$ , hence justifying the estimation of the regression operator at the specific point $x$ . The following conditions are needed in our analysis.

(H2) $(X_{i}, Y_{i})_{i\in\mathbb{N}}$ is an $\alpha$ -mixing sequence and

(ⅰ) $\exists a > 0, \, \exists C > 0, \, \, \forall n\in\mathbb{N}, \, \, \alpha(n)\leq C\, n^{-a},$

(ⅱ) $\sup\limits_{i\neq j}\mathbb{P}((X_{i}, X_{j})\in B(x, h_{ K})\times B(x, h_{ K})) \leq\psi_{x}(h_{ K}),$ where $\psi_{x}(h_{ K})$ is such that there exists $\epsilon \in ]0, 1]$ for which $0 < \psi_{x}(h_{ K}) = O\left(\phi^{1+\epsilon}_{x}(h_{ K})\right);$

(H3) (ⅰ) $r(\cdot)$ and $\sigma(\cdot)$ are continuous in the neighbourhood of $x$ , which means that $r(\cdot)$ and $\sigma(\cdot)$ are both in the set

$\left\{f: \mathfrak{F} \rightarrow \mathbb{R}, \lim\limits_{\left|\delta\left(x^{\prime}, x\right)\right| \rightarrow 0} f\left(x^{\prime}\right) = f(x)\right\},$

and there exists $C > 0$ , such that, almost surely,

$\sup\limits_{i\neq j} \mathbb E(| Y_{i} Y_{j}| \mid (X_{i},X_{j})) \leq C < \infty ,$

(ⅱ) For any $k \in \lbrace 1, 2, \ldots, p, p+1 \rbrace,$ the quantities $\Psi^{(k)}(0)$ exist, where $\Psi^{(k)}(\cdot)$ denotes the ${k}th$ derivative of $\Psi(\cdot)$ defined by $\Psi(s) = \mathbb E(r(X)-r(x) \mid \beta(X, x) = s)$ ;

(H4) The locating operators $\beta(\cdot, \cdot)$ and $\delta(\cdot, \cdot)$ satisfy the following conditions:

(ⅰ) $\forall z \in\ \mathfrak{F}, \; C_1 |\delta(x, z)|\leq |\beta(x, z)|\leq C_2 | \delta(x, z)|;$

(ⅱ) $\sup\limits_{v\in B(x, r)} \mid \beta(v, x)- \delta(x, v) \mid = o(r)$ ;

(H5) The kernel $K(\cdot)$ is a bounded and positive function which is supported within $[-1, 1]$ and for which the first derivative $K^{\prime}(\cdot)$ satisfies: $K(1) > 0$ , $K^{\prime}(t) < 0$ , for $t\in[-1, 1];$

(H6) There exists a positive integer $n_0$ for which

$- \frac{1}{\phi_x(h_K)}\displaystyle {\int}_{-1}^1\phi_x(zh_K,h_K)\frac{d}{dz}\left(z^2K(z)\right) dz > C_3 > 0 \hbox{ for } n > n_0,$

and

$h_K^2 \displaystyle {\int}_{B(x,h_K)}\displaystyle {\int}_{B(x,h_K)}\beta(u,x)\beta(t,x)d\mathbb P_{(X_{1}, X_{2})}(u,t) = o\left(\displaystyle {\int}_{B(x,h_K)}\displaystyle {\int}_{B(x,h_K)}\ \beta^2(t,x) \beta^2(u,x)\, d\mathbb P_{(X_{1}, X_{2})}\right),$

where $d\mathbb P_{(X_{1}, X_{2})}$ is the cumulative distribution of $(X_1, X_2)$ ;

(H7) (ⅰ) For $a > 1$ , the function $\phi_{x}(\cdot)$ satisfies

$\exists \eta > 0,\,\,\,\,\, C_{2}n^{\frac{1}{1-a} }\leq \phi_{x}(h_{ K}) \leq C_{1} n^{\frac{1}{3-2a}-\eta};$

(ⅱ) The bandwidths $h_K$ satisfy

$\lim\limits_{n\to \infty}h_{K} = 0,\,\, \lim\limits_{n\to \infty}\frac{\log n}{n \phi_x(h_K)} = 0 \hbox{ and } \exists \beta > 0 \,\,\hbox{such that} \,\, \lim\limits_{n\to \infty}n^{\beta }h_{K} = 0.$

Before presenting our result, we need additional notation. In the sequel, we define certain quantities appearing in the dominant terms of the bias and of the variance in the asymptotic results. For $j \in \lbrace 1, 2 \rbrace$ , let

$\begin{equation} M_{j} = K^{j}(1)- \displaystyle {\int}_{-1}^{1}(K^{j}(u))'\chi_{x}(u)du, \end{equation}$

(3.1)

where $\chi_{x}(\cdot)$ is a a function such that

$\lim\limits_{h_{ K}\rightarrow0}\frac{ \phi_x(-h_{ K}, th_{ K}) }{\phi_x(h_{ K})} = \chi_{x}(t) \hbox{ for any } t\in[-1, 1].$

For all $a > 0,$ and $b > 0$ , let

$\begin{equation} N(a,b) = K^{a}(1)- \displaystyle {\int}_{-1}^{1}(u^{b}K^{a}(u))'\chi_{x}(u)du. \end{equation}$

(3.2)

3.2. Comments on the conditions

Assumption (H2-ⅰ) requires that the mixing coefficients of the dependent case tend to zero at a suitably mild rate. Assumption (H2-ⅱ) describes the behavior of the joint distribution of the pair $(X_i, X_j)$ in relation to its marginal distributions, allowing us to present an explicit asymptotic variance term. Assumption (H3-ⅰ) imposes the usual moment condition on the responses and the covariance structure of the dependent sample, as detailed in ^[38]. Assumption (H3-ⅱ) specifies the necessary smoothness condition for the current setting. Assumption (H4) is a standard condition in nonparametric estimation. Condition (H5) is very usual in nonparametric estimation literature devoted to the functional data context. Notice that ^[70] symmetric kernel is not adequate in this context since the random process $d\left(x, X_i\right)$ is positive, therefore we consider $K(\cdot)$ with support $[0, 1]$ . This is a natural generalization of the assumption usually made on the kernel in the multivariate case where $K(\cdot)$ is supposed to be a spherically symmetric density function. Assumption (H6) describes the behavior of the bandwidth $h$ in relation to the small ball probabilities and the kernel function $K(\cdot)$ . Assumption (H7-ⅰ) is a technical condition illustrating the relationship between the small ball probability and an arithmetically $\alpha$ -mixing coefficient, as discussed in ^[42]. Assumption (H7-ⅱ) is satisfied when $h_K = n^{-\varrho}$ and $\phi_x(h_k) = h_K^{\vartheta}$ , provided $\vartheta > 0$ and $\beta < \varrho < 1/\vartheta$ . In the following, we use the notation $Z \stackrel{D}{ = } \mathcal{N}(\mu, \sigma^{2})$ to indicate that the random variable $Z$ follows a normal distribution with mean $\mu$ and variance $\sigma^{2}$ . The symbol ${\stackrel{D}{\rightarrow }}$ denotes convergence in distribution, and ${\stackrel{\mathbb{P}}{\rightarrow }}$ denotes convergence in probability.

Remark 3.1. According to ^[40], our methodology is heavily dependent on the distribution function $\phi(\cdot)$ . This dependency is evident in our conditions and the convergence rates of our estimate (via the asymptotic behavior of the quantity $n\phi(h)$ ). More precisely, the behavior of $\phi(\cdot)$ around $0$ is of paramount importance. Consequently, the small ball probabilities of the underlying functional variable $X$ are crucial. In probability theory, the calculation of $\mathbb{P}(\|X - x\| < s)$ for "small" $s$ (i.e., for $s$ tending toward zero) and for a fixed $x$ is known as the "small ball problem". Unfortunately, there are solutions for very few random variables (or processes) $X$ , even when $x = 0$ . In several functional spaces, taking $x \neq 0$ presents formidable obstacles that may be insurmountable. Typically, authors emphasize Gaussian random variables. We refer you to ^[54] for a summary of the key findings regarding the probability of small balls. If $X$ is a Gaussian random element on the separable Banach space $\mathcal{E}$ and $x$ belongs to the reproducing kernel Hilbert space associated with $X$ , then the following well-known result holds:

$\mathbb{P}(\|X - x\| \leq s) \sim C_x \mathbb{P}(\|X\| \leq s), \quad \text{as} \quad s \rightarrow 0.$

As far as we know, the results available in the published literature are essentially all of the form

$\mathbb{P}(\|X - x\| < s) \sim c_x s^{-\alpha} \exp \left(-C / s^\beta \right),$

where $\alpha, \beta, c_x$ , and $C$ are positive constants, and $\|\cdot\|$ may be a supremum norm, an $L^p$ norm, or a Besov norm. The interested reader can refer to ^{[12,16,20,38,39,40,79]} for further discussion. Notably, the pioneering book by ^[38] extensively comments on the links between nonparametric functional statistics, small-ball probability theory, and the topological structure of the functional space $\mathcal{E}$ .

In the following theorem, we present the limiting law.

Theorem 3.2. Assume that the assumptions (H1)–(H7) are satisfied. We have, as $n \rightarrow \infty,$

$\sqrt{n\phi_{x}(h_{K})}\left\lbrace {\rm diag}(1,h_{K},\ldots, h_{K}^{p}) \left( \begin{array}{c} \widehat{r}(x)-r(x)\\ \widehat{ \Psi}^{(1)}(0)- \Psi^{(1)}(0) \\ \vdots \\ \widehat{ \Psi}^{(p)}(0)- \Psi^{(p)}(0) \\ \end{array} \right)-\frac{h_{K}^{p+1}}{(p+1)!} \Psi^{(p+1)}(0)S^{-1}U \right\rbrace{\stackrel{D}{\rightarrow }}\mathcal N (0, \sigma^{2}(x) S^{-1} V S^{-1}),$

where

$\begin{eqnarray*} S & = & \left( \begin{array}{ccc} 1& \cdots & \frac{N(1,p )}{M_{1}} \\ \vdots & \ddots & \vdots \\ \frac{N(1,p)}{M_{1}} & \cdots & \frac{N(1,2p)}{M_{1}} \\ \end{array} \right),\,\,\,\,\,\,\,\,\,\,\,\\ \!\!\!\!\! V & = &{\rm diag}\left( \frac{M_{2}}{M_{1}^{2}}, \frac{N(2,2)}{M_{1}^{2}} ,\ldots, \frac{N(2,2p)}{M_{1}^{2}}\right),\,\,\, \,\,\,\\ U& = &\left( \begin{array}{ccc} \frac{N(1,p+1)}{M_{1}} \\ \vdots \\ \frac{N(1,2p+1)}{M_{1}} \\ \end{array} \right), \end{eqnarray*}$

where $\widehat{c}_{p}(x)$ is a consistent estimator of $\Psi^{(p)}(0)$ . Henceforth, we will denote $\widehat{c}_{p}(x)$ as $\widehat{\Psi}^{(p)}(0)$ , and let $\widehat{r}(x) = \widehat{c}_{0}(x)$ .

The proof of Theorem 3.2 is postponed until Section 6. To eliminate the bias term, we need the following additional assumption.

(H8) Assume that $\lim_{n\to \infty} n h_{K}^{2p+2} \phi_x(h_K) = 0$ .

Corollary 3.3. Assume that the conditions (H1)–(H8) are satisfied. We have, as $n\rightarrow \infty$ ,

$\sqrt{n\phi_{x}(h_{K})}\left\lbrace {\rm diag}(1,h_{K}\cdots h_{K}^{p}) \left( \begin{array}{c} \widehat{r}(x)-r(x)\\ \widehat{ \Psi}^{(1)}(0)- \Psi^{(1)}(0) \\ \vdots \\ \widehat{ \Psi}^{(p)}(0)- \Psi^{(p)}(0) \\ \end{array} \right) \right\rbrace{\stackrel{D}{\rightarrow }}\mathcal N (0, \sigma^{2}(x) S^{-1} V S^{-1}).$

Remark 3.4. To cancel the bias term, we need $n h_{K}^{2p+2} \phi_x(h_K) \rightarrow 0$ , as $n \rightarrow \infty$ . Consequently, the last condition and condition $n \phi_x(h_K) \rightarrow \infty$ are satisfied as soon as $h_K = n^{-\xi}$ , $0 < \xi < 1$ , with and $\phi_x(h_K) = h_K^c$ , for $\frac{1}{\xi }-(2p-2) < c < \frac{1}{\xi }$ .

3.3. Application to the confidence regions

A lower $\alpha$ th quantile of the vector $V_n$ is any quantity $\tau_{n \alpha} \in \mathbb{R}^{p+1}$ satisfying $\tau_{n \alpha} = \inf \left\{\varepsilon: \mathbb P\left(V_n \leq \varepsilon\right) \geq \alpha\right\}$ , where $\varepsilon$ is an infimum over the given set only if there does not exist a $\varepsilon_1 < \varepsilon$ in $\mathbb{R}^{p+1}$ such that $\mathbb P\left(V_n \leq \varepsilon_1\right) \geq \alpha$ . We can, without loss of generality, assume $\mathbb P\left(V_n \leq \tau_{n \alpha}\right) = \alpha$ . Since variance matrix $\Sigma(x) = \sigma^{2}(x) S^{-1} V S^{-1}$ is unknown and assumed non-singular, see Corollary 4 of Appendix 1 of ^[4], there exists a non-singular matrix $C(x)$ such that

$C(x)\Sigma(x)^{-1}C(x) = I_{p+1}.$

Let us first give a consistent estimate of $\Sigma(x)$ . Making use of the decomposition of $\chi_x(t)$ in (3.1), one may estimate $\chi_x(t)$ by

$\widehat\chi_x(t) = \frac{F_{x, n,1}(-h_k, th_k)}{F_{x, n,2}(h_K)},$

where

$F_{x, n,1}(t,u) = \frac{1}{n} \sum\limits_{i = 1}^n \mathbf{1}_{\left\{t\leq \delta\left(x, X_i\right) \leq u\right\}} \; \mbox{ and }\; F_{x, n,2}(t) = \frac{1}{n} \sum\limits_{i = 1}^n \mathbf{1}_{\left\{|\delta\left(x, X_i\right)| \leq t \right\}}.$

Hence we have the following estimates, for $j = 1, 2,$

$\begin{equation} \widehat M_{j} = K^{j}(1)- \displaystyle {\int}_{-1}^{1}(K^{j}(u))'\widehat\chi_x(u)du. \end{equation}$

(3.3)

In a similar way, for all $a > 0,$ and $b > 0$ , we have

$\begin{equation} \widehat N(a,b) = K^{a}(1)- \displaystyle {\int}_{-1}^{1}(u^{b}K^{a}(u))'\widehat\chi_{x}(u)du. \end{equation}$

(3.4)

Hence one can consistently estimate $\Sigma(x)$ by $\widehat\Sigma(x) = \widehat\sigma^{2}(x) \widehat S^{-1}\widehat V \widehat S^{-1},$ where

$\begin{eqnarray*} \widehat S & = & \left( \begin{array}{ccc} 1& \cdots & \frac{\widehat N(1,p )}{\widehat M_{1}} \\ \vdots & \ddots & \vdots \\ \frac{\widehat N(1,p)}{\widehat M_{1}} & \cdots & \frac{\widehat N(1,2p)}{\widehat M_{1}} \\ \end{array} \right),\,\,\,\,\,\,\,\,\,\,\,\\ \!\!\!\!\! V & = &{\rm diag}\left( \frac{\widehat M_{2}}{\widehat M_{1}^{2}}, \frac{\widehat N(2,2)}{\widehat M_{1}^{2}} ,\ldots, \frac{\widehat N(2,2p)}{\widehat M_{1}^{2}}\right),\,\,\, \,\,\,\\ U& = &\left( \begin{array}{ccc} \frac{\widehat N(1,p+1)}{\widehat M_{1}} \\ \vdots \\ \frac{\widehat N(1,2p+1)}{\widehat M_{1}} \\ \end{array} \right). \end{eqnarray*}$

For $n$ large enough, we have

$\widehat C(x)\widehat\Sigma(x)^{-1}\widehat C(x) = I_{p+1}.$

An application of Slutsky with Corollary 3.3 gives the following result.

Corollary 3.5. Assume that the conditions (H1)–(H8) are satisfied. We have, as $n\rightarrow \infty$ ,

$\sqrt{n\phi_{x}(h_{K})}\widehat C(x)\left\lbrace {\rm diag}(1,h_{K}\cdots h_{K}^{p}) \left( \begin{array}{c} \widehat{r}(x)-r(x)\\ \widehat{ \Psi}^{(1)}(0)- \Psi^{(1)}(0) \\ \vdots \\ \widehat{ \Psi}^{(p)}(0)- \Psi^{(p)}(0) \\ \end{array} \right) \right\rbrace{\stackrel{D}{\rightarrow }}\mathcal N (0, I_{p+1}).$

This corollary can be applied directly to the construction of the confidence regions.

3.4. Uniform almost-complete convergence

In this part, we will establish the rate of uniform almost complete convergence over some subset of $\mathcal{F}$ . The following definition is needed.

Definition 3.6. Let $\epsilon > 0$ be given number. A finite set of points $x_{1}, x_{2}, \ldots, x_{N}$ in $\mathcal{F}$ is called an $\epsilon -$ net for S if $S \subset \cup_{k = 1}^{N} B(x_{k}, \epsilon).$ The quantity $\Psi_{s}(\epsilon) = \log(N_{\epsilon}(S)),$ where $\Psi_{s}(\epsilon)$ is the minimal number of open balls in $\mathcal{F}$ of radius $\epsilon$ which is necessary to cover $S$ , is called Kolmogorov's $\epsilon-$ entropy of the set $S$ .

The concept of entropy, introduced by Kolmogorov in the mid-1950s (see ^[52]), serves as a measure of set complexity, indicating that high entropy implies a significant amount of information is required to accurately describe an element within a given tolerance $\varepsilon$ . Consequently, the selection of the topological structure, specifically the choice of the semi-metric, plays a crucial role when examining asymptotic results that are uniform over a subset $S$ of $\mathcal{F}$ . In particular, we subsequently observe that a well-chosen semi-metric can enhance the concentration of the probability measure for the functional variable $X$ , while minimizing the $\varepsilon$ -entropy of the subset $S$ . Ferraty and Vieu ^[38] emphasized this phenomenon of concentration of the probability measure for the functional variable by calculating small ball probabilities in different standard scenarios (refer to ^[41]). For readers interested in these concepts (entropy and small ball probabilities) and/or the utilization of Kolmogorov's $\varepsilon$ -entropy in dimensionality reduction problems, we recommend referring to ^[53] or/and ^[69], respectively. We establish the uniform almost-complete convergence of our estimator on some subset $S_{\mathcal{F}}$ such that:

$S_{\mathcal{F}} \subset \bigcup\limits_{k = 1}^{N} B(x_{k}, \iota_{n}),$

where $x_{k} \in \mathcal{F}$ and $\iota_{n}$ (resp. $d_{n}$ ) is a sequence of positive real numbers. To establish our result, we need the following hypotheses.

(U1) The hypothesis (H1) satisfies the following condition: there exists a differentiable function $\phi(\cdot)$ , such that

$\forall x \in \mathcal{F},\,\,\, 0 < C_{1}\, \phi(h_{k}) \leq \mathbb P(X\in B(x, h_{k})\leq C_{2}\, \phi(h_{k}),$

and there exists $n_0$ such that for each $n < n_0$

$\phi^{\prime}(n) < C,$

where $\phi^{\prime}(\cdot)$ represents the first derivative of $\phi(\cdot)$ and $\phi(0) = 0$ ;

(U2) The hypothesis (H2) is satisfied uniformly on $x \in S_{\mathcal{F}};$

(U3) The function $\beta(\cdot, \cdot)$ satisfies (H4) and the following Lipschitz condition, there exists a positive constant $C$ , for all $x_1, x_2$ in $\mathcal{F}$ , we have

$\mid \beta(x,x_{1})-\beta(x,x_{2}) \mid \leq C d(x_{1},x_{2});$

(U4) The regression operator $r(\cdot)$ satisfies: there exists a positive constant $C$ and $s > 0$ , such for each $x\in \mathcal{F}$ and $z\in B(x, h_{K})$ , we have

$\mid r(x)-r(z)) \mid \leq C d^{s}(x,z);$

(U5) The kernel function $K(\cdot)$ satisfies (H5) and is Lipschitzian on $[-1, 1]$ ;

(U6) In addition to (H7), the Kolmogorov's $\varepsilon$ -entropy of $S_{\mathcal{F}}$ satisfies:

(ⅰ) $\exists n_0, \, \forall n > n_0, \frac{\log ^2 n}{n \phi(h)} < \psi_{S_{\mathcal{F}}}\left(\frac{\log n}{n}\right) < \frac{n \phi(h)}{\log n}$ .

(ⅱ) $\exists \lambda > 1$ , such that

$\sum\limits_{n = 1}^{\infty} \exp \left\{(1-\lambda) \psi_{S_{\mathcal{F}}}\left(\frac{\log n}{n}\right)\right\} = \sum\limits_{n = 1}^{\infty}d_{n}^{1-\lambda} < \infty.$

3.5. Comments on the conditions

Conditions (U1) and (U2) correspondingly serve as the uniform counterparts of (H1) and (H2). Assumption (U3), initially introduced and discussed by ^[9], plays a pivotal role in our methodology, particularly when computing the leading dominant terms in our asymptotic results. Moreover, (U4) proves essential for evaluating the bias component of the uniform convergence rates, and (U5) is a classical requirement in functional estimation within finite or infinite-dimensional spaces, with examples of kernel functions satisfying this condition available in ^[38]. The final condition regarding entropy, (U6), implies that $\psi_{S_{\mathcal{F}}}\left(\frac{\log n}{n}\right) = o(n\phi(h_{K}))$ as $n$ tends to infinity. However, in certain "usual" scenarios, such as a Hilbert space with a projection semi-metric, one can consider $\Psi_{si_{\mathcal{F}}}\left(\frac{\log n}{n}\right) \sim C \log n$ , and (H6 i) is fulfilled whenever $(\log n)^{2} = O(n \phi(h))$ . For further insights, one may refer to Example 4 in ^[41].

We are now equipped to state the main result of this section concerning the uniform almost complete convergence with rate.

Theorem 3.7. Assume that the assumptions (U1)–(U6) are satisfied. We have as $n\rightarrow \infty$ ,

$\sup\limits_{x\in S_{\mathcal{F}}} \| \widehat{c}(x)-c(x) \|_{\mathcal{F}} = O(h_{K}^{p+1})+O_{a.co.}^{\ddagger}\left( \frac{\log\, d_{n}}{n\phi(h_{K})}\right),$

^‡Let $\left(u_{n}\right)$ for $n \in \mathbb{N}$ be a sequence of real random variables. We say that $\left(u_{n}\right)$ converges almost-completely (a.co.) toward zero if, and only if, for all $\epsilon > 0$ , $\sum_{n = 1}^{\infty} \mathbb{P}\left(\left|u_{n}\right| > \epsilon\right) < \infty$ . Moreover, we say that the rate of almost-complete convergence of $\left(u_{n}\right)$ toward zero is of order $v_{n}$ (with $v_{n} \rightarrow 0$ ), and we write $u_{n} = O_{a.co.}\left(v_{n}\right)$ if, and only if, there exists $\epsilon > 0$ such that $\sum_{n = 1}^{\infty} \mathbb{P}\left(\left|u_{n}\right| > \epsilon v_{n}\right) < \infty$ . This type of convergence implies both almost sure convergence and convergence in probability.

where

$\widehat{c}(x) = \left( \widehat{r}(x), h_{K} \widehat{ \Psi}^{(1)}(0), \ldots, h_{K}^{p}\widehat{ \Psi}^{(p)}(0) \right).$

4. The conditional distribution function

In this section, we will establish results analogous to Theorem 3.2 for the local polynomial estimator of order $p$ of the Cumulative distribution function which we will denote by $\widehat{F}^{x}(\cdot)$ . We notice that the distribution function is expressed in terms of regression depending on the choice of the function $\psi(Y)$ (see Eq (2.1)). Then if $\psi_y(Y) = \mathbf{1}_{Y \leq y}$ , the distribution function is defined by

$\forall y\in \mathbb{R}: \,\,\,\, F^{x}(y) = \mathbb P(Y \leq y |X = x).$

The functional local polynomial estimator of $F^{x}(y)$ is based on the minimization, with respect to $\widehat{a} = (\widehat{a}_{0}, \ldots, \widehat{a}_{p}),$ of

$\begin{equation} \widehat{a} = \underset{(a_{0} ,\ldots, a_{p} )\in \mathbb R^{p+1}}{\arg \min } \sum\limits_{i = 1}^{n}\left( L(h_{L}^{-1}(y-Y_{i}))- \sum\limits_{l = 0}^{p} a_{l}(x)\beta^{l}(X_{i}, x)\right)^{2} K(h_{ K}^{-1}\delta(x, X_{i})), \end{equation}$

(4.1)

where $L(\cdot)$ is cumulative distribution function and ( $h_{L} = h_{L, n}$ ) is a sequence of positive real numbers. In the realm of functional statistics, the estimation of the conditional cumulative distribution function holds significant importance both theoretically and practically. This is evident in various applications, such as reliability and survival analysis. Additionally, within the domain of nonparametric statistics, numerous prediction tools, including conditional density, conditional quantiles, and conditional mode, play crucial roles. It is worth noting that these considerations apply particularly when $p$ , is equal to 1. The almost-complete convergence of the estimator $\widehat{F}^{x}(\cdot)$ in the case of $p = 1$ has been investigated by ^[31]. Furthermore, ^[14] has explored the asymptotic normality of the local linear estimation of the conditional function in the scenario of independent observations. For the development of our estimator, we adopt a similar approach as used for the regression operator. Consequently, the local polynomial estimator of $\widehat{F}^{x}(y)$ can be explicitly expressed as follows:

$(\widehat{F}^{x}(y),\widehat{a}_{1},\ldots \widehat{a}_{p}) = {\rm diag}(1,h_{K}^{-1},h_{K}^{-2},\ldots,h_{K}^{-p})\,\,A_{n}^{-1}\,\, B_{n},$

where

$\mathbf{A_{n}} = \left( \begin{array}{ccc} \Lambda_{n,0}& \cdots & \Lambda_{n,p} \\ \vdots & \ddots & \vdots \\ \Lambda_{n,p} & \cdots & \Lambda_{n,2p} \\ \end{array} \right),\,\,\, \hbox{and} \,\,\,\,\,\,\,\,\, B_{n} = ^{\top}( \Upsilon_{n,0},\ldots, \Upsilon_{n,p}).$

We denote

$\begin{eqnarray*} A_{n,j}& = & \frac{1}{n \mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta(X_{i},x)}{h_{K}}\right)^{j} K(h_{K}^{-1}(\delta(X_{i},x)), \\ \Upsilon_{n,j}& = & \frac{1}{n\mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta(X_{i},x)}{h_{K}}\right)^{j}\,K(h_{K}^{-1}(\delta(X_{i},x))L(h_{L}^{-1}(Y_{i}-y)). \end{eqnarray*}$

4.1. Main results

To establish the asymptotic convergence of $\widehat{F}^{x}(y)$ we need some notation and assumptions. For any $l\in \{0, 2, \ldots, 2p\}$ , we set

$\varphi_{l}(\cdot,y) = \frac{\partial^l F^{\cdot}(y)}{\partial y^l} \hbox{ and } \psi_{l}(s) = \mathbb E\left[\varphi_{l}(X,y)-\varphi_{l}(x,y)|\beta(X,x) = s\right].$

(M1) For any $k\in \{1, \ldots, p+1\}$ , the quantities $\psi_{l}^{(k)}(0)$ exist, where $\psi_{l}^{(k)}(\cdot)$ denotes the $k$ th order derivative of $\psi_{l}(\cdot)$ .

(M2) The kernel $K(\cdot)$ satisfies the assumption (H5) and its first derivative $K^{(1)}(\cdot)$ satisfies:

$K^{2}(1)-\displaystyle {\int}_{-1}^{1}(K^{2}(u))'{\Psi _x}(u)du > 0.$

(M3) The kernel function $L(\cdot)$ is a positive function, symmetric, bounded, integrable, and such that and

$\displaystyle {\int}_{\mathbb{R}} z^{2p} L(z)dz < \infty.$

(M4) The coefficients of $\alpha$ -mixing sequence $(X_{i}, Y_{i})_{i\in\mathbb{N}}$ satisfies (H2) and the following condition,

$\sum\limits_{l = 1}^{+\infty}l^{\delta}(\alpha(l))^{\frac{1}{p}} < \infty\ {\rm{for\ some}}\ p > 0\ {\rm{and}}\ \delta > \frac{1}{p}.$

(M5) The function $\phi_{x}(\cdot)$ satisfies the assumption (H1).

(M6) Let $(w_{n})$ and $(q_{n})$ be sequences of positive integers tending to infinity, such that $(w_{n} +q_{n}) \leq n$ , and

(ⅰ) $q_{n} = o\left((n\, \phi_{x}(h_{ K}))^{\frac{1}{2}}\right)$ and $\lim\limits_{\substack{n \rightarrow +\infty}}\left(\frac{n}{\phi_{x}(h_{ K})}\right)^{\frac{1}{2}}\alpha(q_{n}) = 0,$

(ⅱ) $r_{n}q_{n} = o\left((n\phi_{x}(h_{ K}))^{\frac{1}{2}}\right)$ and $\lim\limits_{\substack{n \rightarrow +\infty}}r_{n}\left(\frac{n}{\phi_{x}(h_{ K})}\right)^{\frac{1}{2}}\alpha (q_{n}) = 0,$ where $r_{n}$ is the largest integer, such that $r_{ n} (w_{n} +q_{n}) = O(n)$ .

Theorem 4.1. Under assumptions (M1)–(M6), we have

$\sqrt{n\phi_{x}(h_{K})}\left\lbrace {\rm diag}(1,h_{K},\ldots, h_{K}^{p}) \left( \begin{array}{c} \widehat{F^{x}}(y)-F^{x}(y)\\ \widehat{a_{1}} - \Psi^{(1)}_{0}(0) \\ \vdots \\ \widehat{a_{p}} - \Psi^{(p)}_{0}(0) \\ \end{array} \right)-B_{K}^{p}(x,y) -B_{L}^{p}(x,y) \right\rbrace{\stackrel{D}{\rightarrow }}\mathcal N (0, V_{LK}(x,y) ),$

where

$\begin{eqnarray*} B_{L}^{p}(x,y)& = & \sum\limits_{j = 0}^{p}\left[\frac{\mathbb E(\beta_{1}^{j}K_{1})}{h_{K}^{j}\mathbb E( K_{1})}I+ \sum\limits_{k = 1}^{p} \sum\limits_{a = 1}^{p}\frac{h_{L}^{2k}}{(2k)!} \displaystyle {\int} t^{2k}L^{\prime}(t)dt\, \Psi_{k}^{(a)}(0)\frac{\mathbb E(\beta_{1}^{j+a}K_{1})}{ h_{K}^{j}\mathbb E( K_{1})}\right], \\ I& = &\left( \frac{h_{L}^{2}}{2} \varphi_{2}(x,y) \displaystyle {\int} t^{2}L^{\prime}(t)dt, \ldots, \frac{h_{L}^{2p}}{(2p)!} \varphi_{2p}(x,y) \displaystyle {\int} t^{(2p)}L^{\prime}(t)dt\right)^{T} , \\ B_{K}^{p}(x,y) & = &\frac{h_{K}^{p+1}}{(p+1)!} \Psi_{0}^{(p+1)}(0)S^{-1}U,\\ V_{LK}(x,y)& = &F^{x}(y)(1-F^{x}(y)) S^{-1} V S^{-1}. \end{eqnarray*}$

Remark 4.2. Let $x \in \mathcal{F}$ be a fixed element, and $y \in \mathbb{R}$ . If we define $\psi_y(Y) = \mathbf{1}_{]-\infty, y]}(Y)$ , then the operator $r_\psi(x) = \mathbb{E}(\psi_y(Y) \mid X = x)$ represents the conditional cumulative distribution function (CDF) of $Y$ given $X = x$ , denoted as $F(y \mid x) = \mathbb{P}(Y \leq y \mid X = x)$ . This can be estimated as $\tilde{F}(y \mid x) : = \widehat{r}_{\psi, n}(x)$ . For a given $\alpha \in (0, 1)$ , the $\alpha$ -th order conditional quantile of the distribution of $Y$ given $X = x$ is defined as $q_\alpha(x) = \inf \{ y \in \mathbb{R} : F(y \mid x) \geq \alpha \}$ . If $F(\cdot \mid x)$ is strictly increasing and continuous in a neighborhood of $q_\alpha(x)$ , then $F(\cdot \mid x)$ has a unique quantile of order $\alpha$ at $q_\alpha(x)$ , i.e., $F(q_\alpha(x) \mid x) = \alpha$ . In such cases:

$q_\alpha(x) = F^{-1}(\alpha \mid x) = \inf \{ y \in \mathbb{R} : F(y \mid x) \geq \alpha \},$

which is uniquely estimated by $\widehat{q}_{n, \alpha}(x) = \tilde{F}^{-1}(\alpha \mid x)$ . Conditional quantiles have been extensively studied when the predictor $X$ is of finite dimension. Since $F(q_\alpha(x) \mid x) = \alpha = \tilde{F}(\widehat{q}_{n, \alpha}(x) \mid x)$ , and $\tilde{F}(\cdot \mid x)$ is continuous and strictly increasing, we have, for any $\epsilon > 0$ , there exists $\eta(\epsilon) > 0$ such that for all $y$ :

$\left| \tilde{F}(y \mid x) - \widehat{F}_T\left(q_\alpha(x) \mid x\right) \right| \leq \eta(\epsilon) \Rightarrow \left| y - q_\alpha(x) \right| \leq \epsilon.$

This implies that, for any $\epsilon > 0$ , there exists $\eta(\epsilon) > 0$ such that:

$\begin{eqnarray*} \mathbb{P}\left( \left| \widehat{q}_{n, \alpha}(x) - q_\alpha(x) \right| \geq \eta(\epsilon) \right) &\leq& \mathbb{P}\left( \left| \tilde{F}\left(\widehat{q}_{n, \alpha}(x) \mid x\right) - \tilde{F}\left(q_\alpha(x) \mid x\right) \right| \geq \eta(\epsilon) \right) \\ & = & \mathbb{P}\left( \mid F\left(q_\alpha(x) \mid x\right) - \tilde{F}\left(q_\alpha(x) \mid x\right) \geq \eta(\epsilon) \right). \end{eqnarray*}$

This result may be immediately used to establish the consistency of the quantile estimator. This topic will be investigated in future research.

Remark 4.3. Over time, it has become increasingly clear that the performance of kernel estimates deteriorates as the dimensionality of the problem increases. This decline is primarily due to the fact that, in high-dimensional spaces, local neighborhoods often lack sufficient sample observations unless the sample size is extraordinarily large. Consequently, in kernel estimation, computing local averages is not feasible unless the bandwidth is significantly wide. This overarching issue is known as the curse of dimensionality ^[10]. The paper by ^[77] provides a comprehensive analysis of the feasibility and challenges of high-dimensional estimation, complete with examples and computations. For recent references, see ^[6,15,16,35]. To address the difficulties associated with high-dimensional kernel estimation and simplify the process, a wide range of dimension reduction techniques have been developed. These techniques major to explain the main features of a set of sample curves using a small set of uncorrelated variables. One effective solution is to generalize principal component analysis (PCA) to the context of a continuous-time stochastic process ^[33]. The asymptotic properties of the estimators of Functional Principal Component Analysis (FPCA) have been extensively studied in the general context of functional variables ^[30]. Nonparametric methods have been developed to perform FPCA for cases involving a small number of irregularly spaced observations of each sample curve ^[50,88]. As in the multivariate case, interpreting principal component scores and loadings is a valuable tool for uncovering relationships among the variables in a functional data set. To avoid misinterpretation of PCA results, a new type of plot, named Structural and Variance Information plots, was recently introduced by ^[25]. Numerical results suggest that regularization using a smoothness measure often yields more satisfactory outcomes than the FPCA approach. The FPCA method assumes that the top eigenfunctions of the predictors, which are unrelated to the coefficient function or the response, provide a good representation of the coefficient function. While it is generally believed that neither approach will consistently outperform the other, the Reproducing Kernel Hilbert Space (RKHS) method is an interesting alternative that merits attention ^[58,89]. Spline smoothing is one of the most popular and powerful techniques in nonparametric regression ^{[36,46,47,84]}. Penalized splines have gained widespread use in recent years due to their computational simplicity and their connection to mixed effects models ^[86]. These methods have also found significant success in Functional Data Analysis (FDA), as evidenced by tools like the R package refund. However, there is a consensus that the asymptotic theory of penalized splines is difficult to derive, and many theoretical gaps remain, even in the standard nonparametric regression setting. To our knowledge, there are very few theoretical studies on penalized splines for FDA ^[87]. Most approaches to functional regression are based on minimizing some $L_2$ -norm and are therefore sensitive to outliers. Finally, we can mention methods based on the delta sequences ^[23] and the wavelet methods ^[34].

5. Concluding remarks

Local polynomial fitting exhibits various statistically significant characteristics, particularly in the intricate domain of multivariate analysis. As functional data analysis gains traction in the field of data science, there arises a need for a specialized theory focused on local polynomial fitting. This study employs local higher-order polynomial fitting to address the complex challenge of estimating the regression function operator and its partial derivatives for stationary mixing random processes, represented as $(Y_i, X_i)$ . For the first time, we have achieved significant progress by demonstrating the joint asymptotic normality of the estimates for the regression function and its partial derivatives, particularly applicable to strongly mixed processes. Additionally, we derive precise formulas for the bias and the variance-covariance matrix of the asymptotic distribution. We establish the uniform strong consistency of the regression function and its partial derivatives across compact subsets, providing a detailed analysis of their convergence rates. Importantly, these conclusions are grounded in general conditions that form the foundation of the underlying models. To illustrate practical utility, we utilize our findings to compute confidence regions for individual points. Furthermore, we extend our concepts to encompass the nonparametric conditional distribution and acquire its limiting distribution. There are multiple avenues for further methodological development. As a prospective direction, relaxing the stationarity assumption and exploring similar uniform limit theorems for local stationary functional ergodic processes would be fruitful. Additionally, considering the functional $k$ NN local polynomial approach and expectile regression estimators in future investigations could yield alternative estimators benefiting from the advantages of both methods, as discussed in ^[20] and referenced in ^[19]. Investigating the extension of our framework to the censored data setting would also be of interest. In an upcoming investigation, obtaining a precise uniform-in-bandwidth limit law for the proposed estimators will be essential. This outcome would allow the bandwidth to vary across a comprehensive range, ensuring the estimator's consistency and serving as a valuable practical guideline for bandwidth selection in nonparametric functional data analysis. It is crucial to acknowledge that advancing research in this direction poses a more challenging task compared to previous efforts. The primary challenge lies in the need to develop new probabilistic results, such as inequalities and maximal moment inequalities specifically tailored for independent and identically distributed (i.i.d.) samples, as discussed in ^[81].

6. Proof of the main result

This section is devoted to the proof of our main result. The previously presented notation continues to be used in the following. Before studying the limit law of $\widehat{c}(x)$ , we need to center the vector $V_{n}$ as follows. Let

$\begin{equation} u_{n,j}^{\ast} = \frac{1}{n \mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta_{i}}{h_{K}}\right)^{j}\,K_{i}(Y_{i}-r(x)), \end{equation}$

(6.1)

where $V_{n}^{\ast} = ^{\top}(u_{n, 0}^{\ast}, \ldots, u_{n, p}^{\ast}).$ Making use of assumption (H4) and by the Taylor expansion, we obtain

$\begin{eqnarray*} \mathbb E\left(r(X_{i})\mid \beta_{i}\right)& = & r(x)+\mathbb E\left(r(X_{i})-r(x) \mid \beta_{i}\right)\\ & = & r(x)+ \Psi(\beta_{i})\\ & = & r(x)+ \sum\limits_{k = 1}^{p}\frac{1}{k!}\Psi^{(k)}(0)\beta_{i}^{p}+o(\beta_{i}^{p}). \end{eqnarray*}$

Therefore, we have

$\begin{eqnarray} \Gamma &: = & ^{\top}(\left(r(X_{1}),\ldots, r( X_{n})\right)\\ & = &\mathbf{Q}_{\beta} \left( \begin{array}{c} r(x)\\ \Psi^{(1)}(0) \\ \vdots \\ \Psi^{(p)}(0) \\ \end{array} \right)+\frac{1}{(p+1)!} \Psi^{(p+1)}(0) ^{\top}\left( \beta^{p+1}_{1},\ldots, \beta^{p+1}_{n} \right)+ ^{\top}\left(o( \beta^{p+1}_{1}),\ldots, o (\beta^{p+1}_{n}) \right) . \end{eqnarray}$

Making used of Eq (6.1), we have:

$\begin{eqnarray} \mathbf{U_{n}}^{-1}\,\, V_{n}^{\ast}& = &{\rm diag}(1,h_{K},h_{K}^{2},\ldots,h_{K}^{p})\left( \widehat{c}(x)- (^{\top}\mathbf{Q}_{\beta} \mathbf{W}\mathbf{Q}_{\beta})^{-1} \mathbf{Q}_{\beta}\mathbf{W}\,\Gamma \right)\\ & = & {\rm diag}(1,h_{K},h_{K}^{2},\ldots,h_{K}^{p}) \left( \begin{array}{c} \widehat{r}(x)-r(x)\\ \widehat{ \Psi}^{(1)}(0)- \Psi^{(1)}(0) \\ \vdots \\ \widehat{ \Psi}^{(p)}(0)- \Psi^{(p)}(0) \\ \end{array} \right) \\ &&-\frac{h_{K}^{p+1}}{(p+1)!} \Psi^{(p+1)} (0)U_{n}^{-1} \left( \begin{array}{c} u_{n,p+1}\\ \vdots \\ u_{n,2p+1} \\ \end{array} \right) -o(h_{K}^{p+1}) U_{n}^{-1} \left( \begin{array}{c} u_{n,p+1}\\ \vdots \\ u_{n,2p+1} \\ \end{array} \right). \end{eqnarray}$

(6.2)

The rest of the proof for this theorem relies on the following lemmas, the proofs of which are provided in the appendix.

Lemma 6.1. Under the assumptions (H1)–(H7), we have, as $n\rightarrow \infty$ ,

$\mathbf{U_{n}}\xrightarrow{\mathbb P} S \,\,\,\,\, \mathit{\mbox{and}}\,\,\,\, U_{n}^{-1} \left(\begin{array}{c} u_{n,p+1}\\ \vdots \\ u_{n,2p+1} \\ \end{array} \right) \xrightarrow{\mathbb P} S^{-1} \,\,\, U.$

Hence, considering Lemma 6.1, Theorem 3.2 will hold if we can furnish the proof.

Lemma 6.2. Under the assumptions of Theorem 3.2, we have, as $n\rightarrow \infty$ ,

$\sqrt{n\phi_{x}(h_{K)}} V_{n}^{\ast}\xrightarrow{D} N(0, \sigma^{2}(x)V).$

Author contributions

Oussama Bouannani: Conceptualization, formal analysis, investigation, methodology, validation, writing – original draft, review, edition; Salim Bouzebda: Conceptualization, formal analysis, methodology, validation, writing – original draft, review, edition. All authors have read and approved the final version of the manuscript for publication.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

The authors extend their sincere gratitude to the Editor-in-Chief, the Associate Editor, and the three referees for their invaluable feedback. Their insightful comments have greatly refined and focused the original work, resulting in a markedly improved presentation.

Conflict of interest

The authors declare no conflict of interest. Salim Bouzebda is the(a) Guest Editor of special issue "Advances in Statistical Inference and Stochastic Processes: Theory and Applications" for AIMS Mathematics. Salim Bouzebda was not involved in the editorial review and the decision to publish this article.

A. Appendix

Lemma A.1. (See ^[92]) Under assumptions (H1), (H2-ii) and (H5)–(H7-i), we have

(1) $\mathbb E(K_{1}^{j}) = M_{j}\phi_{x}(h_{K}),$ for $j = 1, 2$ ;

(2) $\mathbb E(K_{1}^{a}\beta_{1}) = o(h_{K}\phi_{x}(h_{K})),$ for all $a > 0;$

(3) $\mathbb E(K_{1}^{a}\beta_{1}^{b}) = N(a, b)h_{K}^{b}\phi_{x}(h_{K})+o(h_{K}^{b} \phi_{x}(h_{K})),$ for all $a > 0$ and $b = 2, 4$ ;

(4) For all $(k, l) \in \mathbb{N}^{\ast} \times \mathbb{N}, \; \mathbb E\left(K^{k}_{1} \; |\beta_{1} |^{l}\right) \leq C\; h_{K}^{l} \phi_{x}(h_{K});$

(5) For all $(a, c, l, s) \in \mathbb{N}^{\ast} \times \mathbb{N}^{\ast} \times \mathbb{N}\times \mathbb{N}, \; \mathbb E\left(K^{a}_{1} K^{c}_{2} \; |\beta_{1} |^{l} |\beta_{2} |^{s} \right) \leq C\; h_{K}^{l+s} \psi_{x}(h_{ K}).$

Proof of Lemma 6.1. To prove this lemma, we use the sufficient convergence condition in probability. For this, it suffices to prove the following formulas

$\mathbb E( u_{n,0}) = 1, \,\, \mathbb E(u_{n,1})\xrightarrow{\mathbb P} 0, \,\, \mathbb E(u_{n,p})\xrightarrow{\mathbb P} \frac{N(1,p)}{M _{1}} , \,\,\mathbb E(u_{n,1})\xrightarrow{\mathbb P} \frac{N(1,2p+1)}{M_{1}},$

and

$\begin{equation} {\rm Var}(u_{n,j})\xrightarrow{} 0,\,\, \,\, \mbox{for} \,\, \, j \in \lbrace 0, 1 ,p, 2p+1\rbrace. \end{equation}$

(A.1)

Using Lemma A.1, we get

$\begin{eqnarray*} \mathbb E(u_{n,0})& = &1,\,\, \mathbb E(u_{n,1}) = \frac{\mathbb E(h_{K}\beta_{1}K_{1})}{\mathbb E(K_{1})} = o(1),\,\,\\ \mathbb E(u_{n,l})& = & \frac{\mathbb E(h_{K} ^{-l}\beta_{1} ^{l}K_{1})}{\mathbb E(K_{1})} \longrightarrow \frac{N(1,l)}{M_{1}} \,\, \mbox{for} \,\, \, l \in \lbrace p, 2p+1\rbrace . \end{eqnarray*}$

Concerning Eq (A.1), we can write

$\begin{eqnarray*} {\rm Var}(u_{n,j})& = & \frac{1}{n^{2}\mathbb E(K _{1})^{2}} \left( \sum\limits_{i = 1} ^{n}{\rm Var}(\beta_{i}^j h_{K}^{-j} K_{i})+ 2\sum _{ 1\leq i \neq l \leq n} {\rm Cov}( \beta_{i}^jh_{K}^{-j} K_{i},\beta_{l}^jh_{K}^{-j} K_{l})\right)\\ & = & Q_{n1}+Q_{n2}. \end{eqnarray*}$

For the term $Q_{n1}$ , we have

$\begin{eqnarray} Q_{n1}& = & \frac{1}{n^{2}\mathbb E(K _{1})^{2}} \sum\limits_{i = 1} ^{n}{\rm Var}(\beta_{i}^j h_{K}^{-j} K_{i})\\ &\leq & \frac{1}{n \mathbb E(K_{1})^{2}} \mathbb E(\beta_{1}^{2j} h_{K}^{-2j} K_{1}^{2})\\ &\leq& \frac{1}{n \phi_{x}(h_{K})}. \end{eqnarray}$

(A.2)

Moreover, for computing the term of $Q_{n2}$ , we used the same technique as in ^[62], we define the sets $W_{1}$ and $W_{2}$ as follows

$W_{1} = \left\{(i,l)\in\{1,\ldots,n\} \, \mbox{ such that } \, 1\leq \mid i-l\mid\leq u_{n}\right\},$

and

$W_{2} = \left\{(i,l)\in\{1,\ldots,n\} \, \mbox{ such that } \, u_{n}+1\leq \mid i-l\mid\leq n-1\right\},$

where the sequence $u_{n}$ is selected in such a way that $u_{n}\to +\infty$ as $n\to +\infty.$ Based on the aforementioned splitting, we obtain

$\begin{equation} Q_{1,n} = \frac{1}{n^{2}\mathbb E^{2}(K_{1})} \sum\limits_{W_{1}} {\rm Cov}( h_{K}^{-j}\beta^{j}_{i}K_{i},h_{K}^{-j}\beta^{l}_{j}K_{l}), \end{equation}$

(A.3)

and

$\begin{equation} Q_{2,n} = \frac{1}{n^{2}\mathbb E^{2}(K_{1})} \sum\limits_{W_{2}} {\rm Cov}( \beta^{c}_{k}K_{k}, h_{K}^{-j}\beta^{j}_{l}K_{l}). \end{equation}$

(A.4)

For the sum $Q_{1, n}$ , by assumptions (H2-ii), (H5) and by Lemma of ^[92], we have

$\begin{eqnarray*} \mid {\rm Cov}( h_{K}^{-2j}\beta^{j}_{i}K_{i}, \beta^{j}_{l}K_{l})\mid &\leq& \mid \mathbb E( h_{K}^{-2j}\beta^{j}_{i}K_{i}\beta^{j}_{l}K_{l})\mid +\mathbb E^{2}(h_{K}^{-j}\beta^{j}_{1}K_{1})\\ & \leq & C (\mathbb E(K_{i} K_{l})+\mathbb E^{2}(K_{1}))\\ & \leq & C( \phi^{1+\epsilon}_{x}(h_{K})+\phi^{2}_{x}(h_{K}))\\ & \leq & C( \phi^{1+\epsilon}_{x}(h_{K})). \end{eqnarray*}$

Then, we readily infer

$\begin{eqnarray*} \mid Q_{1,n}\mid &\leq & \frac{1}{n^{2} \mathbb E^{2}(K_{1})} \sum\limits_{W_{1}} C\,\phi^{1+\epsilon}_{x}(h_{K})\\ &\leq & \frac{u _{n}}{n} \phi^{-1+\epsilon}_{x}(h_{K}). \end{eqnarray*}$

On the other hand, for computing the sum of covariance $Q_{1, n}$ , we use the inequality for the bounded mixing processes, then for any $l \neq i$ , we have

$\mid {\rm Cov}(h_{K}^{-j} \beta^{j}_{i}K_{i},h_{K}^{-j}\beta^{j}_{l}K_{l})\mid \leq C \,\alpha(|i-l|).$

Next, using the inequality

$\sum\limits_{j\geq x+1}j^{-a}\leq\displaystyle {\int}_{u\geq x}u^{-a},$

we derive

$\begin{eqnarray} \sum\limits_{i = 1}^{n} \sum\limits_{W_{2}}\alpha(|i-l|)&\leq & \frac{C\,n(u_{n})^{1-a}}{a-1}. \end{eqnarray}$

(A.5)

Therefore, we readily obtain

$|Q_{2,n}|\leq \frac{C\,(u_{n})^{1-a}}{n\,\phi^{2}_{x}(h_{ K})}.$

Subsequently, by choosing $u_{n} = \lfloor\phi^{1+\epsilon}_{x}(h_{ K}))^{-1/a}\rfloor$ and by assumption (H7), we obtain

$Q_{1,n}\to 0\,\hbox{ and }\, Q_{2,n}\to 0\,\mbox{ as }\, n\to\infty.$

Hence the proof is complete. □

Proof of Lemma 6.2. We consider a given vector of real numbers, denoted as $\mathbf{a} = ^{\top}\left(\alpha _{0} \ldots \alpha _{p}\right)\neq 0$ , then

$\begin{equation} \sqrt{n\phi_{x}(h_{K)}}\, a^{\top} \,V_{n}^{\ast} = \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left(Z_{i}-\mathbb E(Z_{i}) \right)+ \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n} \mathbb E(Z_{i}), \end{equation}$

(A.6)

where

$\begin{equation*} \label{eq15} Z_{i} = \frac{ \sqrt{ n\phi_{x}(h_{K)}}}{\mathbb E(K_{1})}\left( \sum\limits_{j = 0}^{p}\alpha_{j}h_{K}^{-j}\beta_{i}^{j}\right)K_{i}\left(Y_{i}-r(x)\right). \end{equation*}$

By the Cramér-Wold theorem and Slutsky's theorem, to show (13.6), it suffices to prove the following two claims:

$\begin{equation} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n} \mathbb E(Z_{i}) = \frac{\sqrt{n\phi_{x}(h_{K}) }\Psi^{\prime}(0)}{M_{1}} \left( \alpha_{0}\, o(h_{K})+ \sum\limits_{j = 1}^{p}\,a_{j}\, h_{K}^{j+1}\, N(1, j+1) \right) , \end{equation}$

(A.7)

and

$\begin{equation} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left(Z_{i}-\mathbb E(Z_{i}) \right)\xrightarrow{D} N(0, \sigma^{2}(x)\, a^{\top}\,\mathbf{V} \,a). \end{equation}$

(A.8)

First, we proof Claim (A.7). We have

$\begin{eqnarray*} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n} \mathbb E(Z_{i})& = &\sqrt{n}\mathbb E(Z_{1})\\ & = & \frac{\sqrt{n \phi_{x}(h_{K)}}}{\mathbb E(K_{1})}\mathbb E\left(\left(\alpha_{0} + \sum\limits_{j = 1}^{p}\alpha_{j}h_{K}^{-j}\beta_{1}^{j}\right)K_{1}\left(Y_{1}-r(x)\right)\right)\\ & = & \frac{n\phi_{x}(h_{K})}{\mathbb E(K_{1})} \underbrace{a_{0}\mathbb E(K_{1}(Y_{1}-r(x)))}_{S_{1}} + \frac{n\phi_{x}(h_{K})}{\mathbb E(K_{1})} \sum\limits_{j = 1}^{p}a_{j} \underbrace{\mathbb E\left(h_{k}^{-j}\beta^{j}K_{1}(Y_{1}-r(x))\right)}_{S_{2}}. \end{eqnarray*}$

Next, by the same arguments as those used by ^[72] and ^[40], we get

$\begin{eqnarray*} S_{1}& = &\alpha_{0}\, \Psi^{\prime}(0)\mathbb E(K_{1}\Psi(\beta_{1}))+o(\mathbb E(K_{1}\beta_{1}))\\ & = &\alpha_{0}\, \Psi^{\prime}(0)o(h_{K}\phi_{x}(h_{K}), \end{eqnarray*}$

and

$S_{2} = \sum\limits_{i = 1}^{p}\,a_{j}\, \Psi^{\prime}(0)\, h_{K}^{j+1}\, N(1, j+1)\, \phi_{x}(h_{K}),$

then, using Lemma A.1, we obtain

$\begin{equation} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n} \mathbb E(Z_{i}) = \sqrt{n\phi_{x}(h_{K})} \left( \frac{\alpha_{0}\, \Psi^{\prime}(0)}{M_{1}}o(h_{K})+ \sum\limits_{i = 1}^{p}\,a_{j}\, \Psi^{\prime}(0)\, h_{K}^{j+1}\, N(1, j+1) \right) . \end{equation}$

(A.9)

Now, we prove Claim (A.8). Using The CLT by ^[56] (see Corollary 2.2, Page 196), which rests on the asymptotic behavior of the quantity

$\begin{equation} \lim\limits_{n\rightarrow \infty}\sum\limits_{i = 1}^n \mathbb E[\Delta_{i}^{2}], \end{equation}$

(A.10)

where $\Delta_{i} = \frac{1}{\sqrt{n}} \left(Z_{i}-\mathbb E(Z_{i}) \right)$ in addition to the assumptions:

Assumption 1. (A1) There exists a sequence $\tau_n = o(\sqrt{n})$ , such that

(i) $\tau_n = o(\sqrt{n})$ such that $\tau_n\leq \left(\max_{i = 1, \ldots, n} C_{i} \right)^{-1},$ where $C_{i} = {\rm ess} \sup\limits_{\omega \in \Omega} |\Delta_{i}|,$

(ii) for all $\epsilon > 0$ , $\frac{n}{\tau_n}\ \alpha(\epsilon \tau_n) \to 0$ , for all $\epsilon > 0.$

Assumption 2. (A2) There exists a sequence $(m_n)$ of positive integers tending to $\infty$ , such that

(i) $n m_n \gamma_n = o(1),$ where $\gamma_n: = \max\limits_{1\leq i\not = j\leq n}\left(\mathbb E|\Delta_{i} \Delta_{j}| \right),$

(ii) for all $\epsilon > 0$ , $\frac{n}{\tau_n} \alpha(\epsilon \tau_n)\to 0.$

We start by computing the limit of (A.10). In order to do so, let us observe that

$\begin{eqnarray*} \sum\limits_{i = 1}^n \mathbb E(\Delta_{i}^2)& = &{\rm Var}(Z_{1})\\ & = & \frac{ \phi_{x}(h_{K})}{\mathbb E^{2}(K_{1})}\mathbb E\left(\left(\alpha_{0}+ \sum\limits_{j = 1}^{p}\alpha_{j}h_{K}^{-j}\beta_{1}^{j}\right)^2 K_{1}^2 (Y_{1}-r(x))^2\right)\\ &&- \frac{ \phi_{x}(h_{K})}{\mathbb E^{2}(K_{1})}\mathbb E^2\left( \left(\alpha_{0}+ \sum\limits_{j = 1}^{p}\alpha_{j}h_{K}^{-j}\beta_{1}^{j}\right ) K_{1}( Y_{1}-r(x))\right) = A_{1}-A_{2}. \end{eqnarray*}$

Using the fact that

$\begin{eqnarray*} A_{1}& = & \frac{ \phi_{x}(h_{K})}{\mathbb E^{2}(K_{1})}\left(\underbrace{\alpha_{0}^{2}\mathbb E(K_{1}^2 (Y_{1}-r(x))^2)}_{I_{1}}+ \underbrace{2\alpha_{0} \sum\limits_{j = 1}^{p}\alpha_{j}\mathbb E(h_{K}^{-j}\beta_{1}^{j} K_{1}^2 (Y_{1}-r(x))^2}_{I_{2}}\right) \\ &&+ \frac{ \phi_{x}(h_{K})}{\mathbb E^{2}(K_{1})} \mathbb E\left(\underbrace{\left( \sum\limits_{j = 1}^{p}\alpha_{j} h_{K}^{-j}\beta_{1}^{j} \right)^{2} K_{1}^2 (Y_{1}-r(x))^2}_{I_{3}}\right), \end{eqnarray*}$

in combination with Lemma A.1, we readily find

$\begin{eqnarray*} I_{1}& = &a_{0}^2\, \sigma^2(x)\, M_{1}\,\phi_{x}(h_{K}),\\I_{2}& = &a_{0} \sum\limits_{j = 1}^{p}\, a_{j}\, \sigma^{2}(x)\, O(\phi_{x}(h_{K})), \end{eqnarray*}$

and

$\begin{eqnarray*} I_{3}& = & \sum\limits_{j = 1}^{p}a_{j}^{2} \mathbb E(h_{K}^{-2j}\beta_{1}^{2j}(K_{1}^2 (Y_{1}-r(X_{1}))^2))+2 \sum\limits_{ 1\leq i \neq j\leq n}^{p}a_{j} a_{i}\mathbb E(h_{K}^{-j-i} \beta_{1}^{j+i} (K_{1}^2 (Y_{1}-r(X_{1}))^2)) \\ & = & \sigma^{2}(x)\, M_{1}\, \sum\limits_{j = 1}^{p}a_{j}^{2} \, N(2,2j)\,\phi_{x}(h_{K})+2\, \sigma^{2}(x) \, M_{1}\, \sum\limits_{ 1\leq i \neq j\leq n}^{p}a_{j} a_{i}\, O(\phi_{x}(h_{K})). \end{eqnarray*}$

Under assumptions (H1), we have

$\begin{eqnarray*} \lim\limits_{n \rightarrow +\infty} A_{1}& = & \sigma^{2}(x)\, \frac{M_{1}}{M_{2}^{2}}\left( a_{0}^{2}+\sum\limits_{j = 1}^{p} a_{j} \, N(2,2j) \right)\\ & = & \sigma^{2}(x)\, a^{\top}\,\mathbf{V} \,a. \end{eqnarray*}$

Subsequently, the assertion (A.7) suggests that $\mathbb{E}(Z_{1})\rightarrow 0$ as $n \rightarrow \infty$ . Consequently, we infer that $A_{2}\rightarrow 0$ as $n \rightarrow \infty$ . Concerning Assumption (1), the fact that the kernel $K(\cdot)$ is bounded, allows us to infer that

$C_{i} = O\left( \frac{1}{\sqrt{n\phi_{x}(h_{K})}}\right).$

Therefore, an appropriate choice is the following

$\tau_n = \sqrt{\frac{n \phi_x(h_{ K})}{\log n}}.$

Furthermore, this choice gives, for all $\epsilon > 0$

$\begin{eqnarray*} \frac{n}{\tau_n}\ \alpha(\epsilon \tau_n)& \leq & C\left(n^{1-(a+1)/2}( \phi_x(h_{ K}))^{-(a+1)/2}(\log n)^{(a+1)/2}\right)\nonumber \\ & \leq & C n^{1-(a+1)/2+(a+1)/2(a-1)}(\log n)^{(a+1)/2}\nonumber \\ & \leq & C n^{(3a-a^2)/2(a-1)} (\log n)^{(a+1)/2}\rightarrow0\; \mbox{ since }\; a > 3. \end{eqnarray*}$

Concerning Assumption (2), using assumptions (H2-ⅱ) and (H3), we have

$\begin{eqnarray*} \mathbb E|\Delta_{i} \Delta_{j}|&\leq & \frac{\phi_{x}(h_{K})}{n \mathbb E(K_{1})^{2}} \sum\limits_{0 \leq j\neq i \leq p} \vert \alpha_{i}\alpha_{j}\vert \mathbb E(K_{i}K_{j}) \\ &\leq & \frac{ \sum\limits_{0 \leq j\neq i \leq p} \vert \alpha_{i}\alpha_{j}\vert} {n \phi_{x}(h_{K})} \sup\limits_{i\neq j}\mathbb{P}((X_{i}, X_{j})\in B(x,h_{ K})\times B(x,h_{ K})) \\ &\leq & \frac{C\, \phi^\epsilon_{x}(h_{K})}{n}. \end{eqnarray*}$

Elsewhere, using the fact that

$\begin{equation} \sum\limits_{j\geq x+1}j^{-a}\leq\displaystyle {\int}_{u\geq x}u^{-a} = \left[(a-1)x^{a-1}\right]^{-1} , \end{equation}$

(A.11)

we obtain

$\sum\limits_{j = m_n+1}^\infty \alpha(j)\leq \sum\limits_{j = m_n}^\infty \alpha(j)\leq \displaystyle {\int}_{t\geq m_n} t^{-a}dt = \frac{m_n^{1-a}}{a-1},$

thus,

$\left( \sum\limits_{j = m_n+1}^\infty \alpha(j)\right) \sum\limits_{i = 1}^n C_{i} = O\left( \frac{m_n^{1-a}}{a-1} \sqrt{\frac{n}{ \phi_x(h_{ K})}}\right).$

We select

$m_n = \left\lfloor \left( \frac{ \phi_x(h_{ K})}{n\,\log n }\right)\right\rfloor^{1/(2(1-a))},$

such that $\lfloor\cdot\rfloor$ represents the function of the integer part. It is evident that, under assumption (H7), $m_n\to\infty$ . Furthermore, by substituting the expression for $m_n$ , we readily get

$\sum\limits_{j = m_n+1}^\infty \alpha(j)\, \sum\limits_{i = 1}^n C_{i} = o(1).$

Once more, considering assumption (H2-ⅰ) and (H7), we have

$\begin{eqnarray*} m_n\gamma_n&\leq &C n^{-1-1/(2(1-a)} ( \phi_x(h_{ K}))^{1+1/(2(1-a)}(\log n)^{-1/(2(1-a))}\nonumber\\ \hfill\cr &\leq& n^{(-3+2a)/(2(1-a)}( \phi_x(h_{ K}))^{(3-2a)/(2(1-a)} (\log n)^{-1/(2(1-a))}\nonumber\\ \hfill\cr &\leq& n^{-1+\eta(2a-3)/(2(1-a)}(\log n)^{-1/(2(1-a))} = o\left(n^{-1}\right). \end{eqnarray*}$

Hence the proof is complete. □

Proof of Theorem 3.7. According to the formula of (6.2), we deduce the following error formula

$\widehat{c}(x)-c(x) = \,\, U_{n}^{-1}(x)\,\, V_{n}^{\ast}(x)-U_{n}^{-1}(x) e_{n}(x)+\frac{h_{K}^{p+1}}{(p+1)!} \Psi^{(p+1)} (0)U_{n}^{-1}( x)e_{n}(x)+O(h_{K}^{p+1}) U_{n}^{-1}(x) e_{n}(x),$

where $e_{n}(x) = ^{\top}(u_{n, p+1}(x), \ldots, u_{n, 2p+1}(x))$ . Hence, Theorem 3.7 is a direct result of the following lemmas.

Lemma A.2. Assume that the conditions (U1), (U3), (U5) and (U6) are fulfilled. We have, as $n\rightarrow \infty,$

$\sup\limits_{x\in S_{\mathcal{F}}} |u_{nj}(x)-\mathbb E(u_{nj}(x))| = O_{a.co.}\left( \sqrt{ \frac{\log\, d_{n}}{n\phi_{x}(h_{K})}}\right).$

Lemma A.3. Assume that the conditions (U1)–(U6) are fulfilled. We have, as $n\rightarrow \infty,$

$\sup\limits_{x\in S_{\mathcal{F}}} |v_{nj}^{\ast}(x)-\mathbb E(v_{nj}^{\ast}(x))| = O_{a.co.}\left(\sqrt{ \frac{\log\, d_{n}}{n\phi_{x}(h_{K})}}\right),$

$\sup\limits_{x\in S_{\mathcal{F}}} |e_{nj}(x)-\mathbb E(e_{nj}(x))| = O_{a.co.}\left( \sqrt{ \frac{\log\, d_{n}}{n\phi_{x}(h_{K})}}\right).$

Lemma A.4. Assume that the conditions (U1)–(U5) are fulfilled. We have, as $n\rightarrow \infty,$

$\sup\limits_{x\in S_{\mathcal{F}}} |\mathbb E(v_{nj}^{\ast}(x))| = O(h_{K}^{p+1}), \,\,\, \,\, \sup\limits_{x\in S_{\mathcal{F}}} |\mathbb E(e_{nj}(x))| = O(1).$

□

Proof of Lemma A.2. The proof of this lemma relies on Proposition A.11-ⅰ in ^[38]. In order to apply the latter, we need to compute a certain quantity. In a first attempt, we evaluate the expression $\mathbb E(u_{nj}(x))$ . For this, we use Lemma A.1 and by the fact that observations are identically distributed, we have

$\begin{eqnarray} \mathbb E(u_{nj}(x))& = & \frac{1}{h_{K}^{j}\mathbb E(K_{1})}\mathbb E(\beta_{1}^{j}K_{1}) \end{eqnarray}$

(A.12)

$\begin{eqnarray} & = & O(1). \end{eqnarray}$

(A.13)

Moreover, for the covariance term, we have :

$\begin{equation} S_{n,\alpha,l}^{2}(x) = \sum\limits_{i = 1}^{n} \sum\limits_{i = 1}^{n}|{\rm Cov}(A_{i}^{\alpha}(x), A_{j}^{l}(x))|, \end{equation}$

(A.14)

where

$\begin{equation} A_{i}^{l}(x) = \frac{1}{ h_{K}^{l}}\left( \beta_{i}^{l} k_{i}(x)-\mathbb E( \beta_{i}^{l}(x) k_{i}(x)\right) , \,\,\, \hbox{for}\,\,\, l\in\{1,2,\ldots,p \}. \end{equation}$

(A.15)

Under some of the dependence assumptions, we treat the term of $S_{n, \alpha, l}^{2}(x)$ . Then we have

$S_{n,\alpha,l}^{2}(x) = R_{n,1}(x)+R_{n,2}(x)+n\, {\rm Var}(A_{1}^{l}(x)),$

with

$R_{n,1}(x) = \sum\limits_{W_{1}} {\rm Cov}( A_{j}^{l}(x)),A_{i}^{\alpha}(x)));\,\,\, W_{1} = \left\{(i,j)\in\{1,2,\ldots,n\} \, \mbox{ such that } \, 1\leq \mid i-l\mid\leq u_{n}\right\},$

and

$R_{n,2}(x) = \sum\limits_{W_{2}} {\rm Cov}( A_{j}^{l}(x)),A_{i}^{\alpha}(x)));\,\,\, W_{2} = \left\{(i,j)\in\{1,2,\ldots,n\} \, \mbox{ such that } \, u_{n}+1\leq \mid i-l\mid\leq n-1\right\}.$

For the term variance using Lemma A.1 and we deduce that

$\begin{equation} n\, {\rm Var}(A_{1}^{l}(x)) = O( n\phi_{x}(h_{K})). \end{equation}$

(A.16)

Concerning the covariance term $R_{n, 1}(x)$ , we use the same idea as those used in the Eq (A.3) with the choice $u_{n} = \phi^{-\epsilon}_{x}(h_{ K})$ , we get

$\begin{equation} R_{n,1}(x) = O( n\phi_{x}(h_{K})), \end{equation}$

(A.17)

and for the term $R_{n, 1}(x)$ we applied Rio inequality, see Theorem B.3 (ⅱ). So for this technique, we need to compute the absolute moments of the random variable $A_{j}^{l}(x)$ . Then, using a Newton expansion, we have

$\begin{array}{lll} \label{ss1} \mathbb{E}[|A_{j}^{l}(x)|^{m}] & = & \mathbb{E}\left|h_{K}^{-lm} \left( K_{j}(x) \beta_{j}^{l}(x) \; - \; \mathbb{E}[K_{j}(x) \beta_{j}^{l}(x)] \right)^{m}\right |\\ & = & h_{K}^{-lm} \mathbb{E}\left| \sum\limits_{i = 0}^{m} C^{i}_{m} \left( K_{j}(x) \beta_{j}^{l}\right)^{i}(x) \; \left(\mathbb{E}[K_{j}(x) \beta_{j}^{l}(x)] \right)^{m-i}\; (-1)^{m-i} \right |\\ & \leq & h_{K}^{-lm} \sum\limits_{i = 0}^{m}\; C^{i}_{m} \mathbb{E}\left| K_{j} (x) \beta_{j}^{l}\right|^{i}(x) \; \left|\mathbb{E}[K_{j}(x) \beta_{j}^{l}(x)] \right|^{m-i}\\ & \leq & h_{K}^{-lm} \sum\limits_{i = 0}^{m}\; C^{i}_{m}\mathbb{E}\left| K_{j}^{i}(x) \beta_{j}^{li}(x) \; \right| \; \left|\mathbb{E}[K_{j}(x) \beta_{j}^{l}(x) \right|^{m-i}, \end{array}$

where $C^{i}_{m} = \frac{m!}{i!(m-i)!}.$ Next, by applying Lemma A.1, we get

$\mathbb{E}[|A_{j}^{l}|^{m}] = O(\phi_{x}(h_{K})).$

By following the same reasoning used to establish Eq (A.4) with the choice $u_{n} = \phi^{-\epsilon}_{x}(h_{ K})$ permits to get:

$\begin{equation} R_{n,2}(x) = O( n\phi_{x}(h_{K})). \end{equation}$

(A.18)

Finlay, by Eqs (A.16), (A.17) and (A.18), we deduce that

$\begin{equation} S_{n,\alpha,l}^{2}(x) = O( n\phi_{x}(h_{K})). \end{equation}$

(A.19)

On the other hand, we can write

$\begin{equation} u_{nj}(x)-\mathbb E(u_{nj}(x)) = \frac{1}{n\mathbb \phi_{x}(h_{K})} \sum\limits_{i = 1}^{n} A_{i}^{l}(x), \end{equation}$

(A.20)

where $A_{i}^{l}(x)$ is defined in (A.15). Now, we apply the Definition 3.6 to show the uniform convergence of our estimator on a subset $S$ of $\mathcal{F}$ . For this aim, we take

$z(x) = \underset{j\in \{1,\ldots,d_{n} \}}{\arg\min} |\delta(x,x_{j})|.$

Let us consider the following decomposition:

$\begin{eqnarray*} \sup\limits_{x\in S_{\mathcal{F}}} |u_{nj}(x)-\mathbb E ( u_{nj}(x))|&\leq& \underbrace{\sup\limits_{x\in S_{\mathcal{F}}} |u_{nj}(x)- u_{nj}(x_{z(x)}))|}_{T_{1}}\\ &&+ \underbrace{\sup\limits_{x\in S_{\mathcal{F}}} |u_{nj}(x_{z(x)})-\mathbb E(u_{nj}(x_{z(x)}))|}_{T_{2}}\\ & &+\underbrace{\sup\limits_{x\in S_{\mathcal{F}}}\mathbb |\mathbb E(u_{nj}(x_{z(x)}))-\mathbb E(u_{nj}(x))|}_{T_{3}}. \end{eqnarray*}$

Concening the term $T_{2}$ , for all $\epsilon > 0$ , we have

$\begin{eqnarray*} \mathbb P(T_{2} > \epsilon)& = &\mathbb P\left(\underset{j\in \{1,\ldots,d_{n} \}}{\max}|u_{nj}(x)-\mathbb E(u_{nj}(x_{z(x)}))| > \epsilon\right)\\ &\leq& d_{n} \underset{j\in \{1,\ldots,d_{n} \}}{\max}\mathbb P\left( |u_{nj}(x)-\mathbb E(u_{nj}(x_{z(x)}))| > \epsilon\right)\\ &\leq& d_{n} \underset{j\in \{1,\ldots,d_{n} \}}{\max}\mathbb P\left( |A_{nj}(x_{z(x)})) | > \epsilon n \, \phi(h_{K}) \right). \end{eqnarray*}$

Next, we can apply the Proposition A.11 of ^[38] for any $m > 2, \, \tau > 0, \, \varpi \geq 1$ and for certain $0 < C < \infty$ ,

$\mathbb P(T_{2} > \tau) < C( A_{1}+A_{2}),$

where

$\begin{equation} A_{1} = d_{n}\left( 1+ \frac{\tau^{2}n^{2}\phi(h_{K})^{2} }{ \varpi S_{n,\alpha,l}^{2}} \right)^{-\varpi/2}, \,\,\,\,\,\,\,\, A_{2} = d_{n}\, n \, \varpi^{-1}\left( \frac{ \varpi}{ \tau n \, \, \phi(h_{K}) } \right)^{(a+1)m/(a+m)}. \end{equation}$

(A.21)

By using equation of (A.19) and under assumption (U1), we obtain that

$S_{n,\alpha,l}^{2} = \sup\limits_{x\in S_{\mathcal{F}}}S_{n,\alpha,l}^{2}(x) = O(n\phi(h_{K})).$

Now, for $\eta > 0$ , we choose

$\tau = \eta \sqrt{ \frac{\log\, d_{n}}{n\phi(h_{K})}}\, \quad \hbox{and}\quad \varpi = (\log(d_{n}))^{2}.$

Therefore, by utilizing the fact that $\Psi_{S _{ \mathcal{F}}}(\epsilon) = \log(d_{n})$ and by taking C $\eta^{2} = \lambda$ , we obtain under the assumption (U6) the following terms

$A_{1} = O(d_{n}^{1-\lambda})\quad \mbox{and}\quad A_{1} = O(n^{1-\lambda^{\prime}} ), \quad \mbox{for }\,\, \lambda, \lambda^{\prime} > 0.$

Finally, for $\eta$ large enough, we infer

$\mathbb P\left(T_{2} > \eta \sqrt{ \frac{\log\, d_{n}}{n\phi(h_{K})}}\right) < C ( d_{n}^{1-\lambda}+n^{1-\lambda^{\prime}}).$

Using the fact that

$\sum\limits_{n = 1}^{\infty}d_{n}^{1-\lambda} < \infty \,\, \mbox{and }\,\,\, \sum\limits_{n = 1}^{\infty} \frac{1}{n^{1+\lambda{'}}} < \infty,$

we get

$T_{2} = O_{a.co.}\left(\sqrt{ \frac{\log\, d_{n}}{n\phi(h_{K})}}\right).$

For the term $T_1$ we have

$\begin{eqnarray*} T_1 &\leq& \frac{C}{nh_{K}^{j} \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n}K_{i}(x)\mathbf{1}_{B(x,h_{K})}(X_{i})\mid \beta_{i}^{j}(x)-\beta_{i}^{j}(x_{z_{x}})\mathbf{1}_{B(x_{z_{x}},h_{K})}(X_{i})\mid \\ &&+ \frac{C}{nh_{K}^{j} \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n}\beta_{i}^{j}(x_{z_{x}})\mathbf{1}_{B(x_{z_{x}},h_{K})}(X_{i})\mid K_{i}(x)\mathbf{1}_{B(x,h_{K})}(X_{i})-K_{i}(x_{z_{x}})\mid \\ &: = & L_{1}^{j}+L_{2}^{j}. \end{eqnarray*}$

Evaluation of the term $L_{1}^{j}$ . Under assumption (U3) we have

$\begin{eqnarray*} \mathbf{1}_{B(x,h_{K})}(X_{i})\mid \beta_{i}^{j}(x)-\beta_{i}^{j}(x_{z_{x}})\mathbf{1}_{B(x_{z_{x}},h_{K})}(X_{i})\mid &\leq& C \iota_{n} h_{K}^{j-1}\mathbf{1}_{B(x,h_{K})\bigcap B(x_{z_{x}},h_{K})}(X_{i}) + C h_{K}^{j}\mathbf{1}_{B(x,h_{K})\bigcap \overline{B(x_{z_{x}},h_{K})}}(X_{i}), \end{eqnarray*}$

which gives the following expression

$\begin{eqnarray} L_{1}^{j} &\leq & \frac{C\iota_{n}}{n h_{K} \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n}K_{i}(x) \mathbf{1}_{B(x,h_{K})\bigcap B(x_{z_{x}},h_{K})}(X_{i})\\ &&+ \frac{C}{n \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n}K_{i}(x) \mathbf{1}_{B(x,h_{K})\bigcap \overline{B(x_{z_{x}},h_{K})}}(X_{i}). \end{eqnarray}$

(A.22)

Making use of the Lipschitz condition on $K(\cdot)$ and by hypotheses (U3) enables us to directly write

$\begin{eqnarray*} \label{z121} && \mid \beta_{i}^{j}(x_{z_{(x)}}) \mid \mathbf{1}_{B(x_{z_{(x)}},h_{K})}(X_{i})\mid K_{i}(x)\mathbf{1}_{B(x,h_{K})}(X_{i})-K_{i}(x_{z_{(x)}})\mid \\&\leq& C h_{K}^{j} \iota_{n} \mathbf{1}_{B(x,h_{K})\bigcap B(x_{z_{(x)}},h_{K})}(X_{i}) + C h_{K}^{j} K_{i}(x_{z_{(x)}}) \mathbf{1}_{\overline{B(x,h_{K}})\bigcap B(x_{z_{(x)}},h_{K})}(X_{i}). \nonumber \end{eqnarray*}$

Therefore, we readily obtain

$\begin{eqnarray*} L_{2}^{j} &\leq & \frac{C\iota_{n}}{n \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n} \mathbf{1}_{B(x,h_{K})\bigcap B(x_{z_{(x)}},h_{K})}(X_{i})\\ &&+ \frac{C}{n \phi(h_{K})}\sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n} K_{i}(x_{z_{(x)}})\mathbf{1}_{\overline{B(x,h_{K}})\bigcap B(x_{z_{(x)}},h_{K})}(X_{i}). \end{eqnarray*}$

Under assumption (U5) en by combined the last inequality with (A.22) implies that

$T_{1} \leq \frac{ Z_{i}}{n \phi(h_{K})},$

where

$Z_{i} = \frac{C\iota_{n} }{h_{K}} \sup\limits_{x\in S_{\mathcal{F}}} \sum\limits_{i = 1}^{n} \mathbf{1}_{B(x,h_{K})\bigcup B(x_{z_{(x)}},h_{K})}(X_{i}).$

Using a similar approach to the one employed in the proof of (13.14), under the assumptions (U1), (U2) and (U6), we obtain

$S_{n}^{2} = \mid {\rm Cov}( Z_{i}, Z_{j)}\mid = O(n\phi(h_{K})).$

Next, by employing a similar proof to the one applied in the evaluation of $T_{2}$ , we infer that

$T_{1} = O_{a.co.}\left(\sqrt{ \frac{\log\, d_{n}}{n\phi(h_{K})}}\right).$

Concerning the term of $T_{3}$ , it is obvious that

$T_{3} < \mathbb E \left(\sup\limits_{x\in S_{\mathcal{F}}} |u_{nj}(x)- u_{nj}(x_{z(x)}))|\right),$

we infer that

$T_{3} = O_{a.co.}\left(\sqrt{ \frac{\log\, d_{n}}{n\phi(h_{K})}}\right).$

This completes the proof. □

Proof of Lemma A.4. By conditioning on $X_{1}$ , we have

$\sup\limits_{x\in S_{\mathcal{F}}} |\mathbb E(v_{nj}^{\ast}(x))|\leq \frac{1}{ | \mathbb E(K_{1})|}|\mathbb E(( \beta_{1}h_{K}^{-1})^{j} K_{1})) | \sup\limits_{X_{1}\in B(x,h_{K}) }|r(X_{1})-r(x)|,$

so under assumption (U4) and by using the uniform version of Lemma A.1, we obtain

$\sup\limits_{x\in S_{\mathcal{F}}} |\mathbb E(v_{nj}^{\ast}(x))| = O(h_{K}^{p+1}).$

Concerning the term $\sup\limits_{x\in S_{\mathcal{F}}} |\mathbb E(e_{nj}(x))| = O(1)$ is already treated in Eq (A.12). □

Proof of Theorem 4.1. Using similar reasoning employed for the regression function, we show that

$\begin{equation} \sqrt{n\phi_{x}(h_{K)}} (B^{\ast}_{n}-B_{L}^{p}(x,y)\xrightarrow{D} N(0, V_{F}(x,y)), \end{equation}$

(A.23)

where

$B^{\ast}_{n} = ^{\top}( \Upsilon^{\ast}_{n,0},\ldots, \Upsilon^{\ast}_{n,p}),\,\,\, V_{F}(x,y) = F^{x}(y)(1-F^{x}(y))a^{\top}\,\mathbf{V} \,a)$

and

$\Upsilon_{n,j}^{\ast} = \frac{1}{n\mathbb E(K)} \sum\limits_{i = 1}^{n}\left(\frac{\beta(X_{i},x)}{h_{K}}\right)^{j}\,K(h_{K}^{-1}(\delta(X_{i},x))(L(h_{L}^{-1}(Y_{i}-y))-F^{x}(y)).$

For any given vector of real numbers $\mathbf{a} = ^{\top}(\alpha _{0}, \ldots, \alpha _{p})\neq 0$ , we have

$\sqrt{n\phi_{x}(h_{K)}} a^{\top} B^{\ast}_{n} = \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left(R_{i}-\mathbb E(R_{i}) \right)+ \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n} \mathbb E(R_{i}),$

where

$\begin{equation} R_{i} = \frac{\sqrt{ \phi_{x}(h_{K})}}{\mathbb E(K_{1})}\left( \sum\limits_{l = 0}^{p}\alpha_{l}h_{K}^{-l}\beta_{i}^{l}\right)K_{i}\left(L_{i}-F^{x}(y)\right). \end{equation}$

(A.24)

According to the Cramér-Wold theorem and Eq (A.24), Eq (A.23), will be verified if we prove the following two statements:

$\begin{eqnarray} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left(R_{i}-\mathbb E(R_{i}) \right)&\xrightarrow{D}& N(0, F^{x}(y)(1-F^{x}(y)) \, a^{\top}\,\mathbf{V} \,a), \end{eqnarray}$

(A.25)

$\begin{eqnarray} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left( \mathbb E(R_{i}) \right)& = & \sqrt{n\phi_{x}(h_{K})} B_{L}^{p}(x,y). \end{eqnarray}$

(A.26)

□

Proof of statement A.25. We utilize Bernstein's big-block method, as employed in Theorem 3.1 of ^[55]. The set $(1, 2, \ldots, n)$ is partitioned into $2\upsilon_{n} + 1$ subsets, each containing large blocks of size $(w_{n})$ and small blocks of size $(q_{n})$ , by considering

$\upsilon = \left\lfloor \frac{n}{w_{n}+q_{n}} \right\rfloor.$

Assumption (M6) allows us to define the size of the large block as follows

$w_{n} = \left\lfloor \frac{ \left(n\phi_{x}(h_{ K})\right)^{\frac{1}{2}}}{r_{n}} \right\rfloor.$

Afterward, under the same assumptions and employing straightforward calculations, we obtain:

$\begin{equation} \lim\limits_{\substack{n \rightarrow +\infty}} \frac{q_{n}}{w_{n}} = 0,\lim\limits_{\substack{n \rightarrow +\infty}} \frac{w_{n}}{n} = 0,\quad\lim\limits_{\substack{n \rightarrow +\infty}}\frac{w_{n}}{\sqrt{n\phi_{x}(h_{ K})}} = 0\,\,\, \mbox{and} \,\,\, \lim\limits_{\substack{n \rightarrow +\infty}} \frac{n}{w_{n}}\alpha(q_{n}) = 0. \end{equation}$

(A.27)

It can be easily inferred that, as $n$ tends to infinity,

$\begin{equation} \frac{\upsilon q_{n}}{n}\simeq \left( \frac{n}{r_{n}+v_{n}}\right) \frac{q_{n}}{n}\simeq \frac{q_{n}}{w_{n}+q_{n}}\simeq \frac{q_{n}}{w_{n}} = 0. \end{equation}$

(A.28)

Presently, we partition the sum $\sum\limits_{i = 1}^{n} (R_{i}-\mathbb E(R_{i}))$ into distinct large and small blocks as outlined below. Let

$I_{j} = (j-1)(w+q)+1, l_{j} = (j-1)(w+q)+w+1, j = 1,2,\ldots,\upsilon.$

One can see that

$\begin{eqnarray*} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\underbrace{(R_{i}-\mathbb E(R_{i}))}_{E_{i}} & = & \sum\limits_{j = 1}^{\upsilon } \frac{F_{j}}{\sqrt{n}}+ \sum\limits_{j = 1}^{\upsilon} \frac{ F{'}_{j}}{\sqrt{n}}+ \frac{F_{n}}{\sqrt{n}} \\ & = :& n^{-1/2}(S_{1,n}+S_{2,n}+S_{3,n}), \end{eqnarray*}$

where the random variables $F_{j}, F^{'}_{j}$ and $F_{n}$ are defined by

$\begin{equation} \begin{array}{lll} F_{j} = \sum\limits_{i = I_{j} }^{I_{j}+w-1} E_{i}, & \qquad F ^{'} _{j} = \sum\limits_{i = l_{j}}^{l_{j}+q-1}E_{i}, &\qquad F_{n} = \sum\limits_{i = \upsilon (w+q)+1}^{n}E_{i}. \end{array} \end{equation}$

(A.29)

Subsequently, making use of Slutsky's theorem, we establish that the two terms $\frac{1}{\sqrt{n}} S_{2, n}$ and $\frac{1}{\sqrt{n}} S_{3, n}$ converge to zero in probability. Before applying Lindeberg-Feller conditions to establish the asymptotic normality of the term $\frac{1}{\sqrt{n}}S_{1, n}$ , it is imperative to demonstrate that the variables $F_{j}$ are asymptotically independent. Therefore, we obtain

$\frac{\mathbb E(S_{2,n}^{2})}{n} = \sum\limits_{j = 1}^{\upsilon} \frac{1}{n} {\rm Var}(F_{j}^{\prime})+ \underset{i \neq j}{\sum\limits_{j = 1}^{\upsilon }\sum\limits_{i = 1}^{\upsilon }} \frac{2}{n} {\rm Cov }(F_{j}^{\prime},F_{i}^{\prime}) = : \frac{1}{n} (A_{1}+ A_{2}).$

Using second-order stationarity, we get

${\rm Var} (F_{j}^{\prime}) = q{\rm Var } (E_{1})+2 \sum\limits_{i\neq j}^{q } {\rm Cov} (E_{i},E_{j}).$

Therefore

$\frac{A_{1}} {n}\leq\frac{\upsilon q}{n}\underbrace{ {\rm Var} (E_{1})}_{V_{F}(x,y)}+\frac{2}{n}\sum\limits_{i\neq j}^{n} {\rm Cov} (E_{i},E_{j}).$

To derive the covariance term, we adopt the identical procedures outlined in the verification of Eq (A.2). Then we set

$T_{1} = \left\{(i,j)\in\{1,2,\ldots,n\} \, \mbox{ such that } \, 1\leq \mid i-j\mid\leq c_{n}\right\},$

and

$T_{2} = \left\{(i,j)\in\{1,2,\ldots,n\} \, \mbox{ such that } \, c_{n}+1\leq \mid i-j\mid\leq n-1\right\}.$

Based on the aforementioned splitting, we obtain

$\begin{array}{ll} \frac{2}{n}\sum\limits_{i\neq j}^{n} {\rm Cov}(E_{i},E_{j}) = \frac{2}{n} \sum\limits_{G_{1}} {\rm Cov} (R_{i},R_{j})+ \frac{2}{n} \sum\limits_{ G_{2}} {\rm Cov} (R_{i},R_{j}) = : U_{1,n}+U_{2,n}. \end{array}$

$\begin{eqnarray*} {\rm Cov} (R_{i},R_{j})& = & \frac{ \phi_{x}(h_{ K}) }{\mathbb E^{2}( K_{1})} \sum\limits_{l = 0}^{p} \sum\limits_{m = 0}^{p} \alpha_{l} \alpha_{m} \mathbb E\left( h_{K}^{-l} h_{K}^{-m}\beta_{i}^{l}\beta_{j}^{m} K_{i}K_{j} \mathbb E((L_{i}-F^{x}(y) (L_{j}-F^{x}(y) \mid (X_{i},X_{i})) \right) \\&&- \frac{ \phi_{x}(h_{ K}) }{\mathbb E^{2}( K_{1})} \mathbb E\left( \sum\limits_{l = 0}^{p}\alpha_{l} h_{K}^{-l}\beta_{j}^{l} K_{i} (L_{i}-F^{x}(y)) \right)^{2}. \end{eqnarray*}$

Utilizing the inequality

$\begin{equation} \mid L_{i}(y) - F^{x}(y) \mid \leq 1, \end{equation}$

(A.30)

stated in Lemma A.1 and given the conditions (H2-ⅱ) and (H4-ⅱ), we obtain the following

$\begin{eqnarray*} \mid {\rm Cov} (R_{i},R_{j}) \mid & \leq & \frac{ \phi_{x}(h_{ K}) }{\mathbb E^{2}( K_{1})} \left( \sum\limits_{l = 0}^{p} \sum\limits_{m = 0}^{p} \mid \alpha_{l} \alpha_{m}\mid\mathbb E(K_{i} K_{j})+ \left(\sum\limits_{l = 0}^{p} \mid \alpha_{l} \mid \right)^{2}\mathbb E^{2}(K_{i})\right) \\ & \leq & \frac{ \phi_{x}(h_{ K}) }{\mathbb E^{2}( K_{1})} \left( C \mathbb{P}((X_{i},X_{j})\in B(x,h_{ K})\times B(x,h_{ K})) + C^{\prime} M_{1}^{2} \phi^{2} _{x}(h_{ K})\right)\\ &\leq& C \phi _{x}(h_{ K}) \frac{\psi_{x} (h_{ K}) }{\phi^{2} _{x}(h_{ K})}+ C^{\prime} M_{1}^{2}n \phi _{x}(h_{ K}). \end{eqnarray*}$

Then, we readily infer

$\begin{eqnarray*} \mid U_{1,n} \mid & \leq & \left( 2 C n^{-1} \phi _{x}(h_{ K}) \frac{\psi_{x} (h_{ K}) }{\phi^{2} _{x}(h_{ K})}+2 C^{\prime}n^{-1} M_{1}^{2} \phi _{x}(h_{ K}) \right) \underbrace{ \sum\limits_{ G_{1}}}_{nc_{n}}1. \end{eqnarray*}$

In relation to the summation over the set $G_{2}$ , the utilization of Theorem B.3 (ⅱ) gives

$\sum\limits_{G_{2}} \mid{ \rm Cov}(R_{i},R_{j}) \mid\leq \sum\limits_{G_{2}}C (\alpha|j-i|)^{\frac{1}{p}} \left(\mathbb{E}|R_{i} |^{q}\right)^{\frac{1}{q}}\,\left(\mathbb{E}|R_{j}|^{r}\right)^{\frac{1}{r}}.$

Conditioned on $X_{i}$ , and using (A.30) along with assumption (M3), we obtain

$\begin{eqnarray*} \mathbb{E}(|R_{i}|^{q}) = \frac{\sqrt{ \phi_{x}(h_{ K})}}{ M_{1} \phi_{x}(h_{ K} ) } \mathbb{E}\left(|K_{i}|^{q}\, \mathbb{E}\left(|L_{i}-F^{x}(y)|^{q} \mid X_{i}\right)\right) \leq C (\phi_{x}(h_{ K}))^{-\frac{1}{2}+\frac{1}{q}}. \end{eqnarray*}$

We readily infer that

$\begin{eqnarray} \mid U_{2,n} \mid &\leq & 2 \sum\limits_{G_{2}}C (\alpha|j-i|)^{\frac{1}{p}} \, (\phi_{x}(h_{ K}))^{-1+\frac{1}{q}+\frac{1}{r}} \end{eqnarray}$

(A.31)

$\begin{eqnarray} & \leq & 2 c_{n}^{-\delta} (\phi_{x}(h_{ K}))^{ -\frac{1}{p} } \sum\limits_{|l| > c_{n}}l^{\delta}(\alpha(|l|))^{\frac{1}{p}}. \end{eqnarray}$

(A.32)

Using the fact that $\frac{\psi_{x} (h_{ K}) }{\phi^{2}_{x}(h_{ K})}$ is bounded, we select $c_{n} = \left\lfloor \phi_{x}(h_{ K})\right\rfloor^{-\frac{1}{p\delta}}$ and by assumptions (M4), and (M5) we obtain

$\frac{2}{n}\sum\limits_{i\neq j}^{n} {\rm Cov}(E_{i},E_{j}) = o(1).$

Furthermore, for the term $A_{2}$ , we have

$\frac{\mid A_{2}\mid}{n} \leq \sum\limits_{ 1\leq i < j \leq n} \frac{2}{n} {\rm Cov }(F_{j}^{\prime},F_{i}^{\prime}) = \underbrace{\sum\limits_{ 1\leq i < j \leq n} \frac{2}{n} \mid {\rm Cov}(R_{j} ,R_{i} ) \mid }_{o(1)}.$

Consequently by (A.27), we get

$\frac{\mathbb E(S_{2,n}^{2})}{n} = 0,\,\,\, \mbox{as}\,\,\, n \longrightarrow \infty.$

Let's now examine the term of $S_{2, n}^{2}$

$\begin{eqnarray*} \frac{ \mathbb E(S_{3,n}^{2})}{n} &\leq& \frac{1}{n} \sum\limits_{i = \upsilon (w+q)+1}^{n} {\rm Var}(E_{1})+\frac{2}{n}\sum\limits_{ 1\leq i < j \leq n} \mid {\rm Cov}(E_{i}, E_{j}) \mid\\ &\leq& \frac{n-\upsilon (w+q)}{n} {\rm Var}(E_{1})+ \underbrace{ \frac{2}{n}\sum\limits_{ 1\leq i < j \leq n} \mid {\rm Cov}(E_{i}, E_{j}) \mid}_{o(1)}. \end{eqnarray*}$

Besides, from (A.27) and (A.28), we get $\frac{\upsilon w}{n}\longrightarrow 1$ . Then, we obtain

$\frac{\mathbb E(S_{3,n}^{2})}{n} = 0,\,\,\, \mbox{as}\,\,\, n \longrightarrow \infty.$

In order to demonstrate the asymptotic independence of the variables $F_{j}$ , we employ Lemma B.1. By setting $V_{j} = \exp\left(\frac{i tF_{j}}{\sqrt{n}}\right)$ , the ensuing relationship

$\begin{eqnarray} \left| \mathbb E\left(\exp\left(it \frac{S_{1,n}}{\sqrt{n}}\right)\right) -\prod\limits_{j = 1}^{\upsilon} \mathbb E\left(\exp\left(it \frac{F_{j}}{\sqrt{n}}\right)\right)\right| &\leq& 16 \upsilon \alpha(q)\\ &\cong& \frac{n}{w}\alpha(q) \longrightarrow 0 . \end{eqnarray}$

(A.33)

Therefore, according to formula (A.33), the term variance $S_{1, n}$ is computed as follows:

$\begin{eqnarray*} \mathrm{Var}(S_{1,n})& = & \frac{\upsilon w}{n} {\rm Var}(R_{1}), \end{eqnarray*}$

where

${\rm Var}(R_{1}) = \frac{\phi_{x}(h_{K})}{M_{1}^{2}\phi_{x}^{2}(h_{K})}\sum\limits_{l = 0}^{p}\alpha_{l}^{2}h_{K}^{-2l}{\rm Var}(\beta_{i}^{l} K_{i}(L_{i}-F^{x}(y)).$

Using Lemma 6.3 of ^[14], we obtain

${\rm Var}(\beta_{i}^{l} K_{i}(L_{i}-F^{x}(y)) = \mathbb E(\beta_{i}^{2l} K_{i}^{2})F^{x}(y)(1-F^{x}(y)).$

Then, by using the fact that $\frac{\upsilon w}{n}\longrightarrow 1$ and under Lemma A.1, we get

${\rm Var}(R_{1}) = \alpha_{0}^{2}\frac{M_{2}}{M_{1}^{2}}+\sum\limits_{l = 1}^{p}\alpha_{l}^{2} \frac{N(2,2l)}{M_{1}^{2}} F^{x}(y)(1-F^{x}(y)) = F^{x}(y)(1-F^{x}(y)) a^{\top}\,\mathbf{V} \,a .$

Finally, our attention turns to Lindeberg's central limit theorem concerning $F_{j}$ . It is then enough to demonstrate that for any $\epsilon > 0,$

$\frac{1}{n}\sum\limits_{j = 1}^{\upsilon}\mathbb E\left[F_{j}^{2}\mathbf{1}_{|F_{j}| > \epsilon \sqrt{n} V_{F}(x,y)}\right]\longrightarrow0 \, \mbox{ as } \, n\longrightarrow +\infty.$

Under assumption (M2), we have

$\frac{\mid F_{j} \mid}{\sqrt{n}} \leq \sum\limits_{l = 0}^{p}\mid \alpha_{l} \mid\frac{ w}{ \sqrt{n \phi_{x}(h_{K})}}.$

According to (A.27), we obtain

$\frac{\mid F_{j} \mid}{\sqrt{n}} \longrightarrow 0,\,\,\, \mbox{as}\,\,\, n \longrightarrow \infty.$

Hence, for any given $\epsilon$ and when $n$ is sufficiently large, the set $\lbrace|F_{j}| > \epsilon\sqrt{n}V_{F}(x, y)\rbrace$ becomes empty. Consequently, the demonstration of (A.25) is now concluded. □

Proof of statement A.26. We have

$\begin{eqnarray*} \frac{1}{\sqrt{n}} \sum\limits_{i = 1}^{n}\left( \mathbb E(R_{i}) \right)& = & \sqrt{ n}\,\mathbb E(R_{1})\\ & = & \sqrt{n \phi_{x}(h_{K})} \left(\frac{\alpha_{0}}{\mathbb E(K_{1})},\frac{\alpha_{1}h^{-1}_{K}}{\mathbb E(K_{1})}, \ldots,\frac{\alpha_{p} h^{-p}_{K}}{\mathbb E(K_{1})}\right) . \left(\begin{array}{c} \mathbb E(K_{1}L_{1})-F^{x}(y)\mathbb E(K_{1})\\ \mathbb E(K_{1} \beta_{1}L_{1})-F^{x}(y)\mathbb E(K_{1} \beta_{1}) \\ \vdots \\ \underbrace{\mathbb E(K_{1} \beta_{1}^{p}L_{1})-F^{x}(y)\mathbb E(K_{1} \beta_{1}^{p})}_{N(x,y)} \\ \end{array} \right). \end{eqnarray*}$

Next, we compute the term $N(x, y)$ . Notice that

$\mathbb E(K_{1} \beta_{1}^{p}L_{1})-F^{x}(y)\mathbb E(K_{1} \beta_{1}^{p}) = \mathbb E(K_{1} \beta_{1}^{p}((\mathbb E(L_{1}\mid X_{1})-F^{x}(y) )).$

Making use of the Taylor expansion under the assumptions (M2) and (M3), we obtain

$\begin{equation*} \mathbb E(L_{1}\mid X_{1}) = \sum\limits_{k = 0}^{p} \varphi_{2k}(X_{1},y) \frac{h_{L}^{2k}}{(2k)!} \displaystyle {\int}_{\mathbb{R}}t^{2k}L^{\prime}(t)dt+o(h_{L}^{2p}). \end{equation*}$

Therefore, we readily obtain

$\begin{eqnarray} &&\mathbb E(K_{1} \beta_{1}^{p}((\mathbb E(L_{1}\mid X_{1})-F^{x}(y) )) \\& = & \sum\limits_{k = 0}^{p} \mathbb E(K_{1} \beta_{1}^{p}\varphi_{k}(X_{1},y)) \frac{h_{L}^{2k}}{(2k)!} \displaystyle {\int}_{\mathbb{R}}t^{2k}L^{\prime}(t)dt-\varphi_{0}(x,y)\mathbb E(K_{1} \beta_{1}^{p})+o(h_{L}^{2P}E(K_{1} \beta_{1}^{p})). \end{eqnarray}$

(A.34)

Observe that $\psi_{k}(0) = 0$ , we get

$\begin{equation} \mathbb E(K_{1} \beta_{1}^{p}\varphi_{k}(X_{1},y)) = \varphi_{k}(x,y)\mathbb E(K_{1} \beta_{1}^{p})+ \sum\limits_{a = 1}^{p} \frac{\psi_{k}^{(a)}(0)}{a!}\mathbb E(K_{1} \beta_{1}^{p+a}). \end{equation}$

(A.35)

By combining Eqs (A.34) and (A.35), we can derive

$\begin{eqnarray*} &&\mathbb E(K_{1} \beta_{1}^{p}((\mathbb E(L_{1}\mid X_{1})-F^{x}(y) ))\\ & = & \sum\limits_{k = 0 }^{p} \frac{h_{L}^{2k}}{(2k)!} \displaystyle {\int}_{\mathbb{R}}t^{k}L^{\prime}(t)dt \left[ \varphi_{2k}(x,y)E(K_{1} \beta_{1}^{p})+ \sum\limits_{a = 1}^{p} \frac{\psi_{k}^{(a)}(0)}{a!}\mathbb E(K_{1} \beta_{1}^{p+a})\right]\\ &&- \sum\limits_{k = 0 }^{p} \frac{h_{L}^{2k}}{(2k)!} \displaystyle {\int}_{\mathbb{R}}t^{k}L^{\prime}(t)dt\varphi_{0}(x,y)\mathbb E(K_{1} \beta_{1}^{p})]+o(h_{L}^{2p}\mathbb E(K_{1} \beta_{1}^{p})). \end{eqnarray*}$

So the statement (A.26) is proved. □

B. Appendix

This appendix contains supplementary information that is an essential part of providing a more comprehensive understanding of the paper.

Lemma B.1. ^[82] Let $V_{1}, \ldots, V_{L}$ be random variables measurable with respect to the $\sigma$ -algebras $\mathscr{F}_{i_{1}}^{j_{1}}, \ldots, \mathscr{F}_{i_{L}}^{j_{L}}$ respectively with $1 \leqslant i_{1} < j_{1} < i_{2} < \cdots < j_{L} \leqslant n, \quad i_{l+1}-j_{l} \geqslant w \geqslant 1$ and $\left|V_{j}\right| \leqslant 1$ for $j = 1, \ldots$ , L. Then

$\left|\mathbb E\left(\prod\limits_{j = 1}^{L} V_{j}\right)-\prod\limits_{j = 1}^{L} \mathbb E\left(V_{j}\right)\right| \leqslant 16(L-1) \alpha(w),$

where $\alpha(w)$ is the strongly mixing coefficient.

Theorem B.2. (Lindeberg central limit theorem). For each $n \geq 1$ , let $\left\{U_{n 1}, \ldots, U_{n r_n}\right\}$ be a collection of independent random variables such that $\mathbb{E} (U_{n j}) = 0$ and $\operatorname{Var}(U_{n j}) < \infty$ for $j = 1, \ldots, r_n$ .

$\tilde{U}_{n j} = \frac{U_{n j}}{\sqrt{\sum\limits_{k = 1}^{r_n} \operatorname{Var} U_{n k}}}, \quad j = 1, \ldots, r_n .$

Then

$\sum\limits_{j = 1}^{r_n} \tilde{U}_{n j} \rightarrow N(0,1)\ \mathit{\text{in distribution as}}\ n \rightarrow \infty,$

if for every $\epsilon > 0$

$\lim\limits_{n \rightarrow \infty} \sum\limits_{j = 1}^{r_n} \mathbb{E}\left|\tilde{U}_{n j}\right|^2 \mathbf{1}\left(\left|\tilde{U}_{n j}\right| > \epsilon\right) = 0.$

The following theorem is Proposition A.10. in ^[38].

Theorem B.3. Assume that $\left(T_n\right)_{n \in \mathbb{Z}}$ is $\alpha$ -mixing. Let us, for some $k \in \mathbb{Z}$ , consider a real variable $\mathcal{T}$ (resp. $\mathcal{T}^{\prime}$ ) which is $\mathcal{A}_{-\infty}^k$ -measurable (resp. $\mathcal{A}_{n+k}^{+\infty}$ -measurable).

(i) If $\mathcal{T}$ and $\mathcal{T}^{\prime}$ are bounded, then:

$\exists C, 0 < C < +\infty, \operatorname{Cov}\left(\mathcal{T}, \mathcal{T}^{\prime}\right) \leq C \alpha(n) .$

(ii) If, for some positive numbers $p, q, r$ such that $p^{-1}+q^{-1}+r^{-1} = 1$ , we have $E \mathcal{T}^p < \infty$ and $E \mathcal{T}^{\prime q} < \infty$ , then:

$\exists C, 0 < C < +\infty,\,\,\, \operatorname{Cov}\left(\mathcal{T}, \mathcal{T}^{\prime}\right) \leq C\left(\mathbb E \mathcal{T}^p\right)^{\frac{1}{p}}\left(\mathbb E \mathcal{T}^{\prime q}\right)^{\frac{1}{q}} \alpha(n)^{\frac{1}{r}}.$

References

[1]	B. S. Everitt, Cluster Analysis, 3rd ed., Edward Arnold, London / Halsted Press, New York, 1993. ISBN 978-0340584798 / 978-0470220436
[2]	L. Kaufman, P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley Sons, New York, 1990. https://doi.org/10.1002/9780470316801
[3]	R. Dubes, A. Jain, Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ, 1988. ISBN 978-0-13-022278-7
[4]	A. Gersho, R. M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, Dordrecht, 1992. https://doi.org/10.1007/978-1-4615-3626-0
[5]	M. R. Garey, D. S. Johnson, H. S. Witsenhausen, The complexity of the generalized Lloyd-Max problem, IEEE Trans. Inf. Theory, 28 (1982), 255-256. https://doi.org/10.1109/TIT.1982.1056488 doi: 10.1109/TIT.1982.1056488
[6]	J. H. Ward, Hierarchical grouping to optimize an objective function, J. Amer. Statist. Assoc., 58 (1963), 236-244. https://doi.org/10.1080/01621459.1963.10500845 doi: 10.1080/01621459.1963.10500845
[7]	W. H. Equitz, A new vector quantization clustering algorithm, IEEE Trans. Acoust. Speech, Signal Processing, 37 (1989), 1568-1575. https://doi.org/10.1109/29.35395 doi: 10.1109/29.35395
[8]	P. Fränti, O. Virmajoki, T. Kaukoranta, Branch-and-bound technique for solving optimal clustering, Int. Conf. on Pattern Recognition (ICPR'02), Québec, Canada, 2 (2002), 232-235. https://doi.org/10.1109/ICPR.2002.1048281
[9]	P. Fränti, O. Virmajoki, Polynomial-time clustering algorithms derived from branch-and-bound technique, Advanced Concepts for Intelligent Vision Systems (ACIVS'2002), Gent, Belgium, (2002), 118-123.
[10]	W. L. G. Koontz, P. M. Narendra, K. Fukunaga, A branch and bound clustering algorithm, IEEE Trans. Comput., 24 (1975), 908-915. https://doi.org/10.1109/T-C.1975.224336 doi: 10.1109/T-C.1975.224336
[11]	C.-H. Cheng, A branch and bound clustering algorithm, IEEE Trans. SMC, 25 (1995), 895-898. https://doi.org/10.1109/21.376504 doi: 10.1109/21.376504
[12]	G. Palubeckis, A branch-and-bound approach using polyhedral results for a clustering problem, INFORMS Journal of Computing, 9 (1997), 30-42. https://doi.org/10.1287/ijoc.9.1.30 doi: 10.1287/ijoc.9.1.30
[13]	L. S. Iyer, J. E. Aronson, A parallel branch-and-bound method for cluster analysis, Ann. Oper. Res., 90 (1999), 65-86. https://doi.org/10.1023/A:1018925018009 doi: 10.1023/A:1018925018009
[14]	M. J. Brusco, A repetitive branch-and-bound procedure for minimum within-cluster sums of squares partitioning, Psychometrika, 71 (2006), 347-363. https://doi.org/10.1007/s11336-004-1218-1 doi: 10.1007/s11336-004-1218-1
[15]	G. Kaminka, Repetitive branch-and-bound using constraint programming for constrained minimum sum-of-squares clustering, European Conf. on Artificial Intelligence, The Hague, The Netherlands - Including Prestigious Applications of Artificial Intelligence (PAIS 2016), 285 (2016), 462-470. IOS Press. https://doi.org/10.3233/978-1-61499-672-9-462
[16]	J. Bennell, G. Scheithauer, Y. Stoyan, T. Romanova, A. Pankratov, Optimal clustering of a pair of irregular objects, J. Glob. Optim., 61 (2015), 497-524. https://doi.org/10.1007/s10898-014-0192-0 doi: 10.1007/s10898-014-0192-0
[17]	Y. Han, L. Zhu, Z. Cheng, J. Li, X. Liu, Discrete optimal graph clustering, IEEE Trans. Cybern., 50 (2018), 1697-1710. https://doi.org/10.1109/TCYB.2018.2881539 doi: 10.1109/TCYB.2018.2881539
[18]	S. Boluki, S. Z. Dadaneh, X. Qian, E. R. Dougherty, Optimal clustering with missing values, BMC bioinformatics, 20 (2019), 1-10. https://doi.org/10.1186/s12859-019-2832-3 doi: 10.1186/s12859-019-2832-3
[19]	T. Feder, D. Greene, Optimal algorithms for approximate clustering, ACM Symposium on Theory of Computing, (1988), 434-444. https://doi.org/10.1145/62212.62255 doi: 10.1145/62212.62255
[20]	R. R. Mettu, C. G. Plaxton, Optimal time bounds for approximate clustering, Machine Learning, 56 (2004), 35-60. https://doi.org/10.1023/B:MACH.0000033114.18632.e0 doi: 10.1023/B:MACH.0000033114.18632.e0
[21]	Z. Wu, R. Leahy, An optimal graph theoretic approach to data clustering: theory and its application to image segmentation, IEEE T. Pattern Anal., 15 (1993), 1101-1113. https://doi.org/10.1109/34.244673 doi: 10.1109/34.244673
[22]	L. Gao, A. L. Rosenberg, R. K. Sitaraman, Optimal clustering of tree-sweep computations for high-latency parallel environments, IEEE T. Parall. Distr., 10 (1999), 813-824. https://doi.org/10.1109/71.790599 doi: 10.1109/71.790599
[23]	X. Wu, Optimal quantization by matrix searching, J. Algorithms, 12 (1991), 663-673. https://doi.org/10.1016/0196-6774(91)90039-2 doi: 10.1016/0196-6774(91)90039-2
[24]	P. Fränti, T. Kaukoranta, D.-F. Shen, K.-S. Chang, Fast and memory efficient implementation of the exact PNN, IEEE Trans. Image Process., 9 (2000), 773-777. https://doi.org/10.1109/83.841516 doi: 10.1109/83.841516
[25]	H. Späth, Cluster Analysis Algorithms for Data Reduction and Classification of Objects, Ellis Horwood Limited, West Sussex, 1980. ISBN 978-0853121411
[26]	R. L. Graham, D. E. Knuth, O. Patashnik, Concrete Mathematics – a Foundation for Computer Science, 2^nd ed., Addison-Wesley, 1994. ISBN 978-0201558029
[27]	T. Kaukoranta, P. Fränti, O. Nevalainen, Vector quantization by lazy pairwise nearest neighbor method, Optical Engineering, 38 (1999), 1862-1868. https://doi.org/10.1117/1.602251 doi: 10.1117/1.602251
[28]	Y. Linde, A. Buzo, R. M. Gray, An algorithm for vector quantizer design, IEEE Trans. on Comm., 28 (1980), 84-95. https://doi.org/10.1109/TCOM.1980.1094577 doi: 10.1109/TCOM.1980.1094577
[29]	P. Fränti, O. Virmajoki, Iterative shrinking method for clustering problems, Pattern Recogn., 39 (2006), 761-765. https://doi.org/10.1016/j.patcog.2005.09.012 doi: 10.1016/j.patcog.2005.09.012
[30]	T. F. Gonzalez, Clustering to minimize the maximum intercluster distance, Theor. Comput. Sci., 38 (1985), 293-306. https://doi.org/10.1016/0304-3975(85)90224-5 doi: 10.1016/0304-3975(85)90224-5
[31]	P. Fränti, S. Sami, How much can k-means be improved by using better initialization and repeats? Pattern Recogn., 93 (2019), 95-112. https://doi.org/10.1016/j.patcog.2019.04.014 doi: 10.1016/j.patcog.2019.04.014
[32]	P. Fränti, N. Teemu, M. Yuan, Converting MST to TSP path by branch elimination, Applied Sciences, 11 (2020), 177. https://doi.org/10.3390/app11010177 doi: 10.3390/app11010177
[33]	P. Fränti, Efficiency of random swap clustering, Journal of Big Data, 5 (2018) 1-29. https://doi.org/10.1186/s40537-018-0122-y doi: 10.1186/s40537-018-0122-y

This article has been cited by:

1.	Mir Sayed Shah Danish, Gabor Pinter, 2022, Chapter 10, 978-3-031-12957-5, 115, 10.1007/978-3-031-12958-2_10
2.	Miraj Ahmed Bhuiyan, Nora Hegedusne Baranyai, 2022, Chapter 10, 978-3-031-13145-5, 117, 10.1007/978-3-031-13146-2_10
3.	Santi Agatino Rizzo, Editorial to the 'Special Issue—Distribution network reliability in Smart Grids and Microgrids' of AIMS Energy, 2022, 10, 2333-8334, 533, 10.3934/energy.2022026
4.	Nikita Makarichev, Tsangyao Chang, 2022, Chapter 9, 978-3-031-12957-5, 101, 10.1007/978-3-031-12958-2_9
5.	Georgy Shilov, András Vincze, 2022, Chapter 12, 978-3-031-12957-5, 139, 10.1007/978-3-031-12958-2_12
6.	Ivan Udalov, Almakul Abdimomynova, Svetlana Moldagulova, 2022, Chapter 12, 978-3-031-13145-5, 147, 10.1007/978-3-031-13146-2_12
7.	Andrey Kraykin, Artur Meynkhard, Tomonobu Senjyu, 2022, Chapter 11, 978-3-031-13145-5, 131, 10.1007/978-3-031-13146-2_11
8.	Konstantin Panasenko, Fi-John Chang, 2022, Chapter 9, 978-3-031-13145-5, 105, 10.1007/978-3-031-13146-2_9
9.	Miraj Ahmed Bhuiyan, Elizaveta Ibragimova, 2022, Chapter 11, 978-3-031-12957-5, 127, 10.1007/978-3-031-12958-2_11
10.	Lyailya Maratovna Mutaliyeva, Ulf Henning Richter, 2023, 978-1-80382-884-8, 1, 10.1108/978-1-80382-883-120231001
11.	Khayrilla Abdurasulovich Kurbonov, Gabor Pinter, 2023, 978-1-80382-884-8, 31, 10.1108/978-1-80382-883-120231003
12.	David Philippov, Tomonobu Senjyu, 2023, 978-1-80382-884-8, 177, 10.1108/978-1-80382-883-120231014
13.	Raya Hojabaevna Karlibaeva, Anthony Nyangarika, 2023, 978-1-80382-884-8, 153, 10.1108/978-1-80382-883-120231012
14.	Mahmoud Zadehbagheri, Ashraf Hemeida, 2024, Chapter 14, 978-3-031-51531-6, 173, 10.1007/978-3-031-51532-3_14
15.	Iman Mahmoud, Emerson Guzzi Zuan Esteves, 2024, Chapter 11, 978-3-031-51531-6, 135, 10.1007/978-3-031-51532-3_11
16.	Alexey Mikhaylov, 2024, Chapter 4, 978-3-031-51531-6, 39, 10.1007/978-3-031-51532-3_4
17.	Solomon Eghosa Uhunamure, Tsangyao Chang, 2024, Chapter 7, 978-3-031-51531-6, 85, 10.1007/978-3-031-51532-3_7
18.	Laura M. Baitenova, Lyailya M. Mutaliyeva, Fi-John Chang, 2023, Chapter 2, 978-3-031-26595-2, 13, 10.1007/978-3-031-26596-9_2
19.	Mikhail Dorofeev, Hooi Hooi Lean, 2024, Chapter 20, 978-3-031-51531-6, 245, 10.1007/978-3-031-51532-3_20
20.	Mikhail Dorofeev, Kanato Tamashiro, 2023, Chapter 14, 978-3-031-26595-2, 165, 10.1007/978-3-031-26596-9_14
21.	Solomon Eghosa Uhunamure, Tsangyao Chang, 2023, Chapter 10, 978-3-031-26595-2, 115, 10.1007/978-3-031-26596-9_10
22.	Md. Mominur Rahman, Rizwan Ullah Khan, 2024, Chapter 18, 978-3-031-51531-6, 221, 10.1007/978-3-031-51532-3_18
23.	Mikhail Dorofeev, Vikas Khare, 2024, Chapter 19, 978-3-031-51531-6, 233, 10.1007/978-3-031-51532-3_19
24.	Mir Sayed Shah Danish, Emerson Guzzi Zuan Esteves, 2023, Chapter 12, 978-3-031-26595-2, 141, 10.1007/978-3-031-26596-9_12
25.	Solomon Eghosa Uhunamure, Abderrahmen Bouchenine, 2024, Chapter 23, 978-3-031-51531-6, 283, 10.1007/978-3-031-51532-3_23
26.	Md. Mominur Rahman, 2024, Chapter 17, 978-3-031-51531-6, 209, 10.1007/978-3-031-51532-3_17
27.	Kanato Tamashiro, Raya Karlibaeva, Diana Stepanova, 2024, Chapter 12, 978-3-031-51531-6, 147, 10.1007/978-3-031-51532-3_12
28.	Tomonobu Sengyu, Vikas Khare, 2023, Chapter 8, 978-3-031-26595-2, 87, 10.1007/978-3-031-26596-9_8
29.	Alexey Mikhaylov, 2024, Chapter 9, 978-3-031-51531-6, 111, 10.1007/978-3-031-51532-3_9
30.	Laura M. Baitenova, Lyailya M. Mutaliyeva, Tarek Ismail Mohamed, 2024, Chapter 2, 978-3-031-51531-6, 13, 10.1007/978-3-031-51532-3_2
31.	Yulia Budaeva, David Philippov, Tsangyao Chang, 2023, Chapter 5, 978-3-031-26595-2, 47, 10.1007/978-3-031-26596-9_5

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)