The NHRS scheme for the Chaplygin gas model in one and two dimensions

Kamel Mohamed; Hanan A. Alkhidhr; Mahmoud A. E. Abdelrahman; Kamel Mohamed; Hanan A. Alkhidhr; Mahmoud A. E. Abdelrahman

doi:10.3934/math.2022979

AIMS Mathematics

2022, Volume 7, Issue 10: 17785-17801. doi: 10.3934/math.2022979

Previous Article Next Article

Research article

The NHRS scheme for the Chaplygin gas model in one and two dimensions

1.
Department of Mathematics, College of Science, Taibah University, Al-Madinah Al-Munawarah, Saudi Arabia
2.
Department of Mathematics, Faculty of Science, New valley University, New Valley, Egypt
3.
Department of Mathematics, College of Science, Qassim University, Buraidah, Saudi Arabia
4.
Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt

Received: 03 April 2022 Revised: 07 July 2022 Accepted: 12 July 2022 Published: 03 August 2022
MSC : 35L65, 35L67, 65M08, 76M12, 76N15, 35Q35

The main motive of this work is to introduce a numerical investigation for the one and two-dimensional (1D/2D) Chaplygin gas model. Namely, we developed the non homogeneous Riemann solver (NHRS) method to solve these models. After discussing the Chaplygin gas models and the numerical scheme, various 1D and 2D test problems are introduced. In order to complete the numerical investigation in a completely unified way, Rusanov scheme, modified Lax-Friedrichs and analytical solution are compared with NHRS scheme in 1D case. The acquired results clarify the high resolution of the NHRS technique. The NHRS technique is efficacious and robust. Finally, our study displays that the NHRS scheme is a very powerful tool to solve many other models arising in applied science.

Keywords:

Citation: Kamel Mohamed, Hanan A. Alkhidhr, Mahmoud A. E. Abdelrahman. The NHRS scheme for the Chaplygin gas model in one and two dimensions[J]. AIMS Mathematics, 2022, 7(10): 17785-17801. doi: 10.3934/math.2022979

Related Papers:

[1]	Alessandro Linzi . Polygroup objects in regular categories. AIMS Mathematics, 2024, 9(5): 11247-11277. doi: 10.3934/math.2024552
[2]	Lingling Tan, Tiwei Zhao . Extension-closed subcategories in extriangulated categories. AIMS Mathematics, 2022, 7(5): 8250-8262. doi: 10.3934/math.2022460
[3]	Limin Liu, Hongjin Liu . Consistent pairs of $\mathfrak{s}$ -torsion pairs in extriangulated categories with negative first extensions. AIMS Mathematics, 2024, 9(1): 1494-1508. doi: 10.3934/math.2024073
[4]	Haijun Cao, Fang Xiao . The category of affine algebraic regular monoids. AIMS Mathematics, 2022, 7(2): 2666-2679. doi: 10.3934/math.2022150
[5]	Zhen Zhang, Shance Wang . Silting objects and recollements of extriangulated categories. AIMS Mathematics, 2024, 9(9): 24796-24809. doi: 10.3934/math.20241207
[6]	Sana Khadim, Muhammad Qasim . Quotient reflective subcategories of the category of bounded uniform filter spaces. AIMS Mathematics, 2022, 7(9): 16632-16648. doi: 10.3934/math.2022911
[7]	Ahmed Ramadan, Anwar Fawakhreh, Enas Elkordy . Novel categorical relations between $\mathcal{L}$ -fuzzy co-topologies and $\mathcal{L}$ -fuzzy ideals. AIMS Mathematics, 2024, 9(8): 20572-20587. doi: 10.3934/math.2024999
[8]	Tareq M. Al-shami, Salem Saleh, Alaa M. Abd El-latif, Abdelwaheb Mhemdi . Novel categories of spaces in the frame of fuzzy soft topologies. AIMS Mathematics, 2024, 9(3): 6305-6320. doi: 10.3934/math.2024307
[9]	Luca Giorgetti, Wei Yuan, XuRui Zhao . Separable algebras in multitensor C $^$ -categories are unitarizable. AIMS Mathematics, 2024, 9(5): 11320-11334. doi: 10.3934/math.2024555*
[10]	Wenxia Wu, Yunnan Li . Classification of irreducible based modules over the complex representation ring of $S_4$ . AIMS Mathematics, 2024, 9(7): 19859-19887. doi: 10.3934/math.2024970

Abstract

1. Introduction

Random forests are a class of non-parametric statistic machine learning algorithms used for regression and classification tasks. Random forest algorithms have the capability to perform sparse tasks with high accuracy in high dimensions, avoiding overfitting. In particular, random forests are considered to be among the most accurate learning algorithm classes for general tasks. They are routinely used in many fields including bio-informatics ^[15], economics ^[30], biology ^[10], and linguistics ^[18].

The most widely used random forest algorithm was introduced by Breiman ^[13], who was inspired by the work on random subspaces of Ho ^[19], and geometrical feature selection of Amit and Geman ^[2]. In Breiman's random forest, the trees are grown based on the classification and regression trees (CART) procedure, where both splitting directions and training sets are randomized. However, despite the few parameters that need to be tuned ^[16,21], their mathematical properties are still areas of active research ^[6,22]. A significant distinction among the class of random forest algorithms consists in the way each individual tree is constructed, and, in particular, the dependence of each tree on the data set. Some of the researchers consider random forests designed independently from the data set ^[7,14,25].

In 2012, Biau ^[5] studied a random forest model proposed by Breiman, where the construction is independent of the data set, called in literature centered random forest. In ^[5] an upper bound on the rate of consistency of the algorithm and its adaption to sparsity were proven. More precisely, about the first item, for a data set of $n$ samples in a space of dimension $d,$ the convergence rate was $\mathcal{O}\left(n^{-\frac{1}{d \frac{4}{3}\log{2}+1}}\right).$ In 2021, Klusowski ^[20] improved the rate of convergence to $\mathcal{O}\left((n\log^{\frac{d-1}{2}} {n})^{ -(\frac{1+\delta}{d\log{2}+1})} \right),$ where $\delta$ is a positive constant that depends on the dimension of the feature space $d$ and converges to zero as $d$ approaches infinity. In addition, in the same paper, Klusowski proved that the rate of convergence of the algorithm is sharp, although it fails to reach the minimax rate of consistency over the class of the Lipschitz functions $\mathcal{O}\left(n^{\frac{-2}{d+2}} \right)$ ^[29]. There is also important work on the consistency of algorithms that depend on data ^[23,27,28]. For a comprehensive overview of both theoretical and practical aspects of the random forests, see ^[8], which surveys the subject up to 2016.

An important tool for algorithmically manipulating random forests is through kernel methods. Breiman ^[11] observed the connection between kernel theory and random forests by showing the equivalence between tree construction and kernel action. Later this was formalized by Geurts et al. in ^[17]. In the same direction Scornet in ^[26] defined Kernel Random Forest (KeRF) by modifying the original algorithm, and providing theoretical and practical results. In particular, in his important work, Scornet provided explicit kernels for some generalizations of algorithms, their rate of consistency, and comparisons with the corresponding random forests. Furthermore, Arnould et al. ^[3] investigated the trade-off between interpolation of several random forest algorithms and their consistency results.

In the first part of the paper, we provide the notation and the definitions of the centered and uniform random forests and their corresponding kernel-related formulations. In addition, we improve the rate of consistency for the centered KeRF algorithm. Let $k \ge 1$ be the depth of the trees used to estimate the target variable $Y$ (see Section 2 for definitions and notation).

Theorem 1. Suppose that $X = (X_1, \dots, X_d)$ and $Y$ are related by $\mathit{\boldsymbol{Y}} = m(\mathit{\boldsymbol{X}}) +\epsilon$ where: $\epsilon$ is a zero mean Gaussian noise with finite variance independent of $\mathit{\boldsymbol{X}}$ , $\mathit{\boldsymbol{X}}$ is uniformly distributed in $[0, 1]^d$ , and $m$ is a regression function, which we assume to be Lipschitz. Then, there exists a constant $\tilde{C}$ such that, for every $n > 1$ and for every $x \in [0, 1]^d$ ,

$\mathbb{E}(\tilde{m}^{Cen}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)} (\log{n}) .$

Here, $m(x) = \mathbb{E}\left[Y|X = x\right]$ is the predicted value of $Y$ for $X = x\in [0, 1]^d$ , while $\tilde{m}^{Cen}_{\infty, n}(x)$ is the estimate for $m$ provided by the kernel version of the centered random forest algorithm.

Similarly, with $\tilde{m}^{Un}_{\infty, n}(x)$ playing for the uniform KeRF algorithm the role $\tilde{m}^{Cen}_{\infty, n}(x)$ had above, we have:

Theorem 2. Let $\mathit{\boldsymbol{X}}$ , $\mathit{\boldsymbol{Y}}$ , $m,$ and $\epsilon$ be as in Theorem 1, with $\mathit{\boldsymbol{Y}} = m(\mathit{\boldsymbol{X}}) +\epsilon.$ Then there exists a constant $\tilde{C}$ such that for every $n > 1,$ for every $x \in [0, 1]^d$

$\mathbb{E}(\tilde{m}^{Un}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+\frac{3}{2}d \log{2}}\bigr)} (\log{n}) .$

Moreover, in Section 4, we provide numerical examples and experiments concerning the tuning parameter $k,$ which is the tree depth of the two kernel-based random forest algorithms, by comparing the $L_{2}$ -error for different values and under specific assumptions on the data set.

In the final part of the article, we consider the reproducing kernel $K$ used in the centered KeRF algorithm per se. It is rewarding looking at it as defined on the finite Abelian group $\mathbb{Z}_{2}^{kd}$ , where, as above, $d$ is the dimension of the vector $X$ and $k$ is the depth of the tree. By using elementary Fourier analysis on groups, we obtain several equivalent expressions for $K$ and its group transform, we characterize the functions belonging to the corresponding Reproducing Kernel Hilbert Space (RKHS) $H_K$ , we derive results on multipliers, and we obtain bounds for the dimension of $H_K$ , which is much smaller than what one might expect.

2. Notation

A usual problem in machine learning is, based on $n$ observations of a random vector $(X, Y) \in \mathcal{X}\times\mathbb{R} \subseteq \mathbb{R}^{d}\times\mathbb{R}$ , to estimate the function $m(x) = \mathbb{E}(Y | X = x).$ In classification problems, $Y$ ranges over a finite set. In particular we assume that we are given a training sample $\mathcal{D}_{n} = \{(X_{1}, Y_{1}), ..., (X_{n}, Y_{n}) \}$ of independent random variables, where $X_{i} \in [0, 1]^{d}$ for every $i = 1, ..., n$ and $Y \in \mathbb{R}$ with a shared joint distribution $\mathbb{P}_{X, Y}.$ The goal is using the data set to construct an estimate $m_{n}: \mathcal{X} \subseteq [0, 1]^d \to \mathbb{R}$ of the function $m.$ Our convergence rate requires an a priori assumption on the regularity of the function $m.$ Following ^[26], we suppose that $m$ belongs to the class of $L$ Lipschitz functions,

$|m(x)-m(x')|\le L \cdot \|x-x'\|.$

Here, as is ^[26], we consider on $\mathbb{R}^d$ the distance

$\|x-x'\| = \sum\limits_{j = 1}^d|x_j-x_j'|.$

2.1. The random forest algorithm

Next, we provide the general random forest framework by defining firstly the notion of a random tree. Additionally, we present two specific variations of the original random forest algorithm, namely, the centered and uniform random forest algorithms.

Let's assume $\Theta_{i}$ for $i = 1, ..., M$ is a collection of independent random variables, distributed as $\Theta.$ The random variables $\Theta_{i}$ correspond to sample the training set or select the positions for splitting. The detailed construction in the case of the centered random forest is performed in Appendix.

Definition 1. For the $j$ -th tree in the forest, the predicted value x will be denoted by

$m_{n, \Theta_{j}, \mathcal{D}_{n}}(x) = \sum\limits_{i = 1}^{n} \frac{{\mathbb{1}}_{X_{i}\in A_{n, \Theta_{j}, \mathcal{D}_{n} }( x)} Y_{i}}{N_{n, \Theta_{j}, \mathcal{D}_{n}}( x) } .$

Where $A_{n, \Theta_{j}, \mathcal{D}_{n}}(x)$ is the cell containing $x$ and $N_{n, \Theta_{j}, \mathcal{D}_{n}}(x)$ is the number of points that fall into the cell that $x$ belongs to.

For a fixed value of $x \in [0, 1]^d,$ the value of the tree is the empirical expectation of $Y$ in the unique cell containing $x$ ; which is, this is the hope, a good guess for the target value corresponding to $x.$

A random forest is a finite collection (average) of independent, finite random trees $\{{ \Theta}_1, \dots, {\Theta}_M\}.$

Definition 2. The finite $M$ forest is

$m_{M, n}(x) = \frac{1}{M}\sum\limits_{j = 1}^{M} m_{n, \Theta_{j} , \mathcal{D}_{n}}(x) .$

From a modeling point of view, we let $M\to \infty$ and consider the infinite forest estimate

$m_{\infty, n, \mathcal{D}_{n}}(x ) = \mathbb{E}_{\Theta}( m_{n, \Theta, \mathcal{D}_{n} }(x)) .$

The convergence holds almost surely by the law of the large numbers conditionally on $\mathcal{D}_{n}$ ^[12] and [25, Theorem 3.1].

2.2. Kernel random forest algorithm

In 2016, Scornet ^[26] introduced kernel methods in the random forest world (KeRF), producing a kernel-based algorithm, together with estimates on how this compares with the old one, described above.

To understand the intuition behind KeRF construction, we reformulate the random forest algorithm.

For all $x \in [0, 1]^d,$

$m_{M, n}(x) = \frac{1}{M}\sum\limits_{j = 1}^{M} \big(\sum\limits_{i = 1}^n \frac{{\mathbb{1}}_{X_{i}\in A_{n, \Theta_{j}, \mathcal{D}_{n} }( x)} Y_{i}}{N_{n, \Theta_{j}, \mathcal{D}_{n}}( x) }\big).$

Therefore we can define the weights of every observation $Y_{i}$ as

$W_{i, j, n}(x) = \frac{{\mathbb{1}}_{X_{i}\in A_{n, \Theta_{j}, \mathcal{D}_{n} }( x)}}{N_{n, \Theta_{j}, \mathcal{D}_{n}}( x) } .$

Hence it is clear that the value of weights changes significantly concerning the number of points in each cell. A way to overcome this nuisance is by simultaneously considering all tree cells containing $x$ , as the tree is randomly picked in the forest.

For all $x \in [0, 1]^{d},$

$\tilde{m}_{M, n, \Theta_{1}, \Theta_{2}, ..., \Theta_{M}}(x ) = \frac{1}{\sum\nolimits_{j = 1}^{M}N_{n, \Theta_{j}}(x)} \sum\limits_{j = 1}^{M}\sum\limits_{i = 1}^{n}Y_{i}{\mathbb{1}}_{X_{i}\in A_{n, \Theta_{j}}(x)} .$

This way, empty cells do not affect the computation of the prediction function of the algorithm.

It is proven in ^[26], that this representation has indeed a kernel representation.

Proposition 1. [, Proposition 1] For all $x \in [0, 1]^{d}$ almost surely, it holds

$\tilde{m}_{M, n, \Theta_{1}, \Theta_{2}, ..., \Theta_{M}}(x ) = \frac{\sum\nolimits_{i = 1}^{n}K_{M, n}(x, X_{i})Y_{i}}{\sum\nolimits_{i = 1}^{n}K_{M, n}(x, X_{i})} ,$

where

$K_{M, n}(x, z) = \frac{1}{M}\sum\limits_{i = 1}^{M}{\mathbb{1}}_{ x \in A_{n, \Theta_{i}, \mathcal{D}_{n}}( z ) }$

is the proximity function of the $M$ forest.

The infinite random forest arises as the number of trees tends to infinity.

Definition 3. The infinite KeRF is defined as:

$\tilde{m}_{\infty, n}(x) = \lim\limits_{M\to \infty}\tilde{m}_{M, n}(x, \Theta_{1}, \Theta_{2}, ..., \Theta_{M} ) .$

The extension of the kernel follows also in the infinite random forest.

Proposition 2. [, Proposition 2] Almost surely for all $x, y \in [0.1]^{d}$

$\lim\limits_{M\to \infty} K_{M, n}(x, y) = K_{n}(x, y),$

where

$K_{n}(x, y) = \mathbb{P}_{\Theta} (x\in A_{n}(y, \Theta)),$

where the left-hand side is the probability that $x$ and $y$ belong to the same cell in the infinite forest.

2.3. The centered random forest vs centered KeRF, and the uniform random forest vs uniform KeRF

In this paper, we say that an estimator function $m_{n}$ of $m$ is consistent if the following $L_{2}-$ type of convergence holds,

$\mathbb{E}( m_{n}(x) - m (x) )^{2} \to 0 ,$

as $n \to \infty.$

In the centered and uniform forest algorithms, the way the data set $\mathcal{D}_{n}$ is partitioned is independent of the data set itself.

2.3.1. The centered random forest/centered KeRF

The centered forest is designed as follows:

$1)$ Fix $k \in \mathbb{N}.$

$2)$ At each node of each individual tree choose a coordinate uniformly from $\{ 1, 2, ..d\}.$

$3)$ Split the node at the midpoint of the interval of the selected coordinate.

Repeat steps 2) and 3) $k$ times. At the end, we have $2^{k}$ leaves, or cells. A toy example of this iterative process for $k = 1, 2$ is in and . Our estimation at a point $x$ is achieved by averaging the $Y_{i}$ corresponding to the $X_{i}$ in the cell containing $x.$

Figure 1. Centered algorithm with tree level

$k = 1$ with the convention that

$1$ corresponds to

$x$ axis and

$2$ to the

$y$ axis.

DownLoad: Full-Size Img PowerPoint

Figure 2. Centered algorithm with tree level

$k = 1$ with the convention that

$1$ corresponds to

$x$ axis and

$2$ to the

$y$ axis.

DownLoad: Full-Size Img PowerPoint

Scornet in ^[26] introduced the corresponding kernel-based centered random forest providing explicitly the proximity kernel function.

Proposition 3. A centered random forest kernel with $k \in \mathbb{N}$ parameter has the following multinomial expression [26, Proposition 5],

$K^{Cen}_{k}(x, z) = \sum\limits_{\sum\nolimits_{j = 1}^{d}k_{j} = k } \frac{k!}{k_{1}!...k_{d}!}(\frac{1}{d})^k \prod\limits_{j = 1}^{d} {\mathbb{1}}_{ \left \lceil{2^{k_{j}}x_{j}}\right \rceil = \left \lceil{2^{k_{j}}z_{j}}\right \rceil },$

where $K^{Cen}_{k}$ is the kernel of the corresponding centered random forest.

2.3.2. The uniform random forest/kernel random forest

Uniform random forest was introduced by Biau et al. ^[7] and is another toy model of Breinman's random forest as a centered random forest. The algorithm forms a partition in $[0, 1]^{d}$ as follows:

$1)$ Fix $k \in \mathbb{N}.$

$2)$ At each node of each individual tree choose a coordinate uniformly from $\{ 1, 2, ..d\}.$

$3)$ The splitting is performed uniformly on the side of the cell of the selected coordinate.

Repeat steps 2) and 3) $k$ times. At the end, we have $2^{k}$ leaves. Our final estimation at a point $x$ is achieved by averaging the $Y_{i}$ corresponding to the $X_{i}$ in the cell $x.$

Again Scornet in [26, Proposition 6] proved the corresponding kernel-based uniform random forest.

Proposition 4. The corresponding proximity kernel for the uniform KeRF for parameter $k \in \mathbb{N}$ and $x\in [0, 1]^d$ has the following form:

$K^{Un}_{k}(0, x) = \sum\limits_{\sum\nolimits_{j = 1}^{d}k_{j} = k }\frac{k!}{k_{1}!...k_{d}!}(\frac{1}{d})^k \prod\limits_{m = 1}^{d} \bigg( 1-x_{m} \sum\limits_{j = 0}^{k_{m}-1}\frac{(-\ln{x_{m}})^{j}}{j!} \bigg)$

with the convention that

$\sum\limits_{j = 0}^{-1}\frac{(-\ln{x_{m}})^{j}}{j!} = 0$

and by continuity we can extend the kernel also for zero components of the vector.

Unfortunately, it is very hard to obtain a general formula for $K^{Un}(x, y)$ but we consider instead a translation invariant KeRF uniform forest:

$^{Un}_{\infty, n}(x) = \frac{\sum\nolimits_{i = 1}^{n}Y_{i}K^{Un}_{k}(0, \lvert X_{i}-x \rvert ) }{\sum\nolimits_{i = 1}^{n}K^{Un}_{k}(0, \lvert X_{i}-x \rvert ) }.$

3. Proofs of the main theorems

In this section, after providing some measure concentration type results ^[9], we improve the rate of consistency of the centered KeRF algorithm. The following lemmata will provide inequalities to derive upper bounds for averages of iid random variables. Lacking a reference, for completeness, we provide detailed proofs of these lemmata. Moreover, we assume for this article that all random variables are real-valued and $\lvert \lvert X \rvert\rvert_{L_{p}} \colon = (\mathbb{E}\lvert X \rvert^{p})^{\frac{1}{p}}$ and $\lvert \lvert X \rvert \rvert_{\infty} \colon = \inf\{ t \colon P(\lvert X \rvert\leq t) = 1 \}$ .

Lemma 1. Let $X_{1}, ..., X_{n}$ be a sequence of real independent and identically distributed random variables with $\mathbb{E}(X_{i}) = 0.$ Assuming also that there is a uniform bound for the $L_{1}$ -norm and the supremum norm i.e., $\mathbb{E}(\rvert X_{i} \lvert) \leq C,$ $\lvert \lvert X_{i} \rvert \rvert_{\infty} \leq CM$ for every $i = 1, ..., n.$ Then for every $t\in (0, 1)$

$\mathbb{P}\big( \{ \frac{\lvert\sum\nolimits_{i = 1}^{n}X_{i} \rvert }{n} \geq t\} \big)\leq 2 e^{- \tilde{C}_{C}\frac{t^{2} n}{ M }}$

for some positive constant $\tilde{C}_{C}$ that depends only on $C.$

Proof. $\forall x \in [0, 1]$ one has that $e^x \leq 1+x+x^2.$ By using the hypothesis for every $\lambda \leq \frac{1}{CM},$

$\begin{align*} e^{\lambda X_{i}} &\leq 1 + \lambda X_{i} + (\lambda {X_i})^2 \quad \Rightarrow \\ \mathbb{E}e^{\lambda X_{i}} &\leq 1 +\lambda^2 \mathbb{E}({X_i})^2\\ &\leq 1 + \lambda^2 \lvert \lvert X_{i}\rvert \rvert_{1} \lvert \lvert X_{i}\rvert \rvert_{\infty}\\ &\leq 1 + \lambda^2 C^2 M\\ &\leq e^{\lambda^2 C^2 M}. \end{align*}$

By the independence of the random variables $X_{i},$

$\begin{align*} \mathbb{E}e^{\sum\nolimits_{i = 1}^n \lambda X_{i}} & = \prod\limits_{i = 1}^{n}\mathbb{E}e^{\lambda X_{i}} \\& \leq \prod\limits_{i = 1}^{n}e^{\lambda^2 C^2 M} = e^{n\lambda^2 C^2 M}. \end{align*}$

Therefore, by Markov inequality

$\begin{align*} \mathbb{P}\big( \{ \frac{\sum\nolimits_{i = 1}^{n}X_{i} }{n} \geq t\} \big) \leq e^{-\lambda t n} \mathbb{E}e^{\sum\nolimits_{i = 1}^n \lambda X_{i}} \leq e^{-\lambda t n} e^{n\lambda^2 C^2 M} = e^{n\lambda^2 C^2 M-\lambda t n}. \end{align*}$

Finally if $C \geq \frac{1}{4}$ we choose, $\lambda = \frac{t}{2 C^2 M},$ otherwise for $\lambda = \frac{t}{16 C M}$

$\mathbb{P}\big( \{ \frac{\sum\nolimits_{i = 1}^{n}X_{i} }{n} \geq t\} \big) \leq e^{- \tilde{C}_{C}\frac{t^{2} n }{ M}}.$

By replacing $X_{i}$ with $-X_{i}$ we conclude the proof. □

Lemma 2. Let $X_{1}, ..., X_{n}$ be a non-negative sequence of independent and identically distributed random variables with $\mathbb{E}(X_{i}) \leq 2,$ $\lvert \lvert X_{i} \rvert \rvert_{\infty} \leq M$ for every $i = 1, ..., n.$ Let also a sequence of independent random variables $\epsilon_{i}$ following normal distribution with zero mean and finite variance $\sigma^2,$ for every $i = 1, ..., n.$ We assume also that $\epsilon_{i}$ are independent from $X_{i}$ for every $i = 1, ..., n.$

Then for every $t\in (0, 1),$

$\mathbb{P}\biggl(\frac{1}{n} \sum\limits_{i = 1}^{n} \lvert \epsilon_{i}X_{i}\rvert \geq t \biggr) \leq 2\exp{(-C t^{2}\frac{n}{M})}$

with the positive constant $C$ depending only on $\sigma.$

Proof.

$\begin{align*} \mathbb{P}\biggl(\frac{1}{n} \sum\limits_{i = 1}^{n} \epsilon_{i}X_{i} \geq t \biggr) & = \mathbb{P}\biggl( \exp\biggl(\frac{\lambda}{n} \sum\limits_{i = 1}^{n} \epsilon_{i}X_{i} \geq \exp(\lambda t) \biggr) \quad ({\text{for a positive}}\ \lambda)\\ &\leq \exp(-\lambda t) \mathbb{E}\exp\biggl(\frac{\lambda}{n} \sum\limits_{i = 1}^{n}\epsilon_{i} X_{i} \biggl)\quad ({\text{by Chebyshev's inequality}}) \\ & = \exp(-\lambda t) \prod\limits_{i = 1}^{n}\mathbb{E}\exp\biggl(\frac{\lambda}{n} \epsilon_{i}X_{i} \biggl) \quad ({\text{by independence}})\\ & = \exp(-\lambda t)\prod\limits_{i = 1}^{n}\biggl( 1+ \sum\limits_{k = 2}^{\infty} \frac{\lambda^{k}\mathbb{E}X_{i}^{k}\mathbb{E}\epsilon_{i}^{k}}{n^{k}k!}\biggl) \\ &\leq \exp(-\lambda t)\prod\limits_{i = 1}^{n}\biggl( 1+ \frac{2}{M}\sum\limits_{k = 2}^{\infty} \frac{\lambda^{k}M^{k}\mathbb{E}\epsilon_{i}^{k}}{n^{k}k!}\biggr)\\ & = \exp(-\lambda t)\prod\limits_{i = 1}^{n}\biggl( 1+ \frac{2}{M} \biggr( \mathbb{E}\exp\biggl(\frac{\lambda M}{n}\epsilon_{i}\biggr) -1\biggr) \biggr)\\ &\leq \exp(-\lambda t)\prod\limits_{i = 1}^{n}\biggl( 1+ \frac{2}{M} \biggr( \exp\biggl(\frac{\lambda^2 \sigma^2 M^2}{n^2}\biggr) -1\biggr) \biggr)\\ & = \exp(-\lambda t)\exp\biggl(\sum\limits_{i = 1}^{n}\biggl(\log\biggl( 1 + \frac{2}{M} \biggr( \exp\biggl(\frac{\lambda^2 \sigma^2 M^2}{n^2}\biggr) -1 \biggr)\biggr)\biggr)\biggr)\\ &\leq \exp(-\lambda t) \exp\biggl( \sum\limits_{i = 1}^{n} \frac{2}{M}\biggl(\exp\biggl(\frac{\lambda^2 \sigma^2 M^2}{n^2}\biggr) -1\biggr)\biggr)\\ &\leq \exp(-\lambda t) \exp\biggl( \frac{2n}{M}\biggl(\exp\biggl(\frac{\lambda^2 \sigma^2 M^2}{n^2}\biggr) -1\biggr)\biggr)\\ &\leq \exp(-\lambda t) \exp\biggl( \frac{2n}{M}\biggl(2\frac{\lambda^2 \sigma^2 M^2}{n^2} \biggr)\biggr)\quad (\mathit{\text{for}} \lambda\leq\frac{n}{ \sigma M})\\ & = \exp\biggl(-\lambda t+ \frac{4M}{n}\lambda^2 \sigma^2 \biggr). \end{align*}$

Finally, we select $\lambda = \frac{tn}{8 M \sigma^2},$ when $\sigma \geq \frac{1}{8}$ and $\lambda = \frac{tn}{M \sigma},$ when $\sigma \leq \frac{1}{8}$

$\mathbb{P}\biggl(\frac{1}{n} \sum\limits_{i = 1}^{n}\epsilon_{i}X_{i} \geq t \biggr) \leq \exp\biggl(- C \frac{t^2 n}{ M} \biggr).$

Replacing $X_{i}$ with $-X_{i}$ we conclude the proof. □

Theorem 3.1. $\mathit{\boldsymbol{Y}} = m(\mathit{\boldsymbol{X}}) +\epsilon$ where $\epsilon$ is a zero mean Gaussian noise with finite variance independent of $\mathit{\boldsymbol{X}}$ . Assuming also that $\mathit{\boldsymbol{X}}$ is uniformly distributed in $[0, 1]^d$ and $m$ is a Lipschitz function. Then there exists exists a constant $\tilde{C}$ such that for every $n > 1$ , for every $x \in [0, 1]^d$

$\mathbb{E}(\tilde{m}^{Cen}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)} (\log{n}) .$

Proof. Following the notation in ^[26], let $x \in [0, 1]^d$ , $\lVert m \rVert_{\infty} = \sup_{x \in [0, 1]^d} | m(x) |$ , and by the construction of the algorithm

$\tilde{m}_{n, \infty}^{Cen}(x) = \frac{\sum\nolimits_{i = 1}^{n}Y_{i}K_{k}(x, X_{i})}{\sum\nolimits_{i = 1}^{n}K_{k}(x, X_{i})} .$

Let

$A_{n}(x) = \frac{1}{n}\sum\limits_{i = 1}^{n} \left( \frac{ Y_{i}K_{k}(x, X_{i}) - \mathbb{E}(Y K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \right),$

$B_{n}(x) = \frac{1}{n}\sum\limits_{i = 1}^{n} \left( \frac{K_{k}(x, X_{i}) - \mathbb{E}( K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \right),$

and

$M_{n}(x) = \frac{\mathbb{E}(YK_{k}(x, X))}{\mathbb{E}( K_{k}(x, X))}.$

Hence, we can reformulate the estimator as

$\tilde{m}_{n, \infty}^{Cen}(x) = \frac{M_{n}(x)+A_{n}(x)}{B_{n}(x)+1}.$

Let $t \in (0, \frac{1}{2})$ and the event $C_{t}(x)$ where $\{A_{n}(x), B_{n}(x) \leq t\}.$

$\begin{align*} \mathbb{E}(\tilde{m}_{n, \infty}^{cc}(x) - m(x))^2 & = \mathbb{E}(\tilde{m}_{n, \infty}^{cc}(x) - m(x))^2 {\mathbb{1}}_{C_{t}(x)} + \mathbb{E}(\tilde{m}_{n, \infty}^{cc}(x) - m(x))^2 {\mathbb{1}}_{C_{t}^c(x)} \\ &\leq \mathbb{E}(\tilde{m}_{n, \infty}^{cc}(x) - m(x))^2 {\mathbb{1}}_{C_{t}^c(x)} + c_{1} \left(1-\frac{1}{2d}\right)^{2k} + c_{2}t^2 \end{align*}$

where the last inequality was obtained in [26, p.1496]. Moreover, in ^[26],

$\mathbb{E}(\tilde{m}_{n, \infty}^{cc}(x) - m(x))^2 {\mathbb{1}}_{C_{t}^c(x)} \leq c_{3}(\log{n})(\mathbb{P}(C_{t}^c(x)))^{\frac{1}{2}}.$

In order to find the rate of consistency we need a bound for the probability $\mathbb{P}(C_{t}^c(x)).$ Obviously,

$\mathbb{P}(C_{t}^c(x)) \leq \mathbb{P}( \rvert A_{n}(x) \rvert > t)+ \mathbb{P}( \lvert B_{n}(x) \rvert > t) .$

We will work separately to obtain an upper bound for both probabilities.

Proposition 5. Let

$\tilde{X}_{i} = \frac{K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} -1,$

a sequence of iid random variables. Then for any $t \in (0, 1),$

$\mathbb{P}\big( \{ \frac{ \lvert \sum\nolimits_{i = 1}^{n}\tilde{X}_{i} \rvert}{n} \geq t\} \big) = \mathbb{P}\big( \lvert B_{n}(x) \rvert \geq t \big) \leq 2e^{-\tilde{C}_{1} \frac{t^{2} n }{2^k }}$

for some positive constant $\tilde{C}_{1}.$

Proof. It is easy to verify that $\mathbb{E} \tilde{X}_{i} = 0,$ and

$\lvert \tilde{X}_{i} \rvert = \vert \frac{K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} -1 \rvert \leq \frac{K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))}+1,$

hence, $\mathbb{E} \lvert \tilde{X}_{i} \rvert \leq 2.$

Finally,

$\lvert \lvert \tilde{X}_{i} \rvert \rvert_{\infty} = \sup \{ \lvert \tilde{X}_{i} \rvert\} = \sup \{ \lvert \frac{K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} -1 \rvert\} \leq \frac{1}{\mathbb{E}(K_{k}(x, X))} \sup{ K_{k}(x, X_{i})}+1 \leq 2^k +1 \leq 2^{k+1}.$

By Lemma 1,

$\mathbb{P}\big( \{ \frac{ \lvert \sum\limits_{i = 1}^{n}\tilde{X}_{i} \rvert}{n} \geq t\} \big) = \mathbb{P}\big( \lvert B_{n}(x) \rvert \geq t \big) \leq 2 e^{- \tilde{C}_{1} \frac{t^{2} n }{ 2^k }} .$

□

We need a bound for $\mathbb{P}\big(\lvert A_{n}(x) \rvert > t \big)$ where,

$A_{n}(x) = \frac{1}{n}\sum\limits_{i = 1}^{n} \big( \frac{ Y_{i}K_{k}(x, X_{i}) - \mathbb{E}(Y K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \big) .$

Proposition 6. Let

$\tilde{Z_{i}} = \frac{ Y_{i}K_{k}(x, X_{i}) - \mathbb{E}(Y K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))}$

for $i = 1, ..., n$ , then for every $t \in (0, 1),$

$\mathbb{P}\big( \{ \frac{ \lvert \sum\nolimits_{i = 1}^{n}\tilde{Z}_{i} \rvert}{n} \geq t\} \big) = \mathbb{P}\big( \lvert A_{n}(x) \rvert \geq t \big) \leq 4 e^{- C \frac{t^{2} n }{2^k }},$

for some constant $C$ depending only on $\sigma, \lVert m \rVert_\infty.$

Proof.

$\begin{align*} A_{n}(x) & = \frac{1}{n}\sum\limits_{i = 1}^{n} \bigg( \frac{ Y_{i}K_{k}(x, X_{i}) - \mathbb{E}(Y K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \bigg)\\ & = \frac{1}{n}\sum\limits_{i = 1}^{n} \bigg( \frac{ m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \bigg) +\frac{1}{n}\sum\limits_{i = 1}^{n} \bigg( \frac{ \epsilon_{i}K_{k}(x, X_{i}) - \mathbb{E}(\epsilon K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \bigg)\\ & = \frac{1}{n}\sum\limits_{i = 1}^{n} \bigg( \frac{ m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \bigg) +\frac{1}{n}\sum\limits_{i = 1}^{n} \bigg( \frac{ \epsilon_{i}K_{k}(x, X_{i}) }{\mathbb{E}(K_{k}(x, X))} \bigg) . \end{align*}$

Therefore,

$\begin{align*} \mathbb{P} \big( \lvert A_{n}(x) \rvert \geq t \big) &\leq \mathbb{P} \bigg( \bigg\lvert \frac{2}{n} \sum\limits_{i = 1}^{n} \frac{m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X))}{\mathbb{E}(K_{k}(x, X))} \bigg\rvert \geq t \bigg) \\ &\quad+ \mathbb{P} \bigg( \bigg\lvert \frac{2}{n} \sum\limits_{i = 1}^{n} \frac{\epsilon_{i}K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} \bigg\rvert \geq t \bigg). \end{align*}$

Let

${Z_{i}} = \frac{ 2(m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X)) )} {\mathbb{E}(K_{k}(x, X))} ,$

a sequence of iid random variables. It is easy to verify that $\tilde{Z_{i}}$ are centered and

$\lvert \tilde{Z}_{i} \rvert = \lvert \frac{ m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \rvert \leq 2 \lvert \lvert m \rvert \rvert_{\infty} \frac{K_{k}(x, X_{i})+\mathbb{E}(K_{k}(x, X))}{\mathbb{E}(K_{k}(x, X))} .$

Hence, $\mathbb{E}\lvert Z_{i} \rvert \leq 4 \lvert \lvert m \rvert \rvert_{\infty}.$ Finally,

$\begin{align*} \lvert \lvert Z_{i} \rvert \rvert_{\infty} & = \sup \{ \lvert Z_{i} \rvert\}\\ & = 2\sup \{ \lvert \frac{ m(X_{i})K_{k}(x, X_{i}) - \mathbb{E}(m(X) K_{k}(x, X))} {\mathbb{E}(K_{k}(x, X))} \rvert \}\\ &\leq 2 \lvert \lvert m \rvert \rvert_{\infty}(2^k +1)\\ &\leq 4 \lvert \lvert m \rvert \rvert_{\infty} 2^k . \end{align*}$

By Lemma 1,

$\mathbb{P} \Bigg( \Bigg\lvert \frac{1}{n} \sum\limits_{i = 1}^{n} \frac{m(X_i)K_k(x, X_i) - \mathbb{E}(m(X)K_k(x, X))}{\mathbb{E}(K_k(x, X))} \Bigg\rvert \geq t \Bigg) \leq 2 e^{- C\frac{nt^2}{ 2^k}}.$

Furthermore, let

$\tilde{W}_{i} = \frac{2\epsilon_{i}K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))}$

for $i = 1, ..., n$ , a sequence of independent and identically distributed random variables. We can verify that for every for $i = 1, ..., n$ :

$\mathbb{E} \left( \frac{2K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} \right) \leq 2 .$

Finally,

$\sup \left\{ \left\lvert \frac{2 K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} \right\rvert \right\} \leq \frac{2}{\mathbb{E}(K_{k}(x, X))} \sup \{ K_{k}(x, X_{i}) \} \leq 2^{k+1} .$

By Lemma 2 it is clear,

$\mathbb{P} \bigg( \bigg\lvert \frac{2}{n} \sum\limits_{i = 1}^{n} \frac{\epsilon_{i}K_{k}(x, X_{i})}{\mathbb{E}(K_{k}(x, X))} \bigg\rvert \geq t \bigg) \leq 2 e^{- C_{2}\frac{nt^2}{ 2^k}} .$

We conclude the proposition by observing

$\mathbb{P} \big( \lvert A_{n}(x) \rvert \geq t \big) \leq 4 e^{- \min{ \{ C_{2}, C\}}\frac{nt^2}{ 2^k}} .$

□

Finally, let us compute the rate of consistency of the algorithm-centered KeRF. By Propositions 5 and 6, one has that

$\bigg( \mathbb{P}(C_{t}^c(x)) \bigg)^{\frac{1}{2}} \leq \bigg( \mathbb{P}( \rvert A_{n}(x) \rvert > t)+ \mathbb{P}( \lvert B_{n}(x) \rvert > t) \bigg)^{\frac{1}{2}} \leq c_{3} e^{- c_{4}\frac{nt^2}{ 2^k}},$

for some constants $c_{3}, c_{4}$ independent of $k$ and $n.$

Thus,

$\mathbb{E}( \tilde{m}_{\infty, n} -m(x) )^{2} \leq c_{1} \left(1-\frac{1}{2d}\right)^{2k} + c_{2}t^{2} + c_{3}\log{n} e^{-c_{4}t^{2}\frac{n}{2^k}}.$

We compute the minimum of the right-hand side of the inequality for $t \in (0, 1),$

$\begin{align*} 2c_{2}t - 2tc_{4}\log{n}c_{3} \frac{n}{2^k} e^{-c_{4}t^{2}\frac{n}{2^k}} & = 0 \quad \Rightarrow \\ e^{-c_{4}t^{2}\frac{n}{2^k}} & = \frac{c_{2}}{c_{3}c_{4}} \frac{2^k}{n\log{n}} \quad \text{and}\\ t^{2} & = \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k}\right)}. \end{align*}$

Hence, the inequality becomes

$\begin{align*} \mathbb{E}( \tilde{m}_{\infty, n} -m(x) )^{2} &\leq c_{1} \left(1-\frac{1}{2d}\right)^{2k} + c_{2} \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k}\right)} + c_{3} \log{n}\frac{c_{2}}{c_{3}c_{4}}\frac{2^k}{n\log{n}} \\ & = c_{1} \left(1-\frac{1}{2d}\right)^{2k} + c_{2} \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k} e^{\frac{c_{2}}{c_{4}}}\right)}. \end{align*}$

For every $\epsilon_n \in (0, 2]$ it holds, $\log{x} \leq \frac{1}{\epsilon_n} x^{\epsilon_n}.$ Then one has that

$\mathbb{E}( \tilde{m}_{\infty, n} -m(x) )^{2} \leq c_{1} \left(1-\frac{1}{2d}\right)^{2k} + \frac{c_2 ( e^{\frac{c_{2}}{c_{4}}} \frac{c_{2}}{c_{3}c_{4}} )^{n} }{c_4 \epsilon_n}\left(\frac{2^k}{n} (\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}\right)^{1-\epsilon_n}.$

We pick

$k = c(d) \log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}},$

thus,

$\frac{c_2 ( e^{\frac{c_{2}}{c_{4}}} \frac{c_{2}}{c_{3}c_{4}} )^{n} }{c_4 \epsilon_n}\left(\frac{2^k}{n} (\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}\right)^{1-\epsilon_n} \leq \frac{c'}{\epsilon_n} n^{(c(d)-1)(1-\epsilon_n)} \log{n}^{\epsilon_n(1-c(d))},$

for a constant $c'$ independent of $n$ and,

$\begin{align*} c_{1} \left(1-\frac{1}{2d}\right)^{2k}& = c_{1} \left(1-\frac{1}{2d}\right)^{2 ( c(d) \log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}}) } \\ & = c_{1} 2^{2c(d) \log_{2}{\left(1-\frac{1}{2d}\right)}\log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}} }\\ & = c_{1} n^{2c(d) \log_{2}{\left(1-\frac{1}{2d}\right)}} \frac{1}{(\log{n})^{c(d) \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{2d}\right)}} }. \end{align*}$

Therefore,

$c(d) = \frac{\epsilon_n-1}{2\log_{2}{\left(1-\frac{1}{2d}\right)}-(1-\epsilon_n)}.$

Finally,

$\begin{align*} & c_{1}n^{2c(d) \log_{2}{\left(1-\frac{1}{2d}\right)}} \frac{1}{(\log{n})^{c(d) \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{2d}\right)}}}\\ & = c_{1} n^{\frac{2(\epsilon_n-1)}{2\log_{2}{\left(1-\frac{1}{2d}\right)}-(1-\epsilon_n)} \log_{2}{\left(1-\frac{1}{2d}\right)}} \times \frac{1}{(\log{n})^{\frac{2(\epsilon_n-1)}{2\log_{2}{\left(1-\frac{1}{2d}\right)}-(1-\epsilon_n)} \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{2d}\right)}}}\\ & = c_{1} n^{\frac{2(\epsilon_n-1)}{2\bigl(\frac{-\frac{1}{2d}}{\log{2}}\bigr)-(1-\epsilon_n)} \bigl(\frac{-\frac{1}{2d}}{\log{2}}\bigr)} \times \frac{1}{(\log{n})^{\frac{2(\epsilon_n-1)}{2\log_{2}\bigl(1-\frac{1}{2d}\bigr)-(1-\epsilon_n)} \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}\bigl(1-\frac{1}{2d}\bigr)}}\\ & = c_{1}n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \log{2}(1-\epsilon_n )}\bigr)} \end{align*}$

and, for the second term, with the same arguments

$\frac{\tilde{c}}{\epsilon_n} n^{(c(d)-1)(1-\epsilon_n)} \log{n}^{\epsilon_n(1-c(d))} = \frac{\tilde{c}}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \log{2}(1-\epsilon_n )}\bigr)}$

for a constant $\tilde{c}$ independent of $\epsilon_n,$ hence,

$\mathbb{E}(\tilde{m}^{Cen}_{\infty, n}(x)- m(x) )^2 \leq \frac{C}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \log{2}(1-\epsilon_n )}\bigr)} ,$

and consequently,

$\begin{align*} &\frac{C}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \log{2}(1-\epsilon_n )}\bigr)} \\ & = \frac{C}{\epsilon_n} n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)}(\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \log{2}(1-\epsilon_n )}\bigr)} \times n^{\bigl(\frac{\epsilon_n}{ (1+d \log{2}) (1 + (1-\epsilon_n ))d \log{2} }\bigr)}\\ &\leq \frac{C}{\epsilon_n} n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)}(\log{n})^{\bigl(\frac{\epsilon_n}{d \log{2}(1-\epsilon_n )}\bigr)} \times (\log{n})^{ \frac{\log{n}}{\log{\log{n}}} \bigl(\frac{\epsilon_n}{ (d \log{2})^{2} (1-\epsilon_n ) }\bigr)} . \end{align*}$

Finally we finish the proof by selecting $\epsilon_n = \frac{1}{\log{n}},$ and

$\mathbb{E}(\tilde{m}^{Cen}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)} (\log{n} ).$

□

Theorem 4. $\mathit{\boldsymbol{Y}} = m(\mathit{\boldsymbol{X}}) +\epsilon$ where $\epsilon$ is a zero mean Gaussian noise with finite variance independent of $\mathit{\boldsymbol{X}}$ . Assuming also that $\mathit{\boldsymbol{X}}$ is uniformly distributed in $[0, 1]^d$ and $m$ is a Lipschitz function. Providing $k\to \infty$ , there exists a constant $\tilde{C}$ such that for every $n > 1$ , for every $x \in [0, 1]^d$

$\mathbb{E}(\tilde{m}^{Un}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+\frac{3}{2}d \log{2}}\bigr)} (\log{n} ).$

Proof. By arguing with the same reasoning as the proof of the centered random forest we can verify that

for some constants $c_{3}, c_{4}$ independent of $k$ and $n.$ The rate of consistency for the Uniform KeRF is the minimum of the right hand in the inequality in terms of $n$

$\mathbb{E}( \tilde{m}^{Un}_{\infty, n} -m(x) )^{2} \leq c_{1} \left(1-\frac{1}{3d}\right)^{2k} + c_{2}t^{2} + c_{3}\log{n} e^{-c_{4}t^{2}\frac{n}{2^k}}.$

We compute the minimum of the right-hand side of the inequality for $t \in (0, 1),$

$\begin{align*} 2c_{2}t - 2tc_{4}\log{n}c_{3} \frac{n}{2^k} e^{-c_{4}t^{2}\frac{n}{2^k}} & = 0 \quad \Rightarrow \\ e^{-c_{4}t^{2}\frac{n}{2^k}} & = \frac{c_{2}}{c_{3}c_{4}} \frac{2^k}{n\log{n}} \quad \mathit{\text{and}}\\ t^{2} & = \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k}\right)}. \end{align*}$

Hence, the inequality becomes

$\begin{align*} \mathbb{E}(\tilde{m}^{Un}_{\infty, n}(x)- m(x) )^2 &\leq c_{1} \left(1-\frac{1}{3d}\right)^{2k} + c_{2} \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k}\right)}\\ &\quad \quad + c_{3} \log{n}\frac{c_{2}}{c_{3}c_{4}}\frac{2^k}{n\log{n}}\\ & = c_{1} \left(1-\frac{1}{3d}\right)^{2k} + c_{2} \frac{1}{c_{4}} \frac{2^k}{n}\log{\left(\frac{c_{2}}{c_{3}c_{4}} \frac{n \log{n}}{2^k} e^{\frac{c_{2}}{c_{4}}}\right)} . \end{align*}$

For every $\epsilon_n \in (0, 2]$ it holds, $\log{x} \leq \frac{1}{\epsilon_n} x^{\epsilon_n}.$ Then one has that,

$\mathbb{E}( \tilde{m}^{Un}_{\infty, n} -m(x) )^{2} \leq c_{1} \left(1-\frac{1}{3d}\right)^{2k} + \frac{c_2 ( e^{\frac{c_{2}}{c_{4}}} \frac{c_{2}}{c_{3}c_{4}} )^{n} }{c_4 \epsilon_n}\left(\frac{2^k}{n} (\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}\right)^{1-\epsilon_n}.$

We pick

$k = c(d) \log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}}.$

Therefore,

for a constant $c'$ independent of $n$ and,

$\begin{align*} c_{1} \left(1-\frac{1}{3d}\right)^{2k} & = c_{1} \left(1-\frac{1}{3d}\right)^{2c(d) \log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}} }\\ & = c_{1} 2^{2c(d) \log_{2}{\left(1-\frac{1}{3d}\right)}\log_{2}{\frac{n}{(\log{n})^{\frac{\epsilon_n}{1-\epsilon_n}}}} }\\ & = c_{1} n^{2c(d) \log_{2}{\left(1-\frac{1}{3d}\right)}} \frac{1}{(\log{n})^{c(d) \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{3d}\right)}} }. \end{align*}$

Therefore,

$c(d) = \frac{\epsilon_n-1}{2\log_{2}{\left(1-\frac{1}{3d}\right)}-(1-\epsilon_n)}.$

Finally,

$\begin{align*} &c_{1}n^{2c(d) \log_{2}{\left(1-\frac{1}{3d}\right)}} \frac{1}{(\log{n})^{c(d) \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{3d}\right)}}}\\ & = c_{1} n^{\frac{2(\epsilon_n-1)}{2\log_{2}{\left(1-\frac{1}{3d}\right)}-(1-\epsilon_n)} \log_{2}{\left(1-\frac{1}{3d}\right)}} \times \frac{1}{(\log{n})^{\frac{2(\epsilon_n-1)}{2\log_{2}{\left(1-\frac{1}{3d}\right)}-(1-\epsilon_n)} \frac{2\epsilon_n}{1-\epsilon_n}\log_{2}{\left(1-\frac{1}{3d}\right)}}}\\ & = c_{1} n^{\frac{2(\epsilon_n-1)}{2\bigl(\frac{-\frac{1}{3d}}{\log{2}}\bigr)-(1-\epsilon_n)} \bigl(\frac{-\frac{1}{3d}}{\log{2}}\bigr)} \times \frac{1}{(\log{n})^{\frac{2(\epsilon_n-1)}{2\left(\frac{-\frac{1}{3d}}{\log{2}}\right)-(1-\epsilon_n)} \frac{2\epsilon_n}{1-\epsilon_n}\left(\frac{-\frac{1}{3d}}{\log{2}}\right)}}\\ & = n^{-\left(\frac{2(1-\epsilon_n)}{1+(1-\epsilon_n)d \log{2}}\right)} \frac{1}{(\log{n})^{\frac{2\epsilon_n}{-2+3d\log{2}(\epsilon_n-1)}}} \\ & = n^{-\left(\frac{2(1-\epsilon_n)}{2+(1-\epsilon_n)3d \log{2}}\right)} (\log{n})^{\left(\frac{2\epsilon_n}{ 2+3d \log{2}(1-\epsilon_n)}\right) } \end{align*}$

and, for the second term, with the same arguments

$\frac{\tilde{c}}{\epsilon_n} n^{(c(d)-1)(1-\epsilon_n)} \log{n}^{\epsilon_n(1-c(d))} = \frac{\tilde{c}}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)\frac{3}{2}d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+d \frac{3}{2}\log{2}(1-\epsilon_n )}\bigr)},$

for a constant $\tilde{c}$ independent of $\epsilon_n$ hence,

$\mathbb{E}(\tilde{m}^{Un}_{\infty, n}(x)- m(x) )^2 \leq \frac{C}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)\frac{3}{2}d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+\frac{3}{2}d \log{2}(1-\epsilon_n )}\bigr)} ,$

and consequently,

$\begin{align*} &\frac{C}{\epsilon_n} n^{-\bigl(\frac{1-\epsilon_n}{1+(1-\epsilon_n)\frac{3}{2}d \log{2}}\bigr)} (\log{n})^{\bigl(\frac{\epsilon_n}{ 1+\frac{3}{2}d \log{2}(1-\epsilon_n )}\bigr)}\\ & = \frac{C}{\epsilon_n} n^{-\bigl(\frac{1}{1+\frac{3}{2}d \log{2}}\bigr)}(\log{n})^{\bigl(\frac{\epsilon_n}{ 1+\frac{3}{2}d \log{2}(1-\epsilon_n )}\bigr)} \times n^{\bigl(\frac{\epsilon_n}{ (1+\frac{3}{2}d \log{2}) (1 + (1-\epsilon_n ))d \log{2} }\bigr)}\\ &\leq \frac{C}{\epsilon_n} n^{-\bigl(\frac{1}{1+\frac{3}{2}d \log{2}}\bigr)}(\log{n})^{\bigl(\frac{\epsilon_n}{\frac{3}{2}d \log{2}(1-\epsilon_n )}\bigr)} \times (\log{n})^{ \frac{\log{n}}{\log{\log{n}}} \bigl(\frac{\epsilon_n}{ (\frac{3}{2}d \log{2})^{2} (1-\epsilon_n ) }\bigr)}. \end{align*}$

Finally we finish the proof by selecting $\epsilon_n = \frac{1}{\log{n}},$ and

$\mathbb{E}(\tilde{m}^{Un}_{\infty, n}(x)- m(x) )^2 \leq \tilde{C} n^{-\bigl(\frac{1}{1+\frac{3}{2}d \log{2}}\bigr)} (\log{n} ) .$

□

4. Plots and experiments

In the following section, we summarize the rates of convergence for the centered KeRF and the uniform KeRF, and we compare them with the minimax rate of convergence over the class of the Lipschitz functions ^[29]. In addition, we provide numerical simulations where we compare the $L_2-$ error for different choices of the tree depth. All experiments performed with the software Python (https://www.python.org/), mainly with the numpy library, where random sets uniformly distributed in $[0, 1]^d$ have been created, for various examples for the dimension $d$ and the function $Y.$ For every experiment the set is divided into a training set (80 %) and a testing set (20 %); then the $L_2-$ error $(\sum_{ X_{i} \in \; \text{test set} } (\tilde{m}(X_i) -Y_i)^2)$ and the standard deviation of the error is computed.

For the centered KeRF we compare three different values of tree depth, which from theory provide different convergence rates. First, the choice of $k$ in [, Theorem 1] that provides the previous convergence rate; second, the selection of $k$ as it was delivered from the Theorem 1.1; and, third, the case where the estimator, in probability, interpolates the data set, but the known convergence rate is slow [, Theorem 4.1], $O(\log{n}^{-\frac{d-5}{6}})$ for the dimension of the feature space $d > 5.$

For the uniform KeRF, we compare the two values of tree depth as they were derived from ^[26] and Theorem 2 nevertheless, it is not known if the uniform-KeRF algorithm converges when our estimator function interpolates the data set. Of course, in practice, since real data might violate the assumptions of the theorems, one should try cross-validation for tuning the parameters of the algorithms.

Comparing the rates of consistency for centered KeRF and the depth of the corresponding trees:

● Scornet in [26, Theorem 1] rate of convergence:

$n^{-( \frac{1}{d\log{2}+3} )} (\log{n})^2, \; \text{and }\; k = \lceil \frac{1}{\log2 +\frac{3}{d}}\log{\frac{n}{\log{n}^{2}}} \rceil .$

● New rate of convergence:

$n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)}(\log{n}) , \; \text{and} \; k = \lceil \frac{\frac{1}{\log{n} } -1}{2\log_{2}(1-\frac{1}{2d})-( 1-\frac{1}{\log{n}})}\log_{2}\frac{n}{(\log{n})^{\frac{\frac{1}{\log{n}}}{1-\frac{1}{\log{n}}}}} \rceil.$

● Minimax ^[29] rate of consistency over the class of Lipschitz functions: $n^{\frac{-2}{d+2}}$ functions.

Thus, theoretically, the improved rate of consistency is achieved when trees grow at a deeper level compared with the parameter selection in [26, Theorem 1].

As it is evident from , the improvement in the convergence rate is more significant in the low dimensional feature space scenarios. The constant $\tilde{C} = \tilde{C}(d)$ of Theorem 1 depends on the dimension $d$ of the space. The convergence rates in the literature do not try to have a good estimate for $\tilde{C}$ , and they are significant for fixed values of $d$ only.

Figure 3. Plot of the exponents of n, for the previous rate of convergence for the centered KeRF algorithm, the new rate of convergence, and the optimal over the class of the Lipschitz functions.

DownLoad: Full-Size Img PowerPoint

Finally, we note that Klusowski's rate of convergence in ^[20], $\mathcal{O}\left((n\log^{\frac{d-1}{2}} {n})^{ -(\frac{1+\delta}{d\log{2}+1})} \right),$ where $\delta$ is a positive constant that depends on the dimension of the feature space $d$ and converges to zero as $d$ approaches infinity, is sharp and better than the one in Theorem 1 $\mathcal{O}\left(n^{-\bigl(\frac{1}{1+d \log{2}}\bigr)}(\log{n}) \right)$ for small values of $d.$ For large values of $n$ and $d$ the two estimates are essentially the same, but for now, we do not know if in general, the rate of convergence of the centered KeRF is not improvable.

Comparing the rates of convergence for uniform KeRF and the depth of the corresponding trees:

● Scornet in [26, Theorem 2] rate of convergence:

$n^{-( \frac{2}{3dlog2+6} )} (\log{n})^2 , \; \text{and }\; k = \lceil \frac{1}{\log2 +\frac{3}{d}}\log{\frac{n}{\log{n}^{2}}} \rceil .$

● New rate of convergence:

$n^{-( \frac{2}{3d\log2+2} )}(\log{n}), \; \text{and }\; k = \lceil \frac{\frac{1}{\log{n} } -1}{2\log_{2}(1-\frac{1}{3d})-( 1-\frac{1}{\log{n}})}\log_{2}\frac{n}{(\log{n})^{\frac{\frac{1}{\log{n}}}{1-\frac{1}{\log{n}}}}} \rceil.$

● Minimax ^[29] rate of convergence for the consistency over the class of Lipschitz functions: $n^{\frac{-2}{d+2}}$ functions.

Thus, theoretically, as in the case of centered random KeRF, the improved rate of consistency is achieved when trees grow at a deeper level compared with the parameter selection in [26, Theorem 2].

The same considerations on the dependence of the constant $\tilde{C}$ on $d$ we made for the centered KeRF hold in the uniform KeRF case, as it is evident in Figure 4. Moreover, as of now, it is still unknown to us if the convergence rate of the uniform KeRF can be improved.

Figure 4. Plot of the exponents of n, for the previous rate of convergence for the uniform KeRF algorithm, the new rate of convergence, and the optimal over the class of the Lipschitz functions.

DownLoad: Full-Size Img PowerPoint

Finally, numerical simulations of the $L_{2}-$ error of the centered KeRF-approximation for three different values of $k$ in Figure 5 are reported with the standard deviation of the errors in Figure 6. In Appendix, more simulations and plots for different target functions and for both algorithms are illustrated.

Figure 5. Plot of the

$L_{2}-$ error of the centered KeRF-approximation for three different values of

$k$ for the function

$Y = X^{2}_{1}+e^{-X^{2}_{2}} + \epsilon,$ where

$\epsilon \sim \mathcal{N}(0, \frac{1}{2}),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Figure 6. Plot of the standard deviation of errors, for the centered KeRF-approximation for three different values of

$k$ of the function

$Y = X^{2}_{1}+e^{-X^{2}_{2}} + \epsilon,$ where

$\epsilon \sim \mathcal{N}(0, \frac{1}{2}),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

5. Conclusions

In this paper, we have obtained improved rates of convergence for two kernel-based random forests, the centered KeRF and the uniform KeRF. In addition, we provided numerical simulations for both algorithms concerning the parameters of the methods. Finally, we explored the reproducing kernel Hilbert space related to the centered KeRF providing bounds for the dimension of the aforementioned space.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

The first author was funded by the PhD scholarship PON-DOT1303154-3, the second author was partially funded by project GNAMPA 2022 "HOLOMORPHIC FUNCTIONS IN ONE AND SEVERAL VARIABLES" and by Horizon H2020-MSCA-RISE-2017 project 777822 "GHAIA".

Conflict of interest

All authors declare no conflicts of interest in this paper.

Appendix

In this last section, we provide more experiments of low dimensional regression examples with additive noise, regarding the centered and the uniform KeRF. In particular, we calculate and compare the $L_2-$ errors and the standard deviations against different sample sizes for different values of the parameter $k$ of the estimator. Moreover, in the following subsection, we study the corresponding RKHS produced by the kernel

defined in the abelian group $\mathbb{Z}_{2}^{kd}.$ To conclude we recall some notation for finite abelian groups, necessary to define the aforementioned RKHS and estimate its dimension.

Appendix A. Analysis of the kernel

We provide here an alternative description of the centered random forest algorithm, where the dyadic tiling of the hypercube motivates us to define the kernel in the abelian group $\mathbb{Z}^{kd}_{2}$ . First, we define a Random Tree $\Theta$ . Start with a random variable $\Theta^0$ , uniformly distributed in $\{1, \dots, d\}$ , and split

$I: = [0, 1]^d = I^{\Theta^0}_0\cup I^{\Theta^0}_1,$

where

$I^{\Theta_0}_l = [0, 1]\times\dots\times[l/2, (l+1)/2]\times\dots[0, 1],$

where for $l = 0, 1$ the splitting was performed in the $\Theta^0$ -th coordinate. Choose then random variables $\Theta^{1, l}$ ( $l = 0, 1$ ), distributed as $\Theta^0$ , and split each

$I^{\Theta^0}_l = I^{\Theta^0, \Theta^1}_{l, 0}\cup I^{\Theta^0, \Theta^1}_{l, 1},$

where, as before, the splitting is performed at the $\Theta_1$ -th coordinate, and $I^{\Theta^0, \Theta^1}_{l, 0}$ is the lower half of $I^{\Theta^0}_l$ . Iterate the same procedure $k$ times. In order to do that, we need random variables $\Theta^{j; \eta_0, \dots, \eta_j}$ , with $\eta_l\in\{1, \dots, d\}$ and $j = 1, \dots, k$ . We assume that all such random variables are independent. It is useful think of ${\bf \Theta} = \{\Theta^{j; \eta_0, \dots, \eta_j}\}$ as indexed by a $d$ -adic tree, and, in fact, we refer to ${\bf \Theta}$ as a random tree in $[0, 1]^d$ . We call cells, or leaves, each of the $2^k$ rectangles into which $[0, 1]^d$ is split at the end of the $k^{th}$ subdivision.

On we see the $L_{2}-$ error of the centered KeRF-approximation for a three dimensional feature space and on Figure A2 the standard deviation of the errors.

Figure A1. Plot of the

$L_{2}-$ error of the centered KeRF-approximation for three different values of

$k$ for the function

$Y = X^{2}_{1}+\frac{1}{e^{X^{2}_{2}} +e^{X^{2}_{3}}}+\epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Figure A2. Plot of the standard deviation of errors, of the centered KeRF-approximation for three different values of

$k$ of the function

$Y = X^{2}_{1}+\frac{1}{e^{X^{2}_{2}} +e^{X^{2}_{3}}}+\epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

and show the $L_{2}-$ error and the standard deviation for the uniform KeRF in a two dimensional feature space and Figures A5 and A6 present a three dimensional example respectively.

Figure A3. Plot of the

$L_{2}-$ error of the uniform KeRF-approximation for two different values of

$k$ for the function

$Y = X^{2}_{1}+e^{-X^{2}_{2}}+\epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Figure A4. Plot of the standard deviation of errors, of the uniform KeRF-approximation for two different values of

$k$ for the function

$Y = X^{2}_{1}+e^{-X^{2}_{2}}+\epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Figure A5. Plot of the

$L_{2}-$ error of the uniform KeRF-approximation for two different values of

$k$ for the function

$Y = X^{2}_{1}+\frac{1}{(e^{X^{2}_{3}} + e^{X^{2}_{2}})} + \epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Figure A6. Plot of the standard deviation of the errors, of the uniform KeRF-approximation for two different values of

$k$ for the function

$Y = X^{2}_{1}+\frac{1}{(e^{X^{2}_{3}} + e^{X^{2}_{2}})} + \epsilon$ where

$\epsilon \sim \mathcal{N}(0, 0.5),$ against different data set size.

DownLoad: Full-Size Img PowerPoint

Appendix B. A reproducing kernel from the centered KeRF

B.1. The Fourier analysis of the kernel

The kernel in the centered KeRF defines a RKHS $H_K$ structure on a set $\Gamma$ having $2^{kd}$ points ^[4]. In fact, $\Gamma$ has a group structure, and Fourier analysis can be used. Much research is done in RKHS theory, and in this section (see ^[1]), we study the structure of $H_K$ in itself. A priori, $H_K$ might have any dimension less or equal to $\# \Gamma$ . We show in fact that its dimension is much lower than that, a fact which is somehow surprising, and we believe it is interesting in itself. Furthermore, we prove that there are no nonconstant multipliers in the space $H_K.$ For completeness we provide definitions and notation on Fourier analysis on Abelian groups in Appendix C.

We identify every real number $x \in [0, 1]$ with its binary expression $x = 0.x_{1}x_{2}x_{3}...$ with $x_{i} \in \{ 0, 1\}$ for $i \in \mathbb{N}.$ Here we consider the group

$\begin{align} G = \mathbb{Z}_2^{kd}\ni x = (x_i^j)_{\genfrac{}{}{0pt}{}{{i = 1, \dots, k}}{{j = 1, \dots, d}}} = (x^1|x^2|\dots|x^d) = \begin{pmatrix} x_1\\ x_2\\ \dots\\ x_k \end{pmatrix}. \end{align}$

(B.1)

The kernel $K:\Gamma\times\Gamma\to\mathbb{C}$ corresponding to the kernel $K^{cen}_{k}$ is,

$\begin{eqnarray} K(a, b)& = &\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\frac{1}{d^k}\binom{k}{l}\prod\limits_{j = 1}^d\chi\left(a^j_1 = b^j_1, \dots, a^j_{k_j} = b^j_{k_j}\right)\\ & = &\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\frac{1}{d^k}\binom{k}{l}\prod\limits_{j = 1}^d\prod\limits_{i = 1}^{k_j}\chi\left(a^j_i = b^j_i\right)\\ & = &\varphi(a-b), \end{eqnarray}$

(B.2)

where $\binom{k}{l}$ is the multidimensional binomial coefficient, $\chi_E$ is the characteristic function of the set $E$ , and $a-b$ is the difference in the group $\mathbb{Z}_2^{kd}$ . Incidentally, (B.2) shows that the kernel $K$ can be viewed as a convolution kernel on the appropriate group structure. For the last equality, we consider the binary representation of a number in $(0, 1]$ whose digits are not definitely vanishing. The fact that $0$ does not have such representation is irrelevant since the probability that one of the coordinates of the data vanishes is zero.

We now compute the anti-Fourier transform $\mu = \check{\varphi}$ . We know that $\sharp(\Gamma) = 2^{kd}$ , and that the characters of $\mathbb{Z}_2^{kd}$ have the form

$\begin{equation} \gamma_a(x), \ x\in \mathbb{Z}_2^{kd}, \ a\in \widehat{\mathbb{Z}_2^{kd}}, \ a\cdot x = a^1_1 x^1_1+\dots+a^d_k x^d_k. \end{equation}$

(B.3)

Hence,

$\begin{eqnarray} 2^{kd}p^n\mu(x)& = &d^k\sum\limits_{a\in\Gamma}\varphi(a)\gamma_a(x)\\ & = &d^k\sum\limits_{a\in\Gamma}\varphi(a)(-1)^{a\cdot x}\\ & = &\sum\limits_{a\in\Gamma}\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\binom{k}{l}\prod\limits_{j = 1}^d\prod\limits_{i = 1}^{k_j}\chi\left(a^j_i = 0\right) (-1)^{a\cdot x}\\ & = &\sum\limits_{a\in\Gamma}\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\binom{k}{l}\prod\limits_{j = 1}^d(-1)^{\tilde{a}_j^{k_j}\cdot\tilde{x}_j^{k_j}}\prod\limits_{i = 1}^{k_j}\left[\chi\left(a^j_i = 0\right)(-1)^{a^j_i x^j_i}\right]\\ &\ &\text{where }\tilde{a}_j^{k_j} = \begin{pmatrix} a^j_{k_j+1}\\ \dots\\ a^j_n \end{pmatrix}\text{ is the lower,}\ (k-k_j) -\text{dimensional} \\ &\ &\text{part of the column}\ a^j , \\ & = &\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\binom{k}{l}\prod\limits_{j = 1}^d(-1)^{\tilde{a}_j^{k_j}\cdot\tilde{x}_j^{k_j}}\prod\limits_{i = 1}^{k_j}\chi\left(a^j_i = 0\right)\\ & = &\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d}}{{|l| = k}}}\binom{k}{l}\sum\limits_{\genfrac{}{}{0pt}{}{{a\in\Gamma}}{{a^1_1 = \dots a^1_{k_1} = a^2_1 = \dots = a^d_{k_d} = 0}}}\prod\limits_{j = 1}^d(-1)^{\tilde{a}_j^{k_j}\cdot\tilde{x}_j^{k_j}}. \end{eqnarray}$

(B.4)

The last expression vanishes exactly when for all $l$ , there are some $1\le j\le d$ , and some $k_j+1\le i\le k$ such that $x^j_i = 1$ , due to the presence of the factor $(-1)^{a^j_i x^j_i} = (-1)^{a^j_i}$ which takes values $\pm1$ on summands having, two by two, the same absolute values.

If, on the contrary, there is $l$ such that for all $1\le j\le d$ , and $k_j+1\le i\le k$ , we have that $x^j_i = 0$ , then $\mu(x)\ne0$ . Since $|l| = k$ and there are $kd$ binary digits involved in the expression of $x$ , the latter occurs exactly when the binary matrix representing $x$ has a large lower region in which all entries are $0$ . More precisely, the number of vanishing entries must be at least

$\begin{equation} (k-k_1)+\dots+(k-k_p) = (d-1)k. \end{equation}$

(B.5)

The number $N(d, k)$ of such matrices is the dimension of $H_K$ , the Hilbert space having $K$ as a reproducing kernel.

Next, we prove some estimates for the dimension of the RKHS.

We summarize the main items in the following statement.

Theorem B.1. Let $K:\Gamma\times\Gamma\to\mathbb{C}$ be the kernel in (B.2), $K(a, b) = \varphi(a-b)$ , and let

$\begin{equation} E_K = \mathit{\text{supp}}(\check{\varphi})\in K. \end{equation}$

(B.6)

Then,

(i) As a linear space, $H_K = L_{E_K}$ , where

$\begin{eqnarray} E_K& = &\{x = (x^1|\dots|x^d):\ x^j_i = 0 \mathit{\text{for}}k_j+1\le i\le k, \ \mathit{\text{for some}}\ l\\ & = &(k_1, \dots, k_d)\in\mathbb{N}^d\ \mathit{\text{with}}\ k_1+\dots+k_d = k\}; \end{eqnarray}$

(B.7)

(ii) For $x\in E_K$ ,

$\begin{equation} \check{\varphi}(x) = \frac{1}{2^{k}d^k}\sum\limits_{\genfrac{}{}{0pt}{}{{l\in\mathbb{N}^d, \ |l| = k}}{{x^j_i = 0 \mathit{\text{for}}k_j+1\le i\le k}}}\binom{k}{l}. \end{equation}$

(B.8)

To obtain the expression on (B.8), we used the fact that

$\sharp\{a:\ a^1_1 = \dots a^1_{k_1} = a^2_1 = \dots = a^p_{k_p} = 0\} = 2^{(d-1)k}.$

B.2. Some properties of $H_K$

Linear relations. Among all functions $\psi:\Gamma\to\mathbb{C}$ , those belonging to $H_K$ (i.e., those belonging to $L_{E_K}$ ) are characterized by a set of linear equations,

$\begin{equation} 0 = 2^{np}p^n\mu(x) = \sum\limits_{\genfrac{}{}{0pt}{}{{k\in\mathbb{N}^p, \ |k| = n}}{{x^j_i = 0 \text{ for }k_j+1\le i\le n}}}\binom{n}{k}\text{ for }x\notin E_K. \end{equation}$

(B.9)

Multipliers. A multiplier of $H_K$ is a function $m:\Gamma\to\mathbb{C}$ such that $m\psi\in H_K$ whenever $\psi\in H_K$ .

Proposition B.1. The space $H_K$ has no nonconstant multiplier.

In particular, it does not enjoy the complete Pick property, which has been subject of intensive research for the past twenty-five years ^[1].

Proof. The space $H_K$ coincides as a linear space with $L_{E_K}$ . Let $\Lambda_E = \check{L_E}$ , which is spanned by $\{\delta_x:\ x\in E\}$ . Observe that, since $0 = (0|\dots|0)\in E_K$ , the constant functions belong to $H_K$ , hence, any multiplier $m$ of $H_K$ lies in $H_K$ ; $m = m\cdot 1\in H_K$ .

Suppose $m$ is not constant. Then, $\check{m}(a)\ne0$ for some $a\in E_K$ , $a\ne0$ . Let $a$ be an element in $E_K$ such that $\check{m}(a)\ne0$ . Since $H_K\ni m\cdot\widehat{\delta_x}$ for all $x$ in $E_K$ , and $m\cdot\widehat{\delta_x} = \widehat{\check{m}\ast\delta_x}$ , we have that the support of $\check{m}\ast\delta_x$ lies in $H_K$ . Now,

$\check{m}\ast\delta_x(y) = \check{m}(x-y),$

hence, we have that, for any $x$ in $E_K$ , $y = x-a$ lies in $E_K$ as well. This forces $a = 0$ , hence $m$ to be constant. □

B.3. Bounds for dimension and generating functions

Theorem B.2. For fixed $d\ge 1$ , we have the estimates:

$\begin{equation} \mathit{\text{dim}}(H_K)\sim\frac{2^{k-d+1}k^{d-1}}{(d-1)!}, \ \mathit{\text{hence}}\ \frac{\mathit{\text{dim}}(H_K)}{2^{kd}}\sim\frac{k^{d-1}}{2^{d-1}(d-1)!2^{k(d-1)}}. \end{equation}$

(B.10)

Proof. Let $l_{1}, l_{2}, ..., l_{d}$ such that

$0\leq l_{1}+l_{2}+...+l_{d} = m \leq k$

where $m$ is a parameter and let also $\lambda = \lvert j : l_{j} \geq 1 \rvert = \lvert \{ \text{stop 1-digits} \} \rvert = \lvert \{ \text{back-entries} \} \vert$ where $\lvert \cdot \rvert$ denotes the size of the sets, and of course we have that $0 \leq m \leq k$ and $0\leq \lambda \leq d, m.$ Goal to obtain a bound for

$N(k, d) = \sum\limits_{m = 0}^k \sum\limits_{\lambda = 0}^{d \wedge m} 2^{m-\lambda} { d \choose \lambda} \lvert \{ (l_{1}, l_{2}, ..., l_{d}): l_{1} +l_{2}+...+l_{d} = m \rvert$

and

$\lvert \{ j : l_{j} = 1 \} = 1 \rvert \} .$

Let $A(m, \lambda)$ the m-th coefficient of $x$ in the polynomial

$\begin{align*} (x+x^{2}+...x^{m}+...)^{\lambda} & = (x(1+x+x^{2}+...)^{\lambda}\\ & = (x^{\lambda}(1+x+...)^{\lambda})\\ & = \frac{x^{\lambda}}{(1-x)^{\lambda}} \end{align*}$

and $2^{m} A(m, \lambda)$ is the m-th coefficient of $x,$ for the fraction $\frac{(2x)^{\lambda}}{(1-2x)^{\lambda}},$ therefore $2^{m-\lambda} A(m, \lambda)$ is the m-th coefficient of $\frac{x^{\lambda}}{(1-2x)^{\lambda}}.$ Let's see the first sum, $B(m, d)$ is the m-th coefficient of $x$ :

$\begin{align*} \sum\limits_{\lambda = 0}^{d \wedge m }{ d \choose \lambda} 2^{m-\lambda } A(m, \lambda) & = \sum\limits_{\lambda = 0}^{d \wedge m}{ d \choose \lambda} ( \frac{x}{1-2x} )^{\lambda}\\ & = ( 1+ \frac{x}{1-2x})^d\\ & = (\frac{1-x}{1-2x})^d. \end{align*}$

Again by the same combinatoric argument we are looking the $k$ -th coefficient of the function

$f(x) = \frac{1}{1-x}(\frac{1-x}{1-2x})^d .$

Back to the estimate, Let $a_{k}$ the coefficient of the power series centered at $z = 0.$

$\max\limits_{ |z| = r } | f(z)| = \max\limits_{ |z| = r } \bigg| \frac{(1-z)^{d-1}}{(1-2z)^{d}} \bigg| = \max\limits_{ |z| = r }\bigg| \frac{1}{1-z} \bigg(\frac{1-z}{1-2z}\bigg)^{d} \bigg| \leq 2 \max\limits_{ \theta \in ( -\pi, \pi) } \bigg| \frac{1-re^{i \theta}}{1-2re^{i \theta}} \bigg| ^{d}.$

After some calculations since $r$ is fixed one has that the maximum is achieved for $\theta = 0.$ So

$\max\limits_{ |z| = r } | f(z)| \leq 2 (\frac{1-r}{1-2r})^{d} .$

Our estimation finally becomes:

$\begin{align*} |a_{k}| &\leq \frac{2 \big( \frac{1-r}{1-2r} \big)^{d} }{r^{k}}\\ & = \frac{2(1-r)^{d}}{r^{k} (1-2r)^{k} }\\ & = k^{d} 2^{k}(\frac{1}{2} + \frac{1}{2k})^{d}, \quad \quad ( \text{since, } \quad r = \frac{1}{2}( 1-\frac{1}{k}) )\\ & = k^{d} (1+ \frac{1}{k})^{d} 2^{k-d}. \end{align*}$

Thus an estimate for the dimension of $H_K$ is

$\frac{|a_k|}{2^{kd}}\lesssim \frac{k^{d} (1+ \frac{1}{k})^{d} 2^{k(1-d)}}{2^d}.$

Another estimate about the dimension of $H_K$ . For $f(z) = \sum_{n = 0}^\infty a_n z^n$ , we have

$|a_n|\le\frac{\max\{|f\left(r e^{i t}\right)|:\ |t|\le\pi\}}{r^n}.$

Consider the function

$f(z) = \frac{(1-z)^{d-1}}{(1-2z)^d}$

in $|z| < 1/2$ and let $r = \frac{1-1/k}{2}$ . Then,

$\begin{eqnarray*} |a_k|&\le&\frac{(3/2)^{d-1}}{(1/k)^d(1-1/k)^k2^{-k}}\\ &\le&(3/2)^{d-1}2^ke k^d. \end{eqnarray*}$

Thus,

$\frac{|a_k|}{2^{kd}}\lesssim\frac{k^d (3/2)^d}{2^{k(d-1)}}.$

Recursively working out the generating function one gets the estimates in (B.10). □

Appendix C. Some results from Fourier analysis on finite groups

Following the notation of ^[24], we recall here the basic notions of Fourier theory for a finite, abelian group $G$ , which we employed above. Here, $G$ is endowed with its counting measure. The dual group $\Gamma = \widehat{G}$ of $G$ is populated by labels $a$ for homomorphisms $\gamma_a:G\to\mathbb{T} = \{e^{i t}:\ t\in\mathbb{R}\}$ . Given a function $f:G\to\mathbb{C}$ , its Fourier transform $\widehat{f}:\Gamma\to\mathbb{C}$ is defined as

$\begin{equation} \widehat{f}(a) = \sum\limits_{x\in G}f(x)\overline{\gamma_a(x)}. \end{equation}$

(C.1)

We make $\Gamma$ into a (finite), additive group by setting

$\gamma_{a+b} = \gamma_a\cdot\gamma_b, \text{ and }\gamma_x(a): = \gamma_a(x).$

It turns out that they have the same number of elements, $\sharp(G) = \sharp(\Gamma)$ . Some basic properties are:

$\begin{eqnarray*} f(x)& = &\frac{1}{\sharp(\Gamma)}\sum\limits_{a\in \Gamma}\widehat{f}(a)\gamma_a(x)\text{ (inverse Fourier transform) }\\ \sum\limits_{x\in G}|f(x)|^2& = &\frac{1}{\sharp(\Gamma)}\sum\limits_{a\in \Gamma}|\widehat{f}(a)|^2\text{ (Plancherel) }\\ \widehat{f\ast g}& = &\widehat{f}\cdot\widehat{g}, \end{eqnarray*}$

where

$\begin{equation} (f\ast g)(x) = \sum\limits_{y\in G}f(x-y)g(y). \end{equation}$

(C.2)

We write

$\begin{equation} \check{\varphi}(x) = \sharp(\Gamma)^{-1}\sum\limits_{a\in\Gamma}\varphi(a)\gamma_a(x), \text{ so that }\widehat{\check{\varphi}} = \varphi. \end{equation}$

(C.3)

The unit element of convolution in $G$ is $\delta_0$ .

In the other direction, for $\varphi, \psi:\Gamma\to\mathbb{C}$ we define

$\begin{equation} (\varphi\ast\psi)(a) = \frac{1}{\sharp(\Gamma)}\sum\limits_{b\in\Gamma}\varphi(a-b)\psi(b), \end{equation}$

(C.4)

and similarly to above, $\widehat{\check{\varphi}\check{\psi}} = \varphi\ast\psi$ . The unit element on convolution in $\Gamma$ is $\sharp(\Gamma)\delta_0$ .

A function $\varphi$ on $\Gamma$ is positive definite if

$\sum\limits_{a, b\in\Gamma}^n c(a) \overline{c(b)}\varphi(b-a)\ge0.$

Theorem C.1. [Bochner's theorem] A function $\varphi:\Gamma\to\mathbb{C}$ is positive definite if and only if there exists $\mu:G\to\mathbb{R}_+$ such that $\varphi = \widehat{\mu}$ .

The theorem holds in great generality, and its proof in the finite group case is elementary. We include it because it highlights the relationship between the measure $\mu$ on $G$ and the positive definite function (the kernel) $\varphi$ .

Proof. If

$\begin{eqnarray} \sharp(\Gamma)^{-2}\sum\limits_{a, b\in\Gamma}\widehat{\mu}(b-a) c(a)\overline{c(b)}& = &\sum\limits_{x\in G}\sharp(\Gamma)^{-2}\sum\limits_{a, b\in\Gamma}\mu(x)\overline{\gamma_{b-a}(x)}c(a)\overline{c(b)}\\ & = &\sum\limits_{x\in G}\sharp(\Gamma)^{-2}\sum\limits_{a, b\in\Gamma}\mu(x)\overline{\gamma_{b}(x)}\overline{c(b)}\gamma_a(x)c(a)\\ & = &\sum\limits_{x\in G}\mu(x)\left|\sharp(\Gamma)^{-1}\sum\limits_{a\in\Gamma}c(a)\gamma_a(x)\right|^2\\ & = &\sum\limits_{x\in G}\mu(x)\left|\check{c}(x)\right|^2\ge0. \end{eqnarray}$

(C.5)

Only if, since for all $b$ in $\Gamma$ ,

$\begin{eqnarray} \mu(x)\sharp(\Gamma)& = &\sum\limits_{a\in\Gamma}\varphi(a)\gamma_x(a) \\ & = &\sum\limits_{a\in\Gamma}\varphi(a-b)\gamma_x(a-b)\\ & = &\sum\limits_{a\in\Gamma}\varphi(a-b)\gamma_x(a)\overline{\gamma_x(b)}, \end{eqnarray}$

(C.6)

we have

$\begin{equation} \mu(x)\sharp(\Gamma)^2 = \sum\limits_{a, b\in\Gamma}\varphi(a-b)\gamma_x(a)\overline{\gamma_x(b)}\ge0, \end{equation}$

(C.7)

by the assumption. □

We now come to reproducing kernels on $\Gamma$ which are based on positive definite functions $\varphi:\Gamma\to\mathbb{R}_+$ . Set

$\begin{equation} K(a, b) = \varphi(a-b) = K_b(a), \ K:\Gamma\times\Gamma\to\mathbb{C}, \end{equation}$

(C.18)

and set

$\begin{equation} H_K = \text{span}\{K_b:\ b\in\Gamma\}\ni\sum\limits_{b\in\Gamma}c(b)K_b, \end{equation}$

(C.19)

where $H_K$ is the Hilbert space having $K$ as reproducing kernel. We wish to have a more precise understanding of it.

We start by expressing the norm of an element on $H_K$ is several equivalent ways,

$\begin{eqnarray} \left\|\sum\limits_{b\in\Gamma}c(b)K_b\right\|_{H_K}^2& = &\sum\limits_{a, b\in\Gamma}\overline{c(a)}c(b)\langle K_b, K_a\rangle\\ & = &\sum\limits_{a, b\in\Gamma}\overline{c(a)}c(b)K(a, b) = \sum\limits_{a, b\in\Gamma}\overline{c(a)}c(b)\widehat{\mu}(a-b)\\ & = &\sum\limits_{a, b\in\Gamma}\overline{c(a)}c(b)\sum\limits_{x\in G}\mu(x)\gamma_{b-a}(x)\\ & = &\sum\limits_{x\in G}\mu(x)\sum\limits_{a, b\in\Gamma}\overline{c(a)}c(b)\gamma_{b}(x)\overline{\gamma_a(x)}\\ & = &\sum\limits_{x\in G}\mu(x)\left|\sum\limits_{b\in\Gamma}c(b)\gamma_b(x)\right|^2\\ & = &\sharp(\Gamma)^2\sum\limits_{x\in G}\mu(x)\left|\check{c}(x)\right|^2\\ & = &\sharp(\Gamma)^2\sum\limits_{x\in G}\left|\mu(x)^{1/2}\check{c}(x)\right|^2. \end{eqnarray}$

(C.10)

In other terms,

$\begin{equation} \sharp(\Gamma)^{-1}\sum\limits_{b\in\Gamma}c(b)K_b\mapsto \check{c} \end{equation}$

(C.11)

is an isometry of $H_K$ onto $L^2(\mu)$ . This will become important later, when we verify that for our kernels $\text{supp}(\mu)$ is sparse in $G$ . In fact, $\text{dim}(H_K) = \sharp(\text{supp}(\mu))$ .

Corollary C.1. As a linear space, $H_K$ is determined by $\mathit{\text{supp}}(\mu)$ :

$\psi\in H_K\ \mathit{\text{if and only if}}\ \mathit{\text{supp}}\ (\check{\psi})\subseteq\mathit{\text{supp}}(\mu).$

Let $E\subseteq G$ . We denote

$\begin{equation} L_E = \{G\xrightarrow{\psi}\mathbb{C}:\ \text{supp}(\check{\psi})\subseteq E\}. \end{equation}$

(C.12)

Next, we look for the natural orthonormal system provided by the Fourier isometry (C.11). Fr $x\in G$ , let $\check{c_x} = \mu(x)^{-1/2}\delta_x$ : $\{\check{c_x}:\ x\in E: = \text{supp}(\mu)\}$ is a orthonormal system for $L^2(\mu)$ , and so $\{e_x:\ x\in E\}$ is an orthonormal basis for $H_K$ , where

$\begin{eqnarray} c_x(b) = \sum\limits_{y\in G}\mu(x)^{-1/2}\delta_x(y)\overline{\gamma_b(y)} = \mu(x)^{-1/2}\overline{\gamma_b(x)}, \end{eqnarray}$

(C.13)

and

$\begin{eqnarray} e_x(a)& = &\sharp(\Gamma)^{-1}\sum\limits_{b\in\Gamma}c_x(b)K_b(a)\\ & = &\frac{\mu(x)^{-1/2}}{\sharp(\Gamma)}\sum\limits_{b\in\Gamma}K_b(a)\overline{\gamma_b(x)}\\ & = &\frac{\mu(x)^{-1/2}}{\sharp(\Gamma)}\sum\limits_{b\in\Gamma}\varphi(a-b)\overline{\gamma_b(x)}\\ & = &\frac{\mu(x)^{-1/2}}{\sharp(\Gamma)}\sum\limits_{b\in\Gamma}\varphi(a-b)\gamma_{a-b}(x)\overline{\gamma_a(x)}\\ & = &\mu(x)^{-1/2}\mu(x)\overline{\gamma_a(x)}\\ & = &\mu(x)^{1/2}\overline{\gamma_a(x)}. \end{eqnarray}$

(C.14)

Let's verify that we obtain the reproducing kernel from the o.n.b. as expected,

$\begin{eqnarray} \sum\limits_{x\in\Gamma}e_x(a)\overline{e_x(b)}& = &\sum\limits_{x\in\Gamma}\mu(x)\overline{\gamma_x(a)}\gamma_x(b)\\ & = &\sum\limits_{x\in\Gamma}\mu(x)\overline{\gamma_x(a-b)}\\ & = &\widehat{\mu}(a-b)\\ & = &\varphi(a-b). \end{eqnarray}$

(C.15)

Remark C.1. Since any finite, abelian group can be written as the direct product of cyclic groups,

$\begin{equation} G = \bigoplus\limits_{l = 1}^L \mathbb{Z}_{m_l}, \end{equation}$

(C.16)

its dual $\Gamma$ can be written in the same way, because $\widehat{\mathbb{Z}_m}\equiv\mathbb{Z}_m$ . From the Fourier point of view, the only difference is that, if on $G$ we consider the counting measure, then on $\Gamma$ we consider normalized counting measure, as we did above.

References

[1]	J. Smoller, Shock waves and reaction-diffusion equations, Springer, 1 Ed., 1994.
[2]	E. F. Toro, Riemann solvers and numerical methods for fluid dynamics, Springer, 1999.
[3]	M. A. E. Abdelrahman, Cone-grid scheme for solving hyperbolic systems of conservation laws and one application, Comp. Appl. Math., 37 (2018), 3503–3513. https://doi.org/10.1007/s40314-017-0527-9 doi: 10.1007/s40314-017-0527-9
[4]	M. A. E. Abdelrahman, Global solutions for the ultra-relativistic Euler equations, Nonlinear Anal., 155 (2017), 140–162. https://doi.org/10.1016/j.na.2017.01.014 doi: 10.1016/j.na.2017.01.014
[5]	M. A. E. Abdelrahman, On the shallow water equations, Z. Naturforsch. A, 72 (2017), 873–879. https://doi.org/10.1515/zna-2017-0146 doi: 10.1515/zna-2017-0146
[6]	S. Chaplygin, On gas jets, Scientific Memoirs, Moscow University Mathematic Physics, Vol. 21, 1904.
[7]	M. R. Setare, Holographic Chaplygin gas model, Phys. Lett. B, 648 (2007), 329–332. https://doi.org/10.1016/j.physletb.2007.03.025 doi: 10.1016/j.physletb.2007.03.025
[8]	Y. Brenier, Solutions with concentration to the Riemann problem for one-dimensional Chaplygin gas equations, J. Math. Fluid Mech., 7 (2005), S326–S331. https://doi.org/10.1007/s00021-005-0162-x doi: 10.1007/s00021-005-0162-x
[9]	L. H. Guo, W. C. Sheng, T. Zhang, The two-dimensional Riemann problelm for isentropic Chaplygin gas dynamic system, Comm. Pure Appl. Anal., 9 (2010), 431–458. http://dx.doi.org/10.3934/cpaa.2010.9.431 doi: 10.3934/cpaa.2010.9.431
[10]	Y. Hu, J. Q. Li, W. C. Sheng, Degenerate Goursat-type boundary value problems arising from the study of two-dimensional isothermal Euler equations, Z. Angew. Math. Phys., 63 (2012), 1021–1046. https://doi.org/10.1007/s00033-012-0203-2 doi: 10.1007/s00033-012-0203-2
[11]	J. Q. Li, Y. X. Zheng, Interaction of four rarefaction waves in the bi-symmetric class of the two-dimensional Euler equations, Commun. Math. Phys., 296 (2010), 303–321. https://doi.org/10.1007/s00220-010-1019-6 doi: 10.1007/s00220-010-1019-6
[12]	S. Chen, A. Qu, Two-dimensional Riemann problems for Chaplygin gas, SIAM J. Math. Anal., 44 (2012), 2146–2178. https://doi.org/10.1137/110838091 doi: 10.1137/110838091
[13]	G. Lai, W. C. Sheng, Y. X. Zheng, Simple waves and pressure delta waves for a Chaplygin gas in two-dimensions, Discrete Contin. Dyn. Syst., 31 (2011), 489–523. https://doi.org/10.3934/dcds.2011.31.489 doi: 10.3934/dcds.2011.31.489
[14]	W. C. Sheng, T. Zhang, The Riemann problem for transportation equations in gas dynamics, Memoirs of the American Mathematical Society, Vol. 137, 1999. https://doi.org/10.1090/memo/0654
[15]	F. B. Li, W. Xiao, Interaction of four rarefaction waves in the bi-symmetric class of the pressure gradient system, J. Differ. Equations, 252 (2012), 3920–3952. https://doi.org/10.1016/j.jde.2011.11.010 doi: 10.1016/j.jde.2011.11.010
[16]	K. Song, Y. X. Zheng, Semi-hyperbolic patches of solutions of the pressure gradient system, Discrete Contin. Dyn. Syst., 24 (2009), 1365–1380. https://doi.org/10.3934/dcds.2009.24.1365 doi: 10.3934/dcds.2009.24.1365
[17]	G. Q. Chen, X. M. Deng, W. Xiang, Shock diffraction by convex cornered wedges for the nonlinear wave system, Arch. Rational Mech. Anal., 211 (2014), 61–112. https://doi.org/10.1007/s00205-013-0681-1 doi: 10.1007/s00205-013-0681-1
[18]	K. Mohamed, Simulation numérique en volume finis, de problémes d'écoulements multidimensionnels raides, par un schéma de flux á deux pas, Dissertation, University of Paris XIII, 2005.
[19]	K. Mohamed, M. Seaid, M. Zahri, A finite volume method for scalar conservation laws with stochastic time-space dependent flux function, J. Comput. Appl. Math., 237 (2013), 614–632. https://doi.org/10.1016/j.cam.2012.07.014 doi: 10.1016/j.cam.2012.07.014
[20]	F. Benkhaldoun, K. Mohamed, M. Seaid, A Generalized Rusanov method for Saint-Venant Equations with Variable Horizontal Density, In: J. Fořt, J. Fürst, J. Halama, R. Herbin, F. Hubert, Finite volumes for complex applications Ⅵ problems & perspectives, Springer Proceedings in Mathematics, Springer, Berlin, Heidelberg, 4 (2011), 89–96. https://doi.org/10.1007/978-3-642-20671-9_10
[21]	K. Mohamed, F. Benkhaldoun, A modified Rusanov scheme for shallow water equations with topography and two phase flows, Eur. Phys. J. Plus, 131 (2016), 207. https://doi.org/10.1140/epjp/i2016-16207-3 doi: 10.1140/epjp/i2016-16207-3
[22]	K. Mohamed, A finite volume method for numerical simulation of shallow water models with porosity, Comput. Fluids, 104 (2014), 9–19. https://doi.org/10.1016/j.compfluid.2014.07.020 doi: 10.1016/j.compfluid.2014.07.020
[23]	K. Mohamed, A. R. Seadawy, Finite volume scheme for numerical simulation of the sediment transport model, Int. J. Mod. Phys. B, 33 (2019), 1950283. https://doi.org/10.1142/S0217979219502837 doi: 10.1142/S0217979219502837
[24]	K. Mohamed, M. A. E. Abdelrahman, The modified Rusanov scheme for solving the ultra-relativistic Euler equations, Eur. J. Mech. B/Fluids, 90 (2021), 89–98. https://doi.org/10.1016/j.euromechflu.2021.07.014 doi: 10.1016/j.euromechflu.2021.07.014
[25]	S. Mungkasi, S. G. Roberts, A smoothness indicator for numerical solutions to the Ripa model, J. Phys. Conf. Ser., 693 (2016), 012011. https://doi.org/10.1088/1742-6596/693/1/012011 doi: 10.1088/1742-6596/693/1/012011
[26]	S. J. Sherwin, L. Formaggia, J. Peiro, V. Franke, Computational modelling of 1D blood flow with variable mechanical properties and its application to the simulation of wave propagation in the human arterial system, Int. J. Numer. Methods Fluids, 43 (2003), 673–700. https://doi.org/10.1002/fld.543 doi: 10.1002/fld.543
[27]	P. Gupta, R. K. Chaturvedi, L. P. Singh, The generalized Riemann problem for the Chaplygin gas equation, Eur. J. Mech. B/Fluids, 82 (2020), 61–65. https://doi.org/10.1016/j.euromechflu.2020.03.001 doi: 10.1016/j.euromechflu.2020.03.001
[28]	R. J. LeVeque, Numerical methods for conservation laws, Birkhäuser Verlag Basel, Switzerland, 1992.
[29]	V. Rusanov, Caculation of interaction of non-steady shock waves with obstacles, National Research Council of Canada, Ottawa, 1961,267–279.
[30]	P. K. Sweby, High resolution schemes using flux limiters for hyperbolic conservation laws, SIAM J. Numer. Anal., 21 (1984), 995–1011. https://doi.org/10.1137/0721062 doi: 10.1137/0721062
[31]	G. Wang, B. Chen, Y. Hu, The two-dimensional Riemann problem for Chaplygin gas dynamics with three constant states, J. Math. Anal. Appl., 393 (2012), 544–562. https://doi.org/10.1016/j.jmaa.2012.03.017 doi: 10.1016/j.jmaa.2012.03.017

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)