Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order

Cantarini, Marco; Coroianu, Lucian; Costarelli, Danilo; Gal, Sorin G.; Vinti, Gianluca

doi:10.3390/math10010063

Open AccessFeature PaperArticle

Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order

¹

Department of Industrial Engineering and Mathematical Sciences, Marche Polytechnic University, 60121 Ancona, Italy

²

Department of Mathematics and Computer Science, University of Oradea, 410087 Oradea, Romania

³

Department of Mathematics and Computer Science, University of Perugia, 06123 Perugia, Italy

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(1), 63; https://doi.org/10.3390/math10010063

Submission received: 6 December 2021 / Revised: 20 December 2021 / Accepted: 23 December 2021 / Published: 25 December 2021

(This article belongs to the Special Issue Set-Valued Analysis II)

Download Versions Notes

Abstract

:

In this paper, we consider the max-product neural network operators of the Kantorovich type based on certain linear combinations of sigmoidal and ReLU activation functions. In general, it is well-known that max-product type operators have applications in problems related to probability and fuzzy theory, involving both real and interval/set valued functions. In particular, here we face inverse approximation problems for the above family of sub-linear operators. We first establish their saturation order for a certain class of functions; i.e., we show that if a continuous and non-decreasing function f can be approximated by a rate of convergence higher than

1 / n

, as n goes to

+ \infty

, then f must be a constant. Furthermore, we prove a local inverse theorem of approximation; i.e., assuming that f can be approximated with a rate of convergence of

1 / n

, then f turns out to be a Lipschitz continuous function.

Keywords:

sigmoidal functions; ReLU function; neural network operators; saturation result; local inverse theorem

1. Introduction

The introduction of the max-product version of families of linear approximation operators is due to Bede, Coroianu and Gal (see, e.g., [1,2]) and it led to a new branch of approximation theory. The new theory of max-product operators has been deeply studied, and recently the above-mentioned authors summarized their results in a complete monograph [3].

In general, the max-product version of a sequence/net of linear operators is a family of nonlinear (more precisely sub-linear) operators with better approximation properties of their original version: in many cases, the order of convergence is faster than their linear counterparts [4,5,6]. Further, the above operators can also be useful, for instance, in the applications of probability and fuzzy theory involving both real and interval/set valued functions (see, e.g., [7,8]).

In the present paper, we study the max-product form of the neural network (NN) operators of the Kantorovich type

K_{n}^{(M)}

, first introduced in [9] and here recalled in Definition 3 of Section 2. In general, the NN operators (see [10]) are strictly related to the theory of artificial neural networks, which has been introduced in order to provide a very simple model for the human brain, which is able to reproduce all its main abilities [11,12,13,14].

Each basic element composing a neural network is called an artificial neuron; its behavior is regulated by suitable activation functions, which must represent the two possible states of the biological neuron: the activation and the quiet phases [15]. From the mathematical point of view, functions which better represent the latter fact are those with sigmoidal shape (see, e.g., [16]).

For the above reasons, in the present paper, we mainly consider the operators

K_{n}^{(M)}

activated by suitable sigmoidal functions. Very useful examples of sigmoidal functions (in view of their importance in learning algorithms, [17]) are, e.g., the logistic and the hyperbolic tangent functions [18].

However, independently of its biological meaning, in some recent papers, a new unbounded activation function has also been introduced and deeply investigated. This function is the so-called rectified linear unit (ReLU) function (see, e.g., [19]), and it is simply defined by the positive part of x, for every

x \in R

. The ReLU activation is revealed to be very suitable for training deep (i.e., multi-layer) neural networks, in view of the very simple form that is assumed by its derivative (whenever it exists).

Here, we show that the above operators can also be based on a certain finite linear combination of ReLU activation function, and in this case, the approximation properties of

K_{n}^{(M)}

are also preserved.

Problems of interpolation, or more in general, of approximation, are related to the topic of training a neural network by sample values belonging to a certain training set: this explains the interest in studying approximation results by means of NN operators in various contexts [15,20,21,22,23,24].

Indeed, as can be seen from the references therein, results in this sense have been studied deeply in terms of various aspects, such as the convergence and the order of approximation.

In this paper, we deal with the problem of the saturation order and of inverse results of approximation.

In general, the problem of establishing the saturation order for a family of operators

L_{n}

(see [25,26,27,28]),

n \in N

, consists in determining a class of functions

D

, a certain subclass

E

of trivial functions of

D

, and a positive non-increasing function

φ (n)

,

n \in N

, such that there exists

g \in D \ E

with

∥ L_{n} g - g ∥ = O (φ (n))

, as

n \to + \infty

, and with the property that, for any

f \in D

with:

∥ L_{n} f - f ∥ = o (φ (n)), n \to + \infty,

it turns out that

f \in E

, and vice versa. Here,

∥ \cdot ∥

denotes any suitable norm on

D

. In this case,

φ (n)

is said to be the saturation order of the approximation process

L_{n}

, and it represents the best possible order of approximation that can be achieved on

D

by the above approximation operators.

In case of the max-product NN operators of the Kantorovich type, according to the studies given in [9,29], we expect that for a certain subclass

D

of

C ([0, 1])

(endowed with the usual max-norm), with

φ (n) = 1 / n

,

n \in N

, then f is constant over

[0, 1]

. Hence, we also have that the trivial class of functions

E

is given by constant functions. Indeed, one of the main results that we establish in the present paper is exactly the proof of the above claim.

Further, since in [9] it has been proved that, in the space

Lip ([0, 1])

, the order of approximation is exactly

1 / n

, as n goes to

+ \infty

, it is natural to ask if also the converse implication holds.

In this paper, we proved exactly a local version of such an inverse approximation theorem, i.e., if the relation:

{∥K_{n}^{(M)} (f, \cdot) - f (\cdot)∥}_{\infty} = O (1 / n), n \to + \infty,

olds, with f continuous and non-decreasing, then f belongs to

Lip ([a, b])

, for every sub-interval

[a, b] \subset [0, 1]

.

2. Preliminaries

By

C ([0, 1])

, we will denote the space of continuous functions

f : [0, 1] \to R

, while by

C_{+} ([0, 1])

, we will indicate the subspace of

C ([0, 1])

of the non-negative valued functions. Furthermore, we will denote by

Lip ([0, 1])

the subspace of

C ([0, 1])

of the Lipschitz continuous functions on

[0, 1]

, i.e., the space of functions f for which there exists a positive constant L such that

|f (x) - f (y)| \leq L |x - y|, x, y \in [0, 1] .

Finally, we also denote by

{∥\cdot∥}_{\infty}

the classical max-norm. Obviously, all the above notations can be given by replacing the interval

[0, 1]

with any bounded or unbounded interval

I \subset R

.

We now recall the definition of a sigmoidal function introduced by Cybenko [30].

Definition 1.

Let

σ : R \to R

be a measurable function. We call σ a sigmoidal function if

lim_{x \to - \infty} σ (x) = 0, and lim_{x \to + \infty} σ (x) = 1 .

In what follows, we consider non-decreasing sigmoidal functions

σ

that satisfy the following conditions:

( $Σ$ 1) $σ (x) - 1 / 2$ is an odd function;
( $Σ$ 2) $σ \in C^{2} (R)$ is concave for $x \geq 0$ ;
( $Σ$ 3) $σ (x) = O ({|x|}^{- α - 1})$ as $x \to - \infty$ for some $α > 0$ .

Notice that assumptions

(Σ i), i = 1, 2, 3

are satisfied by the main examples of sigmoidal functions known in the literature.

For instance, examples of sigmoidal activation functions are given by

σ_{ℓ} (x)

:

= {(1 + e^{- x})}^{- 1}, x \in R

(i.e., the so-called logistic function), and by

σ_{h} (x) : = (tanh (x) + 1) / 2

,

x \in R

(i.e., the so-called hyperbolic tangent activation function). Note that both

σ_{ℓ} (x)

and

σ_{h} (x)

satisfy

(Σ 3)

for all

α > 0

in view of their exponential decay at

x \to - \infty

.

Further, we can also recall the definition of the sigmoidal functions that can be generated by the well-known central B-splines of order

n \in N^{+}

M_{n} (x) : = \frac{1}{(n - 1)!} \sum_{i = 0}^{n} {(- 1)}^{i} (\binom{n}{i}) {(\frac{n}{2} + x - i)}_{+}^{n - 1}, x \in R,

where the function

{(x)}_{+} : = max \{x, 0\}

. Hence, we can define the sigmoidal function

σ_{M_{n}}

generated by

M_{n}

as follows:

σ_{M_{n}} (x) : = \int_{- \infty}^{x} M_{n} (t) d t, x \in R .

Obviously,

σ_{M_{n}}

satisfies assumption

(Σ 1)

for every

n \geq 1

. Further, since

M_{n}

have compact supports, and the supports are contained in the intervals

[- \frac{n}{2}, \frac{n}{2}]

, it turns out that

σ_{M_{n}}

also satisfies

(Σ 3)

for every

α > 0

. Further, assumption

(Σ 2)

is satisfied for

n \geq 1

.

Now, considering (from now on) a function

σ

that satisfies the above assumptions, we can recall the definition of the density (kernel) function

ϕ_{σ}

, that is:

ϕ_{σ} (x) : = \frac{1}{2} [σ (x + 1) - σ (x - 1)], x \in R .

For the function

ϕ_{σ}

, the following lemma can be proved.

Lemma 1.

(i)

ϕ_{σ} (x) \geq 0

for every

x \in R

, with

ϕ_{σ} (2) > 0

, and

{lim}_{x \to \pm \infty} ϕ_{σ} (x) = 0

;

(i i)

ϕ_{σ} (x)

is an even function;

(i i i)

ϕ_{σ} (x)

is non-decreasing for

x < 0

and non-increasing for

x \geq 0

;

(i v)

ϕ_{σ} (x) = O ({|x|}^{- α - 1})

as

x \to \pm \infty

where α is the positive constant of condition

(Σ 3)

.

For a proof of conditions

(i)

–

(i v)

, see [31].

Now, we introduce the following notation used in the literature (see, e.g., [3]) in order to define the so-called max-product type operators.

Definition 2.

Let

K_{1}, K_{2}

be two integers with

K_{1} \leq K_{2}

and let

A_{k} \in R, k = K_{1}, \dots, K_{2}

. Then we define

⋁_{k = K_{1}}^{K_{2}} A_{k} : = max \{A_{k}, k = K_{1}, \dots, K_{2}\} .

Now, we recall the following lemma that will be useful in order to show that the family of operators investigated in this paper are well-defined.

Lemma 2.

([29])

⋁_{k = 0}^{n - 1} ϕ_{σ} (n x - k) \geq ϕ_{σ} (2) > 0

for every

x \in [0, 1]

.

The definition of the max-product NN operators of the Kantorovich type can now be recalled.

Definition 3.

Let

f : [0, 1] \to R

be a bounded and locally integrable function and let

n \in N^{+}

. The max-product NN operators of the Kantorovich type activated by σ are defined by:

K_{n}^{(M)} (f, x) : = \frac{⋁_{k = 0}^{n - 1} ϕ_{σ} (n x - k) n \int_{k / n}^{(k + 1) / n} f (u) d u}{⋁_{k = 0}^{n - 1} ϕ_{σ} (n x - k)}, x \in [0, 1] .

Clearly, in view of the properties established in Lemma 2, it turns out that

K_{n}^{(M)} (f, x)

are well-defined, and, moreover, it is quite simple to observe that

{∥K_{n}^{(M)} (f, \cdot)∥}_{\infty} \leq {∥f∥}_{\infty} < + \infty .

Concerning the assumptions

(Σ 1)

,

(Σ 2)

, and

(Σ 3)

assumed above, we can observe that condition

(Σ 2)

could be avoided, requiring that the sigmoidal function

σ

is such that

σ (3) > σ (1)

and condition

(i i i)

of Lemma 1 is fulfilled by

ϕ_{σ}

. The main advantage that can be achieved by the latter fact is that one could apply all the approximation results established below also to discontinuous and non-smooth sigmoidal functions.

An example of continuous but non-smooth sigmoidal function (given according to the above remark) is the so-called ramp function

σ_{R} (x)

(see [12]) defined as follows:

σ_{R} (x) : = \{\begin{matrix} 0, & x < - \frac{3}{2} \\ \frac{x}{3} + \frac{1}{2}, & - \frac{3}{2} \leq x \leq \frac{3}{2} \\ 1, & x > 1 . \end{matrix}

In particular,

σ_{R}

satisfies condition

(Σ 3)

for all

α > 0

, and

ϕ_{σ_{R}}

turns out to be a function with compact support; moreover,

σ_{R} (3) > σ_{R} (1)

.

Note that (see [27]) the sigmoidal function

σ_{M_{1}} (3 \cdot)

coincides with the ramp function

σ_{R}

; now recalling the definition of the well-known rectified linear unit (ReLU) activation function (see, e.g., [19,32]):

ψ_{R e L U} (x) : = {(x)}_{+}, x \in R,

it turns out that:

σ_{M_{1}} (3 x) : = ψ_{R e L U} (3 x + 1 / 2) - ψ_{R e L U} (3 x - 1 / 2), x \in R .

Thus, the density function

ϕ_{σ_{M_{1}} (3 \cdot)}

can be expressed in terms of ReLU activation function as follows:

ϕ_{σ_{R}} (x) = ϕ_{σ_{M_{1}} (3 \cdot)} (x) = ψ_{R e L U} (3 x + 1) - 2 ψ_{R e L U} (3 x) + ψ_{R e L U} (3 x - 1), x \in R .

As a consequence of the latter relation, the NN operators

K_{n}^{(M)}

activated by

σ_{M_{1}} (3 \cdot)

can be considered as an NN activated by the above linear combination of ReLU activation functions. Recently, it has been proved that

ψ_{R e L U}

is very suitable in order to train deep (i.e., multi-layer) neural networks; see, e.g., [33,34]. For more details concerning

ψ_{R e L U}

, see also [19,35].

3. The Saturation Order

It is well known that, if

f \in C_{+} ([0, 1])

the family

(K_{n}^{(M)} (f, \cdot))

converges uniformly to f (see [9]). Moreover, we also know that the following quantitative estimates:

{∥K_{n}^{(M)} (f, \cdot) - f (\cdot)∥}_{\infty} \leq M ω (f, \frac{1}{n}),

(1)

as

n \to + \infty,

there holds if condition

(Σ 3)

is satisfied for

α \geq 1

,

M > 0

and where:

ω (f, \frac{1}{n}) : = sup \{|f (x) - f (y)| : x, y \in [0, 1], with |x - y| \leq 1 / n\}

denotes the usual modulus of continuity of the function

f \in C_{+} ([0, 1])

(see, e.g., [36]). From the latter result, it turns out that, if the function f belongs to

Lip ([0, 1])

, then the order of uniform approximation is

1 / n

, as

n \to + \infty

.

In this section, we study the saturation order for the NN operator of the Kantorovich type activated by sigmoidal functions; i.e., we show that

1 / n

is the best possible order of approximation that can be achieved for non-decreasing functions that belong to

C_{+} ([0, 1])

.

In order to reach our main purpose, we need some preliminary lemmas.

Lemma 3.

For any

j \in \{0, \dots, n - 1\}

,

n \in N^{+}

, we have:

⋁_{k = 0}^{n - 1} ϕ_{σ} (n x - k) = ϕ_{σ} (n x - j)

for every

x \in [\frac{j}{n} - \frac{1}{2 n}, \frac{j}{n} + \frac{1}{2 n}] \cap [0, 1]

. Further, if

x \in (1 - \frac{1}{2 n}, 1]

we have:

⋁_{k = 0}^{n - 1} ϕ_{σ} (n x - k) = ϕ_{σ} (n x - n + 1) .

Proof.

Let

j \in \{0, \dots, n - 1\}

be fixed and

x \in [\frac{j}{n} - \frac{1}{2 n}, \frac{j}{n} + \frac{1}{2 n}] \cap [0, 1]

. Observing that

|n x - j| = n |x - j / n| \leq 1 / 2

(2)

and, since

ϕ_{σ} (x)

is even and non-increasing for

x \geq 0

, we get:

ϕ_{σ} (n x - j) = ϕ_{σ} (|n x - j|) \geq ϕ_{σ} (1 / 2) \geq ϕ_{σ} (2) > 0 .

Similarly, if

k \in \{0, \dots, n - 1\}

and

k \neq j

, we have

|n x - k| = n |x - k / n| \geq 1 / 2

(3)

and so, using the previous properties of

ϕ_{σ} (x)

we have:

ϕ_{σ} (n x - k) = ϕ_{σ} (|n x - k|) \leq ϕ_{σ} (1 / 2) .

If

x \in (1 - \frac{1}{2 n}, 1]

we note that

|n x - n + 1| = n |x - (n - 1) / n| \leq 1

and, if

k \neq n - 1

|n x - k| = n |x - k / n| > 1

and the claim follows, arguing as in (2) and (3), respectively. □

Lemma 4.

Let

I \subseteq R

be a bounded or unbounded interval and

f \in C (I)

. Suppose in addition that there exists an absolute positive constant C with the property that for every

ε > 0

there exists

n (ε) \in N^{+}

such that for any

n \in N

,

n \geq n (ε)

and

j \in Z

, with

\frac{j}{n}, \frac{j + 1}{n} \in I

, we have

|f (\frac{j + 1}{n}) - f (\frac{j}{n})| \leq \frac{C ε}{n} .

(4)

Then f is a constant function.

Proof.

Let us choose arbitrary

x_{0}, y_{0} \in I

,

x_{0} < y_{0}

and

ε > 0

. The continuity of f implies the existence of

n_{0} (ε) \in N^{+}

, such that for any

x, y \in I

,

|x - x_{0}| \leq \frac{1}{n_{0} (ε)}

,

|y - y_{0}| \leq \frac{1}{n_{0} (ε)}

, we have

|f (x) - f (x_{0})| \leq ε, |f (y) - f (y_{0})| \leq ε .

We now fix

n_{1} = max \{n (ε), n_{0} (ε), \frac{1}{y_{0} - x_{0}}\}

, where

n (ε)

is the constant arising from the assumptions, and let us choose arbitrary

n \in N

such that

n \geq n_{1}

. Since

\frac{1}{n_{1}} \leq y_{0} - x_{0}

, it follows that there exists

k \in Z

and

l \in N

such that

\frac{k - 1}{n} \leq x_{0} \leq \frac{k}{n} \leq \frac{k + 1}{n} \leq \dots \leq \frac{k + l}{n} \leq y_{0} \leq \frac{k + l + 1}{n} .

Applying successively the triangle inequality we get

\begin{matrix} |f (x_{0}) - f (y_{0})| & \leq & |f (x_{0}) - f (\frac{k}{n})| + |f (\frac{k}{n}) - f (\frac{k + 1}{n})| \\ + \dots + |f (\frac{k + l - 1}{n}) - f (\frac{k + l}{n})| \\ + |f (\frac{k + l}{n}) - f (y_{0})| . \end{matrix}

By relation (4), we have

|f (\frac{k + p}{n}) - f (\frac{k + p + 1}{n})| \leq \frac{C ε}{n}, p = 0, 1, \dots, l - 1,

and since

max \{|x_{0} - \frac{k}{n}|, |y_{0} - \frac{k + l}{n}|\} \leq \frac{1}{n} \leq \frac{1}{n_{0} (ε)}

, we get

|f (x_{0}) - f (y_{0})| \leq \frac{l C ε}{n} + 2 ε .

On the other hand, we observe that

\frac{k + l}{n} - \frac{k}{n} \leq y_{0} - x_{0}

, which implies that

l \leq n (y_{0} - x_{0})

. Thus, we obtain

|f (x_{0}) - f (y_{0})| \leq C ε (y_{0} - x_{0}) + 2 ε .

Now, since

ε > 0

has been chosen arbitrarily, passing to the infimum for

ε > 0

in the previous inequality, we deduce that

f (x_{0}) = f (y_{0})

. By the arbitrariness of

x_{0}

and

y_{0}

, it turns out that f is a constant function on the whole I. □

Lemma 5.

Let

I \subseteq R

be a bounded or unbounded interval and

f \in C (I)

be a non-decreasing function with the property that for any couple

a, b

,

a < b

, of inner points of I, and for every

ε > 0

there exists

n (a, b, ε) \in N^{+}

such that for any

n \in N

,

n \geq n (a, b, ε)

and

j \in Z

, such that

\frac{j + 1}{n}, \frac{2 j + 1}{2 n} \in [a, b]

, we have

f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) \leq \frac{ε}{n} .

(5)

Then f is a constant function over I.

Proof.

Let

n \in N

,

n \geq n (a, b, ε)

, such that

\frac{1}{n} \leq b - a

, and let us choose arbitrary

j \in Z

such that

a \leq \frac{j}{n} \leq \frac{j + 1}{n} \leq b

. We observe that for any

k \in N^{+}

, we have

\frac{j}{n} < \frac{2^{k} j + 1}{2^{k} n} < \frac{j + 1}{n} .

Therefore, applying successively relation (5) we obtain

\begin{matrix} f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) & \leq & \frac{ε}{n}, \\ f (\frac{2 j + 1}{2 n}) - f (\frac{4 j + 1}{4 n}) & \leq & \frac{ε}{2 n}, (j : = 2 j, n : = 2 n) \\ f (\frac{4 j + 1}{4 n}) - f (\frac{8 j + 1}{8 n}) & \leq & \frac{ε}{4 n}, (j : = 4 j, n : = 4 n) \\ \cdot \\ \cdot \\ f (\frac{2^{k} j + 1}{2^{k} n}) - f (\frac{2^{k + 1} j + 1}{2^{k + 1} n}) & \leq & \frac{ε}{2^{k} n}, (j : = 2^{k} j, n : = 2^{k} n) . \end{matrix}

Taking, respectively, the sums of all the terms in the first and second parts of the previous inequalities, we obtain

f (\frac{j + 1}{n}) - f (\frac{2^{k + 1} j + 1}{2^{k + 1} n}) \leq \frac{ε}{n} \cdot (1 + \frac{1}{2} + \dots + \frac{1}{2^{k}}) \leq \frac{2 ε}{n}

Since

lim_{k \to \infty} f (\frac{2^{k + 1} j + 1}{2^{k + 1} n}) = f (\frac{j}{n}),

it follows that

f (\frac{j + 1}{n}) - f (\frac{j}{n}) \leq \frac{2 ε}{n}

and since f is non-decreasing, we get

|f (\frac{j + 1}{n}) - f (\frac{j}{n})| \leq \frac{2 ε}{n} .

By Lemma 4 it follows that f is constant in

[a, b]

. Since a and b are two arbitrary inner points of I and f is continuous, it easily results that f is constant in I. □

Note that Lemma 4 and Lemma 5 can also be extrapolated from the monograph [3].

Now we can prove the main theorem of this section.

Theorem 1.

Let

f \in C_{+} ([0, 1])

be a non-decreasing function such that

{∥K_{n}^{(M)} (f, \cdot) - f (\cdot)∥}_{\infty} = o (n^{- 1})

as

n \to \infty

. Then f is constant over

[0, 1]

.

Proof.

Let us choose arbitrary

a, b \in (0, 1)

,

a < b

. Further, let

n \in N^{+}

be sufficiently large such that

\frac{1}{n} < b - a

and

b < 1 - \frac{1}{2 n}

. Now, we fix

j \in {0, 1, \dots, n - 2}

, such that

\frac{j + 1}{n}, \frac{j}{n} + \frac{1}{2 n} \in [a, b]

. We have

K_{n}^{(M)} (f, \frac{j}{n} + \frac{1}{2 n}) = K_{n}^{(M)} (f, \frac{2 j + 1}{2 n}) = \frac{⋁_{k = 0}^{n - 1} ϕ_{σ} (n \cdot \frac{2 j + 1}{2 n} - k) n \int_{k / n}^{(k + 1) / n} f (u) d u}{⋁_{k = 0}^{n - 1} ϕ_{σ} (n \cdot \frac{2 j + 1}{2 n} - k)} .

By Lemma 3, it follows that

\begin{matrix} ⋁_{k = 0}^{n - 1} ϕ_{σ} (n \cdot \frac{2 j + 1}{2 n} - k) & = & ϕ_{σ} (n \cdot \frac{2 j + 1}{2 n} - j) \\ = & ϕ_{σ} (n \cdot \frac{2 j + 1}{2 n} - (j + 1)) \end{matrix}

which easily implies that

K_{n}^{(M)} (f, \frac{2 j + 1}{2 n}) \geq n \int_{(j + 1) / n}^{(j + 2) / n} f (u) d u .

Moreover, recalling that f is non-decreasing, it follows that

K_{n}^{(M)} (f, \frac{2 j + 1}{2 n}) \geq f (\frac{j + 1}{n})

, and this implies

0 \leq f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) \leq {∥K_{n}^{(M)} (f, \cdot) - f∥}_{\infty} = o (n^{- 1}), a s n \to + \infty .

Then, we can prove that for every

ε > 0

, there exists

n (ε) \in N^{+}

such that for any

n \in N^{+}

, with

n \geq n (ε)

and

j \in Z

, such that

\frac{j + 1}{n}, \frac{2 j + 1}{2 n} \in [a, b]

, we have

f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) \leq \frac{ε}{n},

and hence the proof follows by Lemma 5. □

Remark 1.

The previous theorem can be easily generalized in the case of functions defined on arbitrary intervals

[a, b]

,

a, b \in R

, instead of

[0, 1]

. This is possible by defining the operators

K_{n}^{(M)}

on generic intervals

[a, b]

(as made in [9]) and then working with continuous and non-decreasing

f : [a, b] \to R^{+}

.

4. Local Inverse Result

The main aim of this section is to prove an inverse theorem of approximation. We will use a strategy similar to that presented in the previous section.

Lemma 6.

Let

I \subseteq R

be a bounded or unbounded interval and

f \in C (I)

. Suppose in addition that there exists an absolute positive constant C such that for any sufficiently large

n \in N

and every

j \in Z

, with

\frac{j}{n}, \frac{j + 1}{n} \in I

, we have

|f (\frac{j + 1}{n}) - f (\frac{j}{n})| \leq \frac{C}{n} .

(6)

Then

f \in Lip (I) .

Proof.

Let us choose arbitrary

x_{0}, y_{0} \in I

,

x_{0} < y_{0}

. Moreover, let

ε > 0

with

ε < y_{0} - x_{0} .

The continuity of f implies the existence of

n_{0} (ε) \in N^{+}

, such that for any

x, y \in I

,

|x - x_{0}| \leq \frac{1}{n_{0} (ε)}

,

|y - y_{0}| \leq \frac{1}{n_{0} (ε)}

, we have

|f (x) - f (x_{0})| \leq ε, |f (y) - f (y_{0})| \leq ε .

We now fix

n_{1} = max \{n_{0} (ε), \frac{1}{y_{0} - x_{0}}\}

, and let us choose arbitrary

n \in N

with

n \geq n_{1}

. Proceeding as in the proof in Lemma 4, we get

|f (x_{0}) - f (y_{0})| \leq \frac{l C}{n} + 2 ε .

and since

ε < y_{0} - x_{0}

, we get

|f (x_{0}) - f (y_{0})| \leq (C + 2) (y_{0} - x_{0}) .

Then the thesis follows by the arbitrariness of

x_{0}

and

y_{0}

. □

Lemma 7.

Let

I \subseteq R

be a bounded or unbounded interval and

f \in C (I)

be a non-decreasing function with the property that, for any couple

a, b

,

a < b

, of inner points of I, there exists a

C > 0

such that for every sufficiently large

n \in N

and

j \in Z

, with

\frac{j + 1}{n}, \frac{2 j + 1}{2 n} \in [a, b]

, we have

f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) \leq \frac{C}{n} .

(7)

Then

f \in Lip ([a, b])

.

Proof.

Let

n \in N

be sufficiently large such that

\frac{1}{n} \leq b - a

, and let us choose arbitrary

j \in Z

such that

a \leq \frac{j}{n} \leq \frac{j + 1}{n} \leq b

. Arguing as in Lemma 5, we get

f (\frac{j + 1}{n}) - f (\frac{2^{k + 1} j + 1}{2^{k + 1} n}) \leq \frac{C}{n} \cdot (1 + \frac{1}{2} + \cdot \cdot \cdot + \frac{1}{2^{k}}) \leq \frac{2 C}{n}, k \in N^{+} .

Taking the limit as

k \to + \infty

, we obtain

f (\frac{j + 1}{n}) - f (\frac{j}{n}) \leq \frac{2 C}{n}

and, since f is non-decreasing,

|f (\frac{j + 1}{n}) - f (\frac{j}{n})| \leq \frac{2 C}{n} .

Hence, by Lemma 6, it follows that

f \in Lip ([a, b])

. □

For results similar to that ones of Lemma 6 and Lemma 7 (only in the case of bounded intervals), one can see, e.g., [3].

Now we can finally prove a (local) inverse theorem of approximation.

Theorem 2.

Let

f \in C_{+} ([0, 1])

be a non-decreasing function such that

{∥K_{n}^{(M)} (f, \cdot) - f (\cdot)∥}_{\infty} = O (n^{- 1})

as

n \to \infty

. Then, for every

[a, b] \subset (0, 1)

, it turns out that

f \in Lip ([a, b])

.

Proof.

As in Theorem 1, we choose arbitrary

a, b \in (0, 1)

,

a < b

. Let now

n \in N^{+}

be sufficiently large such that

\frac{1}{n} < b - a

and

b < 1 - \frac{1}{2 n}

. Then, let

j \in {0, 1, \dots, n - 2}

, such that

\frac{j + 1}{n}, \frac{2 j + 1}{2 n} \in [a, b]

. Proceeding as in the proof of the above-mentioned theorem, we immediately get

0 \leq f (\frac{j + 1}{n}) - f (\frac{2 j + 1}{2 n}) \leq {∥K_{n}^{(M)} (f, \cdot) - f (\cdot)∥}_{\infty} = O (n^{- 1}), a s n \to + \infty,

and, by Lemma 7, this implies that

f \in Lip ([a, b])

. □

Author Contributions

Conceptualization, formal analysis, writing—review and editing M.C.; Conceptualization, formal analysis, writing—review and editing L.C.; Conceptualization, formal analysis, writing—review and editing D.C.; Conceptualization, formal analysis, writing—review and editing S.G.G.; Conceptualization, formal analysis, writing—review and editing G.V. All authors have read and agreed to the published version of the manuscript.

Funding

D. Costarelli has been partially supported within the 2020 GNAMPA-INdAM Project “Analisi reale, teoria della misura ed approssimazione per la ricostruzione di immagini”, and G. Vinti has been partially supported by the projects: (1) Ricerca di Base 2017 dell’Università degli Studi di Perugia—“Metodi di teoria degli operatori e di Analisi Reale per problemi di approssimazione ed applicazioni”, (2) Ricerca di Base 2018 dell’Università degli Studi di Perugia—“Metodi di Teoria dell’Approssimazione, Analisi Reale, Analisi Nonlineare e loro Applicazioni”, (3) “Metodi e processi innovativi per lo sviluppo di una banca di immagini mediche per fini diagnostici” funded by the Fondazione Cassa di Risparmio di Perugia (FCRP), 2018, (4) “Metodiche di Imaging non invasivo mediante angiografia OCT sequenziale per lo studio delle Retinopatie degenerative dell’Anziano (M.I.R.A.)”, funded by FCRP, 2019.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The study did not report any data.

Acknowledgments

The authors M. Cantarini, D. Costarelli and G. Vinti are members of the Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM), of the network RITA (Research ITalian network on Approximation), and of the UMI group “Teoria dell’Approssimazione e Applicazioni (T.A.A.)”.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ReLU	Rectified Linear Unit

References

Coroianu, L.; Gal, S.G. Approximation by nonlinear generalized sampling operators of max-product kind. Sampl. Theory Signal Image Process. 2010, 9, 59–75. [Google Scholar] [CrossRef]
Coroianu, L.; Gal, S.G. Approximation by max-product sampling operators based on sinc-type kernels. Sampl. Theory Signal Image Process. 2011, 10, 211–230. [Google Scholar] [CrossRef]
Bede, B.; Coroianu, L.; Gal, S.G. Approximation by Max-Product Type Operators; Springer: Berlin, Germany, 2016. [Google Scholar] [CrossRef]
Güngör, S.Y.; Ispir, N. Approximation by Bernstein-Chlodowsky operators of max-product kind. Math. Commun. 2018, 23, 205–225. [Google Scholar]
Holhos, A. Weighted Approximation of functions by Meyer-König and Zeller operators of max-product type. Numer. Funct. Anal. Optim. 2018, 39, 689–703. [Google Scholar] [CrossRef]
Holhos, A. Weighted approximation of functions by Favard operators of max-product type. Period. Math. Hung. 2018, 77, 340–346. [Google Scholar] [CrossRef]
Gokcer, T.Y.; Duman, O. Approximation by max-min operators: A general theory and its applications. Fuzzy Sets Syst. 2020, 394, 146–161. [Google Scholar] [CrossRef]
Gokcer, T.Y.; Duman, O. Regular summability methods in the approximation by max-min operators. Fuzzy Sets Syst. 2022, 426, 106–120. [Google Scholar] [CrossRef]
Costarelli, D.; Vinti, G. Approximation by max-product neural network operators of Kantorovich type. Results Math. 2016, 69, 505–519. [Google Scholar] [CrossRef]
Cardaliaguet, P.; Euvrard, G. Approximation of a function and its derivative with a neural network. Neural Netw. 1992, 5, 207–220. [Google Scholar] [CrossRef]
Cao, F.; Chen, Z. The approximation operators with sigmoidal functions. Comput. Math. Appl. 2009, 58, 758–765. [Google Scholar]
Cao, F.; Chen, Z. The construction and approximation of a class of neural networks operators with ramp functions. J. Comput. Anal. Appl. 2012, 14, 101–112. [Google Scholar]
Cao, F.; Chen, Z. Scattered data approximation by neural networks operators. Neurocomputing 2016, 190, 237–242. [Google Scholar]
Dai, H.; Xie, J.; Chen, W. Event-Triggered Distributed Cooperative Learning Algorithms over Networks via Wavelet Approximation. Neural Process. Lett. 2019, 50, 669–700. [Google Scholar] [CrossRef]
Ismailov, V.E. On the approximation by neural networks with bounded number of neurons in hidden layers. J. Math. Anal. Appl. 2014, 417, 963–969. [Google Scholar] [CrossRef]
Cao, F.; Liu, B.; Park, D.S. Image classification based on effective extreme learning machine. Neurocomputing 2013, 102, 90–97. [Google Scholar] [CrossRef]
Agostinelli, F.; Hoffman, M.; Sadowski, P.; Baldi, P. Learning Activation Functions to Improve Deep Neural Networks. arXiv 2015, arXiv:1412.6830v3. [Google Scholar]
Iliev, A.; Kyurkchiev, N.; Markov, S. On the approximation of the cut and step functions by logistic and Gompertz functions. Biomath 2015, 4, 1510101. [Google Scholar] [CrossRef] [Green Version]
Yarotsky, D. Error bounds for approximations with deep ReLU networks. Neural Netw. 2017, 94, 103–114. [Google Scholar] [CrossRef] [Green Version]
Bajpeyi, S.; Sathish Kumar, A. Approximation by exponential sampling type neural network operators. Anal. Math. Phys. 2021, 11, 108. [Google Scholar] [CrossRef]
Cantarini, M.; Costarelli, D.; Vinti, G. Asymptotic expansions for the neural network operators of the Kantorovich type and high order of approximation. Mediterr. J. Math. 2021, 18, 66. [Google Scholar] [CrossRef]
Costarelli, D.; Sambucini, A.R. Approximation results in Orlicz spaces for sequences of Kantorovich max-product neural network operators. Results Math. 2018, 73, 15. [Google Scholar] [CrossRef] [Green Version]
Cucker, F.; Zhou, D.X. Learning Theory An Approximation Theory Viewpoint; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
Kadak, U. Fractional type multivariate neural network operators. Math. Methods Appl. Sci. 2021. [Google Scholar] [CrossRef]
Coroianu, L.; Gal, S.G. Saturation results for the truncated max-product sampling operators based on sinc and Fejér-type kernels. Sampl. Theory Signal Image Process. 2012, 11, 113–132. [Google Scholar] [CrossRef]
Coroianu, L.; Gal, S.G. Saturation and inverse results for the Bernstein max- product operator. Period. Math. Hung. 2014, 69, 126–133. [Google Scholar] [CrossRef]
Costarelli, D.; Vinti, G. Saturation classes for max-product neural network operators activated by sigmoidal functions. Results Math. 2017, 72, 1555–1569. [Google Scholar] [CrossRef]
Ivanov, K. On a new characteristic of functions. II. Direct and converse theorems for the best algebraic approximation in C[-1,1] and L^p[-1,1]. Pliska 1983, 5, 151–163. [Google Scholar]
Costarelli, D.; Vinti, G. Convergence results for a family of Kantorovich max-product neural network operators in a multivariate setting. Math. Slovaca 2017, 67, 1469–1480. [Google Scholar] [CrossRef]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Costarelli, D.; Vinti, G. Convergence for a family of neural network operators in Orlicz spaces. Math. Nachr. 2017, 290, 226–235. [Google Scholar] [CrossRef]
Goebbels, S. On sharpness of error bounds for univariate single hidden layer feedforward neural networks. Results Math. 2020, 75, 109. [Google Scholar] [CrossRef]
Li, Y.; Yuan, Y. Convergence Analysis of Two-layer Neural Networks with ReLU Activation. arXiv 2017, arXiv:1705.09886. Available online: https://arxiv.org/abs/1705.09886 (accessed on 1 December 2021).
Zhang, C.; Woodland, P.C. DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016. [Google Scholar]
Agarap, A.F. Deep Learning using Rectified Linear Units (ReLU). arXiv 2018, arXiv:1803.08375. [Google Scholar]
DeVore, R.A.; Lorentz, G.G. Constructive Approximation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1992. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cantarini, M.; Coroianu, L.; Costarelli, D.; Gal, S.G.; Vinti, G. Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order. Mathematics 2022, 10, 63. https://doi.org/10.3390/math10010063

AMA Style

Cantarini M, Coroianu L, Costarelli D, Gal SG, Vinti G. Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order. Mathematics. 2022; 10(1):63. https://doi.org/10.3390/math10010063

Chicago/Turabian Style

Cantarini, Marco, Lucian Coroianu, Danilo Costarelli, Sorin G. Gal, and Gianluca Vinti. 2022. "Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order" Mathematics 10, no. 1: 63. https://doi.org/10.3390/math10010063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inverse Result of Approximation for the Max-Product Neural Network Operators of the Kantorovich Type and Their Saturation Order

Abstract

1. Introduction

2. Preliminaries

3. The Saturation Order

4. Local Inverse Result

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI