On behavior of signs for the heat equation and a diffusion method for data separation

Consider the solution u ( x; t ) of the heat equation with initial data u 0 . The diﬀusive sign S D [ u 0 ]( x ) is de(cid:12)ned by the limit of sign of u ( x; t ) as t ! 0. A suﬃcient condition for x 2 R d and u 0 such that S D [ u 0 ]( x ) is well-de(cid:12)ned is given. A few examples of u 0 violating and ful(cid:12)lling this condition are given. It turns out that this diﬀusive sign is also related to variational problem whose energy is the Dirichlet energy with a (cid:12)delty term. If initial data is a diﬀerence of characteristic function of two disjoint sets, it turns out that the boundary of the set S D [ u 0 ]( x ) = 1 (or (cid:0) 1) is roughly an equi-distance hypersurface from A and B and this gives a separation of two data sets.


Introduction
We consider a simple Cauchy problem for the heat equation in R d (d ≥ 1) with a real-valued bounded (measurable) initial data u 0 of the form (1. 2) The unique bounded solution u is known (see e.g. [W]) to be represented by the Gaussian kernel G t of the form . We are interested in the behavior of sign of u as t tends to zero.
We If u 0 is continuous atx and sgn u 0 (x) = 0, the diffusive sign is well-defined atx and agrees with sgn u 0 (x) since u (x, t) is continuous at (x, 0); see e.g. [GGS]. However, if u 0 (x) = 0, the diffusive sign may not be well-defined even if u 0 is continuous nearx. We show this phenomenon by giving explicit examples where u (x, t) changes its sign infinitely many times as t tends to zero (Lemma 2.2 and Theorem 2.3).
Our main goal of this paper is to give a sufficient condition for u 0 so that S D [u 0 ](x) is well-defined for a given point x. In one-dimensional problem this is related to the number of changes of sign which is also called the "number of zeros" in the literature. Let Z[u 0 ] be the supremum over all k such that there exists −∞ < x 0 < x 1 < · · · < x k < ∞ with u 0 (x i ) u 0 (x i+1 ) < 0 (i = 0, 1, . . . , k − 1). ( provided that u 0 is continuous atx with u 0 (x) = 0 (Theorem 2.1). For a higher dimensional case one should replaceū 0 bȳ where H d−1 denotes d−1 dimensional Hausdorff measure so that dH d−1 is the surface element (Theorem 2.4). These assertions can be proved by a simple application of the strong maximum principle [PW]. Under this setting one is able to prove that the set of x when S D [u 0 ](x) = 0 is a codimension one set, so it is negligible in the sense of the Lebesgue measure. This means the zero set of the diffusive sign is thin even if the original zero set of u 0 has an interior. The diffusive sign is related to the asymptotic sign for a problem of deblurring images. For a given gray scale image u 0 one way to recover the image is to minimize a strictly convex variational problem where λ > 0 is the fidelty constant. If v λ is the unique (H 1 ) minimizer of (1.7), then v λ solves the Euler-Lagrange equation of the form (1.8) We define the asymptotic sign of u 0 at x of the form (1.9) The large fidelty formally corresponds to small time in the heat equation. In fact, when one approximates the solution of the heat equation by a fully implicit finite difference approximation in time, it is interpreted as an Euler-Lagrange equation of the variational problem. So we expect that provided that S D is well-defined. Indeed, we shall prove it rigorously by writing the Newton potential by a heat semigroup (Theorem 3.1).
In [ROF] the total variation is used in (1.7) instead of the Dirichlet energy for a recovery of blurred image. The idea is to minimize ∫ instead of (1.7). One is able to define However, as it turns out this is quite different from S a or S D because the speed of diffusion is very slow. The set where diffusive sign is zero is rather thin even if the zero set of u 0 has an interior while the set of zeros of S t [u 0 ] may have an interior.
We shall see this phenomenon by an example (Theorem 3.3). We shall apply this diffusive method to separate sets of data. Suppose that each point of R d fulfills either property P or Q (with P ∩ Q = ∅) except very thin set. However, we only know that in some subset A of R d the property P is fullfilled and in some subset B (A ∩ B = ∅) of R d the property Q is fulfilled. We would like to classify other point whether it fulfills the property P or Q in a reasonable way. Usually, people try to find a straight line (or a simple curve) to divide R 2 into two sets so that A belongs to one side of the line and B belongs to another side of the line. The line is taken so that the distance from this line to a closest point of A and B is the same and that this quantity is maximized by taking a suitable normal direction of the line. (In a higher dimensional space the line should be of course a hyperplane.) This is a simple example of support vector machines [CST], [Std] and it is widely used for data separation. This separation line is called a maximal margin classifier [Std,22.3.1]. We propose here to use the heat equation to find a separation curve which is interpreted as an example of a geometric diffusion approach explained in [CLLMNWZ].
We set We implicitly assume that A and B are Lebesgue measurable. We propose to classify a point of R d by using the diffusive sign S D [u 0 ]. We set We also give a few numerical test to draw a separation curve ∂A . In [BF] instead of using (1.10) Ginzburg-Landau type energy is proposed It is essentially known that the Gamma limit as ε → 0 of (1.12) is (1.10) (if one puts a multiple constant 4/3 in front of |∇v| in (1.10)); see e.g. [MM], [S]. Compared with (1.7) this variational problem emphasizes sign very much. Using (1.12), the authors of [BF] separates several data sets on graphs as well as R d . It is not clear whether or not our separation by A and B is the same as theirs. We shall give several speculations in this paper.

Sign of a solution of the heat equation
We give a sufficient condition for u 0 so that S D [u 0 ](x) is well-defined. We start with one-dimensional problem.
Theorem 2.1. Assume that u 0 is a (real-valued) bounded measurable function in R and that u 0 is piecewise continuous (with possibly countably many discoutinuities having at most finitely many accumulation points) and at discontinuities either left or right continuous. Assume that u 0 is continuous atx and u 0 (x) = 0. If the number of changes of sign Z[ū 0 ] of Proof. We may assume that u 0 ≡ 0. We symmetrize the problem by considerinḡ Evidently,ū solves the heat equation with initial dataū 0 . Assume thatū 0 ≡ 0. Since Z[ū 0 ] is (locally) finite andū 0 is even in x, there is an interval (−γ 0 , γ 0 ) such that u 0 is continuous near γ 0 > 0 and that Since both cases can be treated similarly, we consider the first case. Sinceū (x, t) is continuous at x = γ 0 , t = 0 (see e.g. [GGS, Chapter 1]), we may assume that there ist > 0 such thatū (x, t) > 0 on {±γ 0 } × [0,t ). By the strong maximum principle [PW] is well-defined and equals one. A symmetric argument yields that S D [u 0 ](x) = −1 for case (ii). Ifū ≡ 0, then u (x, t) is odd with respect tox. Since u 0 is assumed to be continuous atx, u (x, t) is continuous atx and t = 0. Since u (x + x, t) is odd, u (x + 0, t) = 0 for sufficiently small t. This means that S D [u 0 ](x) is well-defined and equals zero.
Remark 1. (i) The idea using symmetrization is used in many times to prove qualitative properties of solutions of semilinear heat equations. For example, Chen and Matano [CM] proved that the maximum point of w (x, t) (x ∈ R) converges to a unique point as t tends to the blow up time when w solves w t = ∆w + w p (p > 1) by considering the symmetrizationū. (ii) There are several studies about the number of zeros or the number of changes of sign for a solution of a one-dimensional general linear parabolic equation of secondorder. It is known that this number is nonincreasing in time. This type of result goes back to Nickel [N] and rediscovered by Matano [M] and Henry [H], where they proved the nonincrease of the number of changes of sign by the strong maximum principle. For nonincrease of the number of zeros the reader is referred to an article by Angenent [A] where it is also analyzed a way of merging zero when the number of zero actually decreases. This paper appeals to an asymtotic analysis near a point of interest by introducing similarity variables (cf. [GGS]).

If the number of changes of sign
does not exist. We shall give an explicit example of such u 0 . In fact, in our example u (x, t) is oscillatory in time and it changes signs +1 to −1 infinitely many times as t ↓ 0.

Lemma 2.2. Let U k be a function of the form
Let u be the solution of (1.1)-(1.2) with initial data u 0 = U k . Then u (0, t) changes its sign infinitely many times from 1 to −1 as t ↓ 0 provided that k ≥ 8.
Proof. By a direct calculation we have We set .
Note that sgn a n (t) = (−1) n . It is clear that If a n (t) = a n+1 (t) for n ≥ 1, then we obtain ( 2 n + 1 2 n+1 By a direct calculation we see that It is clear that t n > t m for n < m, and a n (t) < a n+1 (t) for t < t n , For t ∈ (t n+1 , t n ) we see that By the same argument we obtain when a n (t) = 2 a n+1 (t) , when 2 a n (t) = a n+1 (t) .
where [a] is a largest integer less than or equal to a. We thus obtain Then the solution of (1.1) with u 0 = U k for k ≥ 8 has infinitely many zeros of t at x = 0.
We are tempted to say that it is enough to assume that u 0 itself has at most finitely many changes of sign to guarantee that S D [u 0 ](x) is well-defined. Unfortunately, this is not true in general. In fact, the following example just changes the sign once but u 0 with respect to zero has infinite Z[ū 0 ].
Proof. Since, by symmetry of the Gaussian kernel, u (0, t) is the same as Lemma 2.2, our assertion is already proved in Lemma 2.
We give a sufficient condition for u 0 so that Assume that u 0 is continuous atx and u 0 (x) = 0. Letū 0 be the radial average around x defined by (1.6). Assume thatū 0 is piecewise continuous (with possibly countably many dicontinuities having at most finitely many accumulation points). Moreover, it is left or right continuous at discontinuities. If the number of changes of sign Proof. We may assume u 0 ≡ 0. We study the radial average of the solution Evidently,ū is a radial solution of the heat equation (1.1) with initial dataū 0 . Assume thatū 0 ≡ 0. Then we proceed exactly as in Theorem 2.1 and observe thatū (0, t) > 0 for t ∈ (0,t ) for the case (i), which implies that u ( is contained in the set of zero of u (x, t) for all t sinceū 0 atx equals zero. Since u (·, t) is analytic in space for t > 0, Σ is included in an analytic variety, so it is of locally finite d − 1 Hausdorff measure.

Remark 2.
Even if the heat equation is replaced by a general second-order parabolic equation with non-analytic coefficients it is known that the set of zeros of a solution at a fixed time has at most locally finite d − 1 Hausdorff dimension [XYChen].
We next study what kind of initial data satisfies the locally finiteness of Z[ū 0 ]. For further references we say that v is a (locally) finitely many sign-changing function if Z[v] is (locally) finite. Evidently, if u 0 (x 1 ) is real analytic like sin x 1 , thenū 0 is a locally finitely many sign-changing function. For data separation it is convenient to consider a characteristic function. Assume that A and B are a possibly countably many disjoint union of open intervals whose lengths are bounded from below by a some positive content and A ∩ B = ∅. If one sets with c ∈ R, it is easy to set thatū 0 is a locally finitely many sign-changing function.
with c k ∈ R is a locally finitely many sign-changing function. If one considersf , this is again of the form (2.3) with possible modification at locally finitely many points, which does not give any effect to define Z[f ]. Thus one is able to conclude a general statement for a piecewise constant function.
Theorem 2.5. Assume that u 0 is of the form (2.3), i.e., by (1.5) is a locally finitely many sign-changing function for anŷ x ∈ R.
For a higher dimension setting the situation will be more involved. It is expected that if A k has piecewise real analytic boundary then the radial averageū 0 of u 0 is a locally finitely many sign-changing function. We shall give a proof when A k is a square in the plane R d with d = 2.
Moreover, the function (in the RHS of v A ) in [r 2 , r 3 ) can be extended analytically in some neighborhood of [r 2 , r 3 ] while the square of the function in [r 2 , r 3 ) and [r 3 , r 4 ] can be extended analytically in some neighborhood of [r 1 , r 2 ] and [r 3 , r 4 ] respectively.
Proof. The formula for v A follows from a direct calculation. This is a piecewise analytic function. Since there exists a real analytic function b 1 (x) near 1 such that with any δ ∈ (0, 1). By a direct manipulation we observe that with b i which is real analytic in a neighborhood of [r i , r i+1 ] (j = 1, 2, 3).

Theorem 2.7. Let A k be a square of the form
Then the radial average v =ū 0 (r) defined by (1.6) is a finitely many sign-changing function for anŷ So we may assume that a k > b k > 0 to calculate v = Σ m k=1 c k χ A k . By an explicit form of Lemma 2.6 v is at least continuous.
Let r i (i = 1, 2, 3, 4) be the singularity ofχ A with A = A k defined in Lemma 2.6. We denote it by r k,i to clarify the dependence of k. By Lemma 2.6 our v is piecewise real analytic in R except a singular set Let (p, q) be the maximal interval so that v is real analytic. By definition p and q is an element of S. By the identity theorem the number of zeros of v is finite in any compact sunset of (p, q). It remains to exclude the possibility that zeros accumulate at p or q. By Lemma 2.6 near p, v is of the form where b and c is an real analytic function near r = p.
If there is ρ j ↓ p such that v(ρ j ) = 0, then by the identity theorem c(r) 2 = b(r) 2 (r−p) near r = p since both sides are real analytic in a neighborhood of r = p. However, this is impossible since c(r) is analytic near r = p. We thus observe that there is no accumulation point of zeros of v. Similarly, there is no accumulation of zeros of v to r = q. We have thus proved that v has finitely many zeros and continuous so it is a finitely many sign-changing function. (Note that we need not assume that A k is mutually disjoint.) By Theorem 2.4 we now conclude that S D [u 0 ](x) is well-defined for such u 0 at all x ∈ R d .
Remark 3. In Theorem 2.7 we may replace finite sum of χ A k 's by an infinite sum ∑ ∞ k=1 c k χ A k provided that A k is mutually disjoint and inf k a k , inf b k > 0. The conclusion should be of course modified by replacing "finitely many" by "locally finitely many". This kind of remark is important if one consider a periodic setting.

Variational approach
We shall study the asymptotic sign related to the strictly convex variational problem (1.7), i.e., v → 1 2 with λ > 0. For a given u 0 ∈ L 2 (R d ) there is a unique H 1 -minimizer v λ , which satisfies the elliptic equation (1.8) Proof. The solution of (3.2) is written by use of the heat semigroup e t∆ = G t * of the Here e (∆−λ) t = e ∆t e −λt and e ∆t f = G t * f.
Assume that S D [u 0 ](x) = 1 so that there ist > 0 such that (e t∆ u 0 )(x) > 0 for t ∈ (0,t ). Since e t∆ u 0 is bounded for all t > 0 (actually converges to zero uniformly in t as t → ∞ [GGS]), we now apply the next lemma to conclude that sgn v λ (x) = 1 for sufficiently large λ. Thus, S a [u 0 ](x) = 1 = S D [u 0 ](x). The case S D [u 0 ](x) = −1 can be treated in the same way. It remains to prove the case S D [u 0 ](x) = 0. In this case e t∆ u 0 (x) is zero at least for a small t. However, since u is also analytic in time, this implies that (e t∆ u 0 )(
(Note that f (0) may be zero.) Then is positive for a sufficiently large λ.
Proof. We divide the integral into two parts (0,t ) and (t, ∞). We estimate The other part is estimated from below as ∫t If λ is taken so that then a(λ) > 0.

Remark 4. The equation (3.2) has a unique bounded solution if
The unique solution is given by (3.3).
We now study the energy (1.10) involving total variation. This does not diffuse the sign so S t [u 0 ] and S a [u 0 ] are quite different.
We consider a one-dimensional problem for (1.10) with u 0 ∈ L 2 (R d ) ∩ BV (R d ). Then the unique minimizer v λ for (1.10) fulfills the Euler-Lagrange equation where Σ λ is the set that η is not differentiable. See [Ch] and [ACM]. This problem is studied in detail in [BFI]. Proof. As we know v λ is also Lipschitz continuous since u 0 is Lipschitz continuous [Ch]. If w ∈ L 2 (R) and Lipschitz on R, we have By a fundamental formula for the calculus we have Sending j → ∞ yields the desired estimate. Since v λ is in L 2 , by (3.4) we conclude Thus the energy of w λ is smaller or equal to that of v λ . Since the minimizer is unique, this is a contradiction so we conclude that v λ = 0 for x > M . A symmetric argument yields that v λ = 0 This is a quite different from the case of (1.7), where the total variation energy is replaced by the Dirichlet energy. Because of diffusion effect of (1.7) v λ cannot be zero in a larger set for this problem, while the diffusion effect of total variation is limited since it is not strictly parabolic. Example. If u 0 (x) = ( 1 − |x| ) + , then v λ = 0 for |x| ≥ 1 so for all λ the number S t [u 0 ](x) = 0 for |x| ≥ 1. This is strikingly different since the asymptotic sign S a [u 0 ](x) = 1 no matter how x is. (Note that u 0 is nonnegative.) If we calculate the Euler-Lagrange equation, one can prove that .
along the line of [BFI]. Note that from the Euler-Lagrange equation one observes that if v λ is continuous near x 0 and v λ (x 0 ) = u 0 (x 0 ), then v λ is a constant near x 0 [BFI,Proposition 2.2]. This observation is a key to have the above solution.

Application to separation of data
We are interested in characterizing when u 0 has a special structure like χ A − χ B where A and B are two disjoint measurable sets in R d . Roughly speaking the interior of S + (resp. S − ) is the set where distance from A is longer (resp. shorter) than that from B. Since our A and B is just measurable, we have to use essential distance instead of usual distance. We define the essential distance d e (x, A) by where B r (x) denotes the closed ball of radius r centered at x and | · | denotes the Lebesgue measure. By definition it is clear that d e (x, A) is Lipschitz continuous on R d . We consider more general data u 0 than χ A − χ B .
Theorem 4.1. Assume that u 0 ∈ L ∞ (R d ) satisfies ess. inf A u 0 > 0 and ess. sup B u 0 < 0 and u 0 = 0 outside A∪B, where A and B are two disjoint measurable sets or more weakly |A ∩ B| = 0. Then The argument is symmetric so we just give a proof for the case d e (x, A) < d e (x, B). By the assumption there exists ρ > 0 such that Since 0 ≤ exp ( (ρ 2 − r 2 )/4t ) r d−1 ≤ exp(ρ 2 − r 2 )r d−1 for ρ ≤ r and t ≤ 1/4 and the righthand side is integrable in (ρ, ∞), Lebesgue's dominated convergence theorem implies that as t ↓ 0. We thus conclude that u(x, t) > 0 for sufficiently small t.
It is easy to see that the set where d e (x, A) = d e (x, B) has no interior so the last assertion follows. Note that c does not play any role to determine int S + and int S − . We consider a data separation problem. Suppose that each point of R d fulfills either propery P or Q (with P ∩ Q = ∅) except very thin set. We have to classify a point of R d either it fulfills P or Q. We know that in A ⊂ R d the property P is fulfilled while in B ⊂ R d . Our S + and S − in Corollary 1 give a way to classify R d by the properties P and Q. Corollary 1 gives a characterization of our int S + and int S − by a completely geometric way. The set is a separation curve or hypersurface consisting of points having the same distance from A and B. We call this set as an equi-distance hypersurface. Note that in general C may not be of finite perimeter. From Theorem 4.1 and Corollary 1 we have an algorithm of data separation. Set for given data A and B, and obtain the solution u(t, x) of (1) with the above u 0 . For a reasonable data separation it is better to choose t very small. We observe that our method provides the hypersurface version of a maximal margin classifier [CST] without any technique of data transfer to a higher dimensional space. We give here a few examples of data separations. In the following numerical examinations we consider the heat equation in a square Ω = [−1, 1] × [−1, 1] with the boundary condition -periodic in x 1 and the homogeneous Neumann condition at x 2 = −1, 1 for x = (x 1 , x 2 ). Note that for the heat equation homogeneous Neumann problem in one direction (x 2 -direction) is reduced to a periodic boundary value problem by extending a function in x 2 in an even way (a symmetric way) with respect to x 2 = +1, −1. Thus under this interpretation our solution with the Neumann in x 2 and periodic in x 1 boundary condition is regarded as the solution of the Cauchy problem (1.1) with an initial data whose restriction in Ω equals our u 0 . We calculate the solution by an explicit difference method with the space lattice grid size h = 0.01 and the time grid size τ = 0.1 × h 2 . The difference equation is where u k i,j = u(hi, hj, τ k) for −100 ≤ i, j ≤ 100 and k ≥ 0. The first one clarifies the difference between our method and support vector machine. Let where b > a > 0. It is easy to find that the separation line by the maximal margin classifier (a separation line by a support vector machine) is R × {0}. However, our method provides the equi-distance curve { for the curve of data separation. If (±a, c), (±b, c) are on lattice points for numerics and h is smaller than the span of the lattice points, then the initial data for numerics should be given as The calculation in very short time, like as figure 1, provides the separation curve which is very close to the equi-distance curve. If one calculates longer, the curve is smoothened as in the right one of figure 1. If one calculates for a long time i.e. for large t, the curve may fail to classify data as shown in the next example. The second example is the two-moon type data, which a simple maximal margin classifier cannot draw a separation curve. We give each 100 points of random data for A and B around {0.5(cos θ, sin θ) − (0.25, 0.15)| θ ∈ [0, π]} and {0.5(cos θ, sin θ) + (0.25, 0.15)| θ ∈ [−π, 0]}, and set In the right one of figure 3 some points are failed to separate by our method. We thus observe that it is necessary to take t sufficiently small for exact data separation.
If we calculate by an implicit scheme, the results of separation curves are almost the same; however, evidently it takes more time to calculate.
In [CLLMNWZ] a geometric diffusion approach is given for a data separation procedure. Their approach is very general including the data separation by the heat equation given here. Although they discussed the problem on a graph or a manifold, we just explain the idea of their method when A and B are disjoint subsets of R d . Let S be a compact self-adjoint operator in periodic L 2 space in R d , i.e. L 2 (T d ) (or Figure 1: Example of data separation for (4.2). The left figure denotes the profile of the initial data such that u 0 = 1 at black dots, u 0 = −1 at cross, and u 0 = 0 at the others. The center and right figures express the profiles of {x| u(x, t) = 0} at t = 0.005 and t = 0.2 (500 and 20000 steps respectively by the explicit difference scheme). The area with slash line is the place where u(·, t) > 0, and the other area is the place where u(·, t) < 0. The maximal margin classifier [CST], [Std,22.3   Its limit as m → ∞ is of course S ∞ = e ∆ which is also a typical example. (The kernel of S is regarded as a similarity function which is constracted by using feature vectors derived from a neighborhood of each pixel for practical purpose. In other words, S is chosen depending upon feature of data sets as explained in [BF].) Let A be a subset where property Q is fulfilled. We set u 0 = χ A − χ B and introduce a parameter t > 0. Then we give a separation (In practice, t is chosen so that A t ⊃ A, B t ⊃ B.) If S = S ∞ , the limit as t ↓ 0 yields our separation.

Remark on the method based on the Ginzburg-Landau energy
We now compare minimizers of the Ginzburg-Landau energy (1.12) with the minimizer of (1.7). There exists at least one H 1 minimizer v ε, λ satisfying the Euler-Lagrange equation −ε∆v − 2 (1 − v 2 ) v/ε + λv = λu 0 . (5.1) For fix ε > 0 if 1/λ is regarded as the time grid τ , then (5.1) gives an implicit scheme of time discretization on the evolution equation (called the Allen-Cahn equation) ∂u/∂t = ε∆u + 2 (1 − u 2 ) u/ε. (5.2) We often rescale t by microscopic time t , i.e. t = t /ε when we discuss a phase separation. The equation (5.2) becomes ∂u/∂t = ∆u + 2 (1 − u 2 ) u/ε 2 . (5.3) Starting from initial data u 0 , we know that u quickly tends to either 1 or −1 as time develops [XChen] for small ε. As for the heat equation it is natural to define the diffusive sign by the Allen-Cahn equation by when u ε is the solution of (5.3) with initial data u 0 . We also define the asymptotic sign by the Allen-Cahn equation by (iii) Is the convergence (5.4) uniform with respect to ε? In other words, does there existλ =λ(x) independent of ε ∈ (0, 1) so that sgn v ε, λ ε (x) is constant for λ >λ(x)? Or more weakly, sgn v ε, λ (x) is constant for λ >λ(x)?
We know ∞, otherwise with W (u) = ∫ u −1 (1 − r 2 ) dr in the sense of L 1 loc Gamma convergence as ε → 0 ; see e.g. [MM], [S] for a given u 0 ∈ L 2 (R d ). The first item ∫ ∇W (v) is regarded as 4 3 ∫ |∇v| for the valued function v ∈ {−1, 1} since W (1) = 4/3. Thus E λ is essentially the same as (1.10) (with restricting the value of v in {−1, 1}). Since for a function v whose value is in [−1, 1], if one knows a bound for min E ε λ [v] for a fixed λ, then it gives a bound for ∫ independent of ε ∈ (0, 1). If λ is fixed sufficienly large and ε → 0, then the limit is a two-valued function whose total variation is finite. This separation seems to give a regularized way of equi-distance separation.