A QUANTITATIVE METHOD FOR CHOOSING OPTIMAL DAUBECHIES WAVELETS

We develop a new method for calculating specific values of Daubechies wavelets in one dimension. The novelty of this approach is its ability to calculate exact values of the Daubechies scaling functions and, by extension, wavelets, without calculating values of the scaling function at other unnecessary dyadic rationals. We then provide a new quantitative method for choosing the best wavelet for applications by giving a rigorous error analysis of the wavelet transform for functions of various smoothness.


Introduction
Since the introduction of Daubechies wavelets in [4], [5], mathematicians, physicists, and engineers alike have employed them to study various problems in their respective fields.Some applications include numerically solving partial differential equations [3], [17], recognition and visualization of various signals [10], [18], and regularization of inverse problems [6].These are just a handful of results in the vast wavelet literature.
Frequently in these applications one needs to know the exact value of the wavelet at certain points.This tends to be difficult though since wavelets generally don't have a closed form, but rather satisfy some recursion relation.In the first part of this note we provide a new way to calculate the exact value of Daubechies wavelets that does not require calculating values of the scaling function at unnecessary dyadic rationals.The second part of this note is devoted to analyzing the convergence of the wavelet transform for functions of various smoothness.This analysis gives a quantitative method for choosing the best suited wavelet for the application at hand, see Theorem 5.3 which gives error estimates (10) and (12).See also the estimate (14) for C m smooth functions.
The layout of this article is as follows.In Section 2 we provide the necessary background on Daubechies wavelets.Then in Section 3 we review current methods on finding numerical values for Daubechies wavelets, including the Cascade Algorithm.In Section 4 we discuss the new method and provide specific examples.Then in Section 5 we discuss the discrete wavelet transform and compute the error for Lipschitz and Hölder functions.Finally we discuss the convergence of the wavelet transform for smoother functions, and obtain the error bound (14).

Daubechies Wavelets
Wavelets are essentially functions that can generate an orthonormal basis for L 2 (R) by dilating and translating the wavelet function.Thus, wavelets are useful for approximating functions in L 2 (R).In order to dilate and translate a function, we define the subscript notation f k, j (x) := 2 k/2 f (2 k x − j) for an arbitrary f ∈ L 2 (R).As we will see in the forthcoming sections, a firm understanding of Multiresolution Analysis (MRA) is needed for the derivation of wavelets.

Definition 2.1 ([5]
).A multiresolution of L 2 (R) with scaling function ϕ ∈ L 2 (R) is a sequence of subspaces {V k } k∈Z that satisfy the following properties: (1) orthogonality: {ϕ(x − j) : j ∈ Z} is an orthonormal basis for V 0 (2) nesting: A simple example of a scaling function that generates a multiresolution of L 2 (R) is the indicator function of the interval [0, 1): ϕ(x) = χ [0,1) (x).As we will see later, the scaling function ϕ plays a crucial rule in constructing a wavelet.Hence, ϕ is sometimes referred to as a father wavelet.
Ingrid Daubechies invented an entire family of wavelets [5] which are referred to as Daubechies wavelets.Her discovery was a major breakthrough in wavelet analysis because Daubechies wavelets have compact support and some sort of smoothness.This compromise makes them ideal for applications, which usually exploit their ability to efficiently approximate functions.
The importance of certain properties depend on the nature of the application and some of these properties cannot coexist.
The naming scheme we use to describe a specific wavelet in the Daubechies wavelet family is Daub J, where J is the number of constant filter coefficients, denoted as a j , in the wavelet expansion.
where a j = ϕ, ϕ 1, j and its associated scaling function ϕ ∈ L 2 (R) of the form that generates a multiresolution of L 2 (R) which satisfies the following four properties: (1) ψ and ϕ are compactly supported on the interval [0, 2N − 1], (2) ψ has N vanishing moments, (3) there are only 2N non-zero constant filter coefficients, i.e., a j = 0 for j ∈ [0, 2N) and a j = 0 for all other j, and Another useful property of the Daubechies Scaling function is known as a partition of unity.
Proof.We will prove by induction.Using ϕ's scaling relation when k = 1, the chain of equalities hold by (1) below.Now we will prove the equality for k th case assuming that it holds for (k −1) st case: as desired.
We also need the following: Proposition 2.4.If ϕ is a scaling function of a Daub 2N wavelet, then the following equality holds.
Proof.We will show that ϕ(x)dx = ∑ n ϕ(n); the desired equality follows then as a direct result of Property 4 from Definition 2.2.We can expand the integral of ϕ as a Riemann sum.We will partition the support of ϕ with the intervals [2 −k ( j − 1), 2 −k j] for j = 1, 2 . . ., 2 k (2N − 1).We can then express the integral as limit of Riemann sums with increasing partitions: If we choose x k, j = 2 −k ( j − 1) and utilize Lemma 2.3, we can see that because 2 k (2N − 1) tends to infinity as k increases and ϕ(x) = 0 for all x < 0. Therefore the holds, as desired.
By combining various properties of Daubechies wavelets, we find a system of equations that the filter coefficients must satisfy: This system of equations does not necessarily give a unique solution, however, the number solutions is finite.One of the solutions does indeed produce a Daubechies wavelet.There is a method for finding the exact values of the constant filter coefficients that utilizes the Fourier transform, see [5] and [13].The constant filter coefficients of a Daub 2N wavelet for N = 1, 2, 3, 4 are listed in Table 1.
We only get a unique solution when N = 1, forming the Daub 2 (Haar) wavelet.

Numerical Values of Wavelets
Wavelet collocation methods [3] are frequently used to solve various integral and differential equations, but these methods often require knowledge of the values of the wavelet basis at specific points.
In order to find numerical values of Daubechies wavelets, we first need to find numerical values of their scaling functions.This task is not trivial due to the recursive definition of Daubechies scaling functions.This section explores methods that give good approximations to scaling function values and, by extension, their respective wavelets.

Cascade Algorithm
The cascade algorithm (developed by Daubechies [5]) is a fixed point method that generates a sequence of functions that converge to the Daub 2N wavelet scaling function, Φ.
We begin by discussing the scaling relation Thus, ϕ has the scaling relation where a j := ϕ, ϕ 1, j .
Suppose now that F assigns the expression to the function Λ. Existence and uniqueness of integrable solutions to the general fixed point equation ( 6) was shown in [7].In particular, using (5), we can see that Φ is a fixed point of F.
One can use F to generate a sequence of functions {ν i } that converge to Φ.
To begin generating the sequence of functions, we need to choose a starting function ν 0 that is our "initial guess" for the fixed point of F. It is known that Φ is supported on [0, 2N − 1] and obtains a maximum on this interval.So a reasonable first guess is a function that is zero at all integers except ν 0 (0) = 1 (see [1]).If we connect the integer values of ν 0 with lines we get We can generate the first iteration of the cascade algorithm using ν 1 (x) = F(ν 0 )(x).Once we have ν 1 , we can generate the second iteration using ν 2 (x) = F(ν 1 )(x).This process continues until the desired preciseness of the approximation is obtained.Figure 1 demonstrates the fast convergence of ν i → 2 Φ as i → ∞.In particular, the following rate of convergence is obtained

Eigensystem Method
The eigensystem method is an alternative approach to finding numerical values of a Daub 2N scaling function for N > 1 and is inspired by [5] and [19].
To begin, we will find all integers values that fall within the support of the Daub 2N scaling function, i.e., ϕ(0), ϕ(1), . . ., ϕ(2N − 1).Recall that ϕ has the scaling relation in (5), so the integer values of ϕ must satisfy We can express this system of equations in matrix from as where A is 2N × 2N and the i, j th entry is A i, j = a 2i− j for i, j = 0, 1, . . ., 2N − 1 and 0 else, and Once v is found, it must be scaled so that the sum of its entries satisfies ∑ i ϕ(i) = 1, agreeing with Proposition 2.4.
Solving for these eigenvectors can be done by hand or using an algorithm (e.g. the QR algorithm [20]; note also that the matrix A is sparse).However, it is usually more efficient for the integer values to be calculated beforehand and stored.Solving for eigenvectors is very computationally expensive when compared to the recursive style branching and should be avoided when possible.Now that we have the values of ϕ(i) for i = 0, 1, . . ., 2N − 1, we can use them to find the values of ϕ i 2 for i = 0, 1, . . ., 4N − 2 with because (i − j) is clearly an integer.We can extend this to solve for ϕ i 2 k for any i, k ∈ Z using given that the value of Remark 3.1.This method does not work for the Daub 2 (Haar) wavelet because matrix √ 2A becomes the identity matrix so that we cannot solve for the eigenvector v in (7).However, it is not hard to see that ϕ(x) = 1 Φ(x) = χ [0,1) (x) satisfies the scaling relation ϕ(x) = ϕ(2x) + ϕ(2x−1) and that the wavelet formed, , satisfies all the properties of a Daub 2 wavelet.
We end this section by discussing a recent algorithm [12] which has been developed for calculating exact values of Daubechies wavelets, which can be viewed as an improvement of the Cascade algorithm discussed in the previous section.This algorithm is based on calculating exact integer values of the scaling function, as in the previous section.Once these values are found then Ma uses a convolution operation to compute the value of the scaling function at odd-dyadics.The convolution operation is equivalent to (8) and yields the vector when give the value of ϕ i 2 k−1 at i = 0, 1, . . ., 2 k−1 (2N − 1).In the next section we introduce the new method for calculating values of the scaling function ϕ.

The Branch Method
Sometimes the value of the Daub 2N scaling function ϕ is needed only at a single point.Both the cascade algorithm and the eigensystem method require values of ϕ at many points to find the value of ϕ at a specific point.The branch method finds ϕ i 2 k for any i, k ∈ Z using recursion and can be thought of as the eigensystem method in reverse.All integer values of ϕ must be known to use the branch method.The values of ϕ(n) for n = 0, 1, . . ., 2N − 1 can be calculated using the method discussed in Section 3.2.
Recall the modified scaling relation in (8).In order to find the value of ϕ i 2 k , we must find the values of ϕ i− j2 k−1 2 k−1 for j = 0, 1, . . ., 2N − 1.If we apply the scaling relation to , then we would need to find the values of ϕ n u .In these cases, ϕ(x) = 0 due to the compact support of ϕ and that branch need not be considered.

Visualization of the branch method
The modified scaling relation in ( 8) can be applied k times so that all the lowest branches are of the form ϕ(n) for some n ∈ Z. Then we only need to add together the many integer values of ϕ with their respective filter coefficients to find ϕ i 2 k ; see Algorithm 1 below.
The plot in Figure 3   If it is desired to find the value of ϕ(x) for some x = i 2 k for all i, k ∈ Z, one can approximate x with x ≈ i 2 k for some i , k ∈ Z (dyadic rationals are dense in R).Since ϕ is continuous, ϕ(x) ≈ ϕ i 2 k .Note that the Daub 2 scaling function is not continuous but it has a closed form expression. Thus, this method is not necessary to find numerical values of the Daub 2 scaling function.
Example 1.We will find the value of the Daub 4 scaling function ϕ 13   8   using the branch method.Using the filter coefficients from Table 1, if we solve for the eigenvector in and scale the eigenvector such that ∑ n ϕ(n) = 1, we obtain: The modified scaling relation in ( 8) is applied three times to obtain integer arguments, as shown in Figure 4. We get an expression for ϕ 13 8 from the tree: ≈ 0.0167468.
Remark 4.2.When implementing the Branch Method, all calculated values of the scaling function, i.e., ϕ j 2 k , at each branch node should be stored.The stored values can be used when calculating future branches.This will dramatically increase efficiency as scaling fuction values at each dyadic rational will only be calculated once.
We end this section by discussing the computational issues arising in the Branch method.
The most computationally expensive part of the process is the eigenvector calculation for the 2N × 2N matrix A. Typically in direct dense methods [9], O((2N) 3 ) time is required and O((2N) 2 ) space is required to compute the eigenvectors of A. However, in our case we have three advantages: namely, the matrix is sparse, the associated eigenvalue is known, and we do not need to calculate all eigenvectors of A. Thus a more efficient calculation (via e.g.inverse iteration) can be done.

Daubechies Wavelet Transform
Many of the practical applications of wavelets revolve around the wavelet transform.The transform is a method for approximating functions in L 2 (R) with a finite number of wavelets in a smart and efficient manner.We start by recalling the definition of the wavelet transform: Definition 5.1.Suppose ψ = N Ψ is a Daub 2N wavelet and ϕ = N Φ is its associated scaling function.The Daubechies wavelet transform of a function f ∈ L 2 (R) is defined as In practice, the wavelet transform is used to approximate functions over a finite domain.The function to be approximated can be considered to equal zero outside of this finite domain so that only a finite number of the inner products f , ψ k, j , referred to as wavelet coefficients, need to be computed in D n f .We will find that these wavelet coefficients play a large role in the maximum error of D n f and the overall accuracy of the wavelet approximation.
Proof.The function ψ k, j is supported on the set Definition 2.2 that the zeroth moment of ψ vanishes meaning that the equality ψ k, j (x)dx holds.We then obtain: Applying Hölder's inequality yields because ψ k, j is part of an orthonormal basis, hence ψ k, j 2 = 1.This follows from the fact that one can decompose L 2 (R) as , see e.g.[5].
Next, define the coefficient We see that the inequality holds for all x ∈ I k, j .This inequality assures us that the integral in the previous calculations is less than or equal to the area of a rectangle with dimensions of ω 2 k, j and the width of the domain I k, j , so that Since f is globally Lipschitz continuous, we can see that the inequality ω k, j ≤ C(2N − 1)2 −k is satisfied.By using this upper bound of ω k, j , we are left with the inequality as desired.
Wavelets are well suited to approximate piecewise smooth signals [21].In these types of cases where the signal function f is globally Lipschitz continuous, we can establish uniform convergence of the Daubechies wavelet transform.Proof.Since the set {φ 0, j , ψ k, j : j, k ∈ Z, k ≥ 0} is an orthonormal basis of L 2 (R), we have that f may be expressed as and the difference between the original function and the transform is ω k, j has a similar role to the modulus of continuity of f .
To prove uniform convergence, it suffices to show that f − D n f ∞ → 0 as n → ∞.The following calculations use the results of Lemma 5.2 further simplify the difference: Due to the compact support of ψ there are at most 2N − 1 different integer values of j where |ψ(2 k x − j)| = 0 for any given x.Thus, the inequality holds.This inequality allows to continue our calculations in (9) yielding: The result of these calculations allow us to conclude that the inequality Remark 5.4.Uniform convergence of D n f to f is still obtained even if f is only uniformly continuous on R as well as convergence in L 2 for any f ∈ L 2 (R) [8].

Error Analysis
If E n is the error of D n f using wavelets of the class Daub 2N where D n f (x) = f (x) + E n (x), the result of the proof of Theorem 5.3 assure us that the error of the transform is at most We also mention that the L 2 error was first analyzed rigorously in [14], which resulted in a characterization of Sobolev spaces in terms of error decay rates of wavelet expansions.

Hölder Continuous Functions
In the case that f is Hölder continuous with the additional constant 0 < α ≤ 1, by performing the same steps as we did in the proof of Theorem 5.3, we find that Notice that this estimate agrees with [16,Equation 5.4], but we have explicitly written the constant depending on N. In particular, (11) is equivalent to f ∈ C 0,α (R).This stricter restriction on ω k, j leads us to the following estimate: The Daub 2 wavelet has the smallest error bound for all α, making it the ideal choice of wavelet to use to approximate a Hölder continuous function (assuming that the function is not twice differentiable).

Differentiable Functions
How will the transform error be affected if the function f is more smooth and has multiple derivatives?To answer this question we will need to find a more strict condition on f , ψ k, j for cases when f has m derivatives.
Proposition 5.5.If f is m times continuously differentiable where the i th derivative of f is bounded by B i over I k, j and ψ is a Daub 2N wavelet, then the inequality is satisfied where τ = min{m, N}.
Proof.We first expand the inner product f , ψ k, j .Starting with its integral form can be substituted for the function f since it is a C m function: where the error ε b (x) satisfies The following simplifies the integral form of f , ψ k, j by utilizing the Taylor polynomial: i is a polynomial of degree less than N. Since ψ has N vanishing moments, ψ is orthogonal to any polynomial of degree less than N. Therefore the equality holds for all i = 0, 1, . . ., N − 1.We are left with the following expansion: We then obtain Remark 5.6.In Proposition 5.5, we only need to consider the non-zero derivatives of f .This is because if f (i) = 0, then B i = 0 and the i th term of the sum does not affect the value of the upper Using this new inequality from Proposition 5.5 in the calculations in (9) from the proof of Theorem 5.3 yields an error that obeys the inequality where B = max{B i : i = τ, . . ., m}.Remark 5.7.There is also the Zygmund class [5] of "differentiable" functions where In this case a characterization of the space Λ s * for integer s (the analog of C s functions) was given in [15]:  2, it would appear that the Haar wavelet is the optimal choice for approximating a globally Lipschitz function (as well as for a Hölder function as discussed in Section 5.1.1).
Intuitively this makes sense since Haar wavelets are not differentiable.In particular, since the Haar wavelet has the smallest possible support and only one vanishing moment, it is not ideal for approximating smooth functions, as can be seen in our previous analysis.
The error bound calculations performed in this section bring forward an alternative metric for choosing the best suited Daub 2N wavelet.Notice that the maximum value of E n hinges on the values of the wavelet coefficients, f , ψ k, j .Thus, an alternative method for choosing the optimal wavelet is to choose a wavelet that produces the maximum number of wavelet coefficients f , ψ k, j that are close to zero [13].This not only minimizes the error of D n f but it can also drastically speed up the process of calculating the values of D n f when the wavelet coefficients can be approximated by zero, f , ψ k, j ≈ 0.

Conclusion
We have developed a new algorithm for finding exact values of Daubechies wavelets.The only computationally challenging aspect of this method is the calculation of an eigenvector, but as the matrix is sparse, this is not an issue.The new method does not require the calculation of values of the scaling function at dyadic rationals.We have also provided a new quantitative

Lemma 2 . 3 .
If ϕ is a scaling function of a Daub 2N wavelet, then the equality

2 k− 2
for some numeratorsn 0 , n 1 , . . ., n 2N−1 ∈ Z.Applying the scaling relation can be thought of creating at most 2N branches that need to be summed together in order to calculate the parent node as shown in Figure 2 where h j := √ 2a j .Note that exactly 2N branches are not always created because of the cases when the argument of ϕ(x) falls outside the interval [0, 2N − 1] shows calculations made by the Branch Method of exact values of the Daub 4 scaling function at dyadic rationals.The dyadic rationals i • 2 −k are categorized by their k value to make the dyadic rational 'levels' more visible.

Theorem 5 . 3 .
If f ∈ L 2 (R) is globally Lipschitz continuous with Lipschitz constant C, then D n f converges uniformly to f .
since the support of ψ is I 0,0 .The Taylor polynomial of f on the interval η b = I k, j centered at the midpoint b = 2N−1+2 j 2 k+1 for every x and the result of Taylor's Theorem.The index of the sum starts at τ instead of N because of the possibility that N > m.Next, apply Hölder's inequality to the integral.This leaves us with the desired inequality: bound of | f , ψ k, j |.If f has infinitely many non-zero derivatives, then the value m = N should be chosen to minimize the upper bound of | f , ψ k, j |.

Figure 5
displays a log plot of the right side of the above inequality for different values of N using m = 4 and B = 1.We can see that the error bound is smallest when N = m.A similar analysis can be done on the derivatives of f .Namely, if f ∈ C m (R) then we can estimate the difference between the p-th derivative of f , for 0 < p < m, and its approximationD n f in the L ∞ norm to get d p f dx p − d p (D n f ) dx p ∞ 2 −n(m−p)(15)where the implicit constant again depends only on B i , N and || N Ψ|| ∞ ; compare with[11, where a similar result is shown for Coiflets.

FIGURE 5 .
FIGURE 5. Log plot upper of E n ∞ using Daub 2N wavelets to approximate a C 4 function.

0, j ∈ Z 5 . 1 . 3 .
Minimizing the Error BoundThe results of sections 5.1.1 and 5.1.2show a correlation between the error bound of the transform D n f and the support of the wavelet N Ψ.Indeed, the error bound is smallest for larger N when N ≥ 2. The special case N = 1, i.e. the Haar wavelet, deserves special attention.Based on Table

TABLE 2 .
Ψ ∞ 1 1.73205 1.70112 1.35907 1.19308 1.12634 1.12161 1.10919 1.06982 1.02596 Approximate values of N Ψ ∞ for different values of N if f is globally Lipschitz continuous with Lipschitz constant C. N