Krawtchouk transforms and Convolutions

We put together the ingredients for an efficient operator calculus based on Krawtchouk polynomials, including Krawtchouk transforms and corresponding convolution structure which provide an inherently discrete alternative to Fourier analysis. In this paper, we present the theoretical aspects and some basic examples.


Introduction
Krawtchouk polynomials are part of the legacy of Mikhail Kravchuk (Krawtchouk), see [14] as a valuable resource about his life and work, including developments up through 2004 based on his work. Krawtchouk polynomials appear in diverse areas of mathematics and science. Important applications such as to image processing [15] are quite recent and indeed are current.
We cite [1,11,13] where Krawtchouk polynomials are used as the foundation for discrete models of quantum physics. And they appear naturally when studying random walks in quantum probability [4,5].
After this Introduction, we continue with the probabilistic construction of Krawtchouk polynomials. They appear as the elementary symmetric functions in the jumps of a random walk, providing a system of martingales based on the random walk. Some fundamental recurrence relations are presented as well. The construction immediately yields their orthogonality relations. Alternative probabilistic approaches to ours of §2 are to be found in [2,8,9,10].
Section 3 provides the linearization and convolution formulas that are the core of the paper. They are related to formulas found in [7,12]. The next section, §4, specializes to the case of a symmetric random walk, where the formulas simplify considerably.
In Section 5 we introduce shift operators and use them to develop a computationally effective approach to finding transforms and convolutions. This differs from our principal work with operator calculus [3] and recent approach to Krawtchouk transforms [6] and is suitable for numerical as well as symbolic computations.
The article concludes with §6 which presents special bases in which the Krawtchouk matrices are anti-diagonal. These basis functions have limited support and look to be useful in implementing filtering methods in the Krawtchouk setting.

Combinatorial and probabilistic basis. Main features.
Consider a collection of N bits B = {0, 1} or signs S = {−1, 1}. Correspondingly, we let j denote the number of 0's or −1's. And we denote the sum in either case by x. So x = N −j for bits, x = N −2j for signs. Order the elements of B or S and denote them by X i , 1 ≤ i ≤ N. We can encode this information in the generating function Now introduce a binomial probability space with the X i a sequence of independent, identically distributed Bernoulli variables. With p the probability of "success", q = 1 − p, the centered random variables are distributed as follows: Bits: X i − p = q, with probability p −p, with probability q To get a sequence of orthogonal functionals of the process we redefine where µ is the expected value of X i . We see that the two cases differ effectively as a rescaling of v. To see how this comes about, consider general Bernoulli variables X i taking values a and b with probabilities p and q respectively. Then the centered variables take values We can take as standard model b = 0 and a = λ. Then µ = λp and σ 2 = λ 2 pq are the mean and variance of X i . Thus, G has the form with j counting the number of 0's and These are polynomials in the variable j, Krawtchouk polynomials. We define a corresponding matrix Φ (N ) ij = k i (j, N) which acts as a transformation on R N +1 , which we consider as the space of functions defined on the set {0, 1, . . . , N}. The generic form, equation (1), is convenient for revealing and proving properties of the Krawtchouk polynomials, and of the transform Φ.
We review here some principal features of this construction [4,5].
Remark. Denote expectation with respect to the underlying binomial distribution with angle brackets: and corresponding inner product f, g = f (X)g(X) .

Martingale property.
Since the X i are independent and X i − µ has mean zero, we have the martingale property where F N is the σ-field generated by {X 1 , . . . , X N }. Thus each coefficient k n (j, N) is a martingale, where j denotes the number of 0's in the random sequence of 0's and 1's which is the sample path of the underlying Bernoulli process. This gives immediately Proposition 2.1. Martingale recurrence k n (j, N) = p k n (j, N + 1) + q k n (j + 1, N + 1) One can derive this purely algebraically by the Pascal recurrences presented in the next paragraph.

2.2.
Pascal recurrences and square identity. As is evident from the form of the generating function G, we have recurrences analogous to the Pascal triangle for binomial coefficients.
These follow directly, first considering (1 + λqv)G N (v) = G N +1 (v) and second Note that the martingale property follows by combining p times the first equation with q times the second.
Given four contiguous entries forming a 2 × 2 submatrix of Φ (N ) , the square identity produces the lower left corner from the other three values. In terms of the k's: Proof. Combine p times the first equation above with q times that same equation with j → j + 1. Applying the martingale recurrence on the left-hand side yields k n (j, N) = p k n (j, N)+q k n (j+1, N)+λpq k n−1 (j, N)+λq 2 k n−1 (j+1, N) Subtracting off p k n (j, N) and dividing out a common factor of q yields the result.

2.3.
Orthogonality. For orthogonality, we wish to show that G(v)G(w) is a function of the product vw only. We have, using independence and centering, where the variance σ 2 = λ 2 pq in our context. This yields the squared norms Introducing matrices, we can express the orthogonality relations compactly. Let B, the binomial distribution matrix, be the diagonal matrix Let Γ denote the diagonal matrix of squared norms, For fixed N, we write Φ for Φ (N ) which has ij entry equal to k i (j, N).
In other words, the orthogonality relation takes the form In the following sections we will detail linearization formulas for the symmetric and non-symmetric cases, derive the corresponding recurrence formulas and then look at the associated convolution operators on functions.

Krawtchouk polynomials: general case
We have the generating function with j running from 0 to N. The main feature is the relation where X i are independent Bernoulli variables taking values λ and 0 with probabilities p and q respectively.

Linearization coefficients.
We want the expansion of the product k ℓ k m in terms of k n . First, a simple lemma Lemma 3.1. Let X take values λ and 0. Then the identity in Taylor series about λp and equate the result to zero.
In our context, we can write this as Now multiply by the Lemma. Factoring out 1+σ 2 vw from each term and re-expanding yields Expanding the coefficient of k n , we have 3.1.1. Recurrence formula. The three-term recurrence formula characteristic of orthogonal polynomials follows by specializing ℓ = 1 in the linearization formula. First, compute the constant term and coefficient of v from the generating function G : From the linearization formula, we pick up three terms, with n = m and n = m ± 1. We get The terms k m and k m+1 arise with δ = 0, with the term k m−1 the only contribution for δ = 1. For our standard transform, we think of row vectors with multiplication by Φ on the right. Thus, the transform F of a function f is given by where, e.g., f is the column vector with entries the corresponding values of f .
The inversion formula is conveniently expressed in terms of matrices.
Proposition 3.4. Let P be the diagonal matrix Let P ′ be the diagonal matrix Then The proof is similar to that for orthogonality.
Proof. The matrix equation is the same as the corresponding identity via generating functions. Namely, First, sum over i, using the generating function G(v), with j replaced by n. Then sum over n, again using the generating function. Finally, summing over j using the binomial theorem yields the desired result, via p + q = 1. Thus, which is the basis for an efficient inversion algorithm, being a simple modification of the original transform.

Convolution.
Corresponding to the product of two transforms F and G is the convolution of the original functions f and g. We have, following the proof of the linearization formula, eqs. (3), Thus, we may define the convolution of two functions f and g on {0, 1, . . . , N} by and we have the relation Now, using the inversion formula, Corollary 3.5, we have the relation for the convolution of functions.

Krawtchouk polynomials: symmetric case
For the symmetric case, it is convenient to consider the "signs" process where X i takes values ±1 with equal probability, p = q = 1/2. Thus, λ = 2 and we have the generating function While j runs from 0 to N, the sum x = N − 2j runs from −N to N in steps of 2. Now, q − p = 0 and σ 2 = 1.
In terms of x = k 1 = N − 2j, write k n (j, N) = K n (x, N)/n!. We have the recurrence For example, we can generate the next few polynomials The special identities and recurrences hold with λ = 2, p = q = 1/2 and simplify accordingly. Of particular interest is the simplification of the convolution structure.

Linearization coefficients.
We want the expansion of the product k ℓ k m in terms of k n . In Theorem 3.2, since q = p, we have the condition ℓ + m − n = 2δ and the sum over delta disappears. This leads to a particular set of conditions, namely, that the numbers ℓ, m, and n satisfy the conditions that they should form the sides of a triangle. So, define the triangle function )! where all terms with a factorial must be nonnegative. Note that this is a multinomial coefficient. Proof. The "triangular" form follows from the binomial form by rearranging factorials.

Krawtchouk transforms. Inversion.
In the symmetric case, the matrices P and P ′ in Proposition 3.4 and Corollary 3.5 become identity matrices. Thus, we have So the inversion is essentially an immediate application of the original transform.

Convolution.
Corresponding to the product of two transforms F and G is the convolution of the original functions f and g. In equation (4), the condition q − p = 0 entails n = α + β. We write a for α, replacing β = n − a and write b for δ. This gives for the convolution We have the relation F (j)G(j) = n (f ⋆ g)(n) k n (j) and the inversion simplifies to for the convolution of the original functions.

Shift operators and matrix formulation of Krawtchouk transform and convolution
We will show how the transform and convolution can be represented by matrices acting on appropriate spaces.

Transforms. Introduce the shift operator T x which acts on a function f (x) by
Similarly, T y f (y) = f (y + 1) shifts the variable y by 1. For the transform, in the generating function, we replace v by T n , the matrix shifting the argument n of f : Remark. Even though we are using the column vector f, we are taking the transform multiplying by Φ on the right, that is, computing the entries of f † Φ.
Considering vectors f with a single nonzero entry equal to one leads to another way to describe the result. Namely, the matrix with successive columns equal to the first row from each of the generated matrices Remark. Note that the matrix U has the expansion This follows from the identity which may be verified by multiplying both sides by (I + λqT ). Expanding in geometric series, noting that T is nilpotent, yields the above formula for U. The coefficients are the entries constant on successive superdiagonals of U.
Example. Let N = 4. We have Starting with a column vector f, first multiplying by T (4), then successively by U produces one-by-one the entries of the transform of f.
Similarly, replacing the variables v and w in equation (3), by T n and T m respectively yields the formula Representing T m T n by the Kronecker/tensor product of the corresponding shift matrices provides an explicit matrix that when applied to the tensor product of the vectors f and g yields the convolution f * g. (See Appendix for an example.) So the convolution can be computed analogously to the transform. Start with and compute successively as in equation (5).

Dual Transforms. Binomial bases
Of course, one could define transforms dually by multiplying column vectors : Let's begin with an example.
Example. For the symmetric case, we observe the result The matrix on the left is Φ (4) . Observe that column, m, say, of binomial coefficients is mapped to its partner column indexed by N − m, scaled by 2 m . Note that as functions, functions with zero tails are mapped to functions with zero tails, analogously to Fourier transforms of compactly supported functions or cutoff functions for filtering.
In general we have Proof. We show the generating function version of the relation. Thus, with J, D, and B as above. Comparing with equation (7) indicates a connection between Φ and Φ T . At this point it is straightforward to give a direct proof of the properties we want.

So in this basis
Then the transform, Proof. As in the previous proposition, we show the generating function version of the relation. Consider

Concluding Remarks
We have presented Krawtchouk transforms which have the potential to provide an inherently discrete, efficient alternative to Fourier analysis. By presenting effective algorithms using matrix techniques to compute transforms and convolution products, we have demonstrated useful tools that are not only of theoretical interest but are ready for practical applications. As well, the special binomial transforms we have indicated provide a solid basis for filtering techniques. Thus, the Krawtchouk analogs of the standard Fourier toolkit are now available. Digital image analysis, for example, will provide an important arena for illustrating and developing Krawtchouk methods as presented in this work.

Appendix
Here we show examples of a transform and of a convolution computation using the matrix techniques discussed in the text.