GLOBAL MINIMIZATION

. Many problems in image processing can be posed as non-convex minimization problems. For certain classes of non-convex problems involving scalar-valued functions, it is possible to recast the problem in a convex form using a “functional lifting” technique. In this paper, we present a variational functional lifting technique that can be viewed as a generalization of previous works by Pock et. al and Ishikawa. We then generalize this technique to the case of minimization over vector-valued problems, and discuss a condition which allows us to determine when the solution to the convex problem corresponds to a global minimizer. This generalization allows functional lifting to be applied to a wider range of problems then previously considered. Finally, we present a numerical method for solving the convexiﬁed problems, and apply the technique to ﬁnd global minimizers for optical ﬂow image registration.

When problems are non-convex, these techniques fail because they get "stuck" at local minima. A classical example of one such non-convex problem is image registration. For this problem, gradient descent techniques can be very ineffective, especially if the images involved suffer from strong noise contamination or repetitive details such as textures.
Recent advances in optimization theory have allowed certain classes of nonconvex problems to be solved in polynomial time. Most of these techniques rely on the concept of "functional lifting," which converts a non-convex problem into a convex problem which can be easily minimized using standard techniques. The functional lifting concept has been used to solve segmentation and stereo problems within both the continuous and MRF framework [17,12,37,51].
Unfortunately, the work that has been done on functional lifting methods pertains only to minimizing scalar-valued functions. Binary segmentation problems involve only one variable which determines which region a pixel is in, and stereo matching problems only involve image displacements in one direction. More sophisticated problems require the optimization of multi-variate functions. One of the most ubiquitous examples of this is image registration, in which the pixels of one image are mapped onto another. Each pixel in a registration problem is assigned both a horizontal and vertical displacement component. Because displacement functions are vector-valued, existing methods for non-convex optimization cannot tackle this problem.
In this work, we generalize the functional lifting framework to minimize multivariate functions over Markov random fields -allowing us to minimize a more general class of non-convex problems then has been previously considered. In doing so, we present a scheme for computing the global minimizer of image registration problems.
The organization of this paper is as follows: We first present a brief review of work that has been done on single variable functional lifting, focusing on the discrete theory of MRF's. We then discuss optical flow models for image registration, and explain how they can be formulated using first order MRF's. We then introduce a novel method for solving multi-variable non-convex problems in imaging via functional lifting, and explain how this method can be applied to image registration. Finally, we describe a numerical scheme to solve the resulting convexified problems, and present results of numerical experiments.
1.1. Optimization on Markov Random Fields. The Markov Random Field (MRF) has a long and important history within the physics and statistics communities, where it was first introduced as a formalization of the Ising-Potts model for particle interactions [4,5,52]. It was Geman and Geman [30] who first showed how MRF's could be used for image processing tasks.
We will now formalize the notion of Markov random field. An MRF is defined over a discrete lattice of points, Ω. In the context of image processing, Ω is a rectangular array of pixels. Each point, x ∈ Ω, in the lattice takes on some value (e.g. the pixel intensity), denoted by u(x). With each lattice point, we also identify its set of neighbors, N x ∈ Ω, with which the point x directly interacts.
Most MRF based image processing methods are concerned with minimizing energies of the form Energy models on MRF's consist of two parts. The regularizer measures differences between neighboring nodes. In imaging applications, this term enforces some sort of "smoothness" of the recovered image. The fidelity term measures the relationship between each individual node and some pre-defined quantity, such as the difference between the clean image, u, and a noisy image. Energies of the form (1) can be minimized by standard techniques, such as gradient descent, simulated annealing, or other heuristic methods [6,40,7,10,9]. However, these techniques can be slow, and yield sub-optimal results.
For certain classes of problems with pairwise label interactions, exact minimizers can be computed very efficiently using a "graph cuts" approach. These techniques embed the energy function into a graph consisting of a large number of nodes connected with edges. Each edge is assigned a certain weight. The minimal graph cut problem is to partition the set of nodes into two disjoint sets in such a way that the sum of the weights of the "cut" edges is minimal [10,41,29,5].
The first general class of problems solvable by graph cuts was identified by Picard and Ratliff [48]. This work was followed by Kolmogorov and Zabih [41], who present a regularity condition which is both necessary and sufficient for a problem to be solvable by graph cuts. This basic graph cuts framework was only applicable to binary problems (problems for which the function, u, is restricted to have 2 discrete values). Despite this restriction, Greig et al. [33] realized that this framework could be used to solve TV regularized problems in imaging. This framework was powerful enough to solve problems on binary images, as well as 2-phase segmentation problems [8,11,60].
Later work on MRF optimization focused on generalizing these results to solve problems with more than two labels. For problems with TV regularizers and convex fidelity terms, Darbon and Sigelle show that an n-label problem can be exactly minimized by solving a sequence of log(n) binary minimization problems [23,24,22], allowing for fast minimization of the ROF denoising model [54].
For problems with non-convex regularizers, the problem becomes more difficult. In the next section, we present a modern approach to this problem.

1.2.
Functional Lifting for Non-Convex Optimization. One of the most influential frameworks for global minimization of non-convex functions was proposed by Ishikawa [37]. The simplest version of the Ishikawa method occurs when the regularizer is a linear combination of the pairwise differences between labels. In particular, it is assumed that the Markovian energy has the form for some set of positive weights {w x,y }. Because the fidelity term f (·, ·) may be non-convex, standard minimization techniques fail to find a global minimum in general.
In [37], a technique is introduced to convert the non-convex, multi-label "primal" problem (2) into a regular, binary "lifted" problem that is solvable by graph cuts. This scheme is illustrated in figure 1. This is accomplished by replacing each node in the primal problem, which can take on Γ possible values, with a column of Γ binary valued nodes. By connecting adjacent nodes with properly weighted edges, we formulate the graph so that the value of a cut corresponds to a configuration of pixels in the primal problem. Suppose, for example, we choose each horizontal edge in figure 1 to have a weight of unity. If two adjacent pixels differ in intensity by d units, then the corresponding cut must sever d horizontal edges. This choice of horizontal edge weights gives us the familiar TV regularization for the simple 1-dimensional problem depicted here. Likewise, a cut in figure 1 must sever one solid vertical edges in each column. The energy associated with this severed edge corresponds to the fidelity energy contributed by that pixel.
In figure 1, there are dotted vertical edges between nodes. These edges have infinite weight, and only contribute to the value of a cut if the tail lies in group s. The purpose of these edges to to prevent a column of nodes from being cut more than once. Although we focus on simple TV-like regularizers in this manuscript, the Ishikawa approach was originally formulated to solve problems with arbitrary convex priors. Since Ishikawa's original publication, more sophisticated priors have been considered [38,55]. A very general approach applicable to arbitrary sub-modular priors is presented by Darbon [21].
The functional lifting approach to non-convex optimization has also been studied in the variation/PDE framework by Pock et al. [51]. In this work, the authors reinterpret the Ishikawa method using a continuous formulation, and solve variational problems involving TV regularizers. Consider a continuous energy of the form We wish to minimize the above energy over all functions u :  Note that φ x (γ) = φ(x, γ) is an indicator function on an interval with a discontinuity at γ = u(x). This relationship is depicted in figure 2.
). Using this result, we can write the fidelity portion of the energy (3) as The regularization portion of (3) can be written in terms of φ using the co-area formula as follows Using (4) and (5), we see that E c (u) = L c (φ) where Note that, unlike E c (u), the energy L c (u) is convex. The idea presented in [51] is to compute a minimizer of the convex energy, (6), and then attempt to reconstruct a function u from the optimal function φ. Because we allow φ to take on any value in the interval [0, 1], the minimizer of L c may not be binary valued. In this case, we threshold φ at some value α ∈ (0, 1), and use the resulting binary function to recover u. Interestingly, the authors propose to compute the convex minimization using a primal-dual scheme, rather than conventional graph cut minimization. The variational method has several advantages over graph cuts because it uses less memory, produces more accurate results, and can be easily parallelized.
The framework for functional lifting presented in [51] is rather informal. A rigorous definition of continuous functional lifting requires one to consider many technicalities. In particular, one must carefully define the appropriate function spaces for minimization. Also, in order to guarantee that L c have a non-trivial minimizer, it is necessary tp specify boundary conditions on φ involving limits at infinity. The interested reader should see [50], in which these issues are addressed formally. Rather than dwell on the complexities of the continuous problem, this paper shall focus on the case of functional lifting for discrete energies over MRF's.
The purpose of this work is to present a global optimization scheme for Markovian energies involving vector-valued functions. Our approach relies on a multidimensional analog on the functional lifting concept, and can thus be viewed as a generalization of the method of Ishikawa [37].
In particular, we allow the discrete Markovian function, u : Ω → R m , to take on vector values at each node (i.e. the range of u has m components/channels), and we consider minimization problems of the following form: Although we focus on discrete problems defined over MRF's, our approach to functional lifting is largely variational. Rather than forcing the lifted function φ to take on binary values, we parallel the continuous approach described above by allowing φ to range over the interval [0, 1]. Furthermore, the minimization scheme we choose is a generalization of the continuous scheme presented in [51], and does not rely on graph-cuts. In this sense, the method presented here is a hybrid method -it relies on both discrete and variational techniques. Later on, we will present several theoretical results pertaining to our functional lifting method. Our scheme converts vector-valued problems to a lifted problem with higher dimension. We show that, in order for the solution of the lifted problem to correspond to a solution of the original problem, the solution of the lifted problem must satisfy a certain "box function condition" after thresholding has been applied. We also present a proof that this condition is always satisfied when the method is applied to scalar-valued problems (i.e. m = 1). In the experimental results presented here, we found that the box function condition was also satisfied when the method was applied to vector-valued problems, however rigorous results will be a subject of future research.

Optical Flow Methods for Image Registration.
Optical flow is an important concept in both computer vision and medical imaging fields. The optical flow problem can be phrased as follows: Given two images (a "source" and a "target") with corresponding features, compute the displacement field that maps the features of the source image onto the features of the target image. Optical flow models are used in computer vision to track the movement of an object through consecutive images in a video segment. In this application, the displacement field represents the velocity field of the moving objects. In medical imaging, optical flow is used to remove motion artifacts from clinical images, as well as to map brain images onto reference images for quantitative comparison [39,25,34].
Variational models for optical flow were originally proposed by Horn and Schunck [36]. Given two images, I 0 and I 1 , defined over some region Ω, these authors proposed to recover the displacement field, u = (u 1 , u 2 ), by minimizing a functional of the form where α is a parameter that controls the strength of regularization term. In the original Horn-Schunck model, the authors choose p = q = 2, and assume the images to be smooth. Under these assumptions, the energy (8) can be simplified by replacing the images with a local linear approximation, and forming explicit optimality conditions. The original Horn-Schunck model has several limitations. First, the quadratic regularizer does not allow for discontinuities. Second, the smoothness assumption does not hold for images with hard edges, and only allows for small displacements. For this reason, a myriad of authors have suggested non-quadratic regularizers [14,35,19,53,58,45]. Of the non-quadratic regularizers that have been proposed for this problem, one of the most popular and successful is total-variation (TV) regularized optical flow [14,35,19,53], which corresponds to the above model (8) with p = 1. The TV regularizer is known to enforce smoothness, while still allowing for jump discontinuities [54,47,16].
When problem (8) is discretized using an anisotropic TV regularizer, the energy can be written (9) where e 1 = (1, 0) and e 2 = (0, 1) denote elementary unit vectors. Note that the non-convex optical flow energy (9) is a Markovian energy of the form (7).

Notation and Preliminary
Results. We now introduce some notation and conventions that are handy when working with discrete function spaces.
First, we use the conventional interval notations (a, b) and [a, b] to denote continuous open and closed intervals, respectively. In the discrete setting, we use the notation [a, b] d = {a, a + 1, · · · , b − 1, b} to denote a discrete interval. We also use the notation e i to denote the elementary basis vector, which has its ith component equal to unity and all other components equal to zero.
We will frequently convert between multi-and binary-valued functions using a "threshold" operator. Given some scalar-valued function φ we define its thresholding at level α (also called the upper level set at level α) as Alternatively, we can represent the thresholding of φ as the indicator function φ α (γ) = 1I φ≥α (γ) where 1I A denotes the indicator function of the set A. Many useful regularizers can be represented with level sets/thresholding using a coarea formula. Explanations of these formulas and their uses in imaging can be found in [28,23,24,15]. In the continuous case, the coarea formula allows us to represent the total variation of a function u : Ω ⊂ R n → R as an integral of the total variations of its level sets as follows: The coarea formula also has a discrete analog. Given a discrete function, u : Ω → [0, Γ − 1] d , and two points x, y ∈ Ω, we have (10) |u To aid in the presentation of our theoretical results, we introduce the notion of a "box function." In the discrete setting, consider a rectangular domain Γ = [0, We shall refer to such a domain as a "box" with "principle vertex" at (Γ 1 , Γ 2 , · · · , Γ m ). We say that a function φ : Γ → R is a box function if it can be written in the form In plain English, the region Γ is shaped like a box with one vertex at the origin and one vertex (the principle vertex) in the first quadrant. The corresponding box function has value zero outside of this box, and unity inside the box.
Finally, we will make use of a discrete difference operator, which we shall denote D m γ . This operator is the discrete mixed-partial derivative operator D m γ = (−1) m D + 1 D + 2 · · · D + m where D + i represents the first order forward difference operator in the ith coordinate direction. In 2 dimensions, this operator has the stencil In the discussion below, we will frequently apply this difference operator to box functions. Note that, if φ represents a box function with principle vertex atγ, then A more general understanding of this operator is gained by considering the following observation: When this operator is applied to a discrete binary/indicator function φ, the result has value 1 at every "corner" of the support of φ, and is zero elsewhere. This is illustrated in figure 3.

2.2.
Reformulation as a Convex Problem. In this section, we describe a technique for reformulating the Markovian energy as a convex problem. We also present theoretical results proving the equivalence between the non-convex minimization problem and its convex analog. The energy (11) can also be written in the condensed form is the regularizer of the ith component of u. We begin by specifying the function space over which we intend to minimize (7). We let u(x) range over the space of functions whose ith component lies in the set [0, Γ i − 1] d = {0, 1, 2, · · · , Γ i − 1}. For this purpose, we define the space of discrete Note that (11) is in general non-convex, and so may have a large number of local minima. Rather than attack this non-convex problem directly, we will reformulate this problem as a convex problem over the set of "lifted" functions S l = {φ : Consider the lifted function Note that, if we fix x, then the function φ x (γ) = φ(x, γ) is a discrete box function. The support of φ(x, ·) is a "box" with a vertex at u(x). We thus have that D γ φ(x, γ) = 1 if γ = u(x), and D γ φ(x, γ) = 0 otherwise. This fact allows us to write the fidelity term in the energy (11) as follows: We now consider the regularizing term in (12). Because this regularizer has the form (13), we can write it in terms of the level sets of u using the coarea formula as follows: If we simply note that 1I [0, l] (u i (x)) = φ(x, le i ), we then have When we put these pieces together, we have that We have just shown that, provided that the relationship (14) holds, the nonconvex energy (11) is equivalent to the convex lifted energy (15). The set of lifted functions satisfying the relationship (14) is important enough that we introduce the following definition.

Definition 1. We say that a binary function,
satisfies the discrete box function condition (BFC) if φ x (γ) = φ(x, γ) is a box function for every x ∈ Ω.
Note that a function φ satisfies a relationship of the form (14) if and only if it satisfies the box function condition. If φ does satisfy BFC, then we can recover the corresponding function u using the following multi-dimensional generalization of the the layer cake formula: This observation suggests a method for finding the global minimum of (11): We could first findφ, the minimizer of the convex energy (15) over some appropriate function space. Provided thatφ satisfies the box function condition, this minimizer should correspond to some u ∈ S p , which is the global minimizer of (11). But how can we ensure thatφ satisfies the box function condition? How can we even ensure thatφ is binary?
Before we answer these questions, we must clarify the function space we will use for minimization of the lifted energy (15). Ideally, we would like to minimize over the set of functions satisfying BFC, or even the set of binary functions. However, in order to guarantee that (15) have a unique minimizer, we must choose a convex function space. For this reason, we must relax the binary function condition, and allow our functions to take on values in the unit interval. We define the following space of functions: Definition 1. We say that φ ∈ S + l if the following two conditions are satisfied: We now prove two key results that clarify when the minimizers of (11) and (15) are equivalent.
Proof. Suppose thatφ satisfies the hypothesis of the theorem. Choose some α ∈ (0, 1). We now wish to show that the functionφ α minimizes (15) over the set of all functions satisfying BFC. To this end, we decompose the energy (15) into an integral over the level sets ofφ using the discrete co-area formula (10). This formula tells us that (17) r We now consider the fidelity portion of the energy (15). Invoking the linearity of the operator D m γ , we get Using (17) and (18), we get Suppose for contradiction that there exists some BFC function, φ * , with L f (φ * ) < L f (φ α ). Then which is impossible because of the optimality ofφ. It follows thatφ α minimizes (15) over the set of BFC functions, and thus corresponds to a global minimizer of the original non-convex energy (11).
The above theorem states that we can obtain a global minimizer of (11) from the minimizer of (15) provided that a BFC condition is satisfied. In the case m = 1 (e.g. the case that u is scalar-valued), the minimizer of (15) must always satisfy BFC. This trivial observation is formalized below. Note that an almost identical version of this lemma is presented in [49]. The primary difference between these two lemmas is that our proof is greatly simplified because we restrict the solution to the space S + l .
Proof. Since m = 1 the definition of S + l guarantees that D 1 γφ = −∂ γφ ≥ 0. This condition guarantees theφ is monotonically decreasing in the γ direction. It follows that, for any x ∈ Ω,φ α x is an indicator function of an interval, and thus satisfies the one dimensional box function condition.
For m > 1, we do not at this time have a theoretical guarantee that the minimizer of (15) will satisfy BFC. This will be a subject of further research. However, in all of the numerical experiments presented below for m = 2, it was found that the minimizers did indeed satisfy the necessary box function condition. Furthermore, in case that the BFC is not satisfied at each node, the node can still be assigned a value using the generalized layer cake formula (16).

Relationship to Continuous Models.
In this section, we explore the relationship between the convexification model presented above, and techniques that have been proposed for continuous problems. We first consider the scheme of Pock et. al [51] described in the introduction. The method presented in [51] was originally only applied to scalar valued minimizations, and so we restrict our discussion to the case m = 1. The pock model relies on a lifted minimization of the form The model proposed above involves a minimization of the form A more theoretically sounds formulation of the method in the continuous case has been presented in [49]. This formulation has several notable differences from the continuous model presented above. In particular, the (continuous) set of labels has been expanded to include the whole real number line, rather than just a finite interval. In addition, the authors consider several other types of regularizers which are not considered here, such as Huber and quadratic regularizers.
The new model differs from the Pock model in 2 significant ways. First, the new model does not require the absolute value of the difference operator in the fidelity term. Second, the new model minimizes over the set S + l , which enforces that D 1 γ φ * = −∂ γ φ * ≥ 0. As discussed in lemma (1), this inequality constraint guarantees that the optimal function φ * decays monotonically in the γ direction. This condition is necessary for us to be able to recover the primal function u using the layer cake formula.
We next wish to note the similarity of our method to other methods that have been proposed in the continuous setting. For total-variation regularized problems, theoretical results have been derived using the so-called method of calibrations. Early work in this direction was done in [1], in which the authors generalize the method of calibrations to problems with jump discontinuities. This framework was further generalized by Mora [44], who presented a discontinuous calibration technique for vector valued problems. In this work, the author proposes to lift vector-valued functions defined over a continuous domain Ω onto functions defined over Ω × R m and taking values in the unit interval. The technique presented in this manuscript can be seen as a direct adaptation of this result to the discrete setting. Because we restrict ourselves to the case of discrete functionals, our formulation is considerably simpler than that presented in [44]. In particular, the continuous formulation requires one to decompose the derivative operator into continuous, jump discontinuity, and Cantor parts in order to handle the discontinuous functions on which it must act. The continuous formulation also requires a more sophisticated definition of the thresholding operator as a decomposition over Borel sets. In the discrete setting we have the luxury of working with difference operators rather than generalized derivatives, and so we do not have to handle these technicalities.

Numerical Results
We now present a numerical method for minimizing the convex energy (15), and results when this scheme is applied to the optical flow registration problem. The method we use is a variational-type method drawn from continuous optimization theory, and is a multi-dimensional generalization of the method presented in [51].

Why Not Graph Cuts?
The method presented above allows us to indirectly solve a non-convex problem (2) by solving the corresponding convex problem (15). However, before the minimizer of (15) may be used to recover u, we must apply a thresholding operator to convert it to a binary function. Our reasons for choosing a variational approach to this problem (rather than constraining the solution to be binary valued) is that graph-cuts cannot be applied to the convex problem (15) for m > 1.

Minimization of the Convex Problem.
In this section, we present a primaldual technique for minimizing the convex functional (15). Techniques of this type were first used to solve TV regularized problems in [59]. The technique presented here is a generalization of the technique presented in [51] for scalar-valued problems.
Although the primal-dual approach can minimize an arbitrary energy of the form (15), we shall focus on TV regularized problems for clarity of presentation. Such problems have the form which can be written more compactly as We derive the primal-dual minimization scheme by introducing the dual variable p = {p x , p γ } = {p 1 , p 2 , p γ }. The derivation of this primal dual scheme can be easily understood by noting that the absolute value function can be written as |z| = max −1≤p≤1 pz. Using this idea, we can write Note also that, for Using this new dual variables, the energy (24) can be written (25) where the dual variable may range over the set Note the absence of a lower bound for p γ . This is a key feature in our formulation since we now have L f (φ) = ∞ if the constraint D 2 γ φ(x, γ) ≥ 0 is violated. Thus, by introducing p γ , have formulated the lifted energy so that any feasible function must satisfy this inequality constraint.
Minimizing the energy (24) then corresponds to solving the following min-max problem: Using summation by parts, this problem can be written in the equivalent form (27) min In the above formula, the discrete divergence operator is defined using backward differences as follows and the adjoint of the mixed partial difference operator is where D − i denotes the first order backward difference operator in the ith direction. To compute the solution to the saddle-point problem (26), we use the 2-step gradient projection algorithm proposed by Esser et. al [27]. On the first step, we minimize with respect to φ using a gradient descent step with timestep τ φ . In the second step, we maximize with respect to p using a gradient ascent step with timestep τ p . After each step we reproject the variables back into their corresponding feasible sets. We formalize these two steps below.
(1) Primal Step Gradient Descent: Step Gradient Ascent: Note that, during the dual ascent step, we replace φ k+1 with (2φ k+1 − φ k ) when computing the derivative. The authors of [27] refer to this variant of the primal dual algorithm as PDHDu, and prove convergence of the method for sufficiently small time steps. As numerics is not the primary focus of this paper, we do not consider here a formal convergence analysis of the above algorithm. Convergence proofs for algorithms of this type have been presented in [27,49,13]. For m = 2, we have found the method to be convergent for τ φ τ p < 1 4 .
The numerical examples below were computed using τ φ = τ p = 1 2 . An alternative derivation of this method using a proximal-point technique can be found in [51]. Note our algorithm differs from that presented in [51] in that the dual variable p k+1 γ is reprojected into the interval [−∞, f (x, γ)] rather than [−1, 1]. We have found the primal dual approach very efficient and reliable for the convex problem (24).

3.3.
Examples. We now consider several examples which illustrate the effectiveness of our global optimization approach. We will consider synthetic examples, as well as stereo and object tracking examples from the Middlebury optical flow evaluation dataset [3].
The purpose of the synthetic examples is two-fold. First, these synthetic problems are pathologically difficult, and demonstrate the ability of our method to find the global minimum, even in the presence of many local minima. Second, these problems demonstrate the importance of the positivity constraint, D 2 γ ≥ 0. The introduction of this constraint in a key difference between the method present here and previous work, and we wish to demonstrate it's importance.
We first consider the synthetic stereo matching problem, depicted in figure (4). Note that this is a scalar-valued minimization problem because only horizontal  . Two images taken from the "Tsukuba" stereo matching dataset. These images were taken from slightly different positions, and the deformation field relating them can be interpreted as a depth map.
shifts are considered. The top and bottom images in figure (4) depict the original and shifted image, respectively. For simplicity, periodic boundary conditions have been used for this problem. This is a pathologically difficult stereo matching problem due to the presence of many local minima, which occur whenever any of the rectangular outlines of the two images are aligned. Random initialization is used for the variables in the primal dual algorithm. The method presented in this paper perfectly maps the shifted image onto the original after 140 iterations of the primal-dual algorithm, and thus exactly recovers the global minimum. It has been shown in [49] that the positivity constraint is not strictly necessary in the scalar valued case. However, the unconstrained algorithm converged in 155 iterationsslightly slower than the constrained version that we propose.
We demonstrate the performance of this algorithm on stereo matching data using the "Tsukuba" images from the middle optical flow benchmark, depicted in figure 5. Figure 6 shows the resulting depth/displacement field map when the convex algorithm is used. This result was recovery in 454 iterations using our constrained formulation. The unconstrained formulation required 531 iterations to reach the same result. We next consider the vector-valued analog of the above synthetic example. This is an optical flow registration problem involving both horizontal and vertical translations. Results are depicted in figure (7). The source image has been obtained from the target image by shifting the image in both the horizontal and vertical directions. For simplicity, periodic boundary conditions have been used for this problem. Like the stereo problem, this registration problem has many local minima, which occur whenever any of the rectangular outlines in the source and target images are aligned. Despite this fact, our algorithm obtains the global minimum after 800 iterations of the primal-dual algorithm. To demonstrate the necessity of the non-negativity constraint, D 2 γ ≥ 0, we compare our convex formulation to the 2-dimensional analog of the unconstrained model. In this formulation, the mixed partial term is placed within an absolute value, as is done in (19), and minimization is carried out without an inequality constraint. Note that, without the inequality constraint, the algorithm fails to find a global minimizer. Furthermore, results are dependent on initialization. By introducing the positivity constraint, our model finds the exact global minimizer, even when a random initialization is used.
We now apply the global minimization algorithm to two video tracking problems from the Middlebury benchmark dataset. Depicted in figure 8 are two images of a traffic intersection taken from the "dump truck" dataset. Figure 10 depicts the "basketball" dataset. The optical flow model (8) was applied to both datasets using Registration using an analog of the algorithm presented in [51] without positivity constraint, using random initialization, after 10000 iterations. If the proper global minimum were found, this image, which represents I 1 (x + u), should be identical to I 0 . For this problem, registration using our model with positivity constraint perfectly registers the source image onto the target after 800 iterations, even with random initialization. Figure 8. Two images of a traffic intersection taken at different times. The cars have moved, while the environment remains fixed. Using the global minimization algorithm, we can recover the displacement field describing the movement of the cars. p = q = α = 1. The recovered displacement fields are shown in figures (9) and (11). Note that the vehicles correspond to regions of motion, while the background environment remains fixed. Note also that there are few speckle artifacts present in the displacement field because the global method is robust to noise.
Finally, we wish to emphasize that the primary limitation of this method is memory usage. By adding 2 new dimensions to the registration problem, the size of the problem is dramatically increased. For this reason, lifting is not a practical approach for problems involving high resolution images, or large displacements. Figure 9. Displacement field for the "dump truck" images, recovered using the global minimization scheme. Figure 10. Two images from the "basketball" dataset. Using the global minimization algorithm, we can track the motion of the players and the ball.

Conclusion
We have presented a technique for minimizing non-convex, vector-valued energy functions defined on Markov random fields using a functional lifting approach. This technique can be viewed as an extension of previous work on functional lifting. However, while previous methods can only handle the case of scalar-valued functions, our technique can be used to minimize over vector-valued functions. This level of generality allows to solve a broad class of non-convex problems such as the image registration problem considered in this paper. There are many other areas where Figure 11. Displacement field for the "basketball" dataset, recovered using the global minimization scheme. optimization of vector valued functions becomes important. Two such examples are multiphase segmentation problems [13,49] and diffusion tensor imaging. The application of our convexification technique to these problems will be a subject of future research.