As-Aligned-As-Possible Image Stitching Based on Deviation-Corrected Warping With Global Similarity Constraints

Handling local misalignment caused by the local warp remains a common and challenging task for image stitching. Moreover, the stitched image is prone to appearing ghosting due to the variations of the image viewpoint between images. To solve the problem of local misalignment, we propose a projection deviation-corrected local warping method with a global similarity constraint for image stitching. Recent warps prove that the warp of the local mesh guide image effectively improves the accuracy of image alignment. Geometric projection deviation is well used to accurately correct pixel offsets in image warping. To correct pixel offsets, we first remove the outliers from matching points by using the normal distribution model. The retained matches are more precise and can improve the accuracy of image alignment. Next, we use the local warping model combining local homography and global similarity for image warping. To further address the misalignment problem caused by local warping, we describe the local projection deviation of the local warping model by adopting a three-dimensional mesh interpolation model. Finally, the warped images are blended by a linear smoothing model. Experimental results show that our method outperforms the state-of-the-arts in alignment accuracy, and also provides better visual effects on challenging images.


I. INTRODUCTION
Image stitching, a method of combining multiple images into a wide-angle panorama containing information of original images [1], is among the most widely used in computer vision, such as panorama and surveillance [2], and virtual reality [3].The natural stitching of parallax images remains a challenging task [4].Conventionally, the first stage in image stitching is to determine the warp relationship of each image and transform it into a common coordinate system [5]- [6].Then, the warped images are blended into a natural panorama by linear weighting.
Early methods focused on global alignment of images, a representative example of which is AutoStitch [7].A global homography transformation will be incorrect except under special conditions such as a roughly coplanar scene, which will result in artifacts like ghosting or broken image structures.For addressing the model inadequacy of global warps, several local warp models have been proposed, such as the smoothly varying affine (SVA) [8] and the asprojective-as-possible (APAP) [9].These methods calculate multiple local warps for the image to achieve better alignment accuracy, only work for images with moderate parallax.
Recently, several approaches attempt to define spatiallyvarying warping parameters to align input images alleviate the problem of distortion and limited field of view in stitched images while maintaining good alignment quality.The shape-preserving half-projective (SPHP) warp [10] and the adaptive as-natural-as-possible (ANAP) warp [11] improve the perspective in the non-overlapping regions using a combination of local homography and global similarity transformations.Moreover, the content-preserving warping (CPW) method is also used to overcome the shortcomings of methods using global homography [12].A series of direct warping approaches are proposed to obtain a natural stitching image [13]- [16].These methods localize the image to directly apply geometric constraints to guide mesh deformation and can be combined with seam cutting to cope with large parallax images.However, multiple constraint terms must be optimized simultaneously.
In addition to the above models, non-rigid warping methods that directly compute the deformation function on the image plane, such as thin plate splines (TPS) [4], are also applied to image warping [17]- [20].These methods adopt the TPS to warp the image so that it falls on the fixed anchor point with a smooth surface to achieve the purpose of aligning the image.The warp is extremely sensitive to matching errors, such as the presence of outliers in the matching data that can cause severe distortion.Moreover, little research has focused on the removal of outliers in matching points after image alignment [4].
To deal with these problems discussed above, an image stitching method combining several techniques is proposed in this paper.Our work is inspired by Zaragoza's local projection [11] and Li's elastic warping [4] ideas that the local homography transformation and projection correction can effectively improve the ability of image alignment.Our method is built upon an observation that misalignment caused by projection deviation will still occur after local warping.The 3D smooth surface, a deformation field interpolated by the deviation between the feature point and the projected point, is the key to eliminating projection correction and aligning the image.A feature point refinement model based on normal distribution, which adaptively removes local outliers of matching data, is proposed to be necessary for the calculation of smooth surfaces.In addition, we propose a warping method based on local homography and global similarity, introducing the 3-dimensional interpolation term to eliminate warping deviation to optimize image alignment.The model reduces visual artifacts caused by misalignment that is challenging to be handled by traditional warp, such as APAP [9], ANAP [11], GSP [15], and REW [4].
The remainder of paper is organized as follows.Section II briefly discusses recent work related to the processing of parallax images.Section III describes our approach of projection deviation correction based on local warping in detail.Experimental results are shown and discussed in Section IV.Finally, a conclusion of our work is made in Section V.

II. RELATED WORK
Image stitching has been well studied, and more concepts about it can be found in [1].The global homography worked well [7] under the assumption that the input image was roughly coplanar.Gao et al. [21] proposed the dualhomography warp (DHW) to address scenes with two dominant planes by a linear combination of two homographies.Since a few homography transformations cannot account for parallax, these methods were difficult to handle more complicated situations.
Lin et al. [8] proposed the smoothly varying affine (SVA) field while allowing local deformations to handle parallax flexibly.The as-projectiveas-possible (APAP) warp, which used a moving direct linear transformation (MDLT) to assign global homography to each mesh for better local alignment, was proposed by Zaragoza et al. [9].Multiple local warping models were adopted by [8] and [9] for better alignment accuracy.
After obtaining better alignment, the several methods were more concerned with addressing the distortion problem in the stitched image.Chang et al. [10] proposed the shape-preserving half-projective (SPHP) warp to smoothly turn homography into global similarity, which maintained good alignment in overlapping regions and protected viewing angles in non-overlapping regions without distortion.The method used global homography to derive a global similarity transform, which caused unnatural rotation to occur.Lin et al. [11] proposed the adaptive as-natural-as-possible (ANAP) warp based on linearity homography and combined with global similar transformation to solve the unnatural rotation of SPHP by minimizing the rotation angle.However, the problem of image misalignment still exists.
Other methods model the warp as mesh deformation by energy minimization, which applies geometric constraints to guide the mesh to deform to solve natural quality problems.Lin et al. [14] proposed a seam-guided local alignment (SEAGULL) method [18] based on structure-preserving warping, which effectively preserves the line structure during warping.The global-similarity-prior (GSP) warp based on APAP pre-matching, which constrains the warp resembles a similarity as a whole, is proposed by Chen [15].Zhou et al. [20] proposed a vector-field interpolation method for nonrigid image deformation, which calculates the spatial transformation of each pixel by the TPS function.Li et al. [4] propose a parallax-tolerant image stitching method based on global homography, which uses the TPS function to calculate the global deformation vector field to alleviate the misalignment of global homography in the overlapping region.

A. REFINEMENT OF FEATURE MATCHES
The image stitching method based on feature points, which uses the scale invariant feature Transform (SIFT) [22] combined with a random sample consensus (RANSAC) [23] to filter feature points.The matching strategy performs well.However, some imperceptible wrong matches are still considered correct, which is inevitable.Reference [4] removes some mismatches by using the coefficient distribution of TPS as a criterion for rejecting outliers.As can be seen from the red circle mark of Figure 1, the method removes the wrong point and also deletes some correct points that should be retained.In particular, these points are precious for low-texture images.As with the assumption of [4], the projection deviation of the matching point obeys the normal distribution.We directly predict the correctness of the match by the projection deviation to avoid erroneous prediction, rather than the coefficient distribution of TPS.
Given the reference image I1 and the target image I2 with overlapping regions.In the homogeneous coordinate system, the points , and the mean (μx, μy) and standard deviation (σx, σy) of projection deviation Dx and Dy are calculated, respectively.
Given a pair of matching points {pi, qi}, calculate the projection deviation Dxi and Dyi respectively.
Let the event Axi = {|Dxi -μx| < nσx} and Ayi = {|Dyi -μy| < nσy} represent the thresholds of projection deviations in the x and y directions, respectively.The probability of Axi and Ayi conforms to ( ) Where Φ (•) is the standard normal distribution function.
According to the three-sigma rule, if the projection deviation Dxi or Dyi of feature point is outside the interval (μnσ, μ+nσ) and n > 2, then it is called a small probability event with a probability less than 5%.Generally, set n=3, the probability of this event is only 0.27%.We regard {pi, qi} as an outlier and remove it from the matching data.The matching result based on the normal distribution is shown in the black triangle of Fig. 1.

B. LOCAL WARPING
A homography H is estimated by solving the linear equation of as shown in (3).11 Where ~ denotes equality up to a scale factor,  H .The homology transformation of formula (3) can be expressed as max(exp( ), ) ,  is a scale parameter and 6) can be solved by the smallest right singular vector of  WA .After local homography is obtained, the image I1 can be projected onto the image I2 to achieve alignment of the image.However, this will result in distortion of the image in the non-overlapping region.As with the measures taken by the ANAP, a global similar transformation is introduced for addressing the distortion to preserve the viewing angle.
The optimal global similarity Sg is also calculated by DLT.The homography and similarity are linearly combined into a projection matrix as shown in (7).
The position of each pixel of the image I1 is projected onto the warp mesh W1 by the respective conversion matrices , r ij T .

C. LOCAL ALIGNMENT
Radial basis functions (RBFs) are powerful tools for image alignment [24].A variety of RBFs functions are designed to better align images.The spline tool TPS was used earlier for alignment of images [17].The TPS can be decomposed into global affine transformation and local bending function, which are controlled by global affine matrix and local nonrigid warping function controlled by coefficients [20].TPS can warp with minimal bending energy to create a smooth surface containing the given anchor points.Bookstein [17] defined the energy function Ef : The form of TPS function is as shown in equation (10), and the parameters of TPS can be calculated by solving the linear relation (11).
( ) Where 1 2 3 [] a a a and i  are parameters.v is the function value at (x, y). ( ) , ( , ) ( , ) v is the element of V. E is the N-order identity matrix. is the regular parameter.
TPS works well for interpolation of 3D surfaces, which can interpolate pixels of an image to a specific location by a given common anchor to align the image.We use TPS to calculate the deviation at an arbitrary position in the overlap region, which contains a portion of the non-overlapping region for linear transitions [4].It is not directly using TPS to warp an image to another image.
We substitute the projection deviation PD at the matching points into (11) to calculate the corresponding parameters, and then use (10) to obtain the deviation functions in the x and y directions at (x, y), respectively, namely x ( , ) y of the image.The projection deviation of the warp mesh W1 is eliminated by the alignment term as shown in equation (12).Specifically, xy [ , ] MM is the index of the warp mesh in the x and y directions.
Where the 1 W is the index of aligned pixel.The index of pixel in the image I2 is warped to 2 W by the warp mesh by the bilinear interpolation function.The warped images I r and I t are linearly fused [11].

IV. EXPERIMENTAL RESULTS AND ANALYSIS
A series of challenging images from various data sets were tested to evaluate the performance of the proposed method and only a few are listed in this paper.The compared methods include APAP [9], ANAP [11], GSP [15], and REW [4].The matching of feature points is done by the library VLFeat [25] and the RANSAC algorithm.The parameters are consistent with the settings of other methods.We use the source code provided by the author of the paper to obtain the comparison results.The parameters in our paper are: n = 3, C1 = C2 = 100, δ = 8.5, γ = 0.1,  is the mean of the projection deviation.All the tested methods are run in the same experimental settings with an Intel i5 CPU 3.8G-Hz CPU and 8 GB RAM.Image evaluation includes: A: match point alignment performance.B: comparison of stitching quality.C: comparison of stitching time costs.

A. MATCH POINT ALIGNMENT PERFORMANCE
For the matching alignment accuracy, the accumulated projection deviations in the x and y directions (APDiTxD and APDiTyD) are calculated on different data sets (in pixels).Table 2 shows the comparison of the projection deviations of the matching points compared to the method in [4], where our method produces a smaller cumulative deviation, which means that our projection relationship is more accurate.Our cumulative projection deviation is less than [4] in the random example (but not limited to Table 2).For example, in the image castle, our cumulative projection deviation is 35.3% and 46.0% of [4], in the x and y directions, respectively.This is attributed to the use of local homography to calculate the projection of feature points and adaptively remove outliers using normal distribution characteristics, while retaining more useful inliers.

B. COMPARISON OF STITCHING QUALITY
Fig. 2 shows three examples of large parallax image stitching obtained by different methods, including APAP, ANAP, GSP, and REW.The image datasets for the left, middle, and right columns are lake, building, and garden, which are derived from our images, [14], and [10], respectively.The ability to align images with different methods is visually represented by large parallax images.We use green boxes to mark unnatural scenes such as distortion and misalignment.The ghost area is highlighted by red and blue boxes.The images of the first row (a) show the results of APAP.Note that there are significant misalignments and distortions that are marked by rectangles.Obvious misalignment is also introduced in (a), as shown, the ghosts marked by the red and blue boxes.Fig. 2(b) to (e) show that for ANAP, GSP, REW, and our methods, the application of similar transformations has largely preserved the original perspective and mitigated distortion.Moreover, ANAP improves APAP alignment by linearizing the homography of non-overlapping regions, while REW uses TPS to constrain global homography to align images.GSP uses the similarity term to directly constrain the mesh warping, which improves the alignment ability of the image to some extent.However, there are still obvious ghosts in (b) to (d).Obviously, the lake and building examples of Fig. 2(b) to (e) show that the ANAP, GSP, and REW cannot handle ghosting caused by large parallax, which are highlighted by red and blue boxes.In addition, the lake and garden examples also produce unnatural distortions, which are marked by green boxes.
However, our results look more natural as local warping is corrected by projection deviation in Fig. 2(e), which contains no visible artifacts.Ghosting, misalignment, and distortion are eliminated, in the examples of lake, building, and garden in Fig. 2(e).Our method works well and preserves the naturalness of the image content.More results for our method are available in the supplementary material., ANAP [11], GSP [15], REW [4], and our approach.From left to right, "lake", "building" [14], and "garden" [10].
The Naturalness Image Quality Evaluator (NIQE), which calculates non-reference image quality scores for images to be evaluated, was proposed by Mittal et al [26].The NIQE is close to the human visual system, and the score is inversely proportional to image quality.Lower NIQE score means the stitched images could be better.[14], and 1 collected by ourselves), where in most of the cases, our method generates the smaller score, which means that the stitching quality of our method is higher.

C. COMPARISON OF STITCHING TIME COSTS
To evaluate the computational efficiency of the proposed method, we calculated the time cost of different methods on different databases and compared them with the state-of-theart stitching methods, namely APAP [9], ANAP [11], GSP [15], and REW [4].The authors of APAP and ANAP provided acceleration components for higher computational efficiency, and we used the version of the MATLAB program to get the execution time of the methods.The proposed method is implemented in MATLAB, while GSP is implemented in C++, so the GSP operation time will be faster.
Table 2 shows the comparison of the time consumption of different methods.Note that APAP and REW are faster, except for GSP.APAP lacks the calculation of similarity, while REW lacks the calculation of local homography.Note that GSP performs faster in C++, but it has complex constraint terms, including global and local similarity priors, scaling and rotation optimization.Our approach is slower than APAP and REW, which stems from our warping not only combining local homography and global similarity, but also introducing normal distribution, local projection deviation and linear smoothing pixel calculations.However, the proposed method avoids the homography of linearized non-overlapping regions, making the method perform faster than ANAP.Our warping method avoids the introduction of complex optimization terms, effectively balancing the efficiency and quality of image stitching.

V. CONCLUSION
In this paper, we propose a novel stitching method to solve the problem of local misalignment.We use the normal distribution theory to convert the value of the projection deviation into a probability event, which effectively refines the matching and improves the image alignment accuracy.We use a 3-dimensional surface interpolation model to describe the local projection deviation of the image in the local region, which improves the alignment of the local warp.
The global similarity transform is also introduced to protect the content of non-overlapping regions.Our method is evaluated on a variety of images with large parallax, including comparisons with the state-of-the-art stitching methods such as APAP, ANAP, GSP, and REW.The results show that our method aligns the images more accurately, generating a more natural stitched image with no visible artifacts in the overlap region and alleviating the perspective distortion problem in non-overlapping regions.As a future work, we would like to explore the application of line protection and seam-cut methods in image stitching.
1] = u v q are matching points of the region, the homography H is estimated by DLT from the matching points, T [ , ,1] ' = x' y' p is the projective point of p in I2 estimated by MDLT[9].The projection deviation of the matching points is projected into the x and y directions.The projection

FIGURE 1 .
FIGURE 1.Comparison of refinement results of feature matching.Feature points are represented as filled points.Red circle is 60 error points marked by REW, which contains some correct points.Twentythree error matches are marked by black triangles in our method, which retains more correct matches.
first-two rows of (4), i = 1, 2, …, N. We can formulate (4) by the linear equation called DLT as divided into C1×C2 grids, and the coordinates of the grid vertices are system.The local homography at arbitrary position p  in I1 is estimated by the MDLT framework[9] under the following rules: projection deviation is calculated from local homography rather than global homography.

1 W
and the conversion matrix , t ij T .The warped images I r and I t of the images I1 and I2 are obtained by backward interpolation of the 1 (a) Result of APAP (b) Result of ANAP (c) Result of GSP (d) Result of REW (e) Result of our warp

FIGURE 3 .
FIGURE 3. Comparison of refinement results of feature matching.Feature points are represented as filled points.Red circle is 60 error points marked by REW, which contains some correct points.The black triangle is the result of our record, retaining more correct matches.