Image Mosaicing using Cylindrical Mapping

The objective of this paper is to provide extended field of view for aerial surveillance in MAV’s. Aerial surveillance on unmanned flying vehicles using a single monocular camera is a trade-off between the level of detail obtained at low altitudes and degree of coverage obtained by flying the vehicle at high altitudes. The effective footprint of the camera is smaller if the aerial vehicle flies at higher altitudes. On the other hand, the region can be quickly searched from a higher altitude. The demand for high detail and wider coverage can be satisfied using Image Mosaicing i.e. the stitching of selected frames of a video by estimating the camera motion between the frames and thereby registering successive frames of the video to arrive at the mosaic. Nomenclature MAV = Micro Air Vehicle UAV = Unmanned Air Vehicles YPR = Yaw, Pitch, Roll


I. Introduction
Coverage area of the camera is usually less as compared to the area under surveillance.In the recent days to cover a wide area more number of cameras is used.This creates a mesh of cameras that is to be controlled.In surveillance systems many monitors are used to view a wide area where a specific approach of image processing is required to overcome this problem.The surveillance using an aerial vehicle is a crucial work where the UAVs are used to move to the set points for once and need to cover a wide area of space.If a single camera is used, a wide area cannot be covered and if two cameras are used then it requires more space and hardware for acquiring and transmission.Thus there is a requirement of a method of implementing a method which overcomes the above mentioned problems.In this paper a method for image combining is been proposed.To solve the wide area coverage using single camera different techniques has been proposed for video mosaicing.One approach involves image collection using a mini-gimbal, an actuated platform aiming a camera, which allows the MAV in the form of a quad-copter carrying the gimbal to return multiple viewpoints without varying its course.The video data from these sensors may be fused together to yield a higher resolution panoramic video.An approach to improving the quality of panoramic images can be through optics and hardware design.

II. Image Acquisition
For any image processing concept the image acquisition is the initial step.Here the images are acquired by the single camera using a gimbal setup.For the acquisition of the images initially the camera is set at a desired position and the image of this position is captured.After capturing the image the camera is made to rotate by an angle to move camera to the other desired view and the image of that position is captured.The rotation of the camera by a gimbal is a tough task that has to be taken care.The gimbal rotation is done in such a manner that the image acquisition should be in synchronization with the image stitching process.In this paper the synchronization of the image acquisition with respect to the image stitching is done by controlling the servo of the gimbal with the Arduino microcontroller by sending the required PWM signals.The delay in the image stitching is been compensated by providing the delay for the gimbal after acquisition of the images till the completion of stitching process.

III. Cylindrical Image Model
An alternative for using homographies to align images is to first warp the images into cylindrical coordinates and then apply pure translational model to align them.This only works if the images are all taken with a level camera or with a known tilt angle.Assume the camera is in its canonical position, i.e., its rotation matrix is the identity and the optic axis is aligned with the z axis and the y axis is vertically.The 3D ray corresponding to an (X,Y,Z) pixel is therefore ( ). Map this image onto a cylindrical surface of unit radius.Points on this surface are parameterized by an angle θ and a height h, with the 3D cylindrical coordinates corresponding to (θ, h) given by (1) (2) where x and y are input rectangular image pixels x c and y c are center pixel f is focal length of camera The 3D point (X, Y, Z) is mapped to ( ) is given by ( 3)

Fig 1 Cylindrical wrapping
From this correspondence, we can compute the formula for the cylindrical coordinates given by,

ICIUS-2013-80
Now convert the image to cylindrical coordinates given by ( 5) For inverse Cylindrical projection, which is to map an image to a rectangular view can be obtained by, (6) (10)

IV. Feature Matching
Feature matching of the images acquired is done by using a SURF method.SURF is a local feature recently presented in [1].It is based on the Hessian matrix for both selecting the location and the scale in image feature description given by For the extraction of the SURF descriptor [1], the first step consists of constructing a square region centered on the interest point, and oriented along the selected orientation.The region is split up regularly into smaller 4 x 4 square sub-regions.For each sub-region, d x and d y features at 5x5 regularly spaced sample points are computed.Here, d x and d y denote the Haar wavelet response in horizontal direction and vertical direction respectively.To increase the robustness towards geometric deformations and localization errors, the responses d x and d y are first weighted with a Gaussian centered at the interest point.Then, the wavelet responses d x and d y are summed up over each sub-region and form a first set of entries to the feature vector.In order to bring in information about the polarity of the intensity changes, the absolute values |d x | and |d y | are summed in order to obtain information about the polarity of the image intensity changes.Hence, each sub-region has a four dimensional descriptor vector v for its underlying intensity structure The resulting SURF descriptor vector for all 4 x 4 sub-regions is of length 64.

V. Image Stitching
To obtain the wide area the image stitching is required for combining the images that are acquired.As the images are acquired, the images are subjected for feature detection process by SURF.Using these features the good features are extracted by considering the points which are greater than three times of the minimum distance point.Using these good features the images are stitched.In this stitching process the first image is retained as it is.The second image is altered by considering the features and the portion of second image is removed which has the feature points less than thrice the minimum distance point.The remaining part is wrapped using a perspective transformation.Perspective transformation is similar to affine transformation but it is done by using a 3x3 matrix.The simplest way to define an affine transform is by setting the source image (second image) points to three corners, for example, the upper and lower left together with the upper right of the source image.The mapping from the source to destination image is then entirely defined by specifying destination points, the locations to which these three points will be mapped in that destination image.Once the mapping of these three independent corners (which, in effect, specify a "representative" parallelogram) is established, all the other points can be warped accordingly.
The perspective projection maps points in the three-dimensional physical world onto points on the twodimensional image plane along a set of projection lines that all meet at a single point called the center of projection.The transformed image is wrapped in such a way that the result image should be as if it is a single image.

VI. Experimental Results
Initially the images are been experimented by accessing the images and generating the cylindrical projected images in MATLAB simulation environment.After the simulation the process of real time is executed.In our system a gimbal setup is kept on a UGV for testing purpose.In this system a camera is mounted on the gimbal.An Arduino UNO microcontroller is fixed on the vehicle to provide the necessary rotation for the gimbal.by an angle of 30 0 of rotation for each frame and two frames, one at a certain angle and another frame with a deviation of 30 0 is used to obtain a panoramic view in real time.

VII. Conclusion and Future work
This paper describes the generation of a panoramic view of such a system comprising a gimballed camera with overlapping fields of view to generate the rectangular as well as cylindrical views on board a quad-copter system in a simulation environment and also experimentally using a ground robot.By using both synthetic and real data the accuracies in the motion estimation process is verified.Experiments reveal that cylindrical mapping is much faster when compared to rectangular mapping, as few images are only required for cylindrical mapping compared to rectangular.This reduces the number computations and increases the performance of the system in real time.The future work that can be developed is increasing the speed of the execution process by reducing the delay and implementing for further increase in the field of view.This has a wide application in the field of military and commercial areas.
Where x=(x,y) in an image, of SURF is the fast extraction process and the fast matching speed it permits, mainly achieved by a single step added to the indexing based on the sign of the Laplacian of the interest point.
Fig 2 shows the gimbal setup with servo for the rotation of the gimbal.

Fig 2
Fig 2 Experimental SetupAs the images are acquired from the camera the images are been processed in OpenCV on Ubuntu platform.Fig 3 shows the Matlab simulation of the images acquired.Here three images are captured from a camera and processed offline to generate a panoramic image .Fig 4 shows the real time executed images.The servo is rotate