SYSTEMATIC CALIBRATION FOR A BACKPACKED SPHERICAL PHOTOGRAMMETRY IMAGING SYSTEM

A spherical camera can observe the environment for almost 720 degrees’ field of view in one shoot, which is useful for augmented reality, environment documentation, or mobile mapping applications. This paper aims to develop a spherical photogrammetry imaging system for the purpose of 3D measurement through a backpacked mobile mapping system (MMS). The used equipment contains a Ladybug-5 spherical camera, a tactical grade positioning and orientation system (POS), i.e. SPAN-CPT, and an odometer, etc. This research aims to directly apply photogrammetric space intersection technique for 3D mapping from a spherical image stereo-pair. For this purpose, several systematic calibration procedures are required, including lens distortion calibration, relative orientation calibration, boresight calibration for direct georeferencing, and spherical image calibration. The lens distortion is serious on the ladybug-5 camera’s original 6 images. Meanwhile, for spherical image mosaicking from these original 6 images, we propose the use of their relative orientation and correct their lens distortion at the same time. However, the constructed spherical image still contains systematic error, which will reduce the 3D measurement accuracy. Later for direct georeferencing purpose, we need to establish a ground control field for boresight/lever-arm calibration. Then, we can apply the calibrated parameters to obtain the exterior orientation parameters (EOPs) of all spherical images. In the end, the 3D positioning accuracy after space intersection will be evaluated, including EOPs obtained by structure from motion method.


INTRODUCTION
In contrast to the traditional stereo pairs of images, the spherical images can provide more comprehensive geospatial information. Examples can be found such as Google Map street view or indoor panoramic image for real estate transaction purposes.
The spherical image registration algorithm is the most important part since it will influence not only the quality of registration but also the systematic error. Traditional spherical image registrations are mainly divided into two categories. The first one utilize image matching in the overlapped area to estimate image transformation coefficients. Many algorithms on feature point matching are investigated. Wang et al. (2013) modified the SIFT (Lowe, 2004) algorithm to extract conjugate feature points for panoramic image stitching. However, if the images contain homogeneous area such as water or sky, this kind of method might not work well due to too few or not well distributed feature points.
The second category define a radius of a virtual sphere and projecting the original frame image to the virtual sphere. Liu et al. (2012) projected the images onto a sphere with a radius equal to the focal length of the camera. The result will be affected by a pre-defined sphere radius and the distance between the observation viewpoint and spherical image. The software provides by Point Grey Researches also use a sphere with a predefined radius, then it projects each image onto the spherical model. However, on account of the differences between the sphere centre and each perspective centre, different sphere radii will lead to different registration results. Furthermore, different distances of objects will also lead to misregistration in the overlapped area of two neighbour images. * Corresponding author For the purpose of photogrammetric measurement, this paper focus on the issues of spherical image registration method to solve the problems mentioned above. At first, the spherical camera is calibrated to obtain the interior/exterior orientation of each camera by using an indoor calibration field. Then, derive the relative orientation parameters (ROPs) between the reference camera and the others. In the end, we project each image on a virtual sphere with any radius by their relative orientation, interior and exterior orientation. Consequently, the spherical image generation doesn't require feature point matching and regardless the distances of objects. In the experiment, this paper will investigate the systematic errors for error correction purpose. The error vectors can be used in the future for correcting spherical image to meet the requirement of high accuracy in photogrammetry such as 3D positioning using space intersection. Figure 1 illustrates the proposed back-pack MMS. This system contains a Ladybug 5 spherical camera, a tactical grade POS system, i.e. SPAN CPT (including an IMU and a GNSS antenna), a miniature PC and a battery set. They are fixed by a steel box with a steel pole. On the top of this pole, a Ladybug5 camera is placed to take images with minimum occlusion. The GNSS antenna is set at the middle of the pole lower the Ladybug5 about 0.4 m. The other equipment such as SPAN CPT IMU, PC, and battery are all covered by the steel box for water-proof purpose.

Back-pack Mobile Mapping System
In the interior of the steel box, the PC is on the top, IMU and controller is in the middle and the battery is at the bottom. There are many function of this controller, one is the switch that can start or shutdown the power for PC, IMU and Ladybug5, respectively. The other function is the trigger control that can control the image acquisition frequency or traveling distance interval between two images. The traveling distance is measured by an odometer, but unfortunately this information does not used in POS solution.

Ladybug-5 Camera
The spherical camera used in this study is the Ladybug5 from Point Grey Researches company. It composed of six Sony ICX655 CCD cameras. The original image size of each camera is 2048×2448 pixels, the pixel size is 3.45μm, and the focal length is 4.4 mm. Hence, the FOV is 78°×90°. Five cameras (denoted as Cam0 to Cam4) are rotated with 90 degrees and aligned in a circle horizontally, whereas the Cam5 takes pictures toward the zenith to produce a spherical image with a total FOV of 360 degrees in horizontal direction and 135 degrees in elevation direction, respectively. Since Ladybug5 utilizes fisheye lens with short focal length, each image contains large range so that the overlapped area between adjacent images can be used for seamless mosaicking. Meanwhile, its lens distortion is very serious and need to be corrected during the spherical image stitching step. Therefore, an accurate camera calibration is critical for this purpose.

POS System
Together with the Ladybug5, we equip a SPAN CPT POS system. With the assist of GNSS, the directional accuracy of this system can reach 0.015°/0.015°/0.03° in Roll/Pitch/Heading directions, respectively, after post-processing with the GPS differential positioning.

Camera calibration
In order to obtain the relative orientation parameters (ROPs) between two cameras, lens distortion and interior orientation parameters of each camera, a camera calibration field is designed. We set up more than two hundred Australis© coded targets (Fraser, 1998) on a sealed room. The coded targets can be detected and recognized automatically for obtaining precise image coordinates. In image acquisition, eight different horizontal positions and three different heights are chosen for taking pictures. In which, some of the them were rotated with 90 degrees to increase the overlapped area between images and tilted by 90 degrees to reduce the correlation between and exterior and interior orientation parameters. Figure 2 depicts the distribution of all coded targets together with image location. The interior orientation and exterior orientation of each camera are solved by self-calibration bundle adjustment with additional parameters. The mathematic model used is the photogrammetric collinearity equations as shown in equation (1). In which, the (∆x, ∆y) are the adopted additional parameters as depicted in equation (2) (Fraser, 1997). (1) (2) where f = focal length x, y = original image coordinates , = coordinates of principle point ̅ , ̅ = image coordinates which have been calibrated for lens distortion 1~5 = radial lens distortion parameter 1 , 2 = decentering lens distortion parameter 0 , 0 , 0 = instantaneous position of image in local coordinate system X, Y, Z = coordinates of image point (x, y) in object space r = distance from principle point 11~33 = elements of rotation matrix In the calculation of interior orientation, due to the high correlation between radial lens distortion parameter and high correlation between decentering lens distortion parameter and principle point, the significant test was performed to ignore the non-significant parameter which don't significantly improve the posteriori standard error of image coordinates measurement.

Relative Orientation
In this study, Cam0 is defined as the master or reference camera. After solving the exterior orientation, the rotation angles and spatial offsets of other cameras relative to Cam0 can be obtained by equation (3).
where R = rotation matrix between two cameras = slave camera = master camera M = local coordinate system r = relative position vector of two cameras

Spherical Image Stitching
Since Cam0 is defined as the mater camera, its image coordinate system is used to define a spherical model with any predefined radius. The perspective centre of Cam0 is located in the centre of sphere. Then, we define the y-z plane as equatorial plane and the minus x-axis is defined as the North Pole. Azimuth angle φ is the angle between the observation direction and an object point A. The elevation angle θ is the angle between the equatorial plane and an object point A. Figure 3 illustrates the diagram of spherical model used in this study. According to this figure, the relationship between spherical coordinate and 3D Cartesian coordinate is shown in equation (4).
where , , = 3D Cartesian coordinates of an object point on the sphere model r = any predefined sphere radius φ = azimuth angle θ = elevation angle During the spherical image stitching, we adopt a range of 72°×90° for direct mosaicking those five horizontal cameras without overlap and the Cam5 will cover 360°×45°. Figure 4 displays the coverage of all six cameras at the 2D spherical image coordinate system. This paper assume that the size of a spherical image is 4000×8000 pixels, consequently each pixel contain an IFOV of 162" ×162". Through the direct transformation method, the light ray is projected from the sphere centre onto the sphere. In the beginning, the image of Cam0 is projected onto the sphere using the EOPs of Cam0 with FOV range from longitude of -36° to +36° and latitude of -45° to 45° with an interval of 162" ×162". Further, we use the EOPs of Cam1 derived by the relative orientation to project the image of Cam1 onto the sphere from longitude of 36° to +108° and latitude of -45° to 45°, and so on. Notice that the derived EOPs for Cam1-Cam5 does not consider their spatial offset with Cam0, because we treat their projection centre same as Cam0.
Once the elevation ( ) and the azimuth (φ) are known, the 3D Cartesian coordinates can be obtained by equation (4) by assuming any radius (r). Further, by means of collinearity equations, i.e. equation (5), the corresponding camera's image coordinates can be obtained. As the derived image coordinates are usually decimal, we can thus interpolate the RGB values from the original image and fill it into the correspondent position on the 2D spherical image coordinate system. In which, the relationship between the 3D and 2D spherical image coordinates system are illustrate in Figure 5. The principle point of Cam0 is located at the centre of 2D spherical image centre with zero azimuth and elevation angles. In other words, the EOPs of Cam0 can be regarded as the EOPs of the generated spherical image.
( 5) where f = focal length x, y = derived image coordinates , , = coordinates of object point , , = coordinates of perspective centre 11~33 = elements of rotation matrix The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B1, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic

DG (Direct Georeferencing) calibration
The goal of DG calibration is to solve the lever-arm and boresight angles between IMU and camera. Figure 6 is the schematic diagram about direct georeferencing. In which, ( ) is the phase centre of GPS antenna at time t w.r.t the mapping frame.
( ) is the rotation matrix between body frame and mapping frame at time t.
is the offset between GPS receiver and IMU measured by ground survey. ( ) is the IMU's position vector in mapping frame at time t, obtained by combining ( ) and .
is the camera position vector in mapping frame calculated by photo-triangulation. Therefore, using ( ) and we can figure out the (Lever-arm) and (Boresight angles).

Direct Georeferencing (DG)
For the purpose of emergency response, a fast way to obtain EOPs for each imagery is the direct georeferencing (Ip et al., 2004). Through the DG calibration, we obtain the boresight angles and lever-arm between the IMU body-frame and Cam0. After post-processing of POS data, we estimate the track of IMU body-frame. Thus, we can interpolate each trigger event to obtain the body-frame's position and attitude at the camera exposure time. The EOP of each photo can thus be calculated by applying the boresight angles and lever-arm to the body-frame's position and attitude. Further, we can conduct the space forward intersection to locate all object's 3D coordinates by measuring conjugate points from more than two spherical images.

Structure from Motion and Bundle Adjustment
For the purpose to obtain accurate EOPs of each image, particularly in an indoor environment that GNSS signal was blocked, a structure from motion (SfM) (Häming and Peters, 2010) approach is proposed instead. It utilizes feature point matching technique to find tie points in different spherical images and applying bundle adjust adjustment to build the relationship between each images and object space. It means we can obtain the EOPs and perform space intersection to measure object's 3D coordinates for positioning error analysis.

CASE STUDY
4.1 Study Site 4.1.1 Indoor: An indoor study site located within our department's building is used for spherical image positioning accuracy assessment. We evenly place several targets, i.e. black square with a white circle in the centre, on the wall as the control points. Figure 7 is the distribution of spherical images used in this study, they are acquired by traveling between two floors. In which, the blue balls are spherical images, the blue flags are control points, and the other small dots are tie-points on the object space.

Outdoor:
For positioning accuracy analysis after direct georeferencing, an outdoor study site is created in Guiren campus of NCKU. There are five buildings with 2-6 floors. In order to allow the spherical images to cover as many targets as possible, we utilize total station to measure lots of natural features on these five buildings' wall as control point. The absolute accuracy of control points is 1 cm. Figure 8 demonstrates one building within the outdoor study site, in which the red circles are the location of control points.  Figure 9 displays the spherical image produced by the proposed method. In which, two red rectangles are enlarged and shown in Figure 10 and Figure 11, respectively. From the result, we can find out that Figure 10 has very well alignment between two neighbour cameras, but not for Figure 11. It shows some misalignment still exist. However, image of Cam5 that pointing upward contains mostly sky and only a few feature points can be found within the overlapped area, the proposed method still can work well. This method can not only overcome the problem found in the traditional method using feature points for transformation but also present a good image stitching.

Visual Analysis of Spherical Imagery
In the meantime, we try different radii of the virtual sphere during spherical image stitching, the generated spherical images are all the same. It proves that the proposed algorithm is independent to object distance. Please notice that in session 3, we utilize the ROPs to estimate the other cameras' orientation, but without using the spatial offset. If we apply the spatial offset, the generated spherical images will be different when using different sphere radii.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B1, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic Figure 9. Spherical image of the outdoor calibration field Figure 10. Good alignment between two neighbour cameras. Figure 11. Inaccurate alignment between two neighbour cameras.

Outdoor Calibration Site:
In view of the differences between the sphere's centre and the perspective centres of all six cameras, there will be some systematic errors in the generated spherical image. We choose a test field surrounded by buildings containing many ground control points with known 3D ground coordinates and acquire several Labybug-5 images. Applying all original images (except Cam5) in a rigorous photo-triangulation process, we obtain accurate EOPs of Cam0-4. In which, Cam0's EOPs are treated as same as the spherical image. Further, all control points can be back-projected onto the spherical image coordinates system (i.e. treated as true position) and compared with their corresponding image coordinates by manual measurement (exist systematic error) for error analysis. This discrepancy is denoted as error vector, but in 2D spherical image coordinate system.

Indoor Calibration Site:
Since the outdoor calibration site does not cover the whole FOV of all six cameras, especially for Cam5 that is pointing to the sky only has a few control points for error analysis. Meanwhile, the distribution of all controls points are not even. Therefore, we utilize the indoor camera calibration site for further systematic error analysis.
Here the coded targets' 3D object coordinates are estimated by bundle adjustment during the camera calibration.
Meanwhile, the error analysis is conducted on the original image coordinates system, not 2D spherical one. Therefore, we can analyse each camera's systematic error individually. Figures  14-19 demonstrate six error vector plots of Cam0-Cam5 acquire at the same exposure time, respectively.
shows the statistics of all cameras' error vector length. It shows that Cam0 has an overall error length with a standard deviation of 0.4 pixel, which is very small. However, visualize Figure 14, we still can find some systematic trend.

Figure 14. Error Vector of Cam0
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B1, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic  Observing Figures 15-19 and Table 1, we can also find out clear systematic errors exist in each camera and their quantities are all large. The systematic trends for all cameras are different, meaning that their correction has to be performed individually. If these trends for all images acquired from the same camera are all the same, we may thus utilize different 2D transformation functions to correct them during the generation of spherical image.

Positioning Error Analysis
Here we compare the positioning accuracy using EOPs derived by SfM and DG methods, individually.

Indoor (SfM Method):
Thirteen targets are paste on the wall as the control points. In Table 2, most control points' errors are less than 0.05 m but three of them, i.e. CP8, CP9 and CP10, have larger error about 0.09 m. It's a two floors indoor study site, so we collect the data in second floor first and then take the stair down to the first floor. Those three control points are located in the middle of first and second floors. The reasons that cause larger error is the weak geometry as well as the stitching procedure to generate the spherical images. In the meantime, Table 2 also illustrates the positioning error in image space with an overall error of 2.79 pixels. Since the distance from wall to the camera varied from 2-10 meters, its spatial resolution is ranging from 1.6 mm -8 mm. Thus, 2.8 pixels is equivalent to a spatial positioning error around 4.5 mm -22.4 mm. To sum up, the overall accuracy is less than 0.056 m which is acceptable in indoor 3D mapping application.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B1, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic

Outdoor (DG Method):
For the outdoor experiment, we first conduct DG calibration then perform space intersection positioning accuracy assessment. The purpose of DG calibration is to obtain the boresight angles and lever-arm between IMU and camera. Using the image matching and bundle adjustment to obtain the exterior orientation of reference camera, i.e. Cam0 that has viewing direction same as walking. We can compare the exterior orientation of cameras with the IMU body frame and get the lever-arm and boresights angles. Table 3 shows the internal accuracy assessment results after DG calibration. The standard deviation of Phi angles is 1.8°. The reason we speculated is the MMS has larger up and down vibration when people carry it and walking around the study site. After post-processing of GNSS and IMU, the exterior orientation of all photos from Cam0 can be estimated by applying the calibrated boresight angles and lever-arm to each trigger event. After that, using Cam0's image coordinate system we can define a local 3D spherical coordinate system. Then, it can be convert to 2D spherical image coordinates, i.e. horizontal axis is longitude and vertical axis is latitude. It means that when we measure the spherical image coordinates, we obtain longitude and latitude coordinates. Then, it produces a ray similar to frame-based image. Thus, we can perform forward space intersection to get 3D coordinates of check points and compare it with its true 3D coordinates. In Table 4, the analysis result illustrates that the largest mean error is approximately 9 meters in E direction with a high standard deviation of 11.36 meters. There are some reasons that result in this error. The first one is the DG calibration error and human walking motion is not fit with the used Kalman filtering during the POS solution. The second one is the algorithm to generate spherical image that each camera has its own systematic error.

APPLICATION
This system has been applied in many situations. For instance, there is an earthquake occurred in Tainan in February 6 th , 2016. This earthquake caused 117 peoples die and more than 30 buildings damaged. The land vehicle MMS is prohibited to enter the disaster zone. The back-pack MMS carried by people can overcome these drawbacks are taken during the rescue period of a collapsed building, i.e. Weikung Kinglong building. These spherical images can record the realism in 720° for digital documentation purpose or post-disaster investigation.

CONCLUSIONS AND FUTURE WORKS
In this study, a back-pack mobile mapping system is developed and several positioning accuracy assessments are conducted in indoor and outdoor study sites.
Spherical image registration method is important in that it will influence not only the quality of registration but also the errors of the spherical image. This paper suggests a spherical model, a stitching algorithm based on relative orientation of all cameras. It is confirmed that the proposed method can overcome the problems in traditional methods. It can still work well despite the homogeneous area and the error doesn't relate to the distance of object point. The error analysis results show that different systematic errors exits in the original images.
The image matching and structure from motion are applied to obtain the exterior orientation of cameras due to the lack of GNSS and the overall positioning accuracy is approximately 6 cm. The outdoor experiment utilizes direct georeferencing to get the exterior orientation but the mean error is around 10 m which is not feasible in mapping applications.
In the future, a further investigation of spherical image stitching algorithm that can correct each camera's systematic errors are required and a better POS solution suitable to back-pack MMS is critical.