Refractive Geometry for Underwater Domes

Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere's center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications.

: From left to right: Autonomous underwater vehicle with dome port camera, an underwater photographer with dome for DSLR, submersible with huge spherical window, multi dome port camera system mounted on remotely operated deep sea robot. These domes can avoid refraction effects when all principal rays of the lenses pass the glass spheres in direction of the local normal. Perfectly centering the lenses in the dome is however a hard task, since it is neither obvious where the center of the dome is, nor where the nodal point of the camera is, nor how good an alignment has been achieved. geometry, Calibration, Decentering

Introduction
More than two-thirds of Earth's surface is covered by water -or more specifically -by the oceans. Underwater imaging, vision, photogrammetry and robotic applications include recovering sunken cultural heritage, offshore installations, habitat mapping, resource estimation, deposited munition monitoring, optical quantification of processes in the ocean, human impact on ecosystems and last but not least exploration of the last uncharted terrain on Earth: the deep sea.
Underwater cameras are usually placed inside pressure housings and observe the outer world through some kind of window. Incident light rays travel through water, glass, and air before being sensed by the camera and each time they traverse media with different optical densities their direction might be changed.
In particular for flat glass interfaces this requires more complex geometric reasoning [1,2,3,4], such that alternative configurations are being explored. In oceanographic applications, but also for professional photographers, so-called dome port systems have become popular that rely on a spherical window to avoid view limitations. They are also mechanically more stable and relatively thin spherical glass can resist extremely high pressure at several kilometers of depth. In principle, a well centered lens behind the dome port is able to avoid refraction, but it still remains challenging as both optical centers are physically invisible, especially for large lenses and rugged equipment as shown for example in Fig. 1. If the lens is not well centered, again refraction will occur as with flat ports, but now the decentered dome geometry produces even more complex refraction effects in the 2D image, a threat that might discourage people from using dome systems. Current practical solutions adopt standard pinhole calibration parameters to compensate the remaining refraction for an approximately centered lens, which allows to achieve high accuracy and has been widely applied in shallow water survey tasks [5,6,7,8]. However, this solution is usually performed at an ideal working distance with a well designed control network, which is difficult to achieve in less controllable scenarios such as robotic mapping applications in the deep ocean.
In this contribution, we geometrically analyze refraction at the sphere for the common case that the lens is not exactly centered with the dome and show that the system is actually an axial camera. We derive a chessboard-based direct solver for the refraction axis, distinguish positive and negative decentering, and propose a complementary decentering calibration procedure that can support mechanical adjustment of the lens to avoid refraction in the first place. In case entirely avoiding refraction is not possible, the analyses and the methods for estimating the decentering presented in this paper are intended to facilitate further research, whether and when it could be benefitial to explicitely consider such physical parameters in photogrammetric surveys.

Related Work
It is well known in photogrammetry that refraction occurs when photographing through different media and in particular in underwater scenarios (see e.g. [9,10,11,12,13,14,15,16,17]). Early photogrammetric methods have analytically analyzed the refractive scenario [11] and suggested linearized correction terms in multi-media photogrammetry at flat interfaces [18], whereas other ap- proaches suggested to absorb refractive effects into 2D lens distortion. Many practical underwater photogrammetry applications have shown that approximating the dome port camera system as a perspective camera model with standard distortion parameters is sufficient and convincing 3D survey results can be obtained.
Gläser and Schröcker [19] found the analytical forward projection for a single refraction. Treibitz et al. [1] have shown that overall systems of cameras behind flat glass interfaces in the water can be considered axial cameras and Chari and Sturm [20] have inspected related multi-view relations. Agrawal et al. [2] have generalized analytic forward projection to multiple layers of flat interfaces.
For spherical glass domes, on the contrary, much less work exists, although they can sustain more pressure -and are thus better suited for deep sea applications -and they do not limit the field of view (see Fig. 1 for examples).
Nocerino et al. [21] compare how aerial standard calibration parameters change when submerging dome port cameras underwater, but disregard explicit consideration of refraction. Kunz and Singh [16] discuss that domes avoid refraction when a pinhole camera is exactly centered with the dome and suggest that decentering could be determined using optimization. They analyze the 2D pixel error when approximating a misaligned dome port camera as a perspective camera and state that the error will increase as the camera moves closer or further to the observed scene since the refraction distortion is depth-dependent. Therefore, centering the camera with the dome port is critical to underwater vision applications. Here, Menna et al. [22] practically measure a particular dome port geometry and discuss properties of domes as compared to flat port cameras.
They also suggest how to align the nodal point of a lens with the dome's center by individually measuring both points and then mounting without a feedback loop. Recently, in [23], the authors investigate the depth-dependent systematic errors introduced by refraction effects through dome ports in a statistical way and apply iterative look-up table corrections to reduce systematic residual patterns [24].
To support such a theory, and also to motivate this work, we conduct a nu-merical experiment which compares the physically-based decentered dome port model and underwater calibration using a perspective model. We particularly look at the case of a camera-equipped robot that sometimes comes close to underwater structures, and sometimes observes them from a distance, which means that calibration is required to hold for a range of different distances. As shown in Fig. 2, the underwater pinhole model works well for a mild decentering in optical axis direction (see top row), but it is noticeable that the standard perspective camera model has more difficulty to absorb the refraction effect with larger decentering or when there is even only a slight sideward decentering (see bottom row). In these cases explicitely considering refraction does not suffer from such errors and this motivates the work in this contribution.
Already earlier, She et al. [25] have proposed to mechanically adjust the dome and lens using through-the-lens feedback until no refraction effects are observable. They build their feedback loop on a human operator to judge the continuity of lines in a setup where the camera is looking parallel to the water surface in a special tank. Later they determine the remaining offset by an image pair of a chessboard in air and in water, both taken at exactly the same pose, which makes the approach delicate and complex. However, there are cases where the camera cannot be centered with the dome port. For instance, Bosch et al. [26] have designed a special underwater camera housing for an omnidirectional camera system which consists of a cylinder for the lateral cameras and a hemispherical dome for the top-looking camera, then a calibration scheme to estimate the extrinsic parameters of each camera and the housing parameters is proposed. Refraction effects are considered using the ray-tracing based camera model. However, their work reports a relatively high residual after the parameter optimization in the real-world evaluation. Iscar and Johnson-Roberson [27] have developed an underwater stereo camera system where two cameras share a single hemispherical glass port. Thus, the offset between the camera's optical center and the dome center is extreme in this scenario. They propose ideas to recover the camera's position inside the dome by measuring the point spread function of the overall camera system, which requires a complex dataset col-lection procedure. The authors report that their calibration approach is very time-consuming and practically limited. Besides underwater imaging, exact modeling of refraction in the image formation process can also improve in-air applications such as cameras behind the windshield of cars [28] or behind an optical cover for coal mining vehicles [29]. Note however, that when imaging in air through a relatively thin glass pane, rays essentially only undergo a slight lateral shift depending on the thickness of the glass, whereas in our scenario the refraction effects are dominated by the other two media (air and water) with significantly different optical densities that cause large direction changes of rays.
In summary, in practise dome port cameras are often successfully modeled by the pinhole camera system and very special setups of domes (e.g. stereo camera in a single dome) needed very special treatment that does not directly apply to the case of mildly decentered dome systems. In this contribution we therefore want to characterize the geometrical properties of decentered dome systems such as important axes, directions, symmetries, what kind of refraction patterns are to be expected and how refraction effects can be exploited to infer the physical parameters of the camera system. Therefore, our contribution in this paper is two-fold: In the next section, we derive the geometrical properties for decentered dome port systems and show that even thick domes are actually also axial cameras, similar to the findings of [1,2] for flat interfaces. In contrast to flat interfaces, spheres have two intersections with the refraction axis, and we discuss their different properties.
Second, we derive how to estimate the refraction axis from a single picture of a chessboard drawing on prior insights into radial distortion center estimation [30]. We also show that the apparent chessboard curvature (barrel vs. pincushion) is related to backward and forward decentering respectively and propose how to distinguish these cases in a 2D image. Afterwards, we analyze the forward projection of a 3D scene point onto the image plane given a calibrated decentered dome port camera system. We show that a 6th degree polynomial equation for thin domes can be derived by leveraging the property of the axial camera [31,2,19]. Next, we derive an efficient optimization scheme to find the exact decentering from multiple underwater chessboard pictures, which provides a practical calibration procedure that does not require cumbersome inair/underwater pair of the same chessboard. Finally, we discuss the limitation and the calibration accuracy when the decentering is not significant and propose an image pre-selection scheme to maximize the observable refraction effect to achieve a high calibration accuracy.
The remainder of this paper is structured as follows: In the following section 3, we will discuss and derive the refractive dome geometry based upon which we propose the different novel steps of our calibration algorithm in section 4. The single components and an overall calibration approach are evaluated in section 5 using synthetic and real datasets. Then, we discuss key findings and limitations in section 6 before the conclusion in the final section 7.

Decentered Dome Geometry
The setting is displayed in Fig. 3: We assume that a camera is positioned inside a sphere. The medium inside the sphere (e.g. air) has a different optical density µ air as compared to the medium outside the sphere (e.g. water, with density µ water ). The separating layer (e.g. glass) can either be considered of almost zero thickness (thin dome model), or, in particular for deep sea housings sustaining several hundred bars of pressure, can consist of several millimeters glass (thick dome model) with optical density µ glass . The exact numbers for the µ parameters depend on the composition of the water, the exact "glass" material, but in the remainder of this paper we will assume µ air ≤ µ water ≤ µ glass to reason about inward-or outward bending. In case a pinhole camera 1 is posi-  each viewing ray, a plane of refraction exists, even in case we use a thick dome. In the lower sketch, the dashed light paths form the same angle φ with the refraction axis at the camera center. Consequently, they will be refracted by the same angle θ at the interface. The cone of all those rays intersects the sphere in a 3D circle, and intersects the image plane in an ellipse.
tioned exactly in the center, all viewing rays will pass all dome layers in direction of the local normal and no refraction will occur. In practice, aligning the centers is very challenging, and therefore some decentering is likely to remain when assembling without visual feedback. The vector from the dome center to the camera center is the decentering offset vector v off , or short decentering. The line through the dome center and the camera center is the refraction axis with direction a. It has two intersections with the (thin) dome surface which will be called the refraction poles, where we distinguish the pole closer to the camera center (positive refraction pole I pole+ ) and the pole further away (negative refraction pole I pole− ). The refraction axis also intersects the image plane at the refraction center, which has the position r in image coordinates.
We set the origin of the world coordinate system to the center of the dome.
Further, we assume that the camera is calibrated, and omit the camera intrinsics for the sake of readability, so it can be described by the projection matrix (1) Because of refraction effects according to Snell's law, light rays will change their direction at the interface between different media, unless they hit the interface at an angle coinciding with the normal of the intersection point.
Along the light path from an object in the water, through the glass dome and into the camera we consider 3 segments here, the water segment with light ray direction l water , the glass segment with direction l glass and the air segment with l air (see Fig. 3, Center). Note that incoming rays that travel along the refraction axis will not be refracted, as they hit the outer interface at 90 • and then also the inner interface at 90 • before they move towards the camera. We will now trace back a different light ray l from the pinhole to its intersection I inner with the inner interface.
Lemma 1. The surface normal n inner of the inner interface at I inner is a linear combination of v off and l air .
Proof. The intersection point I inner must be somewhere on the ray starting from the camera center v off in direction of l air , consequently I inner = v off + λl air .
plane spanned by v off and l air . According to Snell's law, the glass segment direction l glass can be computed as a linear combination of the incoming direction l air and the local surface normal n inner [33]: where r = µ air /µ glass , and . (3) Consequently, following the same reasoning, also the outer interface point, its normal and the water segment are linear combinations of v off and l air and all lie in the same plane.
In the flat refractive case [2], this plane is called the plane of refraction. We will now investigate by what angle a light ray changes its direction at the inner interface. We conceptually group light rays that start from v off and change their direction at the inner interface by some angle θ into the set R θ (see Fig. 3, Right). We call the corresponding set of image coordinates iso-refraction-angle curves.
Theorem 1. Iso-refraction-angle curves are conic sections in the image.
Proof. For symmetry reasons it can be seen that all rays starting at v off that form some angle φ with the refraction axis (the φ-cone) intersect the inner sphere in a 3D circle (see Fig. 3). They all form the same angle with the local normal at the sphere, and due to Snell's law, they will thus all be refracted by the same amount. The intersection of the φ-cone with the image plane forms a conic section.
Finally, we will inspect the displacement induced by the different optical densities of the media. In case all media (air, glass, water) had the same refraction index, we could compute the image position x a of an observed 3D point X simply by using the projection matrix: x a PX. This is the position where we would expect to see the point without a dome, just "in air". In case the three media have different optical densities the light will undergo refraction and we will observe the same 3D point at another position x r . 3D points that lie on the refraction axis will not be refracted and rather be projected onto the refraction center r in the image.
Theorem 2. The "in-air" observation x a of a point X and the underwater "refracted" position x r of that point X form a line with the refraction center r.
Proof. From lemma 2 it can be seen that all the segments of the light path, including the 3D point X, are in a plane jointly with the refraction axis. The intersection of that plane with the image plane forms a line. Since the camera center is in that plane too, also the direct line from the camera center to X is in that plane, and thus the unrefracted "in-air" observation must be in the same intersection line of that plane with the image plane.
This means that refraction happens actually along a line containing the unrefracted "in-air" observation and the refraction center (cf. also Fig. 5).
Now we consider the overall dome camera system in the water as a special kind of camera. We simply extend the water segments that lead from 3D points to the outer sphere to (infinitely long) lines: Theorem 3. The decentered dome port camera system is an axial camera.
Proof. From lemma 2 it can be seen that each water segment of a light path is in a plane jointly with the refraction axis. The water segment is either parallel to the axis (intersection at an ideal point at infinity) or its extension will have a Euclidean intersection with the axis. Consequently, all water segments of paths reaching the pinhole, i.e. the viewing rays in water, intersect the axis and the overall system is an axial camera.  This means that in the image the 2D displacement direction with respect to the refraction center (inwards or outwards) depends on the sign of the 3D decentering vector and is the explanation of the barrel and pincushion distortion (depending on backward or forward decentering) known to underwater photographers and empirically reported e.g. in [22].
Theorem 5. In the thin dome port model the maximum change of direction (refraction at sphere) happens to the rays that approach the camera center inside the plane perpendicular to the refraction axis.
Proof. Given a unit circle O, a point C inside the circle has the distance k ∈ [0, 1] to the circle center O. A ray from C intersects O at P with the incidence angle ∠α and an outgoing angle ∠β (see Fig. 4, Right). The change of direction can be represented as ∠diff = ∠α − ∠β. According to Snell's Law, n air sin α = n water sin β, change of direction in range [0, π/2] can be rewritten to: Its first derivative is: Since the derivative is strictly positive, ∠diff is monotonically increasing (α ∈ [0, π/2]) and does not have local maxima. Then the problem of finding a point P on the circle which has the largest changes of direction ∠diff is equivalent to find P which has the largest incident angle ∠α.
Then, according to the Law of Sines, since OC and OP are fixed, and sin α is monotonically increasing in the range [0, π/2], ∠α must have its largest value when sin ∠POC reaches its maximum , it follows that ∠POC = π/2. Therefore, the maximum incident angle ∠α on the circle happens when PC⊥OC.
Since, for the sphere all refractions happen in a plane of refraction, which always include the axis, we can subdivide the sphere surface into circles that include the poles, and in each of them consider the problem only as a 2F problem inside the specific plane of refraction. As shown above, in each of them the maximum change of direction happens perpendicular to the axis. This means that the ray that meets the camera center perpendicular to the refraction axis has suffered from the largest angular change (see Fig. 4, Left), whereas the ray on the axis is not at all refracted. This is an important finding for setting up experiments to observe or to calibrate the decentering.
Note that the largest effect in the image also depends on the orientation of the camera, since the angular resolution of a pinhole camera increases towards the boundaries: Lateral (left-right, or up-down, in the camera coordinate system) decentering will provide a much clearer signal-to-noise ratio of refraction effects vs. corner detector uncertainty, as compared to forward-backward decenterings.

Decentered Dome Calibration
In this section, we will derive a calibration procedure from the insights of the previous section. First, we will present the geometrical considerations that allow directly inferring the refraction axis and a distinction between positive or negative direction decentering from one underwater photo showing refracted chessboard corners. The result can be used to initialize a v off -optimization, using multiple images to actually measure the decentering. Throughout the calibration, we assume the camera intrinsics are known.

Direct Estimation of the Refraction Center
We describe a corner's position on the original chessboard by x c (cf. to Fig.   5). When photographing a chessboard without refraction, the "as in air" image coordinates x a and the original chessboard pattern positions are related by a perspectivity [32], a special kind of homography: x a H x c Keeping the chessboard pose, now consider the camera being behind a dome port and that the entire system of camera, dome and chessboard is submerged in water (underwater in Fig. 5). Imagine a line q through the unrefracted point in air x a and the refraction center: q = [r] × x a By theorem 2, the refracted point x r must be somewhere on this line: x T r q = 0. If we now replace q and x a we obtain a constraint that must hold between all points in the refracted image and their corresponding chessboard position: This relation is reminiscent of epipolar geometry, where also all "displacement" (due to parallax) happens towards or away from the epipole. This principle has been exploited by [30] for calibration of radial distortion. We use it in a similar way to find the refraction center, but working on 3D rays rather than 2D points, and we do not have radial symmetry with respect to displacement in the image.
Note that [2] obtain an algebraically similar setting for refractive projection through flat interfaces. Essentially, as in epipolar geometry estimation, one can rearrange this equation using the Kronecker product and vectorization operator [34] to obtain constraints on the matrix F: Basically, many of these equations can be stacked as for the eight-point algorithm for fundamental matrix estimation in order to estimate the vectorized matrix F. After recomposition of F, the refraction center r can then be extracted as the left null vector of F.
Since H is actually a perspectivity, it would even be possible to use a 5point algorithm for essential matrix estimation, but as we will use the refraction center only as an initial guess for subsequent optimization, and as many reliable correspondences are obtained using a chessboard detector, the normalized 8point algorithm will be a good fit for our purposes (also avoiding the ambiguities of up to ten solutions). Multiple images of chessboards can be combined in the same way as described in [30] for 2D radial distortion center estimation.
Note however that according to theorem 1 iso-refraction curves (with respect to refraction angle) are conic sections in the image and that the refraction effect in pixels is depth-dependent, so it cannot be described by radial distortion and underwater images cannot simply be "unrefracted" without 3D scene knowledge.
Degenerate Cases. In case the camera is already perfectly centered, the secondsmallest singular value of the resulting equation system will also be zero and no unique F can be obtained. This corresponds to the case of no radial distortion in [30] or fundamental matrix estimation in a planar scene and can easily be detected. Besides that the same algebraic conditions hold as for fundamental matrix estimation (e.g. number of correspondences, non-collinearity [32]). Now, using the camera orientation R, the corresponding refraction axis in world coordinates can be computed as: Note that so far, we have obtained the refraction axis, but the sign of the decentering along the axis (forward vs. backward) is still missing. When drawing a line through two refracted chessboard corner points x r,1 , x r,3 (using the cross product operator ×), we can determine whether a third chessboard point x r,2 in between them is projected onto the same side of their connection line as the refraction center by using the convexity test: convexity(x r,1 , x r,2 , x r,3 , r) where there is only one spherical layer of refraction. By lemma 2, we know that all segments of a light path from the camera center and the refraction axis all lie in a single plane, therefore, the refraction on the spherical layer can be analyzed on the plane of refraction, which is similar to the derivation for multi-layer flat interfaces [2]. But the difference is that the plane of refraction is constructed by the refraction axis and the 3D scene point. Let 3D vectors z 1 and z 2 be the vertical axis and the horizontal axis of the 2D local coordinate system on this plane, and the spherical center be at the origin. Let z 1 align with the refraction axis and point from the spherical center to the camera's optical center, as illustrated in Fig. 6. Then, the normal of the plane can be found by taking the cross product between 3D point X and the vertical axis z 1 .
Afterwards, we can find the horizontal axis z 2 as the cross product between z 1 and the normal of the plane. Therefore, a point on the plane of refraction has a 2D Euclidean coordinate of (z 2 , z 1 ) T .
Assume the camera is decentered by a distance of d, thus the camera center has a coordinate of (0, d) T . The 2D coordinate of the 3D scene point X on the plane is given by x = (u x , u y ) T , where u x = X T z 2 and u y = X T z 1 . The 2D

Iterative Forward Projection for Thick Domes
As outlined in the previous section, the thin flat port projection equation by [19] of degree 4 becomes degree 6 for the thin dome port, essentially because of the quadratic nature of the refraction surface. For thick flat ports, [2] had derived a 12th degree polynomial for the analytical forward projection. Both for this case, and also for the case of imaging through a solid glass sphere (degree 10, also by [31]), only some extra constraints of the special setting helped to bring down the polynomial degree to 12 resp. 10. For the thick dome, we haven't found a similar extra constraint, but even if found, the quadratic nature of the refraction surface comes into play and it is likely that the degree of the polynomial will be significantly higher than 12. Solutions to high degree polynomials can become numerically unstable, and will also require iterative, numerical solvers. Consequently, for forward projection through thick domes we turn to the numerical approach as proposed in [16] to find the projection by iteratively solving the inverse problem (back-projection), until the correct 2D point is found. Note that according to theorem 2, the correct 2D point lies on the line joining the refraction center r and the "in-air" observation x a of the 3D point. Rather than a 2D search, the search could now be restricted to the 1D line connecting x a and r, which would simplify iterative forward projection.

Decentering Estimation
Having obtained a good start value from the direct solver and convexity test, in this section we describe an optimization procedure to optimize the decentering vector using several images at the same time.
Different from the approach presented in [25], this paper estimates the decentering vector v off ∈ R 3 only from underwater imagery. Since there is no in-air photo (like in [25]) to provide accurate pose estimation, the poses P i of the chessboard images have to be estimated jointly with v off . This results in 6m + 3 parameters Θ = (v off , P 1 , P 2 , . . . , P m ) T , where m defines the number of chessboard images and the estimation relies on 2 · m · #corners measurements. Assume Euclidean coordinates: X = (X, Y, 0) T , and the measured i th corner in j th image is x j i . Then, we optimize the energy given by: Herein,X is the 3D point on the chessboard plane back-projected from x. Instead of minimizing the re-projection error, we back-project the corners detected in the images to the 3D chessboard, and the sum of squared differences is minimized inside the chessboard plane, since this is computationally much more efficient than doing the iterative forward projection within each optimization step.

Evaluation
To validate the geometrical insights into the decentered dome and to evaluate the proposed decentering calibration approach, we have conducted three different types of experiments. The first type of experiment was performed using synthetic data, where we numerically simulated projections of 3D points by a decentered dome port camera system with perfectly known ground truth and noise models. This helps to understand performance and to look at the magnitude of effects. To take into account also effects as they occur in real images (corner detection problems, reflections, field of view issues and other physical limitations) we have also conducted real-world experiments with a deep sea dome port in a test tank. However, here evaluation becomes very indirect, as it is very hard to obtain ground truth information (in particular, real decentering), and experiments are sensitive to deformation of the tank due to the weight of the water, calibration uncertainties, inaccurate physical measurements of distances and many other effects. While all this will also occur in complex systems and real world applications, we think it is nevertheless important to isolate and understand the refraction effects. We have therefore put substantial effort into another step of evaluation that employs an open source ray-tracing toolbox [35] to faithfully render images with ground truth settings for evaluation. In particular, we modeled a real deep sea dome port including all radiuses and materials in a virtual copy of our real water pool as realistic as possible and we have verified that we can accurately reproduce images taken by the real system. In this setup, we can control all physical parameters and understand effects in valid experiments.

Synthetic Experiments
First, to validate the proposed decentered dome geometry, we simulated the geometric projection of a dome port camera system with known decenterings.
Here, the refraction displacement fields in the images were generated to visualize the theorems for the decentered dome geometry as shown in Fig. 7. The dome center in the local camera coordinate system was directly projected onto the image as the refraction center. In order to better visualize the displacement direction (inward vs. outward) with respect to the refraction center, the dome center was selected to be located in front of the camera center (Fig. 7 , Top) and behind the camera center (Fig. 7, Bottom) respectively. The original patterns shown as green grids were created as if a chessboard was observed in air (not refracted). Then, the pattern was back-projected to the 3D space at 1m depth to obtain the pattern coordinates. Afterwards, we projected the pattern into the image considering the refraction effect, as if it was observed underwater, which is shown as red grids. As can be seen, the grids are refracted either towards or away from the refraction center depending on the direction of the decentering.
In addition, the lines joining the un-refracted points and refracted points all sults can be found in Fig. 9, where the rotation angle error and the translation error show the differences between the estimated camera poses and the ground truth poses. As can be seen, the unknown parameters were computed with high accuracy even though a considerable amount of noise was added to the measurement. Note that the uncertainties of the estimated parameters in the second set are higher than the first set since the refraction effect is much weaker in the images, but the results can still be considered accurate.

Experiments on Rendered Images
To validate the imaging formation model of our dome port camera system and also to take imaging and corner detection effects into account in the evaluation, we employed the ray-tracing toolbox Blender 2 to render underwater imagery with physically plausible ground truth parameters. Blender is a physicsbased ray-tracing software to create 3D worlds with realistic optical parameters for lighting and media, such as the index of refraction. It also allows setting virtual cameras to simulate capturing the scene into 2D images. To validate our camera model, we have modeled the water pool and the dome port into Blender mimicking the same setup as it exists in reality in our lab. This allowed us to identify discrepancies between real images and those from the simulation, and strongly supports that our simulated experiments are valid. The setup for rendering is visualized in Fig. 10. The dome port was modeled as concentric hemispheres with 7mm thick interface material of borosilicate glass 3.3 (index of refraction = 1.473). Then the ray-tracer will track back light rays starting from the camera sensor and compute the light paths when traveling through the glass dome, thus, the refraction effect will be simulated separately for each pixel in the image. This is different from refraction simulation in rasterization-based renderers (e.g. [37]) that apply refraction effects to the image after the actual rendering process. Instead, using ray-tracing, the refraction is handled naturally as can be seen in Fig. 11a, showing a chessboard half underwater and half in the air. When the camera is exactly centered in the dome (see Fig. 11a, Center), we can clearly see that there is no refraction in the image. When pulling the camera away from the center, the lower part of the image is refracted inwards whereas it is refracted outwards when the camera is positioned in front of the dome center. Fig. 11b shows quantitatively the angular change of the viewing rays due to refraction at the dome sphere for each decentering case.
To validate the image formation model, we projected chessboard corners onto the images and compared their position against the rendered images by Blender and the real photos. The parameters for synthesizing were exactly the same as were used in rendering, among which, the camera intrinsics were obtained via calibrating the real camera in-air, and the poses of the chessboard where the real images were taken were computed via the decentering calibration. The resulting images are shown in Fig. 12, as can be seen, the synthesized chessboard corners, the rendered images and the real images match very well.
Next, we rendered 8 sets of images for different decentering situations and evaluate the refraction center estimation and the decentering vector calibration.
The chessboard corners were automatically detected and the calibration results are shown in Table 1. Fig. 13 provides some of the resulting images, where the estimated refraction centers are marked as green spots, and the reprojections are  which is given as [38]: where P est,i and P gt,i are the estimated camera pose and the ground truth pose for the i th image respectively. trans() will extract the translational components from the transformation matrix. As can be seen, the system can determine the decentering with high accuracy. Note that in set 4 ○, the camera is decentered horizontally, thus the refraction axis is parallel to the image plane and the refraction center is located at infinity.

Real-World Experiments
The proposed geometry insights and the calibration approach have been demonstrated on synthetic data and rendered images, but we finally want to validate them also in a real-world scenario. The experimental setup is shown in Fig. 10, top, the dome port with 50mm radius and 7mm thickness was attached to the sidewall of a test tank, then the tank was filled with water. The first method is similar to the validation method suggested in [39], which is faced with a similar challenge. We separated our dataset into a calibration set and a validation set. The calibration set was used to perform the evaluation as described above whereas the latter one was used for validating the estimated decentering vector. For validation, the 4 outermost 3D-2D pairs of chessboard corners were utilized to compute the relative transformation of the camera with  used the in-air photo for estimating the chessboard pose, which can be denoted as P air . Next, we used the underwater photo to estimate the chessboard pose again, which we denote as P unw . Theoretically, the two estimated poses should be exactly the same as we assumed that the chessboard position and orientation had not changed while taking the image pair.
Consequently, we can measure the relative pose error between the two poses to validate the calibration results. The measured pose error in the translation component was 0.0107m. It should be mentioned that when computing the in-air chessboard pose, we considered the refraction pattern when light rays travel through the air-glass-air interfaces. Afterwards, we did the reciprocal reprojection validation, where we projected the 3D chessboard corners onto the underwater image using the in-air estimated pose P air and then onto the in-air image using the underwater estimated pose P unw .
The average corner displacement induced by the refraction effect between the in-air and underwater image is 25.24 pixels. When considering refraction, as obtained from the disjoint set of calibration images, the measured distance is reduced to 0.59 and 0.62 pixels, respectively, which can be explained by unavoidable non-rigidity of the setup when filling the tank with water and some other potential sources of uncertainty as we will outline in the discussion. The results are displayed in Fig. 16. Overall, the reduction of error and the previous validation show that the methods proposed are valid and can be applied in practice.

Practical Calibration of an AUV Camera
Finally, and this was also the motivation of this work in the first place, the insights and the developed techniques have been applied to a newly developed AUV camera for ocean research. This machine vision camera has a wide angle lens inside of a 50mm dome port and a pressure housing. We have first calibrated the camera in air, before we applied the mechanical adjustment method as proposed in [25] until no refraction effects were observable. Afterwards, the entire system was submerged in a tank and chessboard images were recorded.
The images were undistorted according to the in-air calibration, then we apply the steps as reported in the previous sections. We obtain a calibration residual of 0.299 pixels for a decentering estimate of v off = (−0.36, 0.14, 0.27)mm, which means less than half a millimeter decentering in total. The camera system, sample calibration images with reprojected corners, as well as the AUV are displayed in Fig. 18.

Discussion
As the proposed geometrical insights and the decentering calibration approach rely on refraction effects caused by a decentered camera photographing through a spherical interface, in this section we would like to report some lessons learned.
Homography Mapping Error. In the classical approach of using a chessboard for calibration, which we also follow in this manuscript, we have to estimate a chessboard pose for each calibration image, on top of the actually desired refraction parameters. If we can directly compute a homography between an underwater image and the chessboard pattern, and the residual error for the homography is below the corner detector noise, this means that the center of refraction is effectively not observable in this calibration image, i.e. refraction effects are drowned in noise (see Fig. 17b). Typically, this particular scenario happens when the decentering is very small or the chessboard corners in the image are located close to the refraction center (low refraction effects, see section 3). Also, in case the refracted image point x r and the un-refracted image point x a are related by a 2D similarity transformation then the overall relation between refracted image coordinates and 3D chessboard points becomes This means that the overall homography H can absorb all refraction effects and the equation system arising from eq.8 is underconstrained (ambiguous).
To detect and avoid these situations, we define the Homography Mapping Error as the residual of a homography mapping from the 3D points to the image  points. We recommend users to monitor the Homography Mapping Error for each calibration image, and if it is low (compared to the noise level), to place the chessboard away from the refraction axis where we can see stronger effects (larger signal-to-noise-ratio). Additionally, the Homography Mapping Error can also be used to pre-select or to weight images for decentering calibration.
Non-Single-View-Point Camera. The geometry in this paper is analyzed based on the assumption that the camera has a single center of projection. For nonsingle-view-point lenses such as fisheye lenses, there is no perfect pinhole but a caustic where all light rays pass through. Centering such a lens with the dome port is in principle difficult as we can only bring one point to the dome center.
Uncertainty of Intrinsic Calibration. It is well known that the principal point of a camera is difficult to observe in air [40,41]. Conversely, this also means of course that it does not have a big impact on 3D mapping in air. Underwater however, the situation is different: Since the principal point is the intersection point of the optical axis with the image plane, the ray direction and refraction for every pixel will be impacted once the principal point changes, creating a different refraction pattern. Chessboard-based experiments to calibrate both the camera intrinsics and the decentering jointly failed, although we can achieve a lower calibration residual, the estimated parameters were off from the ground truth. Those parameters are correlated and probably more powerful (non-flat) calibration targets are required to better constrain the principal point. As in this contribution we discuss refraction effects, calibration of intrinsics is beyond the scope of this paper, but high-precision principal point calibration using underwater refraction could be an interesting option for future research.
Advantage over In-Air/Underwater Pair Calibration. In order to obtain independent measurements of the decentering we have performed the method proposed in [25] that is based on a pair of images, one taken in water, the other one in air, but with the same pose. While this method works in principle, the accuracy is limited by physical constraints.
When trying to bring the air and water pair into as good as possible agreement, we realized that it was extremely difficult to keep the chessboard steady when capturing the in-air/underwater image pair during the experiment. When the water is injected into the pool, waves are generated at the water surface, making the checkerboard unstable. In addition, we also found that the water pool deforms slightly when carrying more water due to the increasing pressure.
Consequently, we had to spend substantial effort to keep the chessboard steady.
As can be seen from Fig. 15a, we first firmly attached the chessboard to a metal frame and then attached them together on a wooden bar on top of the pool. Next, nylon ropes were used to connect the chessboard with the sidewall of the pool. Nevertheless, we still believe that there is half a pixel error due to non-rigidity of the setup. We therefore conclude that the in-air/underwater pair method is limited in the achievable accuracy. The pure underwater calibration suggested in this paper is more practical, as it can be cumbersome in some cases to bring the chessboard and camera (e.g. attached on a robot) from water into air without changing relative pose, while in the proposed method an underwater camera just has to be submerged, as shown in Fig. 18 for a camera of an AUV.
Other cameras as shown in Fig. 1 can also be calibrated using this method even in the ocean.

Conclusion
In this paper, we have presented new insights into refraction effects caused by a decentered camera behind a spherical window. Somewhat similar to the flat refractive geometry case, the overall system acts as an axial camera, but here the axis intersects the sphere in two different poles that determine barrel-or pincushion-like refraction effects. Refraction happens along a line that connects the refraction center with the unrefracted projection, reducing the 2D search for projection to a 1D search.
It was then shown how to directly estimate the refraction center from a single underwater chessboard image, and that it is possible to infer the decentering from underwater chessboard images only. We then presented a novel practical approach to underwater calibration of dome port camera systems.
The approach can be used without a special pool with windows at the water level and does not need an in-air/underwater image pair, facilitating calibration also for bulky systems like submersibles. The results obtained by our method are not only relevant for adjusting, but the remaining decentering offset can be considered when using the camera in practice, e.g. similar to refractive structure from motion with flat ports as proposed by Jordt et al. [4]. In future works, we strive to integrate decentered dome port cameras into the state-of-art structure from motion pipeline to enable 3D reconstruction using dome port cameras, to facilitate a more detailed 3D error analysis, and also compare uncertainties and performance of different camera models.