Shape from Mixed Polarization

Shape from Polarization (SfP) estimates surface normals using photos captured at different polarizer rotations. Fundamentally, the SfP model assumes that light is reflected either diffusely or specularly. However, this model is not valid for many real-world surfaces exhibiting a mixture of diffuse and specular properties. To address this challenge, previous methods have used a sequential solution: first, use an existing algorithm to separate the scene into diffuse and specular components, then apply the appropriate SfP model. In this paper, we propose a new method that jointly uses viewpoint and polarization data to holistically separate diffuse and specular components, recover refractive index, and ultimately recover 3D shape. By involving the physics of polarization in the separation process, we demonstrate competitive results with a benchmark method, while recovering additional information (e.g. refractive index).


Introduction
For centuries, it has been known that the shape of an object influences the polarization state of reflected light. 3 This principle underlies the Shape from Polarization (SfP) technique, which aims to recover the surface normals of an object from three polarized photos.
Classical approaches to SfP rely on specular reflections from an object (hereafter, specular SfP). In an effort to handle purely diffuse surfaces, Atkinson and Hancock introduced a landmark result, modifying the physical model to account for cases where all the light is diffusely reflected (hereafter, diffuse SfP) [1]. However, many surfaces exhibit properties that are neither diffuse nor specular, but somewhere in-between. A "mixed reflection" occurs: both diffusely and specularly reflected light return to the camera causing model mismatch.
Obtaining surface normals through polarization is mostly a laboratory problem, with several practical challenges. For example, one needs to know the refractive index of the material; the material must be either diffuse or specular; and ill-posed ambiguities exist for both zenith and azimuth angles. Recent work 3 Augustin-Jean Fresnel (1788-1827).  has used a coarse depth map to provide what may be a promising step toward "in-the-wild" uses of SfP [2] (hereafter, "Polarized 3D"). While Polarized 3D has demonstrated compelling results, we believe our work offers complementary benefits.
At the heart of our work is an analysis of mixed reflections and their impact on existing techniques that use SfP. We find that, indeed, a mixed reflection perturbs the result to the point where correction is desirable. We therefore propose a physics-based technique to correct for mixed reflections using multiple viewpoints of an object, demonstrating the practical benefits of our approach through comparisons with previous work.
Scope: Our contribution of extending SfP to handle mixed surfaces is a unified approach. Prior art has proposed a sequential approach, where the scene is first split into diffuse and specular components, following which the appropriate SfP algorithm can be used. For example, the work of Miyazaki et al. [3] handles mixed surfaces by first using an algorithm for diffuse-specular separation, proposed by Tan and Ikeuchi [4], following which standard technique of SfP are used. Since the Tan and Ikeuchi technique is very general, i.e, it is not specific to polarization, we believe that the information from the Fresnel equations could be used to improve on previous work.
In this paper, we develop an approach that incorporates the SfP model to aid in separating the image into diffuse and specular components. We also show that our proposed approach allows simultaneous recovery of refractive index, while outperforming sequential approaches.

Related Work
In the context of related work, we believe our proposed technique is the first unified approach toward joint estimation of shape, diffuse-specular separation, and pixel-wise refractive index.
The Fresnel Equations describe the behavior of electromagnetic radiation as it interacts with a surface. When light interacts with a surface, the first-order event that occurs is reflection and transmission at the boundary (Figure 1a). 4 The Fresnel equations relate the angles of reflection (e.g., the zenith component of surface normal) with the refractive index of the medium as well as polarimetric properties.
Shape from Polarization (SfP) is the term used in computer vision for a technique that estimates surface normals using the principles of the Fresnel equations. Classical SfP requires measurement of the polarimetric properties (through 3 polarized photos) and estimation of the refractive index to solve for the angle of reflection. We consider ourself with two primary branches of the SfP technique: first-order and higher-order. The first-order SfP techniques assume the reflection model akin in Figure 1a. Following this model has allowed shape estimation of metals [5], transparent objects [6,7], dark objects [8], and even ocean waves [9]. Higher-order SfP techniques rely on multiple interactions of light with a medium, as in the case of Figure 1b. In such case, the Fresnel equations are applied differently, allowing for shape recovery of diffuse, subscattering surfaces. This is described with compelling experimental support by Atkinson and Hancock [1].
Diffuse-specular separation refers to a broad class of computational and optical techniques to decompose an image into a specular-only image and diffuse-only image. The most general techniques use only image or color information, but these can be susceptible to artifacts. For example, Nishino et al. introduced a technique that uses view-independent effects to identify the diffuse reflection in an image [10]. Other strategies combine image-based measurements with color analysis [10,11,12,13,4]. In this paper we show that it is beneficial to leverage the behavior of polarization to perform this separation. Previous attempts have used polarization to separate diffuse and specular components in the context of active illumination. In particular, spherical gradient illumination has been used by Ma et al. [14] and Ghosh et al. [15] to achieve photorealistic geometric reconstructions. For passive conditions, Nayar et al. introduced a separation technique that uses polarization images and color cues [16]. While successful, the Nayar method, and a related method proposed by Zickler et al [17] are limited by smoothness assumptions. In crux, though it is possible to directly combine existing work-for example, the SfP paper in [3] uses the separation method from [4]-we show that joint incorporation of reflection separation with SfP physics results is a beneficial strategy for addressing the mixed reflection problem in SfP.
Extended topics in polarization that are tangentially relevant, but outside the direct scope of this paper are described in brief. While our paper considers Polarizer Angle Image Intensity Varies with Polarizer Angle Fig. 2. Notation used for intensity components. As the polarizer angle is rotated, the image intensity varies in accordance with Equation 1. Only the intensity quantities Imax and Imin can be directly measured with a camera. One of the aims of this work is to recover I d using the physical behavior of polarization, shape, and reflectance.
linear polarization effects, work from [18] demonstrates shape reconstructions using circular polarization. Polarization information need not only be used for shape: prior art has considered problems like image dehazing [19], illumination multiplexing [20], panoramas [21], underwater scattering [22] or 3D displays [23]. In comparison to these related works, our paper is specific to the SfP problem. Future work could use, for example, the descattering model of [19], to possibly obtain shape in scattering environments.

Image Formation Model
This section describes SfP in condensed form. Conceptually, SfP uses imagebased measurements to estimate the surface angles of azimuth, ϕ, and zenith, θ. We will denote ϕ and θ to represent estimates of ground-truth quantities.
The measured irradiance at a single scene point is expressed as where φ is the phase angle, and I max , I min are the quantities shown in Fig. 2.
Since a sinusoid has three unknowns, by sampling three different values of φ pol it is possible to estimate φ, I max , and I min . 5 From these measurements, as detailed below, separate mechanisms are used to obtain the azimuth or elevation angles.

Challenges with estimating azimuth angle
To obtain an estimate of the azimuthal angle, ϕ, early work in SfP has used a specular reflection model, as illustrated in Figure 1a [25,26,3]. The maximum value of reflected light will occur when the light that reflects is perpendicularly polarized (since the Fresnel reflection coefficient for perpendicularly reflected light is greater). Then, since the maximum value of the cosine occurs at the origin, the azimuth angle is estimated as ϕ = φ. Atkinson and Hancock later introduced a compelling technique to recover object shape from a diffuse reflection model, as illustrated in Figure 1b. For such a scenario, it was observed that the direction of light propagation is reversed: light is refracted from the surface to air [1]. Since the direction of propagation is flipped, the minimum irradiance is now of interest, resulting in a shift in the estimated azimuth angle of π/2 radians, i.e., ϕ = φ ± π/2.
Two key challenges occur with azimuthal estimation. First, since Equation 1 includes a factor of 2 within the cosine, two azimuth angles, shifted apart by π radians, cannot be distinguished in the polarized images. 6 This first fundamental ambiguity is termed a azimuthal ambiguity, and applies to all SfP techniques. Second, for a general surface, not known a priori to be diffuse or specular it is ambiguous as to whether the estimated angle should be shifted by π/2 radians or not. This second ambiguity, due to the surface reflectance, is termed as azimuthal model mismatch, and applies critically to our problem of mixed reflections.

Challenges with estimating zenith angle
Estimation of zenith angle relies on the degree of polarization, calculated as As in the case of azimuthal estimation, the type of reflection influences the reflection model. First, consider the specular model in Figure 1a. Substituting the Fresnel equations (see [27]) into Equation 2 allows the degree of polarization to be written as ρ = 2 sin θ tan θ n 2 − sin 2 θ n 2 − 2 sin 2 θ + tan 2 θ where n denotes the refractive index and θ is the zenith angle. If some knowledge of ρ and n is obtained, then it is possible to solve Equation 3 for an estimate of the zenith angle, θ. This method is well-suited for highly specular objects and has been successfully used to estimate shape of metallic surfaces [5]. For diffuse reflections, as illustrated in Figure 1b, the Fresnel equations are once again combined with the degree of polarization. However, this is performed for the model where light is transmitted from the surface to air, such that the relation is now expressed as This work addresses two problems with zenith estimation. First, it is difficult to obtain an estimate of refractive index, that is accurate, at each pixel. If an improper refractive index is used, an error in zenith estimation occurs, termed in previous work as refractive distortion [2]. Second, it can be hard to know whether to use the model for a specular surface (Equation 3) or a diffuse surface (Equation 4). Only in ideal scenarios do surfaces conform to specular and diffuse models -real-world surfaces exhibit mixed reflections. This second source of error is referred to as zenith model mismatch in this paper.

Solving Model Mismatch
To solve model mismatch error, consider the dichromatic reflection model, where the radiance from a single scene point is expressed as where I d and I s refer to the radiant intensity of diffuse and specularly reflected light. The prior work in SfP calculates ρ from I max and I min , the maximum and minimum intensities observed when rotating the polarizer. Following the work of Nayar et al. [16], it can be assumed that only the specular component causes appreciable variation, such that the measured degree of polarization for a mixed surface is expressed as where I s max and I s min denote maximum and minimum irradiance observed from specularly reflected light (see Figure 2). Under this simplification, it is possible to substitute Equation 3 into Equation 6 to express the diffuse intensity as I d = I(1 − ρ n 2 − 2 sin 2 θ + tan 2 θ 2 sin θ tan θ n 2 − sin 2 θ ).
This equation contains two unknowns: the intensity of diffuse reflection, I d and the refractive index n. Under a Lambertian approximation, the former quantity is constant across different viewpoints. Additionally, since the refractive index is a physical property of the material, it is also constant across different viewpoints. The proposed strategy is to estimate the quantities on the right-hand-side of Equation 7 at different viewpoints, such that for the i-th view of N total views. To recover I d a non-linear least squares problem can be solved of the form  Fig. 3. The proposed approach is able to separate reflectance for a variety of object textures (simulated example). Using the Mitsuba raytracer we render the Stanford bunny from three viewpoints, under three different material conditions (diffuse, glossy, and a spatially varying texture). The proposed technique is quantitatively compared with the previous work of [10]. By incorporating additional polarization, we demonstrate a significant reduction in error.
In this paper a sequential quadratic program is used to perform the minimization. Please refer to the supplement for implementation details.

Experimental Results
Reflection models for SfP are not geared to handle mixed reflections. Existing solutions use a sequential approach: first, a robust algorithm is used to separate reflection components, following which SfP is performed. We provide a comparison to this sequential approach, using the multiview, reflection separation technique of Nishino et al. [10] as our point of comparison.
Implementation details: All simulations were performed using the Mitsuba raytracer [28]. The raytracer has been modified to acquire depth information and includes a Matlab script to simulate polarization measurements. The object remains static throughout all experiments as viewpoint diversity was acquired by moving the camera. Physical experiments were performed with a Canon Rebel T3i camera with EF-S 18mm-55mm f/3.5-5.6 IS II SLR lens and a linear polarizer with quarter-wave plate, model Hoya CIR-PL. Three viewpoints were collected at 10 degree increments.
Diffuse-specular separation: As shown in Figure 3, the Stanford bunny is rendered with three different materials: clay, gloss, and a mixture of clay and gloss. Reflection separation is shown for the specular image component, for both the proposed technique and Nishino's method. Both techniques recover specular outliers, but the proposed technique recovers detail in regions that are of moderate specularity. As illustrated in the bar graph, the quantitative error is lower, for the proposed method, for all tested material configurations. Because the proposed technique relies on viewpoint artifacts, classic artifacts like occlusions or lack of texture can lead to registration issues. This explains why our proposed method performs worse on the mixed material bunny (although the result is an improvement over Nishino's method).
Surface normal recovery: Figure 4 uses a rendered sphere to show that the recovered surface normals are not accurate using naive shape from polarization Zenith Estim.

Fig. 5.
Validating the proposed technique with a physical experiment and comparisons to [10]. The uncorrect SfP result leaves something to be desired as the normal map is noisy. The proposed correction algorithm reduces the MSE, while applying the technique from [10] increases the error.
( Figure 4b. Simple pre-processing with Nishino's method, as shown in Figure 4d, does not allow for robust surface normal recovery. The Nishino method, as a general method that does not account for polarization information recovers a the degree of polarization that does not conform with the physical scene. This leads to a poor estimate of zenith angle, and ultimately, surface normals. In comparison, the proposed technique, shown in Figure 4c, shows clear recovery of the surface normals, as well as the degree-of-polarization anzenith angle. Refractive index recovery: A benefit of the proposed technique is the ability to simultaneously recover per-pixel refractive index. The rendered sphere in Figure 4 has a ground-truth refractive index of 1.5. The proposed technique estimates the refractive index as 1.49, for a mean absolute error of 0.01. Without applying the proposed correction for mixed reflections, the error in refractive index estimation is 0.07. Interestingly, pre-processing with the Nishino method leads to a much greater error in estimating refractive index (0.18).
Physical experiment: To validate our technique in the wild, a physical scene was set up in a similar fashion to the simulated examples. As illustrated in Figure 5, a camera and polarizing filter are placed 50cm in front of a glossy sphere. Three viewpoints, at 10 degree increments, were captured. At each viewpoint, three polarized photos were captured, for a total of nine photographs. The uncorrected surface normals, obtained from naive SfP, are poor. In particular, note the reflections of the ceiling lights in the normal map. The proposed technique mitigates this issue, and reduces the MSE from 0.06 to 0.03. The Nishino method, while it does mitigate the dramatic specular reflections from ceiling lights, results in a greater MSE. This is likely due to the significant holes in the normal map.

Discussion
In summary, we have proposed a new technique to separate reflection components from a scene using a combination of passive polarization and viewpoint. To our knowledge, this is the first paper to do so. While there are alternate ways to address the reflection separation problem -for instance, through the use of color channels -we believe that the proposed technique is a complementary approach that can be combined with previous methods.
Benefits: The proposed technique may find direct application in improving the quality of SfP and related algorithms. Prior art in SfP has not analyzed indepth the impact of mixed reflections. This paper has shown that it may not be sufficient to sequentially apply an existing algorithm to first separate reflection components. Rather, the physics of polarization that are used to obtain shape, can also be used to separate reflection components. With the increased interest in multiview methods (e.g. KinectFusion [29]), it seems logical to consider the inclusion of the proposed technique within such frameworks. In addition, the proposed technique has shown recovery of refractive index, which is a challenging problem in computational imaging often addressed with calibrated optical setups [30]. The ability to estimate refractive index is shown to greatly improve the accuracy of SfP, but may also find use in other applications like object detection.
Limitations: We follow previous work in using the unpolarized world assumption -the light incident on an object is initially unpolarized. In scenes with significant specular reflections -like a house of mirrors -the unpolarized world assumption is violated. It should be noted that prior art has empirically observed the validity of the unpolarized world assumption in realistic scene conditions [2]. Although our paper also acquires refractive index, for the sole purpose of reflection separation, other strategies that use fewer images (e.g. Tan et al. [4]) may be preferable. Specifically, the proposed technique uses three polarized images at a minimum of two viewpoints -a minimum total of 6 images are required. However, the intended application of this technique is to SfP, where it is expected to capture multiple images, and where it is desirable to estimate the refractive index.
Open challenges: While the proposed technique forges a strong link betweeen shape, passive polarization, and reflectance, several open topics remain. For example, would the technique improve if more viewpoints were captured; would circular polarization allow for more information to be gleaned; and could this method be combined with other frameworks (for example, color, viewpoint, and polarization)? In conclusion, we hope that this paper may improve the practicality of SfP, allowing surface normals to be estimated on surfaces with mixed reflective properties.