Detection of Polarization due to Cloud Bands in the Nearby Luhman 16 Brown Dwarf Binary

Brown dwarfs exhibit patchy or spatially varying banded cloud structures that are inferred through photometric and spectroscopic variability modeling techniques. However, these methods are insensitive to rotationally invariant structures, such as the bands seen in Jupiter. Here, we present H-band Very Large Telescope/NaCo linear polarization measurements of the nearby Luhman 16 L/T transition binary, which suggest that Luhman 16A exhibits constant longitudinal cloud bands. The instrument was operated in pupil tracking mode, allowing us to unambiguously distinguish between a small astrophysical polarization and the ∼2% instrumental linear polarization. We measure the degree and angle of linear polarization of Luhman 16A and B to be pA = 0.031% ± 0.004% and ψA = −32° ± 4°, and pB = 0.010% ± 0.004% and , respectively. Using known physical parameters of the system, we demonstrate that an oblate homogeneous atmosphere cannot account for the polarization measured in Luhman 16A, but could be responsible for that of the B component. Through a nonexhaustive search of banded cloud morphologies, we demonstrate a two-banded scenario that can achieve a degree of linear polarization of p = 0.03% and conclude that the measured polarization of the A component must be predominantly due to cloud banding. For Luhman 16B, either oblateness or cloud banding could be the dominant source of the measured polarization. The misaligned polarization angles of the two binary components tentatively suggest spin–orbit misalignment. These measurements provide new evidence for the prevalence of cloud banding in brown dwarfs while at the same time demonstrating a new method—complementary to photometric and spectroscopic variability methods—for characterizing the cloud morphologies of substellar objects without signs of variability.


Introduction
Brown dwarfs occupy a unique parameter space, with effective temperatures (T eff ), masses, and radii in between those of giant exoplanets and stars. After their initial formation, they radiatively cool over time, moving from late-M through L, T, then Y spectral types, experiencing both chemical and atmospheric evolution. Brown dwarfs at the L/T spectral-type transition are believed to undergo an evolution from extremely dusty/cloudy atmospheres, where the clouds are mostly made of corundum, iron, and silicates, to nearly clear atmospheres that eventually begin to form clouds from other families of condensates such as Cr, MnS, Na 2 S, ZnS, and KCl (Burgasser et al. 2002;Marley et al. 2010;Morley et al. 2012). This theory is bolstered by photometric and spectroscopic variability studies that have revealed increased variability across the transition (e.g., Radigan et al. 2012;Crossfield et al. 2014;Biller 2017;Artigau 2018), suggestive of patchy clouds (Karalidi et al. 2016) or longitudinally varying cloud bands (Apai et al. 2017). Understanding cloud morphology in brown dwarfs is important as clouds affect their disk-integrated spectra and colors, and directly relate to the radiative, advective, and chemical processes taking place within their atmospheres (Showman & Kaspi 2013). Studies of brown dwarf clouds can also serve as probes of cloud formation and transport on directly imaged gas giants, which can have similar effective temperatures and surface gravities (e.g., Bowler 2016).
Polarimetry is a useful tool for studying clouds and hazes in brown dwarfs and is highly complementary to photometry and spectroscopy. As the emitted light of a brown dwarf is scattered by clouds and hazes in its atmosphere, it can locally acquire a preferred linear polarization as it gets redirected toward the observer (Sengupta & Krishan 2001;Sengupta & Marley 2009. This preferred polarization will cancel itself out in an unresolved measurement (which is always the case for brown dwarfs) and result in a net zero polarization unless there is some type of asymmetry in the atmosphere, such as rotationally induced oblateness, or patchy/banded clouds (de Kok et al. 2011). In addition to its oblateness, the measured degree of linear polarization for a given brown dwarf depends on the cloud particles in two ways: their scattering properties (determined by their size, shape, and composition) and their spatial distribution (Stolker et al. 2017). In this respect, polarization can act as a very effective diagnostic of cloud properties in brown dwarfs. These same effects are also present for giant extrasolar planets, where lower surface gravities can result in higher levels of oblateness for the same rotational periods (Marley & Sengupta 2011).
Polarization has been measured in over two dozen brown dwarfs (Ménard et al. 2002;Zapatero Osorio et al. 2005Goldman et al. 2009;Tata et al. 2009;Miles-Páez et al. 2013). These measurements have revealed linear degrees of polarization between ∼0.1% and 2.5%, spanning R to J bands. While many of these detections could possibly be explained by oblateness (e.g., Sengupta & Marley 2010), polarimetric monitoring of a handful of sources has revealed both shortterm and long-term variability, suggesting that the time-varying morphology of the clouds also plays a significant role (Miles-Páez et al. 2015, 2017. Although in some circumstances polarization can be attributed to the presence of a circumbrown dwarf disk (Hashimoto et al. 2009;Zapatero Osorio et al. 2011;Miles-Páez et al. 2013;Ginski et al. 2018), for the vast majority of polarimetric measurements, the true origin of the net polarization remains unknown, due in part to the degeneracies in the atmospheric model parameters and the limited amount of physical characterization available from other measurements.
Here, we present near-infrared (NIR) linear polarimetric observations of the Luhman16 brown dwarf binary system, obtained with the NaCo imager (Lenzen et al. 2003;Rousset et al. 2003) at the ESO Very Large Telescope (VLT). We measure a linear degree of polarization of p A =0.031%±0.004% and p B =0.010%±0.04% in Luhman16A and B, respectively, corresponding to detection significances of 8σ and 2.5σ. The existing extensive characterization of the system allows us to explore possible sources of polarization in greater detail than for all previous polarimetric measurements of brown dwarfs to date.

Luhman 16
The closest brown dwarf system to Earth is Luhman16 (WISE J104915.57−531906.1AB; Luhman 2013), a brown dwarf binary at a distance of ∼1.99pc (Lazorenko & Sahlmann 2018). The binary is of particular interest because the two components span the L/T transition, with spectral types of L7.5±1 and T0.5±1 for components A and B, respectively (Burgasser et al. 2013;Kniazev et al. 2013). The system's relative brightness compared to more distant brown dwarfs has resulted in many detailed studies that have been able to constrain the mass, rotation period, and inclination of both components ( Table 1). Previous linear polarization measurements have put an upper limit of 0.07% in the I band on the unresolved binary (Kniazev et al. 2013).
Numerous photometric and spectroscopic variability studies have revealed that both components are variable, with variability amplitudes that change significantly from epoch to epoch (Biller et al. 2013;Gillon et al. 2013;Burgasser et al. 2014;Buenzli et al. 2015aBuenzli et al. , 2015bKaralidi et al. 2016). In all cases, Luhman16B is found to be the more variable of the two components, with peak to peak amplitudes up to 11% (I + z band; Gillon et al. 2013) and Luhman16A having a maximum measured variability of only ∼4.5% (between 0.8 and 1.15 μm; Buenzli et al. 2015a). Further, high-resolution spectroscopic monitoring of Luhman16B has produced the first two-dimensional Doppler-imaging cloud map of a brown dwarf, revealing patchy variations in the cloud cover .
In general, the variability of both components of Luh-man16 has been attributed to patchy clouds, but recently longitudinally varying cloud bands with planetary-scale brightness variations have been used to explain the photometric variability of three other L/T transition objects, suggesting that this phenomenon may provide a more complete explanation (Apai et al. 2017). In the Luhman 16 system, preliminary evidence already hints at the possibility of cloud bands. Both the near-exact repetition of a light curve feature seen in two Hubble Space Telescope (HST) data sets separated by over a year (Karalidi et al. 2016), as well as the change in state of Luhman 16A from low variability to high variability (Buenzli et al. 2015a) could both be explained with variable cloud bands with slightly different periods, beating over time.

Observations and Data Reduction
Observations of Luhman16 were obtained with VLT/NaCo in visitor mode, starting at 2018 April 11 23:22 UT and lasting until 2018 April 12 06:26 UT, covering a total of 7 hr and 4 minutes ( Table 2). The observations span ∼1.4 or ∼0.9 rotation periods for Luhman 16A, assuming a rotation period of 5 or 8 hr, respectively, and ∼1.4 rotations for Luhman 16B. The 4.5-5.5 4.87±0.01 Buenzli et al. (2015a), Gillon et al. (2013) or 8 or 5.05±0. 10 Mancini et al. (2015), Burgasser et al. (2014) 56±5 (for 5 hr period) 26±8 Karalidi et al. (2016), Crossfield (2014) 18±8 (for 8 hr period) Karalidi et al. (2016) Distance (pc) 1.9937±0.0003 Lazorenko & Sahlmann (2018) instrument was operated in NaCo's dual-channel polarimetry mode, using the SL27 camera, which provides a pixel scale of 27 mas/pixel and a 27×27″ field of view. At the time of the observations, NaCo was mounted on UT1 on the Nasmyth A platform. NaCo's polarimetry mode includes a rotatable halfwave plate (HWP), a focal plane mask, and a Wollaston prism that splits the incident beam along the detector y-axis into ordinary (I P ) and extraordinary (I ⊥ ) beams with a splitting angle that shifts the beams on the detector by the on-sky equivalent of 3 5. We have labeled the two beams "ordinary" and "extraordinary" to be consistent with previous NaCo documentation; however, the modulation of the HWP means each beam effectively acts in both capacities. To prevent beam overlap, the focal plane mask blocks strips of 27×3 5, alternating between blocked regions and transmitted regions with a width of 3 5 along the detector y-axis.
Observations were obtained in four-image groups with the HWP cycling between position angles of 0°, 45°, 67°.5, and 135°. Note that this was unintentionally different from the standard 0°, 45°, 22°. 5, and 67°. 5, and the result of a user error. This sequence of HWP angles is not recommended. At each HWP position, we obtained one image with an exposure time of 3 s (DIT=3 s) and 20 coadds (NDIT=20) in NaCo's cube mode. After each HWP cycle, the telescope was dithered along the detector's x-direction by ±5″, such that the observations were taken in an ABAB dither pattern.
The NAOS adaptive-optics (AO) system was operated using the K-band dichroic, which transmits 1.9-2.5 μm light to the wavefront sensor with 90% efficiency, while sending 0.45-1.8 μm light to the CONICA imaging system, also with 90% efficiency.
To minimize time-varying instrumental polarization effects, the instrument was operated in pupil tracking mode in a "crossed configuration." In this configuration, the entrance fold mirror of NaCo is oriented such that any instrumental polarization it introduces will be of opposite sign to that of M3, thus minimizing the cumulative instrumental polarization reaching the HWP (e.g., Witzel et al. 2011;de Juan Ovelar 2013). The combined use of the pupil tracking mode with NaCo's polarimetry mode was first employed by de Juan Ovelar (2013), who attempted to measure the polarization of the directly imaged planets of HR 8799. In NaCo's pupil tracking mode, the entire instrument rotates as the telescope altitude changes, such that the orientation of M3 relative to the instrument remains constant throughout the observations. Thus, under this configuration, the instrumental polarization is both stable and minimized. Over the course of our observations, the A and B components rotated with parallactic angle relative to their center of light due to the pupil tracking configuration. Our one-minute exposures and four-exposure modulation cycles are significantly faster than the parallactic rotation. Figure 1 provides a summary of the parallactic angle, seeing, and air mass as a function of time. The seeing across the observations ranged from 0 24 to 1 09. During the observations, the AO loops opened several times, causing two minor interruptions (i.e., less than 10 minutes) and one ∼45 minute interruption. After just over seven hours of observing, a fatal mechanical malfunction in the HWP unit forced us to stop  Center left, center right, and right-the flux ratio of A to B as a function of y-detector position, air mass, and seeing. Although slight correlations are measured between the flux ratio and these three parameters (Section 2.2), it is difficult to explain the variations in the ratio seen in Figure 3 as being significantly influenced by the y-detector position, air mass, or seeing.
observing. We obtained a total of 276 individual exposures, amounting to a total integration time of 4 hr and 36 minutes.
Observations of the evening twilight sky were obtained on 2018 April 11 UT, with the telescope pointed with an altitude of 90°and an azimuth angle of 45°east of north. Although predicting the exact degree of linear polarization is difficult, the angle of linear polarization is expected to be oriented 90°away from the vector connecting the telescope pointing location and the Sun (e.g., Harrington et al. 2011;deBoer et al. 2014). We obtained five sets of four HWP positions. In order to mimic the Luhman16 observing configuration, the K-band dichroic was inserted into the beam and the instrument was oriented in the same "crossed configuration" as above (verified by checking that the ESO ADA PUPILPOS header keyword was equal to the expected position of 90°for the crossed configuration). See Table 2 for a summary of the observations.
Due to the failure of the HWP rotation mechanism on 2018 April 11, observations of a polarized and unpolarized standard were not possible until after an intervention to repair the mechanism, which occurred in late 2018 May. The failure occurred because an axle that drives the rotation of the HWP snapped and part of the mechanism had to be replaced. The replacement process lost the known calibration between the motor encoder position and angle of the HWP. As a result, the observations after the intervention display a systematic offset in the measured polarization angle from the Luhman16 data. This offset was fit for as part of our data analysis (see Section 3).
After the repair, observations of the polarized standard Elias2-25 (p=4.13%±0.02%, ψ PA =24°±1°in the H band; Whittet et al. 1992) and the unpolarized standard HD162973 (p=0.09%±0.055%, ψ PA =104°±17°in the B-band; Mathewson & Ford 1970) were obtained in queue mode, with each target observed both before and after meridian passage for one HWP cycle in each of the same two dither positions obtained for Luhman16. For both of these targets, one HWP cycle consisted of HWP positions of 0°, 45°, 67°.5, 135°, and 22°. 5. The observations were obtained in exactly the same observing mode as Luhman16: pupil tracking, H-band filter, field mask inserted, and using the K-dichroic. See Table 2 for a summary of the observations.

Luhman 16 Data Reduction
A master dark and a master flat field (using lamp flats) were created using the NaCo_img_dark and NaCo_img_lampflat recipes, respectively, from the ESO Gasgano pipeline with the default calibration files provided by ESO. Each individual exposure was then dark subtracted and divided by the master flat field. Although the data was taken in NaCo's cube mode, we opted to carry out our analysis on the "single" frame images (i.e., the average of all the NDITS in each image). Background subtraction was carried out by subtracting from each frame an image at the opposite dither position but with the same HWP angle, from either the following (for the "A" dither position) or the preceding (for the "B" dither position) HWP cycle.
In each exposure, both Luhman16A and B components were detected with a signal-to-noise ratio (S/N) between ∼1000 and ∼1500, in both the ordinary and extraordinary beams ( Figure 2). The two objects were clearly resolved in all images. However, the two point spread functions (PSFs) can be seen to overlap with each other slightly.
For each image, the photutils (Bradley et al. 2017) Python package was used to measure the location of each of the two binary components in both the ordinary and extraordinary beams and extract their photometry. Each of the four sources was first found using the DAOStarFinder routine and then photometry was extracted using a circular aperture. Several different radii for the aperture were tested (6, 10, 12, 13, 16, and 17 pixels), and ultimately, we found a radius of 17 pixels (0 46) to provide a balance between the formal S/N measured by photutils and the true scatter of the data points. This radius was the largest possible radius without the circular apertures from A and B overlapping. Uncertainties on each photometric measurement were provided by photutils, assuming a read noise of 4.2 ADU and a gain of 11 electrons/ADU.

Total Intensity Data Reduction
To obtain the total intensity for each component at each time step, the aperture sums for the ordinary and extraordinary beams were added together (see Table A1). The absolute variability of the individual components appears to be highly correlated and is likely due to changing atmospheric conditions and imperfect AO correction. Around the 4 hr mark, the measured flux dropped by 30%-40%, due in part to poor AO correction and an enlarged PSF. The apertures sizes could not be increased further to compensate because of the separation between the two components and the potential for overlap.
It was not possible to derive accurate absolute photometry due to changing atmospheric properties and adaptive-optics systematics. The flux ratio between the two components displays quasi-periodic variability on timescales of less than 2 hr, with the values of the peaks, troughs, and the peak-tovalley distance changing over the length of the observations (Figure 3). In order to explore the fidelity of this signal, we measured the Spearman-r and Kendall-Tau correlation coefficients (and the corresponding null hypothesis p-values) of the flux ratio against the detector x and y positions, the FWHM of each component, air mass, and seeing ( Table 3). We measure no significant correlations, but some low-level correlations with y-position, air mass, and seeing. Although these small correlations may be real, when the flux ratio is plotted against these quantities, it is clear that the main quasi-periodic photometric signal cannot be explained by these correlations (see Figure 1).   (Lomb 1976;Scargle 1982) of the flux ratio data, excluding the measurements between 4 and 4.85 hr where the AO correction was poor. A strong peak can be seen at 1.64 hr, roughly a factor of 3 less than the previously measured 4.87±0.01hr rotation period of Luhman 16B (Gillon et al. 2013). In Figure 3, we also explore whether our selection of aperture size affected the peak in the periodogram. For all apertures sizes, we find the same significant peak at 1.64 hr.
Using the relative positions of each source measured by the DAOStarFinder, we measured a separation of 934±2 mas (34.52 ± 0.07 pixels) by averaging over all measurements in our observing sequence, including both dither positions and both the ordinary and extraordinary beams. The relative positions in pixels were converted to an on-sky separation using a pixel scale for the S27 camera of 27.053±0.019 mas/ pixel. 13 The quoted errors represent the standard deviation of the measurements added in quadrature to the error due to the plate scale uncertainty. For each image, we also measured a relative angle between the two sources in detector coordinates, which was then converted to an angle on sky by adding the parallactic angle from the header for each observation (taken as the average of the ESO TEL PARANG START and ESO TEL PARANG END header keywords). By averaging all measurements, we obtained a relative position angle between A and B of 147°.0±0°. 1, measured east of north. Neither north angle correction nor distortion correction has been applied.

Polarimetry Data Reduction
In order to extract the measured polarization from the photometric measurements of each component, we built a Mueller matrix model of the system to describe how on-sky polarization relates to the intensities measured on the detector. The on-sky Stokes vector, , , sky sky sky sky sky , is related to the measured Stokes vector at the detector, S det , through a system Mueller matrix, M sys :  Goldstein (2003). We follow the IAU definitions of Q, U, and V for our on-sky frame of reference. We define the Stokes vector seen by the waveplate, S NaCo , as

HWP HWP NaCo
Most detectors, including NaCo's, are only sensitive to the Stokes I term. By evaluating the Mueller matrices for the Wollaston prism and the HWP, it can be shown that º q Q I NaCo NaCo NaCo can be measured from the Stokes I det values of the ordinary and extraordinary beams by taking the normalized difference of the two beams (i.e., a "single difference"): where the angle in parentheses refers to the angle of the HWP. However, noncommon path errors and different detector systematics between the ordinary and extraordinary beams can introduce an error/bias term, ò, rewriting Equation (  Quasi-periodic variation can be seen throughout the sequence. The noisier region near the 4 hr mark is associated with poor adaptive-optics correction. Bottom: a Lomb-Scargle periodogram of the relative photometry. The data obtained between 4 and 4.75 hr after the start of the sequence have been excluded. A prominent peak can be seen at 1.64 hr. enabling the measurement of an unbiased q NaCo via a "double difference": The error term ò can be calculated as In general, the bias term ò represents the noncommon path errors between the ordinary and extraordinary beams that do not depend on the incident polarization state (and hence the HWP position). This error changes with time due to timevarying instrument and detector drifts, motivating the standard θ HWP =[0°, 45°22°.5, 67°.5] HWP sequence, where ò is calculated for every pair of q and u measurements. In our case, for the twilight and Luhman16 data, without measurements with θ HWP =22.5, each u NaCo measurement (obtained with θ HWP =67.5) is corrected for the ò term using the bias measured from the preceding measurements with θ HWP =0°a nd 45°(Equation (9); i.e., not standard double-differencing). The time-varying nature of this systematic means that this nonstandard correction is not perfect and is expected to result in additional systematic noise in u NaCo relative to q NaCo . We attempt to compensate for this extra added noise in our modeling (the σ SD term in Section 3).
A 90°degeneracy in HWP angles means that the measurements at 45°and 135°should be nearly identical, modulo changes in ò over time. Before calculating q NaCo or the ò term for a given HWP cycle, we average together the two measurements at 45°and 135°to increase the S/N.
For the Luhman16A and B data, using the photometry measured for the o and e beams, we calculated q NaCo and u NaCo for each HWP cycle from Equations (8) and (10) (hereafter q A and u A for Luhman 16A and q B and u B for Luhman 16B). A visual inspection of the resultant values revealed a systematic offset in q NaCo and u NaCo between the two dither positions in both components, possibly suggesting a spatial dependence in the instrumental polarization along the x spatial direction (on the order of ∼0.02%). To compensate for this spatial dependence, we average the q NaCo and u NaCo values for each AB dither pair. Figure 4 (and Figures A1 and A2) shows the q NaCo and u NaCo for Luhman16A and B after averaging the two dither positions (also see Table A2). Our correlation analysis (Section 2.4.2) suggests that after this averaging, there is no longer any significant spatial dependence in the signal. Errors on q NaCo and u NaCo were propagated from the original photometric errors through Equations (6)-(11).
A detection of polarization in Luhman16A can be seen as a near-sinusoid in the q NaCo frame. In the instrument frame, the instrumental polarization holds a constant value over time (because of the cross-configuration), and any astrophysical polarization modulates between q NaCo and u NaCo as the sky (and therefore the angle of polarization) rotates relative to the instrument according to the parallactic angle. For Luhman16A, this can most easily be seen by comparing the first half of the data to the second half. Although it cannot as easily be identified by immediate inspection, polarization is also detected for Luhman16B, but at a lower significance. In Figure A3, we show the same data but combined to calculate the linear degree of polarization ( = + p q u NaCo NaCo 2 NaCo 2 ) and the angle of linear polarization ( ( ) q = u q 0.5 arctan 2 NaCo NaCo NaCo ) in the instrument frame as a function of parallactic angle. Before calculating p NaCo and θ NaCo , we subtracted the mean q NaCo and u NaCo values from each measurement. This step acts as a preliminary instrumental polarization subtraction; because any real signal presents itself as a modulation on top of the instrumental polarization and our observations are relatively well centered on meridian passage, there should be nearly equal signal above and below the instrumental polarization values of both q NaCo and u NaCo . The p NaCo measurements suffer from the squaring of the noise of q NaCo and u NaCo , and can be difficult to interpret given the S/N of each measurement. However, for Luhman16A, the angle of linear polarization (which does not suffer from the same increase in noise) shows a clear linear trend with the parallactic angle, indicating the presence of an astrophysical signal that is rotating with the sky relative to the fixed instrument frame. The same signal is not seen for Luhman 16B. Translating the instrument-frame measurements into calibrated sky-frame polarization measurements (i.e., S sky from Equation (1)) requires further instrument modeling, as described below (Section 3).

Real Detection or Residual Instrumental Polarization?
The signal presented in Figures 4 and 15 for Luhman 16A represents a significant jump in accuracy compared to previous polarimetry measurements obtained with NaCo, and it is therefore prudent to question whether this signal could be due to other systematic effects. Here we consider previous efforts to model the instrumental polarization of NaCo to put our data set in context, and we also discuss potential systematic effects that could affect the interpretation of our data.

Previous Work
Two papers have previously developed detailed Mueller matrix models for NaCo's Wollaston/HWP polarimetry mode, both using a field-stabilized observing mode: Witzel et al. The Mueller matrices of M3 and NAOS were constructed as linear diattenuating retarders, where the linear diattenuation and retardation were calculated using assumed material properties for aluminum (for M3) and Silflex (for two NAOS 45°fold mirrors), and the known angles of incidence upon the different mirrors in the system. This system model was then compared to measurements of four polarized sources in the IRS 16 cluster of the Galactic center. In order to match the observations, the authors tweak the imaginary index of refraction for aluminum and the retardance of the Silflex while simultaneously fitting for linear degrees and angles of polarization for the IRS 16 sources until a minimum χ 2 value is obtained. They quote a final error on their polarization estimates of the IRS 16 sources to be 0.8%. Unfortunately, the details of the fitting procedure and the derivation of this final error are not included in the paper and thus make the error difficult to compare to our data. They also compare their tweaked model to observations of three unpolarized standards (obtained under a different instrument configuration, with the HWP out of beam) and one polarized standard, and find that their model is accurate to ∼0.3%±0.2%, where the uncertainty represents the median deviation of their residuals. deBoer et al. (2014) carry out a strategy of using twilight polarization measurements, and different telescope and instrument orientations in order to constrain the Mueller matrices of M3 and NaCo in the H band. By rotating the telescope to different azimuthal positions and NaCo to different derotator angles, they take advantage of the known twilight sky polarization angle to partially recover the Mueller matrices of both M3 and NaCo. Using this strategy, to fully describe the Mueller matrices of both components assumptions on their analytical forms must be made (e.g., Fresnel reflections with imposed material properties). Nonetheless, the authors were able to obtain errors on the instrumental polarization of 0.4%. Although this method was able to provide a good first step at developing an accurate instrument model, the authors note several discrepancies in their fitting that suggest their model is not fully self-consistent and suggest future avenues for a more The degrees of polarization of A and B are related to the peak-to-trough distance, and the angle of linear polarization is related to the parallactic angle at which the peaks and troughs occur. The displayed error bars include both the original read-noise and photometric errors (black), as well as the extra systematic errors (red; see Methods) included in our MCMC fitting (Section 3.2). Increased scatter and larger systematic error bars are seen for the u measurements, due to the HWP angles that were inadvertently used in the observation set. Overlaid on top of the data are accepted MCMC fits to the instrumental polarization (orange; IP Q and IP U ), as well as the best-fit models for A and B (blue). We have also included the results of our simple six-parameter fit (see Section 3) as a purple line.
in-depth characterization. In particular, their estimate of the instrumental polarization appears to be systematically different from that of de Juan Ovelar (2013), obtained with observations of an unpolarized standard, and may therefore be called into question.
The apparent signal in Figure 4 is of order ∼0.03%, an order of magnitude less than the model accuracy presented in Witzel et al. (2011) anddeBoer et al. (2014). In comparison to Witzel et al. (2011), the Luhman 16 data set presented herein is of significantly higher quality than their IRS 16 data (with individual error bars of 0.3% in both Q and U) and their unpolarized standard star data (with a median deviation relative to the instrumental polarization model of 0.2%); our formal photometric errors on each Q and U measurement are on the order of 0.005% and 0.008%, respectively, nearly two orders of magnitude better than the IRS 16 data. Further, the standard deviation of our Luhman 16A/B measurements are 0.03%/ 0.02% and 0.03%/0.03% (even in the presence of a potential astrophysical signal) for Q and U, respectively, demonstrating the increased accuracy and stability of our data set relative to 0.2% measured by Witzel et al. (2011). Given the preliminary nature of the deBoer et al. (2014) system model and their final error bars on the instrumental polarization, we also consider our data to also be of superior quality given our error bars and the scatter of our data points relative to their 0.4% errors. Overall, the stability of our data set is well below the errors quoted in both these previous works and emphasizes the superior quality of our data set.

Instrumental and Data Extraction Systematics
While our data set may have the statistical power to detect a signal of ∼0.03%, before considering the signal real, instrumental systematics must be ruled out as a cause of the observed change in polarization. Because the detection of a signal in our data relies on a time-variable signal, here we focus on potential systematics that change on the same timescale as the signal: on the order of one or more hours. Instrumental or data extraction biases occurring on faster timescales are fit for in our analysis as an additional noise source (see Section 3), and the best-fit values for this noise are visualized in Figure 4 as the red error bars superimposed upon the photometric error bars in black. These "fast" varying biases do not affect what appears to be the signal in Luhman 16A. Such fast-varying biases could include, for example, changing atmospheric conditions (i.e., seeing) between two subsequent frames, which has the potential to affect double-difference measurements, among other effects.
One major strength of this data set is the nearly equalmagnitude binary nature of our targets. This allows us to rule out many potential sources of bias, because they would affect the measured polarization of both sources in the same way. For example, this includes slowly varying misalignments in the optical train, second-order flexure effects on the telescope and/ or instrument, and possible polarization induced by thin layers of clouds. In particular, this also includes changes in the systematic alignment between M3 and NaCo's rotation ring.
Spatially varying instrumental polarization is another potential effect to be explored. In Section 2.3, we noticed a spatially dependent instrumental polarization offset between the two dither positions that we corrected for by averaging together the q and u values from back-to-back dithers. We searched for residual spatial dependence after this ditheraveraging by comparing q measurements against the measured x and y detector positions of each source (using the x and y positions in the ordinary beam) using the Spearman-r and Kendall-Tau correlation coefficients. Because of the ditheraveraging, we calculated the correlation coefficients using the mean x position for each dither pair. The results are summarized in Table 3. The table also includes p-values for the null hypothesis that each coefficient (or greater) would be obtained with a random data set. However, the individual q data sets for each source only contain 35 data points, and so these values may not be fully reliable. Here we limit ourselves to searching for correlations only in q because that is where the more significant (potential) signal is seen.
The correlation coefficients immediately suggest that there is a significant spatial correlation for the q measurements of Luhman 16A. However, because we were operating in pupil tracking and the two objects were rotating about their common center, the source's spatial location is correlated with the parallactic angle-as is the expected signal-and so it is hard to draw any conclusions from this correlation. On the other hand, there does not appear to be a significant correlation between the q measurements of Luhman 16B and its detector position. If there were a significant correlation, it would be difficult to distinguish between an astrophysical signal and a spatially dependent instrumental polarization, but the lack of correlation strongly suggests that there is no significant residual spatially dependent instrumental polarization after our dither-averaging. As a secondary check for the lack of spatial dependence in B's q values, we randomly shuffled the q measurements 1000 times and measured the Spearman-r and Kendall-Tau values for each shuffle. The correlation values for B presented in Table 3 fall within 1σ of the mean of the sample for both Spearman-r and Kendall-Tau metrics for both x and y detector positions, confirming the lack of significant spatial dependence in B and implying that the signal seen for A is also not due to any spatially dependent instrumental polarization. We also searched for correlations as a function of distance and angle from the detector center, as well as distance and angle from the mean position of the binary. In all cases, we measured a significant spatial correlation for the Luhman 16A, but not for B, again suggesting that there is no remaining spatial dependence in the instrumental polarization. Finally, we revisit the spatial dependence in Section 3.2, where we demonstrate that our best-fit models are sufficient to explain the observed spatial correlations.
In our data extraction procedure, the only tunable parameter is the aperture size used to extract the flux of each star. Figure 8 displays the q and u measurements for A and B, extracted from a range of aperture sizes. Although the individual data points appear to vary slightly for different aperture sizes, the largescale trends appear to be consistent for all of the aperture sizes explored. This also suggests that the perceived trends in q cannot be attributed to a varying PSF overlap between the two stars, as the amount of intensity that leaks into one object's aperture from the other object changes with aperture size, although the signal appears relatively constant.
Finally, we searched for correlations between our measurements and the FWHM of each source. We first fit each PSF in each image to a 2D Moffat profile and then extracted the FWHM of the Moffat profile. We then took the average FWHM across eight frames (the number of frames needed to get one set of dither-average q and u measurements). Table 3 displays the Spearman-r and Kendall-Tau correlation coefficients for our q measurements against the FWHM in the ordinary beam. Here, we have compared the FWHM of A against the q measurements of A, and similarly for B. No significant correlations are found.
We conclude that the long-term variations seen in our data cannot be attributed to changing instrument systematics, observing conditions, or our own data extraction. Further, when compared to previous efforts to characterize the NaCo system, our data set represents a new standard in terms of depth and stability. Without any evidence to the contrary, hereafter we consider the the long-term trends seen in the data to be astrophysical in nature.

Twilight Data Reduction
Flat-fielding and dark correction for the twilight data were carried out as described for Luhman16. For each twilight sky observation, the intensity in the ordinary and extraordinary beam was measured in the same region of the detector as the Luhman16 observations by summing the counts in a rectangular aperture. , where the extra 45°was added to compensate for the telescope azimuth position of 45°east of north. The measured angle of linear polarization can be seen in Figure 5, along with the expected position angle for a range of accepted system models from the Markov Chain Monte Carlo (MCMC) fitting (Section 3.2). Errors on each angle measurement were estimated by propagating the standard deviation in each rectangular aperture through Equations (6)-(11) using standard error propagation techniques.

Elias 2-25 and HD 162973 Data Reduction
Flat-fielding, dark correction, and background subtraction were carried out on the Elias2-25 (polarized standard) and HD162973 (unpolarized standard) data as described for Luhman16. Photometry was extracted from all observations of Elias2-25 and HD162973 using a similar method as for Luhman16 (Section 2.2). Measurements of q NaCo and u NaCo were then calculated using a similar method to Luhman16, except that both q NaCo and u NaCo were both calculated using double differencing because observations with the HWP at 22°. 5 were obtained for these two targets. As with Luhman16, the two dither positions were averaged. For HD162973, we average together the measurements from the two different dates, as we expect the instrumental polarization to strongly dominate over any potential stellar polarization. The q NaCo and u NaCo measurements for Elias2-25 and HD162973 can be seen in Figures 6 and 7, respectively. Errors on q NaCo and u NaCo were propagated from the original photometric errors through Equations (6)-(11).

Analysis
The ultimate goal of our analysis was to determine the degree and angle of linear polarization of both Luhman16A (p A and ψ A ) and B (p B and ψ B ). We began our analysis with a simple instrument model to characterize the degree of polarization measured by the instrument for each object, where we were not (yet) concerned with the true on-sky degree of polarization or angle of polarization. We assumed an instrument and telescope (i.e., M inst and M tel from Equation (3)) model that included instrumental polarization, but assumed perfect efficiencies and no « Q U crosstalks. With this model, the measured q NaCo and u NaCo at each parallactic angle (θ PA ) can be described with only six parameters (q A , u A , q B , u B , IP Q , and IP U ) by expanding the rotation matrix in Equation (3)  In this set of equations, we applied a negative sign to the righthand sign for both u A and u B to compensate for the sign flips between q NaCo and u NaCo seen by Witzel et al. (2011) anddeBoer et al. (2014). We fit for a joint solution for all six parameters from our full set of measurements with the SciPy least_squares function, using a "Cauchy" loss function to account for outlier data points. We display the results of this fit in Figure 4 as the "Simple Model," demonstrating the detection of polarization in both A and B. The fit returned q A =−0.007%, u A =0.027%, q B =0.007%, u B =−0.006%, IP Q =−1.921%, and IP U = −0.324. The degrees of linear polarization for A and B are p A =0.028% and p B =0.009%. While this simple model serves to highlight the detection of polarization in the two components, it fails to capture several important aspects when trying to accurately estimate the degrees and angles of linear polarization. First, in assuming a perfect system, we have failed to account for polarimetric efficiencies and crosstalks, as well as instrument angular offset, all of which will affect the estimates of the on-sky values. The polarimetric efficiencies and crosstalks would typically be measured using polarized standards observed on the same night as the observations. However, because our standards were observed after the HWP mechanism was replaced, a more complicated approach is required. From Figure 4 it is also clear that the formal photometric error bars do not represent the scatter in the data, and parameter errors estimated from the least-squares approach above would be inaccurate (for this reason we do not report them). Further, because we were required to apply a nonideal correction of the u NaCo measurements, we expect their errors to be larger than for q NaCo and there may be nonstandard correlations between q NaCo and u NaCo . In order to fully account for these effects, we developed an MCMC-based strategy to simultaneously fit for the polarization of Luhman 16A and B and instrumental effects, and to account for increased errors and correlations in the data.

Polarimetric Instrument Model
We first built up a Mueller matrix model for the combination of the instrument (M inst ) and the telescope (M tel ) to translate onsky polarization to the q NaCo and u NaCo values from each observation. Although Mueller matrices for NaCo have been measured before both in the H and K bands (Witzel et al. 2011;deBoer et al. 2014), both were obtained while NaCo was mounted on the UT4 telescope and neither used the pupil tracking mode. The move to a new telescope (UT1), the change in observing mode, and the increased depth of our data set motivated the development of a new system model.
In pupil tracking mode, the orientations of M inst and M tel do not change relative to each other, and M inst and M tel was therefore combined for all of our observations into a single Mueller matrix, which we call M inst+tel : where the IP Q,U elements represent the instrumental polarization (i.e., the total intensity that gets polarized, even for an unpolarized source), the η terms represent efficiency terms, and the  X terms represent crosstalk between Q and U. For completeness, we included the fourth row that corresponds to circular polarization, to which NaCo is insensitive.
Technically, all 16 elements of the 4×4 matrix need to be calculated in order to back out the true on-sky Stokes vector of a source. However, a number of reasonable assumptions were made to limit the number of free variables. First, because NaCo is only sensitive to linear polarization, the bottom row of M inst+tel can be effectively set to zero. Second, we assumed that our target is not significantly circularly polarized, which is reasonable for brown dwarfs and most stars, allowing us to set the fourth column to be all zeros (e.g., Clarke 2010). Third, we assumed that any polarization does not affect the total intensity, setting all but the [1, 1] element of the first row to be zero. Thus, we were left with only six free parameters in M inst+tel that had to be found: IP Q , IP U , η Q , η U ,  X U Q , and  X Q U . We also included in our system model two additional free parameters: (i) a rotational offset between NaCo's frame of reference and the sky, δθ PA , that was included in the rotation matrix in Equation (3), M rot (θ PA +δθ PA ), and (ii) an extra rotation Figure 6. The measured q NaCo and u NaCo for the Elias2-25 polarized standard observations. Also shown are the modeled q NaCo and u NaCo values for a random selection of walker positions from the MCMC fit, for the median posterior parameters from the MCMC fit, and for a perfect M inst+tel Mueller matrix (although including instrumental polarization). Error bars on the measurements are shown, including the extra error term from the MCMC fit, but are smaller than symbols. For each randomly selected walker position, a system Mueller matrix is generated and the modeled polarization is calculated by propagating the known polarization of Elias2-25 through Equation (3) (which includes all instrumental polarization effects). Deviation from the perfect model is due to nonperfect efficiencies (η) and crosstalks (X). Figure 7. The measured (black) and modeled (blue) instrumental polarization for the HD162973 unpolarized data set. The expected instrumental polarization is shown as a 2D histogram and is generated by picking values of IP Q and IP U from 30,000 randomly selected walker positions in the final MCMC chain and then rotating them with the associated δθ HWP value for that walker position. The model and the data agree to within 1σ. angle for the HWP, δθ HWP , that was applied in Equation (4) to the HWP Mueller matrix, M HWP (θ HWP +δθ HWP ). This second offset, δθ HWP , represents the unknown encoder offset between the April and May/June data and was only applied to the data obtained after the intervention (i.e., observations of Elias 2-25 and HD 162973).
In our fitting procedure, the instrumental polarization is constrained mainly by the Luhman 16 observations themselves. The instrumental polarization is essentially the mean of the q and u measurements of the source, with any astrophysical signal presented as a modulation on top of that (modulated in Parallactic angle). Under this scheme, the primary function of including the unpolarized standard is to constrain the relative offset of the encoder between the April data and the later data, δθ HWP . This offset is constrained by the difference in the instrumental polarization angle measured by the Luhman 16 data set in April and the instrumental polarization angle measured by the unpolarized standard. This offset's main purpose is to connect the constraints on the q and u efficiencies and crosstalks obtained by the polarized standard to the April Luhman 16 data.
The instrumental polarization has the potential to evolve slightly on one-to two-month timescales (e.g., due to degradation of mirror coatings); this evolution could affect the δθ HWP value, the crosstalks, and the efficiencies, and in turn the final derived degrees and angles of polarization for A and B. If the unpolarized standard were being used to constrain the instrumental polarization in the Luhman 16 data set, the potential for this evolution would be of significant concern in interpreting our signal. However, in our case, the unpolarized standard is not being used in such a manner. The evolution of the polarimetric efficiencies and crosstalks on this timescale is likely to be on the order of a few percent or smaller. These effects are relative to the polarization signal itself, and given the low S/N of the polarization signal of Luhman 16A and B, we anticipate that the final degrees and angles of polarizations will be dominated by statistical errors rather than any systematic offsets introduced by this temporal evolution.
Residual fast-varying polarimetric systematics may unduly increase the scatter of our data beyond the read-out and photon noise, especially for the u NaCo measurements of Luhman16, where double differencing was not possible. To characterize these systematics, we included two extra error terms, one for the single-differenced data (i.e., u NaCo for Luhman 16 and the twilight observations), and one for the double-differenced data: σ SD and σ DD , respectively. In practice, we fit for ( ) s log SD 2 and ( ) s log DD 2 . Finally, to account for possible covariance between measurements taken within the same HWP cycle in the Luhman 16 data set, we include four covariance terms in our fit: (i) C q u , covariance between q NaCo and u NaCo (the same value for both Luhman 16A and B measurements), (ii) C q A q B , , , covariance between q A and q B , (iii) C q A u B , , , covariance between q A and u B , and (iv) C u A u B , , , covariance between u A and u B . Although typically one would expect the q and u measurements to be independent, the correction of the systematic bias term ò for the u NaCo measurements in a given HWP cycle relies on the ò measurements of q NaCo , motivating the inclusion of the covariance terms.

MCMC Model Fitting and Results
To extract accurate polarization measurements and errors, we adopted a strategy of simultaneously fitting for Luh-man16ʼs polarization and our system model parameters with a Bayesian MCMC approach, using all four of our data sets as input (Luhman 16, Twilight polarization, Elias 2-25, and HD 162973). Our full model includes 18 parameters, summarized in Table 4. Our choice of modeling strategy was largely driven by the many relationships between the model parameters and our different data sets, in addition to the need to compensate for the extra systematic error terms and covariances. For each model parameter, Table 4 summarizes the most constraining data set, as well as the other data sets whose interpretations are affected by that model parameter.
We constructed our log-likelihood function ( ( | ) Q p y ln ) for the MCMC fitting as the sum of four components as follows: Twilight where each sum is over the data (y), model ( f (Θ)), and covariance matrix (C) associated with each subscript for a given parameter set, Θ. For the Luhman16, Elias2-25, and HD162973 data sets, y included both q NaCo and u NaCo measurements. The model q NaCo and u NaCo measurements ( f (Θ)) were calculated from Equation (3) for a given input q sky and u sky and included the M inst+tel Mueller matrix (Equation (12) Rather than fitting for q sky and u sky and calculating p and ψ afterwards, we chose to fit for p and ψ directly and forward model into q sky and u sky , which allowed us to directly obtain a posterior distribution of p, avoiding the positive bias associated with calculating a single value of = + p q u 2 2 in the presence of noise on q and u.
For the Elias2-25 data set, the model values for q NaCo and u NaCo (i.e., ( f (Θ)) were calculated from q sky and u sky , using the known p=4.13%±0.02% and ψ PA =24°±1° (Whittet et al. 1992). HD 162973 has a measured B-band polarization of p=0.09%±0.055% (Mathewson & Ford 1970). Assuming a Serkowski polarization law (Serkowski 1973), with a maximum polarization of 0.09 and a peak wavelength of 0.55 μm, we estimated a polarization of ∼0.02% at 1.6 μm. Given that the error bars on the measurement are significantly larger than this value, we considered the intrinsic source polarization to be negligible and set q sky and u sky both to zero (further justifying our choice to average together the two measurements at different parallactic angles). Our choice of 0.55 μm sits roughly in the center of the range of values found in Whittet et al. (1992). If the true value of the peak polarization wavelength is smaller, then the polarization of HD 162983 in the H band will also be smaller. Whittet et al. (1992) find peak polarization wavelengths as high as ∼0.8 μm. If it were that high for HD 162983, the polarization in the H band would be ∼0.05%, which is still smaller than our error bars.
In the twilight portion of Equation (13), the data y is the measured position angle and the model position angle ( f (Θ)) was calculated from the model q NaCo and u NaCo , where the input q sky and u sky were calculated with Equation (14) assuming p=1.0 (this has no effect on the results) and ψ PA is 90°away from the solar azimuth at the time of observation.
For all data sets except the twilight position angle measurements, the covariance matrices were populated with diagonal variance terms calculated as the square of the original photon and read-out noise estimates for q NaCo and u NaCo plus either s SD 2 or s DD 2 , depending on whether the data were measured as a single difference or double difference. For the Luhman 16 data set, the four covariance terms (C q u , , ) multiplied by the square of the two associated diagonal terms were included as off-diagonal terms. With this definition, the expected range of the covariance parameters should range from −1 to 1. The covariance matrix for the twilight data was populated with diagonal terms corresponding to the square of errors on the angles of polarization.
To sample our parameter posterior distributions we employed the python-based ensemble-sampling MCMC package emcee (Foreman-Mackey et al. 2013). Priors for all the model parameters can be found in Table 5. The emcee ensemble-sampler was run for 50,000 steps with 256 walkers, after a burn-in of 1000 steps. After the run, we measured a maximum autocorrelation across all parameters of 48.4 steps, verifying that the chains should have reached equilibrium within the burn-in phase ( ( ) 10 autocorrelation times are needed for convergence; Foreman-Mackey et al. 2013).
Posterior distributions were estimated using 1 out of every 49 chains, to ensure statistical independence. The full "corner" plot showing the marginalized and joint probability distributions can be seen in Figure A4 (Foreman-Mackey 2016). All parameters appear single-peaked and mostly Gaussian in the marginalized posterior distributions, with the exception of η U and η Q , which both show a slightly skewed tail to smaller absolute values. The joint posterior distributions show correlations between η Q , η U ,  X Q U ,  X U Q , δθ PA , and δθ HWP , but all parameters appear well constrained in the marginalized posteriors. Table 5 summarizes the fitting results, where the median value from each marginalized posterior distribution is taken as the best-fit value, and the upper and lower 1σ values were taken as the 16% and 84% percentile values (corresponding to a confidence interval of 68%). As demonstrated in Figures 4-7, the model appears to fit all of our input data well.
From the results of the fitting, we detect polarization in both Luhman16A and B at the 8σ and 2.5σ levels, respectively. For Luhman16A, we find a linear degree of polarization of p A =0.031%±0.004% with an angle of ψ A =−32°±4°. For Luhman16B, we find a linear degree of polarization of p B =0.010%±0.004% with an angle of y =  -+ 73 B 11 13 . As expected, we find a higher value of σ SD than σ DD . In all cases, we find that the data is well fit by the model. Our fitting process recovers a weak covariance between the q NaCo and u NaCo measurements in the Luhman 16 data set, C q u , that we attribute to the correction of the ò systematic for the u NaCo measurements. We also recover to higher significance a covariance between u A and u B , which we attribute to residual uncorrected systematics in both u measurements as a result of the nonoptimal correction of ò. Random walker positions were selected from the final parameter chains and have been overplotted on the measurements of all four data sets in Figures 4-7 and A1 and A2.
Our Mueller matrix system model parameters qualitatively agree well with those found by previous characterization (Witzel et al. 2011;deBoer et al. 2014), modulo several sign flips that may be attributable to different sign conventions. Exact agreement was never expected, due to the change of telescope and aging mirror coatings. Our modeling strategy is able to achieve errors on the efficiencies and crosstalks similar to those of deBoer et al. (2014), where they are able to constrain them. However, the accuracy of our constraints on the instrumental polarization is two orders of magnitude greater than either of these two works. This can be attributed to the depth and stability of our data set (see Section 2.4.2), the simplicity of our instrument setup, and our self-calibration strategy. In contrast to previous modeling efforts that had to model M3 and NaCo (upstream of the HWP) separately, operating in pupil-tracking mode has allowed us to consider only a single Mueller matrix when modeling our system, therefore simplifying the observations required to constrain it. Our data-driven modeling strategy also contrasts with that of Witzel et al. (2011;and in part deBoer et al. 2014) in that we leave the relevant elements of our Mueller matrix as completely free parameters that we fit to the data, whereas they rely on specific function forms and assumed material parameters, giving our model more freedom to fit the data.
Although the 2.5σ detection of Luhman16B is of low S/N, we believe that it is real to within the accuracy of our system model architecture. To verify this, we explored several different model comparison metrics for different model setups. For all model setups, we considered the reduced χ 2 , the Aikake Information Criterion (AIC), and the Bayesian Information Criterion (BIC). For our default model, we calculated all three parameters for the fit described above, as well as the same fit, but setting p B =0.0. For this second case, we reduced the number of parameters by two, as both p B and ψ B need to be removed. We also considered a model where the instrumental polarizations (IP Q and IP U ) were separate free parameters for the A and B measurements. After rerunning the MCMC fit under this new assumption, we calculated all three metrics both with and without p B . Under this new assumption, we measured the same polarizations of A and B as those shown in Table 5. Finally, we consider a model where the polarization of B is forced to be zero from the start of the fitting procedure. The goodness-of-fit parameters for all model setups can be found in Table 6, with the best-fit models shown in bold When fitting for separate IP values for A and B, we find the best-fit instrumental polarization values for each component to be within 1−σ of the values reported in Table 5, and the instrumental polarization measured from A and B are within 2−σ of each other. For all model comparison metrics, we find that the default model that includes a polarization of B is the best model to describe the data. Using the best-fit system model, we inverted Equation (3) to obtain the sky-plane Q and U values as a function of time for A and B. The residuals displayed similar features in both Luhman 16A and B, which we attribute to uncorrected time-varying systematics likely related to the larger systematics in u (and encapsulated in σ SD ). We searched for variability using a Lomb-Scargle periodogram, but found no significant peaks. We conclude that we have not detected any polarimetric variability.
As a secondary check on our fitting results, we split our data in half and reran the fitting procedure. The data was split such that one of the fits included the first and third quarters of the data, and the second fit included the second and fourth quarters. Rather than splitting the data at the halfway mark, splitting the data up by quarters was done to ensure that each data set had sufficient diversity in parallactic angle. The new fits resulted in values of p A , ψ A , p B , ψ B , IP Q , and IP U that agreed with the mean values presented in Table 5 to within 1−σ (using the newly fit error bars). As expected, the new error bars werẽ 2 times worse than the original error bars.

Spatial Correlations
Due to our pupil tracking observing mode, the two binary components rotate around each other's center of light throughout our observation set. This naturally introduces a correlation between detector position and parallactic angle, and in turn, between detector position and the instrument-frame q and u measurements, q NaCo and u NaCo . Here we revisit the question of whether or not the spatial correlation of q A measured in Section 2.4.2 can be explained by an astrophysical signal.
We began by generating fake q data sets for A and B, given the best-fit model of Table 5, sampling the model at the parallactic angles corresponding to our observations. We then injected noise into the fake data sets by replacing each data point with a draw from a Gaussian distribution centered on the model value and with a width equal to the photometric errors and best-fit σ DD value added in quadrature. This procedure was repeated 1000 times, and Spearman-r and Kendall-Tau coefficients were measured for each fake data set with respect to the real detector x and y positions. Figure 9 displays the resulting distributions of the coefficients for this "noisy model." These histograms represent the range of correlation coefficients one might expect given the final errors in our model fitting (e.g., Curran 2014).
Next, we estimated the distribution of the Spearman-r and Kendall-Tau values for the q measurements of Luhman 16A/B against the detector x and y positions by perturbing the original q measurements 1000 times. For each iteration, each q measurement (for both A and B) was replaced with a new value drawn from a Gaussian distribution with a mean equal to the measurement value and a width equal to the photometric error ( Figure 9). As with the model data, this perturbation approach should represent the range of values expected in the two correlation coefficients given our errors. The Spearman-r and Kendall-Tau distributions for the perturbed data overlap significantly with the model distributions, suggesting that the posterior distributions recovered from the model fitting procedure are sufficient to explain the measured spatial correlation of the data.

Atmospheric Modeling
In this section, we consider what possible physical phenomena could result in the measured polarizations of Luhman 16A and B. The constant polarization that we measure is integrated over our entire 7hr observing window, corresponding to greater than one rotation period for each component, or nearly a full period if Luhman 16A's period is 8 hr. The measured signals cannot be attributed to longitudinally varying features, such as patchy clouds or spatially varying bands, because they would cause the degrees and angles of polarization to change in time as the features rotated in and out of view and would manifest as variability in our residuals. However, we cannot rule out lower-level variable features below our detection limits that may be superimposed upon the constant signal. Polarization from a circum-brown dwarf disk can also be ruled out due to the lack of any previous evidence of extra dust in the SED. Thus, oblateness (Section 4.1) and constant cloud bands (Section 4.2) are the only remaining possible sources of polarization.

Polarization due to Oblateness
We first considered the polarization signal that would be caused by oblateness in the case of a homogeneous atmosphere and cloud cover. Oblateness as a function of effective temperature was calculated using updated evolutionary models for objects with masses of 27.2, 29.3, 31.4, and 34.6M Jup (M. S. Marley et al. 2020, in preparation). Each model track provided the radius, moment of inertia, and effective temperature as a function of age for a given mass. The spin angular velocity was calculated for 5 and 8 hr periods, for each radius in each evolutionary track. Oblateness as a function of effective temperature was calculated using the Darwin-Radau formula that relates an object's spin angular velocity, mass, radius, and moment of inertia to its oblateness (Barnes & Fortney 2003). These curves were then interpolated to the known mass ranges Note. The best-fit models are shown in bold.
for Luhman16A and B (Figure 10). The 5 hr period was considered for both Luhman 16A and B, and the 8 hr period was only considered for Luhman16A. Using the measured effective temperatures from Faherty et al. (2014), we constrained Luhman16B's oblateness to -+ 0.0131 0.0015 0.0016 and Luhman16A's oblateness to either 0.0092±0.0009 for a 5 hr period or -+ 0.0036 0.0004 0.0003 for an 8 hr period. We note that all of these values are smaller than Jupiter's oblateness of 0.0649 for a rotation period of 9.9 hr (e.g., Dutta et al. 2009).
We then calculated polarization as a function of oblateness and line-of-sight inclination using a radiative transfer code previously applied to model polarized brown dwarfs (Sengupta & Marley 2009Marley & Sengupta 2011). We define an inclination of i=0°to be when a brown dwarf is viewed equator-on and i=90°is when viewed pole-on. To explore a representative parameter space for Luhman16A and B, we considered models for 1200 and 1300 K objects, each with a sedimentation parameter f sed =1 (thicker clouds) and f sed =3 (thinner clouds), generated from the Ackerman & Marley (2001) cloud code. Figure 11 displays the expected polarization due to oblateness for the f sed =1 case at the inclinations measured for Luhman16A (Karalidi et al. 2016) and Luhman16B . For all inclinations and masses, the f sed =3 models result in a polarization an order of magnitude less than what was measured. Given the oblateness ranges calculated above, the f sed =1 models predict a polarization in the range p B =0.026%-0.033% for Luh-man16B, and either p A =0.008%-0.010% or p A = 0.007%-0.009% for Luhman16A for a rotation period of 8 hr and 5 hr, respectively.
It is clear that for both Luhman16A and B, the f sed =1 models do not reproduce the measurements; the models Figure 9. Spearman-r and Kendall-Tau correlation values for q A and q B against the x and y detector position. The green and red distributions were calculated from our best-fit model, and the blue and orange distributions were calculated from the data. The solid and dashed black lines represent the values reported in Table 3 for q A and q B , respectively. underpredict the polarization of Luhman16A and overpredict the polarization of Luhman16B. For Luhman16B, this discrepancy could potentially be attributed to the presence of clouds thinner than those produced by the f sed =1 models. Models with higher values of f sed (i.e., thinner clouds) result in a decrease in the expected polarization (Sengupta & Marley 2010), and previous fits to the observed spectra of Luhman 16B do indeed suggest values higher than f sed =1.
The best-fit model to the HST/WFC3 0.8-1.6 μm timeresolved varying spectra is a linear combination of two models with f sed =1 and f sed =3 (each component with its own effective temperature), with the thinner cloud component ( f sed =3) always having a higher coverage fraction than the thicker clouds (Buenzli et al. 2015a(Buenzli et al. , 2015b. Although our polarimetric observations of Luhman 16B are consistent with thinner clouds, our data alone cannot definitively distinguish between oblateness-induced polarization with thinner clouds and polarization due to banded clouds (see Section 4.2).
On the other hand, we cannot resolve the underprediction of the oblateness model (with f sed =1) relative to the Luhmnan 16A measurement by invoking thicker clouds (i.e., f sed <1) without contradicting previous spectroscopic measurements. Model fitting to HST/WFC3 spectra always results in f sed 1; depending on the wavelength region analyzed, HST/WFC3 data of Luhman 16A are best fit by either homogeneous atmospheres with f sed =2 (Buenzli et al. 2015b), or a patchy model with a combination of f sed =1 and f sed =3 (Buenzli et al. 2015a). There is no evidence that a thicker, f sed <1, cloud model applies to Luhman 16A. If anything, the f sed >1 values found in these studies should reduce the expected polarization due to oblateness. Another mechanism other than oblateness is needed to explain the polarization of Luhman 16A.

Polarization due to Banded Clouds
Cloud banding in brown dwarfs has recently been inferred from modeling of quasi-periodic photometric variability (Apai et al. 2017) and has been predicted by general circulation models (Zhang & Showman 2014). Within the solar system, cloud banding can easily be seen in Jupiter, Saturn, and Neptune. de Kok et al. (2011) first showed that a single cloud band in a giant exoplanet/brown dwarf can easily cause degrees of linear polarizations greater than 0.5%. Further modeling by Stolker et al. (2017) demonstrated that depending on the cloud band distributions and oblateness of the source, one could achieve degrees of linear polarizations up to ∼1.33%.
For a given object, the net polarization at a given wavelength will depend on the line-of-sight inclination, the oblateness, and the properties of the cloud bands (i.e., the number of bands, their widths, their longitudes, and their relative polarization compared to the background atmosphere). Polarization from banded clouds can either supplement the net polarization from oblateness or work against it. For example, for a brown dwarf viewed equator-on, if polarizing clouds are found in an equatorial band, the angle of polarization from the clouds would be parallel to the direction of the pole projected on the sky and would "cancel out" some of the polarized intensity coming from oblateness, which is oriented perpendicular to the spin axis (Stolker et al. 2017). Depending on the relative strength of the polarization from the cloud bands versus that from oblateness, the polarization can (a) align itself with the spin axis, if equatorial clouds dominate the polarization, (b) be zero if the polarization from clouds and oblateness perfectly match, or (c) align itself perpendicular to the spin axis if oblateness dominates. Alternatively, if the polarizing clouds are concentrated at the poles (e.g., similar to the hazes in Jupiter poles), the clouds will only add to the oblateness-induced polarization, and the angle of polarization will always be perpendicular to the spin axis (Stolker et al. 2017). In Jupiter, these polar hazes are thought to originate from interactions with its aurora (West et al. 2004). For Luhman 16, Hα ) and radio (Osten et al. 2015) emission nondetections suggest negligible auroral activity, and as a result, we rule out polar hazes or clouds as possible sources of polarization.
Here we explore banding in a spherical atmosphere and demonstrate that using known characteristics of the system, we can recreate the measured polarization, without the need for significant model tuning. To estimate the number of bands to consider, we calculated the expected number of atmospheric jets given a characteristic horizontal wind speed, assumed radius, and rotation rate using a simple atmospheric scaling from Showman et al. (2010) based on 2D models. We assumed a radius of 1R Jup , and considered both 5 and 8 hr rotation periods. We calculated the number of jets for wind speeds of 10, 50, 100, and 200 m s −1 , consistent with the expected ranges ( Table 7; Showman & Kaspi 2013). We find that for a five-hour period, we can expect at least a single jet (i.e., more than 0.5 jets) for wind speeds <200 m s −1 and over two jets if the wind speeds are <10 m s −1 . For an eight-hour rotation period, we get one jet for wind speeds of <120 m s −1 for rotation periods of 8 hr and 1.8 jets for wind speeds of 10 m s −1 .
Given these results, we calculated the expected degree of linear polarization for cloud configurations with one, two, and three bands at brown dwarf inclinations of 20°and 56°( roughly corresponding to the possible inclinations of Luhman 16A and B). Although the above calculations do not predict three jets for the wind speeds we explored, we include a threeband model as a proxy for the cloud bands inferred for 2MASS J13243553+6358281 (Apai et al. 2017). We assumed that the atmospheres of Luhman 16A and B are composed of two components: a "background" atmosphere at 1100 K with thicker clouds ( f sed =1) and a "bands" atmosphere component at 1300 K with thinner clouds ( f sed =3; Buenzli et al. 2015b).  Faherty et al. (2014). Shown in gray are the original evolutionary models that were interpolated to obtain the dashed colored regions.
We first used the Marley et al. (1996) radiative transfer code to calculate the temperature-pressure and composition profiles of the two atmosphere components, as well as their cloud properties. We injected these profiles into the polarizationsensitive radiative transfer code used in de Kok et al. (2011), which uses a doubling-adding method that fully includes all orders of scattering and polarization to calculate the polarization of the two atmosphere components in the H band. Clouds were modeled using Mie theory, as is standard in brown dwarf and exoplanet cloud modeling.
We modeled the atmospheres of Luhman 16A and B, assuming either one band spanning −10°to +10°of latitude, or two bands spanning −50°to −30°, and +30°to +50°d egrees of latitude. The models with three bands included the bands from both the one-band and two-band models. Net degrees of polarization were calculated for an inclination of ∼20°for the models with one, two, or three bands and ∼56°f or the two-band model. In order to explain the quasi-periodic photometric variability, previous modeling relied on a sinusoidal modulation of the band brightness with longitude (Apai et al. 2017). Because our observations were averaged over a rotation period, we assumed that the bands were rotationally invariant to the extent of our time resolution and sensitivity. Our model atmosphere was split in a grid of 2°×2°resolution in latitude-longitude, and each pixel was assigned one model ("band" or "background atmosphere"). We assumed that our pixels were large enough to be able to ignore adjacency effects, i.e., light that is scattered within more than one pixel. For each configuration, we integrated the spatially resolved polarized fluxes across the observed disk and divided by the total intensity fluxes to get the degree of linear polarization (Table 8, Figure 12).
With these models, we can recreate the measured values to within a factor of ∼1.5. By adjusting our two-banded model to have bands with widths of 32°centered at latitudes of −35°a nd +35°, we can reproduce the measured degree of linear polarization of 0.03% (shown in bold in Table 8). This configuration was found via a nonexhaustive, manual exploration of possible cloud configurations, and we expect that it is not a unique solution. Further, these models do not currently simultaneously include oblateness and cloud bands. Nonetheless, these models demonstrate that banded clouds are easily able to explain the level of polarization measured in Luhman 16A.
Considering the disagreement between the measurements and the homogeneous oblate polarization model, we conclude that cloud banding is the dominant source of the measured polarization in Luhman16A. For Luhman16B, we can neither rule out nor confirm the presence of banded clouds; the measurements could be explained as due to oblateness alone with clouds thinner than in the f sed =1 models, or by a combination of oblateness and banded clouds. Although we model the bands and fit the polarization as constant over longitude and time, there is the possibility of variability with amplitudes below our sensitivity limits. Indeed, the bands of Jupiter display variability, yet the bulk features remain relatively constant over time (Ge et al. 2019). Similar variability superimposed on the bands discovered here may account for the previously detected photometric variability seen in Luhman 16A.

Spin Axis Orientations from the Angle of Linear Polarization
The angle of polarization for Luhman16A and B differs by ∼105°(or equivalently, ∼75°, because of a 180°degeneracy in the definition of the angle of polarization). For each object, the angle of polarization could be aligned either parallel or perpendicular to the position angle of the projected on-sky spin axis, as discussed in Section 4.2. The misalignment of the angle of polarization immediately suggests a misalignment of the two components' spin axes of either ∼15°or ∼75°, depending on the source of polarization of Luhman16B. Neither component has an angle of polarization aligned parallel or perpendicular to the longitude of the ascending node of the Figure 11. Polarization as a function of oblateness for T eff =1200 K (dashed lines) and 1300 K (solid lines) for the inclinations relevant for Luhman 16A (left) and Luhman 16B (right). All models use a sedimentation parameter f sed =1. Thickened solid lines highlight the oblateness ranges determined in Figure 10 for the 1300K curves. Also shown as horizontal bands are the ±1σ ranges for the polarization measured from Luhman16A (orange) and B (blue).  (Garcia et al. 2017). Luhman16A's polarization is at best ∼20°misaligned, and Luhman16B is ∼40°-60°, indicating that neither object has a spin axis aligned with the orbital plane, regardless of the oblateness or banded interpretation. The misalignment may imply that the two objects did not form alone together within the same disk, but instead may have instead experienced a more raucous dynamical history such as having originally been formed in a triple system (Reipurth & Mikkola 2015). To the best of the authors' knowledge, this is the first measurement of the projected spin axis for any brown dwarf (albeit with a 90°d egeneracy). Given the low-S/N detection of Luhman16B, we consider any interpretation of its position angle to be tentative.

Total Intensity Variability Interpretation
We were unable to measure absolute photometric variability, but the flux ratio of the two objects exhibits quasi-periodic variations, with a period of 1.64 hr. We attribute this signal to Luhman 16B alone (H-band variability has never been detected in A; e.g., Biller et al. 2013;Buenzli et al. 2015b) and suggest that it may represent a similar three-band scenario to that inferred for other L/T transition objects, where slight differences in the rotational periods of each band results in photometric variations that beat over time (Apai et al. 2017). In this scenario our data would represent a time when the three bands were nearly completely out of phase, but each band still rotates with a ∼5 hr period. However, unambiguously distinguishing between a model composed of only cloud patches and that which includes cloud bands would require a baseline of >2 rotation periods.

Conclusions
Here we have presented new H-band linear polarization measurements of the Luhman 16 binary brown dwarf system obtained with VLT/NaCo. The measurements of Luhman16B could be explained by oblateness or cloud banding, but the polarization of Luhman16A can only be explained by bands of clouds, similar to those seen throughout the solar system. Previous photometric and spectroscopic variability studies of Luhman 16A have either suggested patchy clouds or have been Figure 12. Schematic images displaying a selection of the different cloud banding and oblateness scenarios that were explored. The modeled degree of linear polarization p and angle of linear polarization ψ (and blue arrows) are shown above each model. The degrees of polarization calculated for the oblateness models ( Figure 10) are displayed as upper limits, as thinner clouds could reduce the polarization. The measured values for A and B are displayed for reference, and a banded cloud model that matches the polarization of Luhman 16A has been highlighted. The angles of polarization for the measurements are given in the sky frame measured east of north, but the models are given relative to the on-sky projected rotation axis. unable to constrain its cloud morphology due to nondetections of variability. In contrast, our polarimetric measurements detect bands that could not have been found using these techniques, as both methods rely on cloud morphologies or variations that rotate in and out of view. Although our data are inconclusive about the presence of bands in Luhman 16B, bands cannot be ruled out by our data or previous studies; the Doppler-imaging technique used with Luhman 16B was unable to recover longitudinal bands artificially injected into the data ) and cloud spot mapping using HST data did not consider banded structures (Karalidi et al. 2016).
The polarization measurements of Luhman 16A represent the state of the cloud morphology at a fixed point in time. The absolute variability of previous polarization measurements of brown dwarfs suggest that cloud morphologies may vary significantly over time. Whether or not the polarization of Luhman 16A varies in time is yet to be determined. We have obtained follow-up observations of Luhman 16 over four nights in 2019 April in order to search for both short-term and longterm variability in the polarization. These data have not yet been analyzed and will be published in a follow-up study.
Cloud bands in brown dwarfs have only been previously inferred for three L/T transition brown dwarfs via photometric monitoring (Apai et al. 2017). The discovery of cloud bands in Luhman 16A provides a critical independent confirmation of banded clouds near the L/T transition and suggests that their presence is common, if not ubiquitous. Whether or not these bands persist outside of the L/T transition remains an open question that should be pursued with further studies. We have demonstrated that polarimetry provides a promising avenue to answer this question; variability occurrence rates and amplitudes decline outside of the L/T transition, and therefore many targets are unsuitable to cloud mapping via variability monitoring. High-accuracy targeted polarimetric studies sensitive to cloud bands similar to those found in this study may therefore be critical to furthering our understanding of the cloud dynamics in substellar objects. The characterization presented herein relies heavily on previous characterization of Luhman16. Similar polarimetric studies should be pursued for other brown dwarfs with comparable levels of characterization (e.g., brown dwarfs with well-constrained masses and/or inclinations).
We note that the exquisite accuracy of our polarimetric measurements is due not only to the brightness of Luhman16, but also to the specific instrumental setup that allowed us to easily distinguish between instrumental polarization and astrophysical signal. Unfortunately, NaCo was decommissioned in 2019, preempting further studies with the same instrument. Other NIR polarimeters, such as SPIRou (Artigau et al. 2014), MMT-Pol (Packham & Jones 2008), and/or WIRC+Pol (Tinyanont et al. 2018) (among others) may provide other opportunities for new brown dwarf polarimetric studies. The discovery of cloud bands in Luhman 16A via polarimetry sets the stage for a wide range of new 3D atmospheric dynamics studies, not only for brown dwarfs like Luhman 16, but also for giant exoplanets.
We thank D. Saumon for updated brown dwarf evolutionary models. Support for this work was provided by NASA through the NASA Hubble Fellowship grant HST-HF2-51378.      Note. The error bars represent the 1σ photometric errors, not neccesarily the ultimate photometric stability of the relative measurements. Note. The errors presented here are the 1σ photometric errors (i.e., the black error bars in Figure 4) and do not include the extra error term that was fit for (i.e., the red error bars in Figure 4).