Closure Statistics in Interferometric Data

, , , , , , and

Published 2020 May 1 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Lindy Blackburn et al 2020 ApJ 894 31 DOI 10.3847/1538-4357/ab8469

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/894/1/31

Abstract

Interferometric visibilities, reflecting the complex correlations between signals recorded at antennas in an interferometric array, carry information about the angular structure of a distant source. While unknown antenna gains in both amplitude and phase can prevent direct interpretation of these measurements, certain combinations of visibilities called closure phases and closure amplitudes are independent of antenna gains and provide a convenient set of robust observables. However, these closure quantities have subtle noise properties and are generally both linearly and statistically dependent. These complications have obstructed the proper use of closure quantities in interferometric analysis, and they have obscured the relationship between analysis with closure quantities and other analysis techniques such as self calibration. We review the statistics of closure quantities, noting common pitfalls that arise when approaching low signal to noise due to the nonlinear propagation of statistical errors. We then develop a strategy for isolating and fitting to the independent degrees of freedom captured by the closure quantities through explicit construction of linearly independent sets of quantities along with their noise covariance in the Gaussian limit, valid for moderate signal to noise, and we demonstrate that model fits have biased posteriors when this covariance is ignored. Finally, we introduce a unified procedure for fitting to both closure information and partially calibrated visibilities, and we demonstrate both analytically and numerically the direct equivalence of inference based on closure quantities to that based on self calibration of complex visibilities with unconstrained antenna gains.

Export citation and abstract BibTeX RIS

1. Introduction

Interferometric observations allow diffraction-limited resolution on angular scales that are inaccessible to single-element systems. However, interferometers have the limitation of only sparsely sampling information in the so-called visibility domain. While measured visibilities have simple and deterministic thermal noise, they also have complex systematic errors. These systematic errors manifest as variations in visibility amplitudes and phases on many timescales, representing limitations imposed by a broad range of sources including the constituent interferometer elements, reference frequencies, and atmosphere.

The dominant systematic errors are station-based effects, corresponding to multiplicative complex gain factors. In this case, "closure" quantities can be constructed, which are independent of the station-based systematic calibration errors. Specifically, closure phases consist of a directed sum of visibility phases around a closed triangle joining three stations (Jennison 1958; Rogers et al. 1974), while closure amplitudes are the quotient of two visibility products involving four stations (Twiss et al. 1960; Readhead et al. 1980). These quantities have found particular utility in very long baseline radio interferometry (VLBI) where array sparsity motivates the use of model independent observables. For an array with N stations, one can form ∼N3 closure phases and ∼N4 closure amplitudes from the original set of ∼N2 visibilities. However, there are at most (N − 1)(N − 2)/2 degrees of freedom in the closure phases and N(N − 3)/2 degrees of freedom in the closure amplitudes. The necessary degeneracy between the full sets of closure quantities is captured by the structure of their covariance.

Closure quantities are useful for interferometric analysis, especially model fitting and imaging (e.g., Readhead & Wilkinson 1978; Chael et al. 2018), because they eliminate the need to model station gain systematics, and their error budget can be determined from first principles. Yet, despite the fundamental importance of closure quantities for interferometry, there is widespread variation in the literature concerning their properties, best practices when utilizing closure quantities, their relationship with standard analyses such as self calibration, and the role of linearly independent sets of closure quantities (which are not necessarily statistically independent). Moreover, most analyses to date ignore the covariance between closure quantities, which can be significant; although, covariance of closure phases has been studied in the optical interferometry community (e.g., Kulkarni et al. 1991; Martinache 2010; Ireland 2013).

Here, we provide a rigorous foundation for analysis using closure quantities, and we give procedures for selecting nonredundant sets of closure phases and amplitudes. We demonstrate that, when covariance is correctly accounted for, these nonredundant sets carry the full information of the complete sets of closure quantities. Moreover, in the limit of completely unconstrained station gains, we show that analysis of closure quantities is identical to analysis of complex visibilities with gain marginalization. We also give procedures for selecting nonredundant sets that minimize covariance, and we demonstrate the effects of covariance among closure products using simulated data from simple models.

We begin, in Section 2, by discussing thermal and systematic errors in interferometric measurements, and we assess the conditions under which errors on closure quantities can be approximated as Gaussian. Next, in Section 3, we evaluate the covariance among closure quantities and give prescriptions for selecting nonredundant sets of closure quantities. Then, in Section 4, we apply our results to simple model fits using closure quantities and demonstrate the role of nonredundant sets and covariance among closure products. We summarize our results in Section 5. The notation used throughout the paper is described in Table D1.

2. Closure Quantities and Errors

2.1. Inteferometric Visibility and Gain

An interferometric array aims to measure the complex coherence function of the electric field, or visibility ${V}_{{ij}}=E\left[{{ \mathcal E }}_{i}{{ \mathcal E }}_{j}^{\ast }\right]$ (represented here in the frequency domain, with ${ \mathcal E }$ in units such that expectation value ${S}_{\nu }\sim E\left[| {{ \mathcal E }}_{i}{| }^{2}\right]$ is the electromagnetic flux spectral density), from a distant source at two locations i and j in the plane of propagation. Vij samples a Fourier component of the brightness distribution on the sky (via the van Cittert–Zernike theorem, van Cittert 1934; Zernike 1938; Thompson et al. 2017), with spatial frequency corresponding to the projected baseline length in units of observing wavelength. In the idealized case, the field is measured without any attenuation or propagation delays (e.g., through atmosphere). In practice, the measured complex signal vi at an antenna i can be modeled as the idealized incident aligned electric field ${{ \mathcal E }}_{i}$ subject to a linear complex gain factor γi, plus additive zero-mean circularly symmetric complex Gaussian noise ni

Equation (1)

In continuum-VLBI, the noise power typically exceeds signal power by a large factor: $E\left[| n{| }^{2}\right]\gg E\left[| \gamma { \mathcal E }{| }^{2}\right]$. The gain for a particular antenna feed is a function of time and frequency γ = γ(t, f), and while a variety of simplifying assumptions and factorizations can be made, the gain is often not known a priori to a high degree of precision. Thus, the fundamental observable is not source visibility Vij but cross-covariance rij between pairs of antennas:

Equation (2)

If the signals vi and vj are normalized by their noise power such that $E\left[| {n}_{i}{| }^{2}\right]=1$, rij is the correlation coefficient, and ${\mathrm{SEFD}}_{i}=1/| {\gamma }_{i}{| }^{2}$ represents the system-equivalent flux density (noise power in units of flux above the atmosphere).

Relating the measured correlation coefficients rij to source visibilities Vij is the process of calibration and may include estimating the magnitude of $| {\gamma }_{i}| $ through the observation of bright flux calibrators, measuring differential phase $\mathrm{Arg}\left[{\gamma }_{i}^{}{\gamma }_{j}^{* }\right]$ by observing phase calibrators with known structure, or by process of self calibration where the gains γi are solved simultaneously with unknown source model parameters. For a VLBI array at millimeter wavelengths, calibration is made difficult by the strong and rapidly changing atmospheric effects and by the lack of bright compact calibration sources of known structure. Amplitude and phase gain systematics often dominate over the thermal (statistical) noise that arises from estimating rij over finite time and bandwidth.

2.2. Closure Phase and Closure Amplitude

Closure quantities are special combinations of correlation measurements taken over closed loops in an antenna network. They are able to cancel out station-based gains γi, giving observables that depend only on intrinsic source parameters.

A closure phase is the sum of measured phases around a closed triangle of baselines,

Equation (3)

where ϕ12 is the phase on baseline 1–2,

Equation (4)

Written as the phase of the complex bispectrum (triple product) V123 = V12V23V31, we see that a closure phase is independent of arbitrary phase gain $\mathrm{Arg}\left[{\gamma }_{i}\right]$,

Equation (5)

Equation (6)

since every gain term on the right-hand side is multiplied by its complex conjugate.

A set of correlation coefficients connecting four sites in a closed quadrangle can be used to calculate a closure amplitude,

Equation (7)

In this case, we see that closure amplitude is independent of arbitrary amplitude gain $| {\gamma }_{i}| $ since each station gain amplitude term appears in both the numerator and denominator.6

By canceling station gains, the closure quantities are able to isolate measurement degrees of freedom that are independent from gains, and they provide observables that are accurate to the thermal-noise limit or to residual baseline errors (Massi et al. 1991), which are typically much smaller than the station errors. Thus, they are particularly valuable when systematic gain uncertainty is much larger than statistical uncertainty. We will see in subsequent sections that the closure phases and closure amplitudes capture all of the gain-invariant degrees of freedom from the baseline visibilities, at the cost of removing any prior information about the gains.

2.3. Statistical Thermal Noise

When estimated from actual data, the closure quantities and associated correlation coefficients from Equations (2)–(7) must be averaged over a finite time and bandwidth in order to accumulate signal-to-noise ratio (S/N). A measurement of rij taken over integration time Δt and bandwidth Δν averages ${\rm{\Delta }}t\,{\rm{\Delta }}\nu $ independent complex samples (finite average denoted with $\langle \rangle $), and includes contributions from both the source and the independent zero-mean (and normalized) thermal noise at each antenna,

Equation (8)

We have introduced a breve accent ${\breve{r}}_{{ij}}={\breve{\gamma }}_{i}{\breve{\gamma }}_{j}^{* }{\breve{V}}_{{ij}}$ to distinguish underlying (ground-truth) values from those that are subject to statistical or systematic errors.7 Under the previously adopted normalization $E\left[| {n}_{i}{| }^{2}\right]=1$ (see Equation (2)), the variance for one sample of correlated complex noise $E\left[| {n}_{i}^{}{n}_{j}^{* }{| }^{2}\right]=1$, and the variance in one component (real or imaginary) of the averaged complex noise correlation $\langle {n}_{i}^{}{n}_{j}^{* }\rangle $ is then

Equation (9)

where the amount of time-frequency averaging to reduce ${\sigma }_{r}^{2}$ is ultimately constrained by assumptions regarding station gain variability and source model variability.

The underlying signal-to-noise $\breve{\rho }$ of a correlation coefficient amplitude $| \breve{r}| $ under time-frequency averaging, and assuming negligible evolution of model visibility due to source structure or residual systematics, is

Equation (10)

This is taken in the moderate-to-high $\breve{\rho }$ limit where it is meaningful to measure noise along just one of the complex components. A correlated flux density of 1 Jy with typical geometric mean SEFD of 104 Jy would give an expected correlation coefficient of 10−4 and an S/N of 4.5 over 1 GHz of bandwidth in 1 s of integration time. At $\breve{\rho }\lt 1$, the ability to measure correlation phase and amplitude degrades rapidly (Rogers et al. 1995), so that the minimum acceptable integration time is fundamentally limited by a combination of source strength, bandwidth, and collecting area.

At the same time, any uncompensated complex gain $\breve{\gamma }$ (from Equations (1) and (2)) must be stable over the averaging timescale to both ensure a meaningful measurement and to avoid phase decoherence while vector averaging complex baseline visibility. At millimeter and submillimeter observing wavelengths, phase decoherence due to atmospheric turbulence occurs on timescales of seconds. The requirement that $\breve{\rho }\gt 1$ over an averaging time Δt and bandwidth Δν where gain variation remains negligible sets the observational constraints where the use of closure quantities is particularly effective. Gain variation over the frequency bandpass is generally a stable instrumental effect that can be well measured and calibrated out. Similarly, relative complex gain between two orthogonal feeds in an antenna, e.g., γR/γL or γX/γY, is generally stable and can be calibrated, so that synthesized combinations of correlation products such as Stokes I = VRR + VLL are also characterized by station-based residual gains that close.

For observations at high radio frequencies, rapid phase gain variability in time due to the atmosphere is a primary driver of efforts to expand the collecting area and instantaneous bandwidth of mm-VLBI arrays such as the Event Horizon Telescope (EHT; Event Horizon Telescope Collaboration et al. 2019a, 2019b). So long as there exists at least one high-S/N baseline to a given site connecting it to the array phase center, the atmospheric phase variations can generally be solved for and removed. This allows for longer coherent integration on weak baselines to that site (Blackburn et al. 2019). In the case of the EHT, this condition is generally satisfied by the presence of the highly sensitive ALMA in the array.

In the following subsections, we discuss the consequences of low S/N on characterization of errors in phase and amplitude, and also the propagation of these errors across derived closure quantities. However, we do not explore optimal averaging strategies for the generation of closure phase and closure amplitudes. This would require assuming a prior model for gain variability. Rather, we assume there exist some Δt and Δν such that $\breve{\rho }\gt 1$ is maintained on all baselines, and over which $\breve{\gamma }$ can be made reasonably stable.

2.4. Non-Gaussian Errors at Low S/N

The observed correlation coefficients (Equations (8) and (9)) are subject to measurement noise that is complex and independent in real and imaginary components. While this implies that complex visibility is the natural measurement space for correlation observables, gain systematics are largely separable into amplitude factors (e.g., aperture efficiency) and phase factors (e.g., variable path delay), and this is reflected in the way closure amplitude and closure phase are formed.

The transformation from errors in real and imaginary coefficients to errors in amplitude and phase is only effectively linear for $\breve{\rho }\gg 1$. A consequence is that the statistical error budget of closure quantities becomes progressively non-Gaussian as $\breve{\rho }$ becomes small. This is particularly severe for the case of reciprocal amplitude, which is a necessary component of closure amplitude (Equation (7)). Heavy tails in the distribution for reciprocal amplitude are one motivation to move to log-closure amplitudes, which place the numerator and denominator of a closure amplitude on equal footing.

We will primarily assume that measured amplitudes, log amplitudes, and phases for visibilities and for closure quantities can each be approximately, yet adequately, characterized as a Gaussian random process with assumed model mean and variance. This is the case for $\breve{\rho }\gtrsim \mathrm{few}$, which is typically achieved in continuum radio interferometry through sufficient time-frequency averaging in the weak-signal limit. Examples of closure phase and closure amplitude distributions, along with corresponding high-S/N normal distribution approximations, are shown in Figures 12.

Figure 1.

Figure 1. Distribution of closure phases vs. Gaussian approximation from the high-S/N theoretical limit (dashed lines). For each closure phase, all three baseline visibilities are drawn from a complex normal distribution with mean value $\breve{\rho }$ and unity variance in each complex component.

Standard image High-resolution image
Figure 2.

Figure 2. Distributions of closure amplitudes vs. Gaussian approximations from the high-S/N theoretical limit. Baseline amplitudes A, B, C, D are drawn from a Rice distribution with noncentral amplitude 1 and ${\breve{\rho }}_{A},{\breve{\rho }}_{B},{\breve{\rho }}_{C},{\breve{\rho }}_{D}$ = (8, 8, 5, 5). There are large tails in the standard closure amplitude ratio due to amplitudes in the denominator that approach zero (top panel). The tail is mitigated somewhat by placing the lower-S/N measurements in the numerator (middle panel). However, using log-closure amplitude provides a better-behaved distribution overall (bottom panel).

Standard image High-resolution image

These ensemble distributions for measured phase and amplitude are exactly calculable for a given model $\breve{\rho }$, even in the low $\breve{\rho }$ limit where the distributions become non-Gaussian. However, in practice, the underlying intrinsic signal-to-noise $\breve{\rho }$ is generally not known, which means the distribution from which a single measured ρ is drawn is also not known precisely. Unless $\breve{\rho }$ is either assumed under a complete forward model (incorporating model visibility and all forward gains) or based on additional averaging beyond the single measurement of r, any estimate will be subject to thermal noise. In addition to a general mischaracterization of errors, this can also lead to a self-selection bias if realizations that are randomly low amplitude are assigned larger errors or if they are preferentially flagged from the data.

An expanded description of phase and amplitude distributions is given in Appendix A. The distributions for phase, amplitude, and log amplitude can be reasonably approximated as Gaussian for $\breve{\rho }$ above 2–5. For log amplitude, a full characterization of the distribution under incoherent averaging of amplitudes is given in terms of moments. This is useful for estimating the a priori amplitude noise bias that becomes significant at low S/N. However, if a significant amount of informative data has low S/N, it may be advantageous to forward model complex gains and explicitly marginalize over their uncertainties, at least for the affected stations. This keeps data in the complex domain and their errors Gaussian.

3. Independence of Closure Quantities

The ∼N3 possible closure phases and ∼N4 closure amplitudes are formed using the original ∼N2 baseline visibilities and become highly redundant at large N, where a much smaller subset of nonredundant quantities captures all source degrees of freedom (Readhead et al. 1980; Pearson & Readhead 1984). The codependence of redundant closure quantities and their initial construction from common baseline quantities leads to a general lack of statistical independence in their residual thermal noise (Kulkarni 1989).8 For closure phases and log-closure amplitudes in the Gaussian limit,9 the statistical dependence is fully characterized by a nonzero covariance.

In the following subsections, we detail the covariance structure for closure phases and log-closure amplitudes, and we demonstrate the relationship of the covariance to the unique and statistically independent degrees of freedom present in the quantities. We then present strategies for the construction of nonredundant but complete sets of quantities, and we discuss proper accounting of the number of gain-invariant degrees of freedom.

3.1. Closure Covariance due to Thermal Noise

Closure phases and log-closure amplitudes are formed from sums and differences of shared baseline quantities, so that the closure quantities do not have independent noise. Under the approximation that baseline observables are Gaussian random variables, the joint distribution of $T$ nonredundant closure phases ${\psi }_{{ijk}}$, for example, is characterized by a multivariate Gaussian distribution,

Equation (11)

where residual closure phases $\tilde{{\boldsymbol{\psi }}}={\boldsymbol{\psi }}-\hat{{\boldsymbol{\psi }}}$ are taken about model values $\hat{{\boldsymbol{\psi }}}=\{{\hat{\psi }}_{{ijk}}\}$ and have covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$. This corresponds to the likelihood of observing the residuals $\tilde{{\boldsymbol{\psi }}}$ under the model hypothesis.

For a collection of all baseline phases measured among four sites, ${\boldsymbol{\phi }}=\{{\phi }_{12},{\phi }_{13},{\phi }_{14},{\phi }_{23},{\phi }_{24},{\phi }_{34}\}$ (Figure 3), the first three closure phases are,

Equation (12)

The final closure phase is redundant with the other three,

Equation (13)

Figure 3.

Figure 3. Network of four sites. There are six baselines, three nonredundant closure phases, and two nonredundant closure amplitudes.

Standard image High-resolution image

We can represent the generation of closure phases as a linear operator (closure phase design matrix ${\boldsymbol{\Psi }}$) applied to the baseline phases: ${\boldsymbol{\psi }}={\boldsymbol{\Psi }}{\boldsymbol{\phi }}$,

Equation (14)

This closure phase design matrix is equivalent to the "phase closure operator" of Lannes (1990b) and the "phase compilation operator" of Lannes (1991).

The covariance matrix for the nonredundant set is ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}={\boldsymbol{\Psi }}{{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}{{\boldsymbol{\Psi }}}^{\top }$, where ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ is the covariance of the measured baseline phases. In general, ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ has a diagonal contribution from $B$ independent baseline thermal-noise contributions ${\boldsymbol{S}}=\mathrm{diag}({\sigma }_{00}^{2},\ ...,{\sigma }_{{BB}}^{2})$, plus diagonal and off-diagonal contributions from common systematic gain errors ${\sigma }_{\theta ,i}^{2}$. However, the common gain errors are ultimately eliminated through the formation of closure quantities. Therefore, ${\boldsymbol{\Psi }}{{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}{{\boldsymbol{\Psi }}}^{\top }={\boldsymbol{\Psi }}\,{\boldsymbol{S}}\,{{\boldsymbol{\Psi }}}^{\top }$, and

Equation (15)

The cross terms of ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ are nonzero and are based on the sign of the shared baseline components of each closure phase.

For the same network of four sites, the first two log-closure amplitudes are also based on sums and differences of log-baseline amplitudes ${\boldsymbol{a}}=\{{a}_{12},{a}_{13},{a}_{14},{a}_{23},{a}_{24},{a}_{34}\}$,

Equation (16)

with a third closure amplitude that is redundant,

Equation (17)

By using log amplitude, the redundancy in closure amplitudes can be cast in terms of linear dependence, as is already the case for closure phases. Covariance terms are formed according to shared baselines, as was done for closure phases,

Equation (18)

The likelihood of observing a set of measured residual log-closure amplitudes $\tilde{{\boldsymbol{c}}}={\boldsymbol{c}}-\hat{{\boldsymbol{c}}}$, given measurements ${\boldsymbol{c}}$ and model hypothesis $\hat{{\boldsymbol{c}}}$, parallels Equation (11) for closure phases,

Equation (19)

The covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ must be formed from a nonredundant set of $Q\leqslant {Q}_{\mathrm{minimal}}$ closure quantities—otherwise, the matrix will be rank deficient and not invertible. ${Q}_{\mathrm{minimal}}$ is the minimum size set that captures all available degrees of freedom, as well as the largest nonredundant set that can be formed (this is demonstrated in Section 3.2). The value ${\tilde{{\boldsymbol{c}}}}^{\top }{{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}^{-1}\tilde{{\boldsymbol{c}}}$ will then follow a χ2 distribution with Q degrees of freedom.

If we write the inverse covariance matrix as ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}^{-1}\,={{\boldsymbol{U}}}^{\top }{{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}},\mathrm{diag}}^{-1}{\boldsymbol{U}}$, we see that matrix U transforms a nonredundant set of ${Q}_{\mathrm{minimal}}$ closure quantities into a space of combinations of closure quantities with independent noise, and characterized by diagonal covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}},\mathrm{diag}}$. When applied to closure phases, this generates the so-called "kernel phases," first noted by Martinache (2010). The closure basis formed in this manner can be arbitrarily rotated by different choices of U, but all rotations capture the same ${Q}_{\mathrm{minimal}}$ degrees of freedom. Additional redundant closure quantities to this set will be perfectly degenerate with linear combinations of the closure basis, and they will not add additional information to the likelihood of a set of observations. Thus, the calculation of χ2 is unique and does not depend on the particular set of nonredundant closure quantities used (specific examples of this invariance are provided in Appendix C).

In terms of the closure (log amplitude) design matrix ${\boldsymbol{C}}$, the factorization can also be written

Equation (20)

where ${{\boldsymbol{C}}}^{+}$ is the pseudo-inverse of ${\boldsymbol{C}}$, and ${{\boldsymbol{S}}}^{-1}$ is a diagonal matrix containing the reciprocal baseline thermal variances ${\sigma }_{{ij}}^{2}$. ${{\boldsymbol{C}}}^{+}$ itself does not depend on the actual baseline noise and can be readily computed via singular value decomposition (SVD). Redundant degrees of freedom will be reflected by singular values of zero and can be avoided by first removing the redundant closure quantities by matrix reduction or explicit construction (Section 3.2). In that case, the pseudo-inverse will be a true inverse. The advantage to inverting the design matrix rather than the covariance matrix (as in Equations (11) or (19)) is that the operation on the design matrix can be done once and then applied to different baseline noise prescriptions with little computational cost.

3.2. Minimal Complete Sets

The total number, $T$, of triangles that can be constructed from a fully connected set of baselines across $N$ sites is

Equation (21)

while the number of closure phase degrees of freedom is only the total number of baseline phases ($N(N-1)/2$) minus the number of degrees of freedom contained in site phase differences ($N-1$). These degrees of freedom should be captured by a nonredundant subset of closure phases of size

Equation (22)

For a large network, the set of all closure triangles will quickly outpace the number of independent measurements, resulting in a highly redundant set. One method for choosing a minimal set of closure triangles is given by Thompson et al. (2017) and shown in Figure 4. It involves selecting a single reference station and selecting the set of all triangles that contain it. Triangles that do not contain the reference station are formed as combinations of triangles from the minimal set.

Figure 4.

Figure 4. Construction of a minimal set of closure phases. Site 1 is used as a reference, from which there are $(N-1)(N-2)/2$ choices for the other two sites that build the set of all triangles containing site 1. Combinations of closure phases from this set can be used to form arbitrary triangles that do not contain the reference station, proving that the set is complete. This prescription is described in Thompson et al. (2017).

Standard image High-resolution image

We introduce a corresponding diagrammatic procedure for selecting a minimal set of closure amplitudes (Figure 5). The independent closure amplitudes are formed by arranging $N$ sites on a ring and selecting all pairs of two adjacent nonoverlapping sites. One closure amplitude is formed from each four-site arrangement (quadrangle) from the baselines that span the pair. Because the order of the pair does not matter, this results in the formation of

Equation (23)

total closure amplitudes, equal to the $N(N-1)/2$ baseline degrees of freedom minus the $N$ unknown site gain factors. Combinations of closure amplitudes from the basis can be used to construct all remaining possible closure amplitudes, showing that the remaining closure amplitudes are redundant. Note that because adding a new station to the ring will necessarily break up one previous pair, the minimal set formed this way across $N$ stations is not a proper subset of that formed across $N+1$ stations. An alternative strategy for building the set of closure amplitudes in a staged matrix-driven approach is presented in Appendix B.3.

Figure 5.

Figure 5. Construction of a minimal set of closure amplitudes. We begin with the set of closure amplitudes defined by choosing all sets of two nonoverlapping pairs of adjacent sites and forming one closure amplitude from each collection of four sites according to the baselines shown on the top left. The solid and dashed lines determine which baselines go in the numerator and denominator of the closure amplitude. Since there are $N$ choices for the placement of the first adjacent pair and $N-3$ choices for the placement of the second pair, there are $N(N-3)/2$ nonredundant closure amplitudes formed. By multiplying closure amplitudes from our set, we can construct arbitrary closure amplitudes containing nonadjacent sites. The set is therefore complete.

Standard image High-resolution image

In practice, the full set of $N(N-1)/2$ baseline visibilities may not be available due to processing issues or by choice, which complicates the generation of a minimal set of closure quantities. For closure phases, so long as a missing baseline does not include the reference station (Figure 4), the closure triangle containing the missing baseline can be excluded from the minimal set. For closure amplitudes, missing exterior baselines between adjacent sites along the ring (Figure 5) appear in only one of the closure amplitude basis quadrangles. Removing the quadrangle with the missing exterior baseline will correctly exclude all derivative closure amplitudes from the set. For more complicated baseline unavailability, a site-based procedure for forming nonredundant closure quantities may not work. One alternative method for extracting the unique degrees of freedom from a partially redundant set of closure quantities is through singular value decomposition (SVD) of the covariance matrix or design matrix (Equation (20), Figure 6). Alternatively, a minimal set of the original closure quantities can be identified and extracted by matrix reduction of the design matrix.

Figure 6.

Figure 6. Singular value decomposition of the covariance matrix formed from a full set of 378 closure amplitudes over nine sites. The baseline noise prescription is random. There are 27 nonzero singular values corresponding to the $9\times (9-3)/2=27$ independent degrees of freedom represented in the closure amplitudes. SVD is particularly useful in situations of arbitrary missing baselines, which complicates the direct generation of a minimal set of closure quantities.

Standard image High-resolution image

An upper limit on the number ${n}_{\psi }$ of different minimal closure phase subsets that exist for a fully connected array with N stations is given by the binomial coefficient

Equation (24)

This expression yields only an upper limit because for a given maximal set and N > 4, some selections of subsets with size equal to that of the minimal set will contain redundant closure phases, and so they will not themselves be valid minimal sets. An analogous upper limit holds for the number of nonredundant sets of closure amplitudes,

Equation (25)

Both nψ and nc grow super-exponentially with N (see Table 1), and as the number of stations increases beyond a few, it quickly becomes prohibitive to search through all possible nonredundant subsets for the one that minimizes covariance. The minimal-covariance subset for both closure phases and log-closure amplitudes will generically depend on the specific baseline S/N distribution of the array, and we do not know of a general-purpose algorithm for selecting the optimal set. Instead, we consider rules of thumb for two limiting cases that approximate realistic array configurations: an array with uniform S/N on all baselines, and an array with S/N dominated by strong baselines to a single station or with other means to clearly identify weak baselines.

Table 1.  Unique Minimal Sets

N nψ nc
3 1 ...
4 4 3
5 125 1518
6 46620 351117922

Note. The number of unique minimal sets of closure phases and log-closure amplitudes for small arrays.

Download table as:  ASCIITypeset image

We use the determinant λ of the correlation matrix ${\boldsymbol{\varrho }}$ to quantify the degree of independence for any specific choice of minimal subset,

Equation (26)

Equation (27)

where the elements of ${\boldsymbol{\varrho }}$ are related to the elements of the covariance matrix ${\boldsymbol{\Sigma }}$. The value of λ varies between zero and one, with λ = 1 corresponding to no correlation and λ = 0 corresponding to complete correlation.

For an array with a uniform S/N on all baselines (e.g., a homogeneous array observing a point source), the covariance is minimized (i.e., λ is maximized) when all stations are represented as nearly equally as possible10 in the minimal set of either closure phases or log-closure amplitudes. For example, an array with N = 6 stations has a minimal closure phase set size of 10, but of the nψ = 46,620 different choices of minimal set, only 12 equally represent all baselines.11 Similarly, an array with N = 5 stations has a minimal log-closure amplitude set size of five, but only six out of nc = 1518 minimal sets equally represent all baselines.12

For an array with a high S/N on baselines to only one station (e.g., a heterogeneous array containing one highly sensitive station), the closure phase covariance is minimized when the minimal set is constructed using only triangles containing the reference station; that is, using the minimal set construction algorithm described earlier in this section produces an optimal set when the reference station dominates the array sensitivity. This is because weak baselines between two non-reference stations are then used only once in the construction. For log-closure amplitudes, placing the lowest S/N baselines on the ring as adjacent sites (as in Figure 5) accomplishes the same goal; the weakest baselines are used only once in the minimal set and, thus, do not contribute to the overall covariance.

3.3. Redundant Baselines

Some interferometric arrays have multiple baselines that are effectively redundant (dense arrays are often designed with this redundancy, to aid calibration). For instance, a common case in VLBI is to have multiple sites that can effectively be considered colocated. For example, the CSO, JCMT, and SMA are all on Maunakea and have participated in EHT experiments. Likewise, the APEX telescope is located within a few kilometers of the ALMA phased array center. Baselines to these redundant sites sample the same visibility and source structure, and they can be combined to reduce thermal noise and to improve calibration. For example, the addition of ALMA to the EHT including APEX does not provide new baselines. However, it significantly reduces the thermal noise of baselines to Chile.

We have so far focused on the unique statistical degrees of freedom contained in the closure quantities, which do not depend on array geometry. Baseline redundancy does have a dramatic effect, however, on the unique source structure degrees of freedom measured by the array. For example, the addition of colocated sites to a VLBI network does not sample new nontrivial source information via closure phases even as the statistical degrees of freedom grow according to Equation (23), but it does increase the amount of source information measured via closure amplitudes. In the limit where every site has a redundant partner, all source visibility amplitude information is sampled via closure amplitudes apart from a single unknown degree of freedom for the total flux density.

To assess the independent degrees of freedom for an array with baseline redundancy, we introduce a redundancy matrix ${\boldsymbol{R}}$ of dimensions ${B}_{\mathrm{NR}}\times B$ that links multiple measurements from redundant baselines into a single degree of freedom, such that ${B}_{\mathrm{NR}}\leqslant B$ is the number of nonredundant geometric baselines that sample unique source structure. For each row corresponding to a unique geometric baseline, ${\boldsymbol{R}}$ contains a "1" in each column for each matching station pair. If there are no redundant baselines, ${\boldsymbol{R}}$ is the identity matrix. For the four-site network in Figure 3, if stations 1 and 2 are taken to be colocated, then of the six measured baselines $\{{V}_{12},{V}_{13},{V}_{14},{V}_{23},{V}_{24},{V}_{34}\}$, ${V}_{13}\sim {V}_{23}$ sample the same geometric baseline, as do ${V}_{14}\sim {V}_{24}$, so that

Equation (28)

with the "zero baseline" ${V}_{12}$ serving as one of the four unique geometric baselines.

The number of unique source degrees of freedom captured by closure quantities is found by taking the rank of the compound design matrix, which converts nonredundant amplitudes to closure quantities. For the previous four-station example with one colocated pair, this gives $\mathrm{rank}({\boldsymbol{\Psi }}{{\boldsymbol{R}}}^{\top })=2$ gain-independent phase structure degrees of freedom, and $\mathrm{rank}({{\boldsymbol{CR}}}^{\top })=1$ gain-independent amplitude structure degrees of freedom.13 A four-site array arranged in a square satisfies different constraints with ${V}_{12}\sim {V}_{34}$ and ${V}_{14}\sim {V}_{23}$. While there are still four unique geometric baselines, there are now $\mathrm{rank}({\boldsymbol{\Psi }}{{\boldsymbol{R}}}^{\top })=3$ structure closure phases and $\mathrm{rank}({{\boldsymbol{CR}}}^{\top })=2$ structure closure amplitudes, both equal to the corresponding number of linearly independent closure quantities.

The analysis indicates that by judicious use of redundancy, an interferometric array can reduce the overall complexity of the measurements (with $\mathrm{rank}({\boldsymbol{R}})$ as an indication of complexity) while not sacrificing measured gain-independent structure degrees of freedom. For example, $\mathrm{rank}({\boldsymbol{R}})-\mathrm{rank}({{\boldsymbol{CR}}}^{\top })$ can be taken as an indication of the number of "amplitude gains" that remain unconstrained. As some level of a priori gain information is generally available, a sparse array that samples the maximum number of unique geometric baselines is likely preferable over one that utilizes geometric redundancy for most situations. Colocated sites in particular can cause a significant loss in measured information. However, they do provide a link to zero baseline quantities (such as total flux), which are often known a priori and, thus, inform model independent calibration (Blackburn et al. 2019).

4. Model Fitting with Unknown Gains

In this section, we apply the closure construction procedures detailed in the appendices to perform a series of simple model fits to different simulated data products generated from the same underlying truth image. The goal of these tests is to demonstrate that the same model parameter posteriors can be recovered using different representations of the data products, so long as covariances between measurements are properly accounted for.

4.1. Visibility Covariance due to Gain Error

To connect model fitting to closure quantities with model fitting to baseline visibilities, we first introduce a parallel construction (to Section 3.1) for the covariance in visibility measurements under the presence of uncertainty in station gain. In both cases, we characterize the covariance in a residual quantity ($\tilde{{\boldsymbol{\psi }}}$ or $\tilde{{\boldsymbol{c}}}$ for closure quantities, $\tilde{{\boldsymbol{\phi }}}$ or $\tilde{{\boldsymbol{a}}}$ for visibilities—see Table D1), reflecting the difference between the measured quantity and the model prediction. However, while the covariance for residual closure quantities is due to thermal error on shared baselines, the baseline thermal noise is independent for visibility quantities in the weak source limit, and the covariance is due to systematic error in model gain over shared stations. A visibility measurement ${V}_{{ij}}$ contains contributions from both the source and from the station gains,

Equation (29)

The multiplicative complex gains manifest as additive terms modifying the visibility phases and log visibility amplitudes,

Equation (30a)

Equation (30b)

where the sign differences in the second gain terms arise because complex conjugation negates phases but leaves amplitudes unchanged.

More generally, we can express the gain contributions to a collection of visibility phases or log visibility amplitudes in terms of design matrices ${\boldsymbol{\Phi }}$ or ${\boldsymbol{A}}$ operating on the vector of gain phases or log gain amplitudes,

Equation (31a)

Equation (31b)

For example, the visibility phases measured on the baselines in Figure 3 can be expressed using

Equation (32)

while the log visibility amplitudes can be similarly expressed using

Equation (33)

The visibility phase design matrix is equivalent to the "phase aberration operator" of Lannes (1990b), while the log visibility amplitude design matrix matches the "amplitude aberration operator" of Lannes (1990a, 1991).

This additivity makes it convenient to model the gain phases and log gain amplitudes as Gaussian distributed, so that their variances simply add to those of the corresponding visibility quantities. The baseline-based thermal variances are uncorrelated across baselines, and in the absence of gains, they would fully describe the visibility covariances via the diagonal matrices ${{\boldsymbol{S}}}_{{\boldsymbol{\phi }}}$ for visibility phases and Sa for log visibility amplitudes (see Tables B1 and B2 in Appendix B.1). The station-based gain variances do drive covariances in the visibility residuals for baselines that share a station, with the design matrices serving to map stations to baselines. The covariance matrices are then constructed as the sum of the baseline-based and station-based contributions,

Equation (34a)

Equation (34b)

with the off-diagonal elements consisting of only station-based terms while the diagonal elements combine both station-based and baseline-based terms. The covariance matrix corresponding to the visibility phases in Equation (32) is given by

Equation (35)

while the covariance matrix corresponding to the log visibility amplitudes in Equation (33) is structurally identical except for the off-diagonal term signs,

Equation (36)

Table B1.  Visibility Phase Design and Covariance Matrices for Two- and Three-element Arrays, along with Matrices Relevant for Their Construction

    Number of Stations ($N$)
Matrix Shape $N$ = 2 $N=3$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\theta }}}$ $N$ × $N$ $\left(\begin{array}{cc}{\sigma }_{\theta ,1}^{2} & 0\\ 0 & {\sigma }_{\theta ,2}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{\theta ,1}^{2} & 0 & 0\\ 0 & {\sigma }_{\theta ,2}^{2} & 0\\ 0 & 0 & {\sigma }_{\theta ,3}^{2}\end{array}\right)$
${{\boldsymbol{S}}}_{{\boldsymbol{\phi }}}$ $B$ × $B$ $\left(\begin{array}{c}{\sigma }_{12}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{12}^{2} & 0 & 0\\ 0 & {\sigma }_{13}^{2} & 0\\ 0 & 0 & {\sigma }_{23}^{2}\end{array}\right)$
${\boldsymbol{\Phi }}$ $B$ × $N$ $\left(\begin{array}{cc}1 & -1\end{array}\right)$ $\left(\begin{array}{ccc}1 & -1 & 0\\ 1 & 0 & -1\\ 0 & 1 & -1\end{array}\right)$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ $B$ × $B$ $\left(\begin{array}{c}{\sigma }_{12}^{2}+{\sigma }_{\theta ,1}^{2}+{\sigma }_{\theta ,2}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{12}^{2}+{\sigma }_{\theta ,1}^{2}+{\sigma }_{\theta ,2}^{2} & {\sigma }_{\theta ,1}^{2} & -{\sigma }_{\theta ,2}^{2}\\ {\sigma }_{\theta ,1}^{2} & {\sigma }_{13}^{2}+{\sigma }_{\theta ,1}^{2}+{\sigma }_{\theta ,3}^{2} & {\sigma }_{\theta ,3}^{2}\\ -{\sigma }_{\theta ,2}^{2} & {\sigma }_{\theta ,3}^{2} & {\sigma }_{23}^{2}+{\sigma }_{\theta ,2}^{2}+{\sigma }_{\theta ,3}^{2}\end{array}\right)$

Note. Here, $B=\left(\displaystyle \genfrac{}{}{0em}{}{N}{2}\right)$ is the number of baselines.

Download table as:  ASCIITypeset image

Table B2.  Log Visibility Amplitude Design and Covariance Matrices for Two- and Three-station Arrays, along with Matrices Relevant for Their Construction

    Number of Stations ($N$)
Matrix Shape $N=2$ $N=3$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{g}}}$ $N$ × $N$ $\left(\begin{array}{cc}{\sigma }_{g,1}^{2} & 0\\ 0 & {\sigma }_{g,2}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{g,1}^{2} & 0 & 0\\ 0 & {\sigma }_{g,2}^{2} & 0\\ 0 & 0 & {\sigma }_{g,3}^{2}\end{array}\right)$
${{\boldsymbol{S}}}_{{\boldsymbol{a}}}$ $B$ × $B$ $\left(\begin{array}{c}{\sigma }_{12}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{12}^{2} & 0 & 0\\ 0 & {\sigma }_{13}^{2} & 0\\ 0 & 0 & {\sigma }_{23}^{2}\end{array}\right)$
${\boldsymbol{A}}$ $B$ × $N$ $\left(\begin{array}{cc}1 & 1\end{array}\right)$ $\left(\begin{array}{ccc}1 & 1 & 0\\ 1 & 0 & 1\\ 0 & 1 & 1\end{array}\right)$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{a}}}$ $B$ × $B$ $\left(\begin{array}{c}{\sigma }_{12}^{2}+{\sigma }_{g,1}^{2}+{\sigma }_{g,2}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{12}^{2}+{\sigma }_{g,1}^{2}+{\sigma }_{g,2}^{2} & {\sigma }_{g,1}^{2} & {\sigma }_{g,2}^{2}\\ {\sigma }_{g,1}^{2} & {\sigma }_{13}^{2}+{\sigma }_{g,1}^{2}+{\sigma }_{g,3}^{2} & {\sigma }_{g,3}^{2}\\ {\sigma }_{g,2}^{2} & {\sigma }_{g,3}^{2} & {\sigma }_{23}^{2}+{\sigma }_{g,2}^{2}+{\sigma }_{g,3}^{2}\end{array}\right)$

Note. Here, $B=\left(\displaystyle \genfrac{}{}{0em}{}{N}{2}\right)$ is the number of baselines.

Download table as:  ASCIITypeset image

The likelihood of observing a collection of $B=N(N-1)/2$ residual visibility phases under a given source and gain model is then

Equation (37)

with a similar construction for log visibility amplitudes ${\boldsymbol{a}}$. This likelihood reduces to the simple case of statistically independent measured visibilities in the limit of zero systematic gain error (i.e., perfectly calibrated data).

4.2. Model Specifications

We consider the simple geometric truth image shown in the left panel of Figure 7. This image is constructed from the sum of two elliptical Gaussian components that are symmetrically positioned about the origin with a mutual separation of $\xi =50$ μas and a position angle of η = 75 degrees east of north. Both components have major and minor axis Gaussian σ-values of 9 μas and 6 μas, respectively, and each has a flux density of 0.5 Jy. The major axis of the eastern component is oriented at 30 degrees east of north, while the western component has a −30 degree orientation. These specific choices of parameter values are largely arbitrary, and they serve primarily to give the image sufficient asymmetry to produce nontrivial visibility phases and sufficient compactness to produce nonzero visibility amplitudes.

Figure 7.

Figure 7. Truth image (left panel) and (u, v) coverage (right panel) for the model considered in Section 4. The truth image contains two elliptical Gaussian components with an arbitrary but specific separation and relative orientation angle; all defining parameters are fixed during model fitting except for the component separation ξ and position angle η. The (u, v) coverage represents that from a single simultaneous observation with mutual visibility to all stations; we consider both an array with N = 8 stations of equal sensitivity as well as a subset containing only N = 5. The resulting baseline S/N measurements span a factor of ∼20.

Standard image High-resolution image

To produce synthetic visibility data, we sample the Fourier transform of the truth image at discrete locations in (u, v) space. We consider two sets of (u, v) coverage, corresponding to (1) a single snapshot from an N = 8 station array with mutual visibility to all stations and (2) an N = 5 station subset of that array. Both sets of coverage are shown in the right panel of Figure 7. Visibility amplitudes and phases are given by the magnitude and argument of the complex visibilities from each (u, v) point. The visibilities are then multiplied by their associated station gains, which are simulated as complex Gaussian-distributed random variables with unit mean and standard deviation of 0.1 along each dimension. We add a single realization of Gaussian thermal noise to the visibilities corresponding to a median S/N of 10.2 and spanning an S/N range from 2.3 to 44.9. Closure phases are constructed from the visibility phases using Equation (C10), and log-closure amplitudes are constructed from the visibility amplitudes using Equation (C17).

We model the data as the sum of two elliptical Gaussians, with all parameters except for ξ and η held fixed at their corresponding truth values. By restricting the model to this two-dimensional subspace of its natural 12-dimensional parameter space, we simplify the fitting process while retaining enough model complexity to provide nontrivial parameter correlations. We perform parameter estimation using Gaussian likelihoods analogous to Equation (37) for all data products. Unless otherwise specified, we apply uniform priors on the range [40, 60] μas for ξ and [0, 180] degrees for η; when fitting gains, our "maximally uninformative" priors are log-uniform on the range [10−5, 105] for all gain amplitudes and uniform priors on the range [0, 360) degrees for all gain phases. We use the Python nested sampling code dynesty14 (Speagle 2020) to produce parameter posteriors for all model fits.

4.3. Phase and Amplitude Modeling

We fit the model to our single realization of synthetic data represented in a variety of ways, starting with visibility phases under the assumption that the gain phases are perfectly known (or equivalently, that they are perfectly calibrated). The likelihood function for this representation is given by Equation (37), and because the gain phases are known, the visibility phase covariance matrix is diagonal. Figure 8 shows the two-dimensional (ξ, η) posteriors for such fits to the N = 8 array and N = 5 array data in black contours.

Figure 8.

Figure 8. Joint posterior distributions for residual separation (ξ − ξ0) and position angle (η − η0) when fitting the model described in Section 4 to visibility phases with perfectly known gain phases (black contours), visibility phases with completely unknown gain phases (gray contours), and closure phases with covariant structure accounted for (red dashed contours). We also show the results from numerically marginalizing over the gains (thin black contours), which match the covariant treatment as expected (see Appendix C.5). The model is fitted to the eight-station array data on the left and to the five-station array data on the right; we can see that the relative loss of information when going from perfectly calibrated phases to closure phases increases for smaller arrays. In both cases, the closure phase fits accurately recover the posteriors derived from visibility phase fits, within sampling uncertainties. Contours enclose 50%, 90%, and 99% of the posterior probability.

Standard image High-resolution image

We also fit the model to visibility phase data without assuming any a priori knowledge of the gain phases. The likelihood function remains Equation (37), but in this case, the covariance matrix is no longer diagonal. The gray contours in Figure 8 show the corresponding joint posteriors for (ξ, η), which exhibit the expected loss of constraining power compared to the posteriors derived from calibrated visibility phases. We can see that this loss becomes less severe as the number of stations increases, a consequence of the fact that the fraction of the visibility phase information required to constrain the gain phases decreases as 2/N.

The other phase data representations we consider are closure phases. For a minimal subset of closure phases described by covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ (see Appendix B.2), the likelihood function is given by Equation (11). Posteriors derived from this likelihood are shown as red dashed contours in Figure 8. Within the ∼1% numerical sampling uncertainties of our posterior contours, the closure phases provide parameter constraints that are identical to those imposed by the uncalibrated visibility phases.

We perform a corresponding set of model fits to visibility amplitudes and log-closure amplitudes rather than visibility phases and closure phases. The black contours in Figure 9 show the (ξ, η) posteriors for fits to the N = 8 and N = 5 array visibility amplitude data, using a likelihood analogous to Equation (37) under the assumption of perfectly calibrated gain amplitudes. The gray contours show fits to visibility amplitudes using the same likelihood but assuming no knowledge of the gain amplitudes. We again see the relative loss of information increasing as the number of stations decreases, becoming particularly severe for the case of N = 5 (in which there are only three degrees of freedom remaining in the data to constrain the model, compared to eight degrees of freedom when the gain amplitudes are calibrated).

Figure 9.

Figure 9. Same as Figure 8 but for visibility amplitudes and log-closure amplitudes rather than visibility phases and closure phases. Contours enclose 50%, 90%, and 99% of the posterior probability.

Standard image High-resolution image

We compare the visibility amplitude results to those obtained from fitting to log-closure amplitudes. For a minimal subset of log-closure amplitudes described by a covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ (see Appendix B.3), the likelihood function is given by Equation (19). The posteriors derived using this likelihood expression are plotted in Figure 9 as red dashed contours. As with the closure phases, we find that the log-closure amplitudes provide constraints that are identical to those provided by the uncalibrated visibility amplitudes.

We also consider two alternative treatments of the closure phases and log-closure amplitudes that attempt to avoid accounting for covariances, and we show here that these efforts fail. In the first such treatment, we use a minimal closure phase subset but assume all measurements are independent. This assumption amounts to using only the diagonal elements of ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ (i.e., all off-diagonal elements are set to zero), and the likelihood function remains Equation (11). The red and blue contours in the left panel of Figure 10 show posteriors derived under this assumption, for two different choices of minimal closure phase subset constructed by ordering the stations from lowest to highest (red) and highest to lowest (light blue) mean baseline S/N. We can see that these contours systematically deviate from the visibility phase contour. In the second treatment, we use the maximal (redundant) set of closure phases (see Appendix B.2), but we retain the assumption that all measurements are independent. The likelihood is then simply the product of the individual measurement likelihoods taken over all closure phases in the maximal set. The dotted black contour in the left panel of Figure 10 represents the resulting posterior after scaling the individual measurement variances by

Equation (38)

which is a redundancy factor that accounts for the fact that the maximal set contains an increased number of measurements without a corresponding increase in the number of degrees of freedom. Even after accounting for this redundancy, however, we see a similar systematic discrepancy in the posterior relative to those derived from the visibility phases. Note that for the unusual case of equal S/N on all baselines, this redundancy factor scaling does produce the correct likelihood (see Appendix C).

Figure 10.

Figure 10. Left panel: same as the left panel of Figure 8 but showing additional posteriors as calculated for different closure phase (CP) likelihood constructions. The posteriors obtained when including closure phase covariance (dashed contours) demonstrate the most consistency with the uncalibrated visibility phase posterior (yellow contour). The solid blue and red contours show posteriors constructed from fitting a minimal closure phase set but ignoring covariance, while the dotted black contour shows the posterior constructed from a maximal (redundant) set of closure phases that has been corrected for redundancy factor (see Equation (38)). These three contours that do not account for covariance show artificial distortions in the confidence regions. Right panel: analogous to the left panel but for the log-closure amplitudes (LCA) rather than closure phases; the redundancy factor for the maximal set is given by Equation (39). The contours in both panels enclose 90% of the posterior probability.

Standard image High-resolution image

In the right panel of Figure 10, we again compare the posteriors obtained using (1) a minimal set of log-closure amplitudes without accounting for covariance, and (2) a maximal set of log-closure amplitudes. In both cases, the likelihood function is the product of the individual measurement likelihoods, where the product is taken over all log-closure amplitudes in the minimal or maximal set, as appropriate. We again consider two choices of minimal subset, constructed via the same station-ordering scheme used for phases. For the posteriors derived from the maximal set, shown using a dotted black contour in the right panel of Figure 10, we have scaled the measurement variances by the redundancy factor

Equation (39)

Regardless of the redundancy correction, we find in both cases that the posterior distributions do not match those expected from fits to the visibility amplitudes.

4.4. Gain Uncertainty Modeling

We have shown that our level of knowledge about the gains dictates how much source information the closure quantities contain relative to the visibility quantities. For perfectly known (or, equivalently, perfectly calibrated) gains, the visibility quantities provide more information about the source than the closure quantities; for small arrays, this difference may be quite large (see, e.g., Figure 9). When the gains are completely unknown (or, equivalently, when the gains must be fully determined along with the source information), both the visibility and closure quantities contain identical source information. We now explore the case of partially known gains.

We quantify how well we know the gains by comparing our gain uncertainty, σi, to the uncertainties in the data (i.e., in the visibilities), σij, using

Equation (40)

where $\left\langle \right\rangle $ denotes a sample average; the averages are taken over all stations and all baselines for the gain uncertainties and visibility uncertainties, respectively. The quantity ε tracks our knowledge of the gains; $\varepsilon \to 0$ when we have no information about the gains, and $\varepsilon \to \infty $ when the gains are perfectly calibrated. Note that both σi and σij refer to logarithmic uncertainties when considering amplitude data products, meaning that ε can also be thought of as the ratio of the "gain S/N" to the data S/N.

Within the context of our model-fitting procedure, the assumed level of gain knowledge can be straightforwardly incorporated using Gaussian priors on the gain parameters. Figure 11 shows the results of fitting to visibility quantities while varying the value of ε. We find that noticeable improvements in the posterior constraints start to occur for ε ≳ 1, and that for ε ≳ 10, the posteriors better approximate the perfect-knowledge case (black contours in Figure 11) than they do the no-knowledge case (gray contours). This matches our expectation that knowledge of gains begins to inform an overconstrained model as soon as its precision approaches that of the thermal uncertainties. For an underconstrained problem such as imaging using a sparse array, typical regularization imposes much weaker relationships across points in the (u, v) domain, and partial gain calibration can matter much earlier by providing unique information not sampled by the closure quantities.

Figure 11.

Figure 11. Left panel: posterior contours for fits to visibility phases with varying degrees of prior gain phase knowledge assumed. The gray contour matches the posterior recovered when fitting to closure phases (see Figure 8). Right panel: same as the left panel but fitting to visibility amplitudes rather than visibility phases. The contours in both panels enclose 90% of the posterior probability.

Standard image High-resolution image

While the demonstrations presented here are all done using simulated observations, the recent parameterization of the horizon-scale emission and shadow of the supermassive black hole in M87 by the Event Horizon Telescope Collaboration et al. (2019c) utilized cross-validation of results across several techniques to handle gain uncertainty. These included explicit semi-analytic marginalization of amplitude gains by Laplace approximation (via Themis; Broderick et al. 2020), minimization of closure phase covariance through selection of a highly sensitive reference antenna (ALMA), and use of diagonalized closure phases and log-closure amplitudes by accounting for covariance (via dynesty, as described in this work). The multiple approaches resulted in a high degree of consistency as reflected by their posterior parameter distributions. A detailed study of the effects of covariant interferometric errors on imaging and on parameter reconstruction is forthcoming (D. W. Pesce et al. 2020, in preparation).

5. Summary

We have explored in detail the statistics of closure phase and closure amplitude for S/N ≳ 1, characteristic of high-frequency radio interferometry where both phase and amplitude calibration have significant uncertainties, and where phase coherence timescales are short relative to the length of a continuous observation. The analysis unifies and clarifies several concepts that have been previously discussed in the literature regarding the independence of closure quantities, the nature and number of statistical degrees of freedom, best practices for constructing and fitting to closure quantities, and the relationship of closure quantities to self calibration and marginalization over unknown gains. Due to the large number of topics covered, we delineate the main statements and findings from this work across three primary topics.

(1) Formation of closure quantities and non-Gaussian errors:

  • 1.  
    Non-Gaussian errors become significant for S/N below ∼2–5 for phase, amplitude, and log amplitude. Reciprocal amplitude is unstable below S/N ∼5, which provides motivation to use log amplitude instead when there is a chance for low-S/N amplitudes to appear in the denominator of an amplitude ratio. (Appendix A)
  • 2.  
    The ensemble distribution of measured log amplitude for known S/N is fully characterized in terms of moments, from which expected distributions for log-closure amplitude can be derived. (Appendix A)
  • 3.  
    In practice, a noisy estimate of S/N prevents a reliable characterization of phase and amplitude errors, particularly for weak signals, and can lead to significant bias from self-selection of data. (Section 2.4)

(2) The covariance structure for closure quantities—fitting those quantities to a model and characterizing their fundamental degrees of freedom:

  • 1.  
    Closure quantities formed from a common set of baseline visibilities are covariant due to shared thermal noise. This must be included for a particular realization to recover both proper χ2 statistics and a correct likelihood. (Section 3.1)
  • 2.  
    When covariant errors are included, both the χ2 and the likelihood (in the Gaussian limit) are independent of the specific minimal set of closure quantities used for the calculation. (Section 3.1)
  • 3.  
    In the limit of equal S/N on all baselines, the χ2 for a specific minimal set reduces to an evaluation over all closure quantities weighted equally, scaled to the appropriate degrees of freedom. (Appendix C)
  • 4.  
    If closure quantities are assumed to be independent and the covariance structure is ignored, results do depend on the specific choice of minimal set. Certain selections can be chosen to minimize off-diagonal terms in the covariance matrix, but this choice depends on the specific arrangement of baseline S/N. (Section 3.2, Figure 10 of Section 4)
  • 5.  
    Two different direct constructions for selecting nonredundant sets of closure amplitudes are given. They verify explicitly the expected N(N − 3)/2 independent degrees of freedom contained in the closure amplitudes. (Section 3.2 and Appendix B.3)
  • 6.  
    A unified matrix construction for creating visibilities and closure quantities is given, which systematically builds up design matrices for increasing station number. These are used to derive the covariance and other relationships across different quantities. (Appendix B)

(3) Relationship of closure information to station gain information, and behavior in the limit of completely known gains, partially known gains, and completely unknown gains:

  • 1.  
    Under a model for systematic station-based gain error, residual visibility phase and log amplitude also assume a covariance structure with nonzero off-diagonal elements due to gain model error. (Section 4.1)
  • 2.  
    Using this covariance structure for visibilities is equivalent to explicitly marginalizing over additional free gain parameters under a Gaussian prior. We note however that wide log-amplitude gain priors will often be a poor characterization of expected telescope performance, which is bounded. A standard Bayesian approach of direct numerical marginalization over nuisance gain parameters would be needed to take full advantage of more realistic priors. (Appendix C.5)
  • 3.  
    In the limit of small thermal error compared to gain error, the χ2 derived from visibility measurements reduces to the χ2 derived from only closure quantities, after accounting for covariance. Thus, the closure quantities contain all non-station-based information. (Appendix C.4)
  • 4.  
    We apply the likelihood constructions introduced in this paper toward direct sampling of the posterior distribution of a simple source model and simulated observation. We confirm that the inferred parameter posterior derived using closure quantities matches that derived using baseline visibilities in the limit of unknown gains and that the uncertainties are larger than those derived under known gain calibration, reflecting the relative loss of information. (Section 4)
  • 5.  
    Under modeling of partially known gains, with systematic station gain uncertainty comparable to that from baseline thermal noise, we see that the model posterior distribution transitions smoothly between the case of perfectly calibrated, corresponding to zero gain error, and completely unknown calibration, using only closure quantities. (Section 4.4)

The authors thank Jim Moran, Geoff Bower, Ramesh Narayan, Kazunori Akiyama, Katie Bouman, Christiaan Brinkerink, Avery Broderick, Andre Young, Josh Speagle, and the anonymous referee for helpful discussions, comments, and ideas. We thank the National Science Foundation (AST-1440254, AST-1614868) and the Gordon and Betty Moore Foundation (GBMF-5278) for financial support of this work. This work was supported in part by the Black Hole Initiative at Harvard University, which is supported by a grant from the John Templeton Foundation and the Gordon and Betty Moore Foundation.

Appendix A: Distributions due to Thermal Noise

Here, we discuss the statistical distributions of measured amplitude and phase quantities used for model fitting, quality of the normal distribution approximations, and influence of the estimation of an intrinsic signal-to-noise parameter $\breve{\rho }$. In the thermal-noise-dominated regime, the fundamental measured quantity, complex correlation coefficient r, follows a circularly symmetric complex normal distribution with mean $\breve{\rho }{\sigma }_{r}$. Without loss of generality, we choose our coordinates such that the mean of the complex correlation distribution is real. An associated standard deviation of both real and imaginary components σr can be computed from first principles (Thompson et al. 2017). Hence, it is useful to work with the normalized complex random variable r/σr, with unit standard deviation (Figure A1). Probability densities for closure quantities, as shown in Figures 1 and 2, can then be derived from the ones presented here with elementary operations such as convolution.

Figure A1.

Figure A1. Two-thousand random realizations of the measured complex correlation coefficient r/σr given intrinsic signal-to-noise $\breve{\rho }$ = 2 and 5.

Standard image High-resolution image

A.1. Phase and Amplitude Distributions

The correlation coefficient phase ϕ is the argument of a circular complex normal variable and, as such, obeys the following circular distribution (Thompson et al. 2017):

Equation (A1)

We choose the coordinates in such a way that the true visibility phase is zero, and we denote the error function with Erf. Examples showing the probability density $p(\phi | \breve{\rho })$ for different values of $\breve{\rho }$ are shown in the top left panel of Figure A2. This somewhat complicated distribution can be approximated either by a normal distribution (dashed lines in Figure A2),

Equation (A2)

or by the von Mises distribution (dotted lines in Figure A2; Christian & Psaltis 2019),

Equation (A3)

which outperforms the normal distribution for low S/N.

Figure A2.

Figure A2. Analytic distributions of phase and amplitude quantities (continuous lines). Normal distribution approximations, exact in the $\breve{\rho }\to \infty $ limit, are shown with dashed lines. For the visibility phase (top left panel), the von Mises distribution approximation is shown with a dotted line. All presented approximations assume knowledge of a hidden parameter $\breve{\rho }$, which in general must be estimated from noisy measurements.

Standard image High-resolution image

For a given model $\breve{\rho }$, the measured normalized correlation coefficient amplitude $\rho =| r/{\sigma }_{r}| \geqslant 0$ follows a Rice distribution,

Equation (A4)

where I0 is a modified Bessel function of the first kind with order zero. Distributions of visibility amplitude are shown in Figure A2 (top right panel). The dashed lines represent a normal distribution approximation with mean ${m}^{2}=({\breve{\rho }}^{2}+1)$ and unit standard deviation, which is accurate in the limit $\breve{\rho }\to \infty $.

The normal approximation can not properly handle strictly nonnegative random variables, which becomes a problem at low S/N. The mean of the correlation amplitude is also positively biased with respect to $\breve{\rho }$ due to its noise contribution, and we find $E\left[\rho \right]$ = 2.272 for $\breve{\rho }=2$ and $E\left[\rho \right]=5.101$ for $\breve{\rho }=5$, which illustrates why debiasing is important for incoherent averaging over many realizations and for estimating $\breve{\rho }$ from low-S/N data.

When working with closure amplitudes, we need to utilize the reciprocal amplitudes y = 1/ρ, distributed according to

Equation (A5)

Although this distribution can be approximated at high S/N as a normal distribution with mean ${m}^{2}={({\breve{\rho }}^{2}+1)}^{-1}$ and standard deviation ${\breve{\rho }}^{-2}$ (Figure A2 bottom left panel), the probability distribution exhibits heavy tails at the low S/N, related to inversion of potentially arbitrarily small amplitude. The fact that amplitude is always positive is one indication that log amplitude might be a more natural space in which to characterize the distribution. Another benefit of using log amplitude is that amplitude and squared-amplitude (a more natural quantity for incoherent sums of Gaussian components) are simply related. Logarithms of the correlation amplitude $z=\mathrm{log}\rho $ ($\mathrm{log}$ denotes a natural logarithm) obey the following log–Rice distribution:

Equation (A6)

The distributions of the logarithm of amplitude for different $\breve{\rho }$ are shown in Figure A2, bottom right panel. Moments of the log–Rice distribution are treatable analytically, and the distribution can be approximated with a normal distribution of mean $m=0.5\mathrm{log}({\breve{\rho }}^{2}+1)$ and standard deviation $1/\breve{\rho }$. A more general exact treatment of incoherent averages of M amplitude measurements follows.

A.2. Log-amplitude Ensemble Distribution

Amplitude gain is a property of antenna efficiency and system noise and is generally quite stable when compared to variations in atmospheric phase. This leads to the concept of incoherent averaging for a series of amplitude measurements (Rogers et al. 1995; Johnson et al. 2015). Consider a set of M independent complex visibility measurements vi where each complex component has thermal noise of 1. Thus, ${v}_{i}={\breve{\rho }}_{i}+{n}_{i}$ where ${\breve{\rho }}_{i}$ is some expected signal-to-noise ratio for each measurement, and ni is a Gaussian complex random variable with σ = 1 for each component. The sum-squared amplitudes follow a χ2 distribution with 2M degrees of freedom. This will be a noncentral χ2 distribution if it includes a nonzero expected source contribution.

Equation (A7)

where λ is the non-centrality parameter,

Equation (A8)

The expectation value of $\mathrm{log}x$ is

Equation (A9)

and g(·) is the function (Lapidoth & Moser 2003)

Equation (A10)

where γ ≈ 0.577 is the Euler–Mascheroni constant and $\mathrm{Ei}$ is the exponential integral. We have introduced σ for the case where amplitudes are uniformly scaled away from σ = 1. From this, the expectation value $E\left[\mathrm{log}\sqrt{x}\right]=E\left[\mathrm{log}x\right]/2$ is easy to calculate, for example, in the case of a single Rice-distributed complex visibility (where "measured" $\rho =| v| $)

Equation (A11)

The log-closure amplitude c is formed from the linear combination of four log amplitudes A, B, C, D,

Equation (A12)

so that the expectation value (from which a bias is derived) is trivial,

Equation (A13)

To characterize the distribution of measured closure amplitudes, we require additional moments beyond the first moment (bias). For a multivariate Gaussian approximation suitable for a least-squares fitting of log-closure quantities with known covariance, we need to estimate the second moment of each log amplitude. High-order moments of the log-noncentral χ2 distribution can be derived as Poisson-weighted infinite series of polygamma functions ψ(m)(z) (Pav 2015),

Equation (A14)

where ${\mu }_{{k}_{2M+2j}}^{{\prime} }$ is the kth moment (not central moment) of a log chi-square (λ = 0) distribution with 2M + 2j degrees of freedom,

Equation (A15)

Equation (A16)

in terms of cumulants κn. Note that the second and third cumulants are equal to the corresponding central moments. The cumulants in terms of noncentral moments are

Equation (A17)

For a single log-central χ2 distribution of two degrees of freedom (i.e., an exponential distribution with mean value 2), the first and second cumulants are particularly simple,

Equation (A18)

Equation (A19)

which are the same as the cumulants for a log-Rayleigh distribution scaled appropriately by a factor of two.

Recurrence relationships for polygamma can be used to quickly derive cumulants, including higher-order cumulants, of the log-central χ2 distribution of 2M degrees of freedom. Aside from κ1, the higher-order cumulants approach zero as $M\to \infty $. We see that calculating cumulants at increasing number of degrees of freedom is simply adding one more term to the series.

Equation (A20)

Equation (A21)

Equation (A22)

Equation (A23)

or, more generally,

Equation (A24)

These relatively simple expressions are for the cumulants of a log central χ2 distribution. They must be converted into moments of a noncentral distribution by summing over the appropriate Poisson mixture (Equation (A14)), which depends on the non-centrality parameter. The noncentral moments can then be converted back into cumulants of the noncentral distribution to build cumulants of the log-closure amplitude distribution. For a log-closure amplitude c = A + B − C − D, the cumulants of c are formed as

Equation (A25)

Equation (A26)

Equation (A27)

Equation (A28)

and so on. Figure A3 shows the first four moments calculated this way using a finite number of nonzero terms from the Poisson mixture, and compared to a Monte Carlo estimation.

Figure A3.

Figure A3. Moments of the log-noncentral χ2 distribution (two degrees of freedom) as a function of signal-to-noise. Blue dots correspond to a Monte Carlo estimation, while orange lines correspond to the moment expansion (Equation (A14)) over a finite number of Poisson terms. The noncentral χ2 distribution itself is not well captured by a small number of moments (due to the tail), but for log-closure amplitude, the propagated moments (Equations A25A28) can be used to fit to good approximations such as an exponentially modified Gaussian distribution.

Standard image High-resolution image

A.3. Quality of Distribution Approximations

The true underlying value of $\breve{\rho }$ remains generally unknown, and our ability to estimate $\breve{\rho }$ will influence the quality of our derived distribution for the measured value. This contributes a source of error in addition to a possible mismatch due to any approximations used. In Figure A4, we evaluate the influence of both these effects using a χ2 test and also by calculating the Kullback–Leibler divergence between the ground truth and a normal distribution characterized by the two approximated moments. At low S/N, uncertainties are typically underestimated leading to large χ2 values. In the context of inferred model parameters, this leads to erroneously narrow derived posteriors.

Figure A4.

Figure A4. The four panels show the quality of the Normal approximation for different phase and amplitude distributions as a function of model S/N. The solid lines show the expected squared value of the normalized residual quantity (expected reduced χ2) from the legend, while the dashed lines show the relative entropy (Kullback–Leibler divergence) between the true distribution of each quantity and a standard Normal distribution. For example, the values at $\breve{\rho }=2$ reflect an ensemble of complex visibilities with intrinsic $\breve{\rho }=2$ and measured $\rho =| r/{\sigma }_{r}| $ for each random realization (see Figure A1). The orange line in the top left panel thus corresponds to an expected squared deviation in measured phase $\phi =\mathrm{Arg}\left[\rho \right]$ away from the truth value $\breve{\phi }$, where the deviation is normalized by an empirical error estimate σϕ = 1/ρ. Other curves show different error estimates based on the model $\breve{\rho }$ (which is typically not known in a real observation), or a noise-debiased estimate ${\rho }_{\mathrm{deb}}=\sqrt{{\rho }^{2}-1}$. For log amplitude, μ corresponds to the small expected bias from Equation (A11).

Standard image High-resolution image

Given knowledge of true $\breve{\rho }$, it is possible in principle to achieve perfect statistics due to full knowledge of the distribution, rather than the high-S/N approximations used in the figure. However, this does not extend to the empirical (realistic) estimators. Furthermore, we see that the estimator with χ2r closest to one is not always the estimator with the best Gaussianity according to the KL divergence. Lastly, although reciprocal amplitude is very difficult to characterize due to values near zero, motivating the use of log amplitude, visibility amplitude itself is comparatively well behaved and easy to approximate, even to low S/N.

Appendix B: Design and Covariance Matrix Construction

B.1. Baseline Phase and Amplitude Matrices

A pair of complex visibilities may share a station, so station-based gain effects result in covariances between visibility measurements. The covariance between two visibility phase measurements ${\phi }_{{ij}}$ and ${\phi }_{k{\ell }}$ can be expressed as

Equation (B1)

where ${\sigma }_{{ij}}^{2}$ is the thermal variance of the visibility phase measurement ${\phi }_{{ij}}$, ${\sigma }_{\theta ,i}^{2}$ is the gain phase variance for station i, and δij is the Kronecker delta. A similar expression holds for the covariance between two log visibility amplitude measurements ${a}_{{ij}}$ and ${a}_{k{\ell }}$,

Equation (B2)

where ${\sigma }_{{ij}}^{2}$ is the thermal variance of the log visibility amplitude measurement ${a}_{{ij}}$ and ${\sigma }_{g,i}^{2}$ is the log gain amplitude variance for station i.

We can see from Equations (B1) and (B2) that the visibility measurement covariances separate into baseline-based and station-based terms. We can thus write the visibility phase covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ as

Equation (B3)

The ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\theta }}}$ and ${{\boldsymbol{S}}}_{{\boldsymbol{\phi }}}$ matrices are $N$ × $N$ and $B$ × $B$ diagonal matrices constructed from the individual station gain phase variances and visibility phase variances, respectively. The visibility phase "design matrix" ${\boldsymbol{\Phi }}$ is rectangular in general, with $B$ rows and $N$ columns, and provides a mapping from the station-based representation to the baseline-based representation. Each row of ${\boldsymbol{\Phi }}$ contains only two nonzero entries, the first being a 1 and the second being a −1. There are $B$ different ways of writing a length-$N$ row in this fashion, and these constitute the $B$ rows of the matrix. The ordering of these rows depends on the chosen baseline ordering scheme. In this section, we assume a "second station first" baseline ordering scheme, which increments the visibility phases via a nested loop method. The "inner loop" iterates through the second station in increasing order, and the "outer loop" iterates through the first station in increasing order. An example such ordering would be (${\phi }_{12}$, ${\phi }_{13}$, ${\phi }_{14}$, ..., ${\phi }_{1N}$, ${\phi }_{23}$, ${\phi }_{24}$, ..., ${\phi }_{2N}$, ${\phi }_{34}$, ..., ${\phi }_{N-1,N}$), where the indices here correspond to the two stations forming each baseline.

For a general $N$-station array with $N$ > 2, we present the following recursive relationship for the visibility phase design matrix:

Equation (B4)

where $1$ is an ($N$ − 1) × 1 vector containing only 1s, $0$ is an $\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)\times 1$ vector containing only 0s, ${{\boldsymbol{I}}}_{N-1}$ is the identity matrix of rank $N-1$, and ${{\boldsymbol{\Phi }}}_{N-1}$ is the visibility phase design matrix for an array with $N-1$ stations. The rank of ${{\boldsymbol{\Phi }}}_{N}$ is equal to $N-1$. Table B1 lists examples of ${\boldsymbol{\Phi }}$ and the corresponding ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ matrices.

The log visibility amplitude design matrix ${\boldsymbol{A}}$ shares the same structure as the visibility phase design matrix, with the only difference being that the negative elements of ${\boldsymbol{\Phi }}$ become positive for ${\boldsymbol{A}}$. As a result, for $N\gt 2$, the rank of ${{\boldsymbol{A}}}_{N}$ for the log visibility amplitudes is equal to $N$. Table B2 lists examples of log visibility amplitude design and covariance matrices.

B.2. Closure Phase Matrices

It is possible for two closure triangles to have a baseline in common, so in general, two closure phase measurements may be covariant. The covariance between two closure phase measurements ${\psi }_{{ijk}}$ and ${\psi }_{{\ell }{mn}}$ can be expressed as

Equation (B5)

This lengthy expression encodes two symmetries of closure phases. The first symmetry is a "cycling invariance",

Equation (B6)

which indicates that the choice of starting baseline does not affect the value of the closure phase. The second symmetry is a sign flip imparted upon reversing the direction of the sequence,

Equation (B7)

These symmetries are illustrated in Figure B1.

Figure B1.

Figure B1. Diagrams of closure phase symmetries for a single triangle containing stations i, j, and k, with baselines numbered in the sequence used to construct the closure phase. All closure phases in the left block of diagrams have the same value, and all closure phases in the right block of diagrams have the same value; however, the values corresponding to the two blocks of diagrams differ in sign. The left three diagrams illustrate closure phases constructed in a clockwise manner using a different starting baseline each time; the value of the closure phase is invariant to the choice of starting baseline (Equation (B6)). The left and right blocks of diagrams differ by a reversal in the direction of closure phase construction; the value of the closure phase changes sign upon direction reversal (Equation (B7)).

Standard image High-resolution image

As with the visibilities (see Appendix B.1), we can construct a design matrix ${\boldsymbol{\Psi }}$ that maps from the visibility phase space to the closure phase space,

Equation (B8)

where ${\boldsymbol{\phi }}$ and ${\boldsymbol{\psi }}$ are vectors of visibility phases and closure phases, respectively. This design matrix allows us to express the closure phase covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ in terms of the visibility phase covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$,

Equation (B9)

By construction, the closure phases use combinations of visibility phases for which the gain contributions cancel, a property referred to as "phase aberration annihilation" by Lannes (1991). This cancellation manifests in the design matrices as well: the product of closure phase and visibility phase design matrices evaluates to the zero matrix,

Equation (B10)

We can thus express the closure phase covariance matrix more simply in terms of the diagonal matrix containing only visibility phase thermal variances,

Equation (B11)

For a general $N$-station array with $N$ > 3, we present the following recursive relationship for constructing the design matrix corresponding to a maximal set of closure phases:

Equation (B12)

where ${{\boldsymbol{\Phi }}}_{N-1}$ is the visibility phase design matrix for an array with $N-1$ stations (see Equation (B4)), $0$ is an $\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{3}\right)\times (N-1)$ matrix containing only 0s, ${{\boldsymbol{I}}}_{\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)}$ is the identity matrix of rank $\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)$, and ${{\boldsymbol{\Psi }}}_{N-1,\max }$ is the maximal closure phase design matrix for an array with $N-1$ stations.

To obtain a minimal (nonredundant) set of closure phases for an $N$-station array, we can use a modified design matrix:

Equation (B13)

Table B3 lists example closure phase design and covariance matrices.

Table B3.  Closure Phase Design and Covariance Matrices for Three- and Four-element Arrays

    Number of Stations ($N$)
Matrix Shape $N=3$ $N=4$
${{\boldsymbol{\phi }}}^{\top }$ $1\times B$ $\left(\begin{array}{ccc}{\phi }_{12} & {\phi }_{13} & {\phi }_{23}\end{array}\right)$ $\left(\begin{array}{cccccc}{\phi }_{12} & {\phi }_{13} & {\phi }_{14} & {\phi }_{23} & {\phi }_{24} & {\phi }_{34}\end{array}\right)$
${{\boldsymbol{\Psi }}}_{\max }$ T × B $\left(\begin{array}{ccc}1 & -1 & 1\end{array}\right)$ $\left(\begin{array}{cccccc}1 & -1 & 0 & 1 & 0 & 0\\ 1 & 0 & -1 & 0 & 1 & 0\\ 0 & 1 & -1 & 0 & 0 & 1\\ 0 & 0 & 0 & 1 & -1 & 1\end{array}\right)$
${\boldsymbol{\Psi }}$ t × B $\left(\begin{array}{ccc}1 & -1 & 1\end{array}\right)$ $\left(\begin{array}{cccccc}1 & -1 & 0 & 1 & 0 & 0\\ 1 & 0 & -1 & 0 & 1 & 0\\ 0 & 1 & -1 & 0 & 0 & 1\end{array}\right)$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }},\max }$ T × T $\left(\begin{array}{c}{\sigma }_{12}^{2}+{\sigma }_{13}^{2}+{\sigma }_{23}^{2}\end{array}\right)$ $\left(\begin{array}{cccc}{\sigma }_{12}^{2}+{\sigma }_{13}^{2}+{\sigma }_{23}^{2} & {\sigma }_{12}^{2} & -{\sigma }_{13}^{2} & {\sigma }_{23}^{2}\\ {\sigma }_{12}^{2} & {\sigma }_{12}^{2}+{\sigma }_{14}^{2}+{\sigma }_{24}^{2} & {\sigma }_{14}^{2} & -{\sigma }_{24}^{2}\\ -{\sigma }_{13}^{2} & {\sigma }_{14}^{2} & {\sigma }_{13}^{2}+{\sigma }_{14}^{2}+{\sigma }_{34}^{2} & {\sigma }_{34}^{2}\\ {\sigma }_{23}^{2} & -{\sigma }_{24}^{2} & {\sigma }_{34}^{2} & {\sigma }_{23}^{2}+{\sigma }_{24}^{2}+{\sigma }_{34}^{2}\end{array}\right)$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ t × t $\left(\begin{array}{c}{\sigma }_{12}^{2}+{\sigma }_{13}^{2}+{\sigma }_{23}^{2}\end{array}\right)$ $\left(\begin{array}{ccc}{\sigma }_{12}^{2}+{\sigma }_{13}^{2}+{\sigma }_{23}^{2} & {\sigma }_{12}^{2} & -{\sigma }_{13}^{2}\\ {\sigma }_{12}^{2} & {\sigma }_{12}^{2}+{\sigma }_{14}^{2}+{\sigma }_{24}^{2} & {\sigma }_{14}^{2}\\ -{\sigma }_{13}^{2} & {\sigma }_{14}^{2} & {\sigma }_{13}^{2}+{\sigma }_{14}^{2}+{\sigma }_{34}^{2}\end{array}\right)$

Note. Here, $T=\left(\displaystyle \genfrac{}{}{0em}{}{N}{3}\right)$ is the number of triangles in a maximal set, $t=\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)$ is the number of triangles in a minimal set, and $B=\left(\displaystyle \genfrac{}{}{0em}{}{N}{2}\right)$ is the number of baselines.

Download table as:  ASCIITypeset image

B.3. Log-closure Amplitude Matrices

A pair of closure quandrangles can have up to two baselines in common, meaning that, in general, log-closure amplitudes will be covariant. The covariance between log-closure amplitude measurements ${c}_{{ijk}{\ell }}$ and ${c}_{{mnpq}}$ can be expressed as

Equation (B14)

where ${\sigma }_{{ij}}^{2}$ is the variance in the log visibility amplitude measurement ${a}_{{ij}}$. There are three symmetries encoded in the above expression. The first of these is a cycling invariance,

Equation (B15)

indicating that, as for the closure phases, the log-closure amplitude value does not change with choice of starting baseline. The second symmetry is a direction invariance,

Equation (B16)

showing that, unlike for closure phases, the log-closure amplitude value does not change when the sequence of baselines is reversed. The third symmetry is a sign flip imparted on the value of the log-closure amplitude upon swapping the numerator and denominator,

Equation (B17)

These symmetries are illustrated in Figure B2.

Figure B2.

Figure B2. Diagrams of log-closure amplitude symmetries for a single quadrangle containing stations i, j, k, and , with baselines numbered in the sequence used to construct the log-closure amplitude; baselines in the numerator of the closure amplitude are filled in black, while those in the denominator are filled in white. All log-closure amplitudes in the left block of diagrams have the same value, and all log-closure amplitudes in the right block of diagrams have the same value; however, the values corresponding to the two blocks of diagrams differ in sign. Within a single block of diagrams, each row illustrates log-closure amplitudes constructed in the same cycle direction but using a different starting baseline; the value of the log-closure amplitude is invariant to the choice of starting baseline (Equation (B15)). Within a single block of diagrams, each column illustrates a reversal in the cycle direction of log-closure amplitude construction; the value of the log-closure amplitude is invariant upon direction reversal (Equation (B16)). The left and right blocks of diagrams differ by a swap of numerator and denominator; the value of the log-closure amplitude changes sign upon swapping numerator and denominator (Equation (B17)).

Standard image High-resolution image

We construct a minimal design matrix ${\boldsymbol{C}}$ that maps from the log visibility amplitude space to the log-closure amplitude space,

Equation (B18)

where ${\boldsymbol{a}}$ and ${\boldsymbol{c}}$ are vectors of log visibility amplitudes and log-closure amplitudes, respectively. This log-closure amplitude design matrix is equivalent to the "amplitude closure operator" of Lannes (1990a) and the "alternate amplitude compilation operator" of Lannes (1991). We express the log-closure amplitude covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ in terms of this design matrix ${\boldsymbol{C}}$ and the log visibility amplitude covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{a}}}$,

Equation (B19)

where, as with Equation (B10), we have used the fact that ${\boldsymbol{CA}}=0$ to simplify the construction. This cancellation is referred to as "amplitude aberration annihilation" by Lannes (1991).

For an $N$-station array with $N\gt 4$, the design matrix for a minimal set of log-closure amplitudes can be constructed using

Equation (B20)

where ${{\boldsymbol{C}}}_{N-1}$ is the design matrix for an array with $N-1$ stations, $0$ is an $\left(\tfrac{(N-1)(N-4)}{2}\right)\times \left(N-1\right)$ matrix of all zeros,

Equation (B21)

and ${{\boldsymbol{Y}}}_{N}$ is an $(N-2)\times \left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)$ matrix constructed by "cycling" through pairs of baselines that do not contain the first station:

Equation (B22)

Here, matrix elements represented by a dot indicate zero-valued entries. Table B4 lists example log-closure amplitude design and covariance matrices.

Table B4.  Minimal Log-closure Amplitude Design and Covariance Matrices for a Four-element Array

    Number of Stations ($N$)
Matrix Shape $N=4$
${{\boldsymbol{a}}}^{\top }$ × B $\left({a}_{12}\ {a}_{13}\ {a}_{14}\ {a}_{23}\ {a}_{24}\ {a}_{34}\right)$
${\boldsymbol{C}}$ q × $B$ $\left(\begin{array}{cccccc}0 & 1 & -1 & -1 & 1 & 0\\ 1 & 0 & -1 & -1 & 0 & 1\end{array}\right)$
${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ q × q $\left(\begin{array}{cc}{\sigma }_{13}^{2}+{\sigma }_{14}^{2}+{\sigma }_{23}^{2}+{\sigma }_{24}^{2} & {\sigma }_{14}^{2}+{\sigma }_{23}^{2}\\ {\sigma }_{14}^{2}+{\sigma }_{23}^{2} & {\sigma }_{12}^{2}+{\sigma }_{14}^{2}+{\sigma }_{23}^{2}+{\sigma }_{34}^{2}\end{array}\right)$

Note. Here, $q=\tfrac{N(N-3)}{2}$ is the number of quadrangles in a minimal set, and $B=\left(\displaystyle \genfrac{}{}{0em}{}{N}{2}\right)$ is the number of baselines.

Download table as:  ASCIITypeset image

Appendix C: Worked Examples of Information Content

For an array of $N$ stations, an accounting of the nonredundant closure phases reveals that they differ by an amount $N-1$ from the number of visibility phases. This offset, which is equal to the number of unique gain phases in the array, suggests that closure phases contain all of the source phase information and that the additional degrees of freedom afforded by the visibility phases only describe the gains. A similar situation holds for the closure amplitudes, where the number of nonredundant quadrangles differs from the number of visibility amplitudes by an amount equal to the number of gain amplitudes, $N$. In the limit where we have no a priori information about the gains, then, the information content in the visibility quantities should be identical to that contained within the closure quantities. In this section, we demonstrate the reality of this equality and its lack of dependence on the specific choice of nonredundant closure subset for some selected test cases.

C.1. Closure Phase for N = 3 Stations

We consider a three-station interferometer with measured visibility phases (${\phi }_{12}$, ${\phi }_{13}$, ${\phi }_{23}$) and model visibility phases (${\hat{\phi }}_{12}$, ${\hat{\phi }}_{13}$, ${\hat{\phi }}_{23}$) related by the station gain phases (${\hat{\theta }}_{1}$, ${\hat{\theta }}_{2}$, ${\hat{\theta }}_{3}$) as

Equation (C1)

The information contained in the measured visibilities is captured by their joint likelihood distribution, ${ \mathcal L }$. If the visibility phases have Gaussian thermal variances (${\sigma }_{12}^{2}$, ${\sigma }_{13}^{2}$, ${\sigma }_{23}^{2}$), and if we assume that the gain contributions are also Gaussian distributed with variances (${\sigma }_{\theta ,1}^{2}$, ${\sigma }_{\theta ,2}^{2}$, ${\sigma }_{\theta ,3}^{2}$), then the likelihood of the measured visibility phases can be expressed as a multivariate Gaussian,

Equation (C2)

where

Equation (C3)

is the vector of visibility phase residuals, and ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ is the visibility phase covariance matrix (see Appendix B.1). Because the likelihood is Gaussian and the variances are constant-valued, the quantity

Equation (C4)

contains the same information as ${ \mathcal L }$ in a more compact form; we will thus proceed through the use of ${\chi }_{\phi }^{2}$ rather than ${ \mathcal L }$.

Our expectation is that in the high-S/N limit (i.e., when the thermal noise is negligible compared to gain variations), the information content in the visibility phases will be identical to that in the closure phases; equivalently, ${\chi }_{\phi }^{2}$ for the visibility phases should equal ${\chi }_{\psi }^{2}$ for the closure phases. To simplify the mathematics and notation, let us now suppose that the array is perfectly homogeneous such that we can denote ${\sigma }_{\theta ,1}^{2}\,={\sigma }_{\theta ,2}^{2}={\sigma }_{\theta ,3}^{2}\equiv {\sigma }^{2}$ and ${\sigma }_{12}^{2}={\sigma }_{13}^{2}={\sigma }_{23}^{2}\equiv {\varepsilon }^{2}{\sigma }^{2}$. The high-S/N limit thus corresponds to ${\varepsilon }^{2}\ll 1$. To leading order in ${\varepsilon }^{2}$, the inverse of the covariance matrix is

Equation (C5)

which corresponds to a ${\chi }_{\phi }^{2}$ in the same limit of

Equation (C6)

Because the array contains only $N=3$ stations, $\left(\displaystyle \genfrac{}{}{0em}{}{N}{3}\right)=\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)$ and the complete set of closure phases is equal to the nonredundant set, both of which contain only a single element. We can write the model closure phase as ${\hat{\psi }}_{123}={\hat{\phi }}_{12}-{\hat{\phi }}_{13}+{\hat{\phi }}_{23}$ and the measured closure phase as ${\psi }_{123}={\phi }_{12}-{\phi }_{13}+{\phi }_{23}$, with corresponding thermal noise given by ${\sigma }_{123}^{2}={\sigma }_{12}^{2}+{\sigma }_{13}^{2}+{\sigma }_{23}^{2}=3\epsilon {\sigma }^{2}$. The value of ${\chi }_{\psi }^{2}$ is then written simply as

Equation (C7)

which we can see is identical to Equation (C6).

C.2. Closure Phase for N = 4 Stations

We consider now a four-station interferometer with measured visibility phases (${\phi }_{12}$, ${\phi }_{13}$, ${\phi }_{14}$, ${\phi }_{23}$, ${\phi }_{24}$, ${\phi }_{34}$) and model visibility phases (${\hat{\phi }}_{12}$, ${\hat{\phi }}_{13}$, ${\hat{\phi }}_{14}$, ${\hat{\phi }}_{23}$, ${\hat{\phi }}_{24}$, ${\hat{\phi }}_{34}$) related by the station gain phases (${\hat{\theta }}_{1}$, ${\hat{\theta }}_{2}$, ${\hat{\theta }}_{3}$, ${\hat{\theta }}_{4}$) as specified in Equation (C1). Following the same procedure as in the previous section, the inverse of the covariance matrix in the high-S/N limit is

Equation (C8)

corresponding to

Equation (C9)

The four-station array has four closure phases in total, of which three are nonredundant. We specify a measured closure phase ${\psi }_{{ijk}}$ as

Equation (C10)

with an analogous specification for the corresponding model closure phase ${\hat{\psi }}_{{ijk}}$. For a particular choice of nonredundant closure phase subset, the value of ${\chi }_{\psi }^{2}$ will depend on the covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ for the closure phases (see Appendix B.2 and Equation (B9)) and on the vector $\tilde{{\boldsymbol{\psi }}}$ of closure phase residuals,

Equation (C11)

After computing the inverse of ${{\boldsymbol{\Sigma }}}_{\psi }$,

Equation (C12)

it is a tedious but straightforward algebraic exercise to obtain

Equation (C13)

Because closure phases are constructed purely from sums and differences of visibility phases, ${\tilde{\psi }}_{{ijk}}={\tilde{\phi }}_{{ij}}-{\tilde{\phi }}_{{ik}}+{\tilde{\phi }}_{{jk}}$, and thus, Equation (C13) is equivalent to Equation (C9). Furthermore, Equation (C13) no longer shows any signature of the original nonredundant closure phase subset choice; rather, each element of the full redundant set of four closure phases is represented equally, and the ${\chi }_{\psi }^{2}$ includes a 3/4 redundancy correction factor (see Equation (38)) corresponding to the ratio of linearly independent to total closure phases (note that ${\sigma }_{{ijk}}^{2}=3\,{\varepsilon }^{2}{\sigma }^{2})$.

C.3. Closure Amplitude for N = 4 Stations

We consider again a four-station interferometer with measured log visibility amplitudes (${a}_{12}$, ${a}_{13}$, ${a}_{14}$, ${a}_{23}$, ${a}_{24}$, ${a}_{34}$) and model log visibility amplitudes (${\hat{a}}_{12}$, ${\hat{a}}_{13}$, ${\hat{a}}_{14}$, ${\hat{a}}_{23}$, ${\hat{a}}_{24}$, ${\hat{a}}_{34}$) related by the log station gain amplitudes (${\hat{g}}_{1}$, ${\hat{g}}_{2}$, ${\hat{g}}_{3}$, ${\hat{g}}_{4}$) as

Equation (C14)

As in Appendix C.1, if the measured log visibility amplitudes have Gaussian thermal variances (${\sigma }_{12}^{2}$, ${\sigma }_{13}^{2}$, ${\sigma }_{14}^{2}$, ${\sigma }_{23}^{2}$, ${\sigma }_{24}^{2}$, ${\sigma }_{34}^{2}$) and the log gain amplitude contributions are also Gaussian distributed with variances (${\sigma }_{g,1}^{2}$, ${\sigma }_{g,2}^{2}$, ${\sigma }_{g,3}^{2}$, ${\sigma }_{g,4}^{2}$), then the joint distribution of the measured log visibility amplitudes can be expressed as a multivariate Gaussian. The covariance matrix for this distribution can be constructed using the procedure described in Appendix B.1.

If we once again treat the array as perfectly homogeneous and take the high-S/N limit, then to leading order in ${\varepsilon }^{2}$, we find

Equation (C15)

The corresponding ${\chi }_{a}^{2}={\tilde{{\boldsymbol{a}}}}^{\top }{{\boldsymbol{\Sigma }}}_{{\boldsymbol{a}}}^{-1}\tilde{{\boldsymbol{a}}}$ can then be written

Equation (C16)

The four-station array has three closure amplitudes in total, of which two are nonredundant. We specify a measured log-closure amplitude ${c}_{{ijk}{\ell }}$ as

Equation (C17)

with an analogous specification for the corresponding model log-closure amplitude ${\hat{c}}_{{ijk}{\ell }}$. For a particular choice of nonredundant closure amplitude subset, the value of ${\chi }_{c}^{2}$ will depend on the covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ for the log-closure amplitudes (see Appendix B.3 and Equation (B19)) and on the vector $\tilde{{\boldsymbol{c}}}$ of log-closure amplitude residuals,

Equation (C18)

Written out more explicitly, the covariance matrix is given by

Equation (C19)

with corresponding inverse

Equation (C20)

We thus obtain

Equation (C21)

which is equal to Equation (C16). As with the closure phases, we see that the initial choice of minimal log-closure amplitude subset has no bearing on the value of ${\chi }_{c}^{2}$ and accounts for the redundancy factor of total versus linearly independent closure amplitudes.

C.4. Closure Quantities for Arbitrary N

We introduce the notion of "mixed phases" that retain all of the information contained in the visibility phases, but we separate it into two components: one component that is captured by the closure phases and a second component that captures the remaining station-based effects. For an array with N stations, the mixed phase design matrix operates on the $B$ baseline phases and is given by

Equation (C22)

${{\boldsymbol{I}}}_{N-1,B}$ is an $(N-1)\times B$ "rectangular identity matrix" that extracts the first $N-1$ baseline phases by combining ${{\boldsymbol{I}}}_{N-1}$, a standard square identity matrix of rank $N-1$, with $0$, an $(N-1)\times (B-N+1)$ matrix of all zeros. ${{\boldsymbol{\Psi }}}_{N}$ is the minimal closure phase design matrix for $N$ stations (see Equation (B13)), which can be expanded into the visibility phase design matrix ${{\boldsymbol{\Phi }}}_{N-1}$ for $N-1$ stations (see Equation (B4)) and a standard square identity matrix of rank $\left(\displaystyle \genfrac{}{}{0em}{}{N-1}{2}\right)$. This design matrix maps from the visibility phase space to the mixed phase space,

Equation (C23)

For example, the mixed phase design matrix for an array with N = 4 stations is given by

Equation (C24)

and the corresponding mixed phase vector is

Equation (C25)

The mixed phase covariance matrix is given by

Equation (C26)

Using the inverse of the mixed phase design matrix,

Equation (C27)

we can invert Equation (C26) to obtain an expression for the visibility phase covariance matrix in terms of the mixed phase covariance matrix,

Equation (C28)

where we note that the inverse transpose is equal to the transposed inverse for the mixed phase design matrix. We can use the above to substitute for ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ in our expression for the visibility phase ${\chi }^{2}$ (see Equation (C4)),

Equation (C29)

revealing that the χ2 constructed from mixed phases is equal to that constructed from visibility phases, when all covariances are taken into account. This is due to the fact that the mixed phases are generated through a non-singular linear transformation of the visibility phases.

To see how the mixed phases reduce to purely closure phases in the high-S/N limit, it is convenient to consider the following decomposition of the mixed phase covariance matrix:

Equation (C30)

where ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}^{{\prime} }$ is the first (N − 1) × (N − 1) upper left subset of the full visibility phase covariance matrix ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$, ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ is the closure phase covariance matrix, and ${\boldsymbol{W}}={{\boldsymbol{\Phi }}}_{N-1}\,{{\boldsymbol{S}}}_{{\boldsymbol{\phi }}}^{{\prime} }$ is the covariance between the closure phases and the first $N-1$ visibility phases. Since the closure phases are independent of station gain, both ${\boldsymbol{W}}$ and ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ include only baseline thermal noise.

Using a strategy analogous to that employed in the previous sections, where parameter ${\varepsilon }^{2}\sim { \mathcal O }({\sigma }_{{ij}}^{2}/{\sigma }_{\theta ,i}^{2})$ relates statistical error in visibility phase to that from gain uncertainty, we examine the behavior of ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}+}^{-1}$ as $\varepsilon \to 0$. The sub-matrices of ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}+}$ scale with ε as

From the block matrix form of Equation (C30), we can write the inverse mixed phase covariance matrix as

Equation (C31)

These four sub-matrices scale with ε as

and we can see that the ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}^{-1}$ term in the lower right sub-matrix dominates ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}+}^{-1}$. The product of ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}+}^{-1}$ with ${\tilde{{\boldsymbol{\psi }}}}_{+}$ in this limit will therefore serve to isolate the last t terms of ${\tilde{{\boldsymbol{\psi }}}}_{+}$ (which are just the closure phases $\tilde{{\boldsymbol{\psi }}}$), and multiply them by ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}^{-1}$:

Equation (C32)

The final result is that the visibility phase ${\chi }^{2}$, in the limit where the uncertainty in the baseline-based quantities is much smaller than the uncertainty in the station-based quantities, is equal to the closure phase χ2.

The equivalence between ${\chi }_{a}^{2}$ derived from a complete set of log visibility amplitudes in the $\varepsilon \to 0$ limit and ${\chi }_{c}^{2}$ derived from log-closure amplitudes is demonstrated the same way. Here, the corresponding design matrix and mixed log amplitudes are

Equation (C33)

which draws from the first $N$ visibility log amplitudes followed by a minimal set of log-closure amplitudes. The transformation to mixed quantities is non-singular, as long as the first $N$ baselines drawn do not form any closed quadrangles (or the first $N-1$ baselines do not form any closed triangles, in the case of mixed phases). This condition is met by the baseline ordering convention used in this paper. The reduction in Equations (C30)–(C32) then follows under substitution of phase with log-amplitude quantities.

C.5. Explicit Gain Marginalization

In the previous Sections C.1C.4, we have shown that the information content in closure quantities is equivalent to that in the baseline visibilities for the limit of completely unconstrained gains, so long as the covariance structure of the corresponding observables is taken into account. We now relate the use of covariance in the residual visibility likelihood construction (Equation (C2)) to explicit analytic marginalization over Gaussian uncertainties in phase or log-amplitude station gain. Thus, in the limit of completely unconstrained gains, use of closure quantities should give identical results to explicit numerical marginalization over all possible gains; and, for the case of finite Gaussian uncertainties in phase or log-amplitude station gain, use of the residual visibility covariance should give identical results to explicit numerical marginalization over Gaussian priors for the gains.

For an array with N stations under modeled gain corrections, we can write Equation (C14) for all baselines using

Equation (C34)

where ${\boldsymbol{A}}\tilde{{\boldsymbol{g}}}$ is a residual vector of log visibility amplitude correction factors. For example, a three-station array would have

Equation (C35)

The Gaussian likelihood for calibrated log visibility amplitudes is then expressed as

Equation (C36)

where we note that ${{\boldsymbol{S}}}_{{\boldsymbol{a}}}$ contains only baseline thermal noise and is diagonal.

If we further impose independent zero-mean Gaussian priors on each of the model gain correction factors, we can express the joint prior as

Equation (C37)

We can use this prior to marginalize the log visibility amplitudes over the log gain amplitudes,

Equation (C38)

where ${{ \mathcal L }}_{a}$ represents the marginalized likelihood.

The integrand in Equation (C38) is a product of exponentials, which together contain several terms that depend on $\tilde{{\boldsymbol{g}}}$. To evaluate the integral, we would like to consolidate these terms. By defining

Equation (C39)

Equation (C40)

completing the square, and then pulling terms that do not depend on $\tilde{{\boldsymbol{g}}}$ out of the integral, we obtain

Equation (C41)

The integrand now contains only a single multivariate Gaussian in $\tilde{{\boldsymbol{g}}}$, with mean ${{\boldsymbol{M}}}^{-1}{\boldsymbol{\mu }}$ and covariance ${{\boldsymbol{M}}}^{-1}$. Integrating over all $\tilde{{\boldsymbol{g}}}$ thus yields the volume $\sqrt{\det \left(2\pi {{\boldsymbol{M}}}^{-1}\right)}$, so that

Equation (C42)

Upon expanding ${{\boldsymbol{\mu }}}^{\top }{{\boldsymbol{M}}}^{-1}{\boldsymbol{\mu }}$ and directly applying the Woodbury matrix inverse identity, we obtain

Equation (C43)

where we have obtained the log visibility amplitude covariance ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{a}}}={{\boldsymbol{S}}}_{{\boldsymbol{a}}}+{\boldsymbol{A}}{{\boldsymbol{\Sigma }}}_{{\boldsymbol{g}}}{{\boldsymbol{A}}}^{\top }$ analogous to Equation (B3).

For the determinants in the normalization constant,

Equation (C44)

Equation (C45)

Equation (C46)

Equation (C47)

Here, we have used the Weinstein–Aronszajn matrix identity $\det \left({\boldsymbol{I}}+{\boldsymbol{XY}}\right)=\det \left({\boldsymbol{I}}+{\boldsymbol{YX}}\right)$. The marginalized likelihood can thus be written as

Equation (C48)

showing that the marginalization over Gaussian priors in log gain amplitude is fully captured through the use of visibility covariance as in Equation (C2). The derivation applies to any linear transformation of independent Gaussian observables. In particular, it is the same for partially known visibility phases under the substitution $({\boldsymbol{a}},{\boldsymbol{A}},{\boldsymbol{g}})\to ({\boldsymbol{\phi }},{\boldsymbol{\Phi }},{\boldsymbol{\theta }})$.

Appendix D: Notation

Table D1 lists the notation used throughout this paper Vector quantities reflect values taken over a common set of recorded signals at N antennas. Throughout, we distinguish the measured values (no accent) with a forward model (breve accent), as well as the model residual (measured minus model value, tilde accent).

Table D1.  Notation Used in This Paper

    Measured Value   Model Parameter   Residual   Residual Error
Quantity Number Single Vector   Single Vector   Single Vector   Variance Covariance Design
Complex gain $N$ ${\gamma }_{i}$ $...$   ${\hat{\gamma }}_{i}$ $...$   ${\tilde{\gamma }}_{i}$ $...$   $...$ $...$ $...$
Complex visibility $B$ ${V}_{{ij}}$ $...$   ${\hat{V}}_{{ij}}$ $...$   ${\tilde{V}}_{{ij}}$ $...$   $2\,{\sigma }_{V,{ij}}^{2}$ (thermal only) $...$ $...$
Gain phase $N$ ${\theta }_{i}$ ${\boldsymbol{\theta }}$   ${\hat{\theta }}_{i}$ $\hat{{\boldsymbol{\theta }}}$   ${\tilde{\theta }}_{i}$ $\tilde{{\boldsymbol{\theta }}}$   ${\sigma }_{\theta ,i}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\theta }}}$ $...$
Visibility phase $B$ ${\phi }_{{ij}}$ ${\boldsymbol{\phi }}$   ${\hat{\phi }}_{{ij}}$ $\hat{{\boldsymbol{\phi }}}$   ${\tilde{\phi }}_{{ij}}$ $\tilde{{\boldsymbol{\phi }}}$   ${\sigma }_{{ij}}^{2}+{\sigma }_{\theta ,i}^{2}+{\sigma }_{\theta ,j}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\phi }}}$ ${\boldsymbol{\Phi }}$
Closure phase $T$ ${\psi }_{{ijk}}$ ${\boldsymbol{\psi }}$   ${\hat{\psi }}_{{ijk}}$ $\hat{{\boldsymbol{\psi }}}$   ${\tilde{\psi }}_{{ijk}}$ $\tilde{{\boldsymbol{\psi }}}$   ${\sigma }_{{ijk}}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{\psi }}}$ ${\boldsymbol{\Psi }}$
Gain amplitude $N$ ${G}_{i}$ $...$   ${\hat{G}}_{i}$ $...$   ${\tilde{G}}_{i}$ $...$   $...$ $...$ $...$
Visibility amplitude $B$ ${A}_{{ij}}$ $...$   ${\hat{A}}_{{ij}}$ $...$   ${\tilde{A}}_{{ij}}$ $...$   $...$ $...$ $...$
Closure amplitude $Q$ ${C}_{{ijkl}}$ $...$   ${\hat{C}}_{{ijkl}}$ $...$   ${\tilde{C}}_{{ijkl}}$ $...$   $...$ $...$ $...$
Log gain amplitude $N$ ${g}_{i}$ ${\boldsymbol{g}}$   ${\hat{g}}_{i}$ $\hat{{\boldsymbol{g}}}$   ${\tilde{g}}_{i}$ $\tilde{{\boldsymbol{g}}}$   ${\sigma }_{g,i}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{g}}}$ $...$
Log visibility amplitude $B$ ${a}_{{ij}}$ ${\boldsymbol{a}}$   ${\hat{a}}_{{ij}}$ $\hat{{\boldsymbol{a}}}$   ${\tilde{a}}_{{ij}}$ $\tilde{{\boldsymbol{a}}}$   ${\sigma }_{{ij}}^{2}+{\sigma }_{g,i}^{2}+{\sigma }_{g,j}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{a}}}$ ${\boldsymbol{A}}$
Log-closure amplitude $Q$ ${c}_{{ijkl}}$ ${\boldsymbol{c}}$   ${\hat{c}}_{{ijkl}}$ $\hat{{\boldsymbol{c}}}$   ${\tilde{c}}_{{ijkl}}$ $\tilde{{\boldsymbol{c}}}$   ${\sigma }_{{ijkl}}^{2}$ ${{\boldsymbol{\Sigma }}}_{{\boldsymbol{c}}}$ ${\boldsymbol{C}}$

Note. Measured values reflect contributions from thermal noise and systematic errors. The residual error represents the expected covariance across a set of simultaneously observed residual quantities (measured minus model values). For visibilities, the residual covariance includes contributions from both thermal (statistical) and gain (systematic) errors in the general case. At high S/N, the variance for phase and log amplitude are equal; thus, we use the same symbol ${\sigma }_{{ij}}^{2}\approx {\sigma }_{V,{ij}}^{2}/{A}_{{ij}}$ for notational simplicity. ${\sigma }_{V,{ij}}^{2}$ reflects the thermal noise in one component of the complex visibility, and it is known a priori to high precision.

Download table as:  ASCIITypeset image

Footnotes

  • This quadrangle can be easily cast as a complex closure quantity, but doing so provides no phase information beyond the set of complex bispectra. When taken over the four polarization feeds of a single baseline however, e.g., $({V}_{\mathrm{LR}}\,{V}_{\mathrm{RL}})/({V}_{\mathrm{LL}}\,{V}_{\mathrm{RR}})$, such a construction can provide some information about delay closure and/or polarization fraction.

  • In principle, the source contribution $E\left[{{ \mathcal E }}_{i}^{}{{ \mathcal E }}_{j}^{\ast }\right]$ is also subject to statistical fluctuations, i.e., the self noise of Kulkarni (1989), but these are strongly subdominant to uncertainties from thermal noise in the weak-signal limit $E\left[| n{| }^{2}\right]\gg E\left[| \gamma { \mathcal E }{| }^{2}\right]$ (Section 2.1).

  • Although we focus here on the covariance of thermal noise, which contributes to covariance in the residual measured quantities under a true source model, we note that the same relationships also hold for non-closing baseline errors (Massi et al. 1991). Such errors are particularly straightforward to incorporate into the analysis if modeled as additional independent Gaussian systematic error in baseline quantities. The same covariance relationships also hold for variations in structure closure phases, and the analysis is relevant for isolating independent structural variability degrees of freedom that are measured across the array.

  • The Gaussian limit is appropriate for S/N > ∼ few (Section 2.4). Even at low S/N, closure quantities formed from a single set of baseline visibilities will be dependent by construction, but their statistical dependence can not be characterized by a multivariate Gaussian due to nonlinear effects. However, in the special case of coherent ensemble averages over a very large number of closure quantities that have more than one low-S/N baseline, the closure quantities become approximately statistically independent. This is often the case for bispectral averaging in optical interferometry where the atmospheric coherence time is extremely short (e.g., Kulkarni et al. 1991).

  • 10 

    Note that it is almost never the case that strictly equal representation of all baselines is possible; for closure phases, only the N = 3 and N = 6 arrays can achieve perfect balance (with each baseline represented exactly once or twice, respectively), while log-closure amplitudes are limited to only the N = 5 array (with each baseline represented exactly twice) and N = 9 array (with each baseline represented exactly three times).

  • 11 

    An example such set is {${\psi }_{123}$, ${\psi }_{124}$, ${\psi }_{135}$, ${\psi }_{146}$, ${\psi }_{156}$, ${\psi }_{236}$, ${\psi }_{245}$, ${\psi }_{256}$, ${\psi }_{345}$, ${\psi }_{346}$}.

  • 12 

    An example such set is {${c}_{1234}$, ${c}_{1245}$, ${c}_{1352}$, ${c}_{1453}$, ${c}_{2345}$}.

  • 13 

    The construction does not impose a trivial phase for the zero baseline, but it can be assumed by explicitly removing the corresponding row from ${\boldsymbol{R}}$. Doing so leaves one remaining phase structure degree of freedom corresponding to the single open triangle. There is also a trivial closure amplitude for the case of a colocated pair of sites where each baseline in the numerator has a matching baseline in the denominator with the same amplitude. In this case, the trivial behavior is already fully captured by the closure amplitude design matrix ${\boldsymbol{C}}$ and does not need to be taken a priori.

  • 14 
Please wait… references are loading.
10.3847/1538-4357/ab8469