First Sagittarius A* Event Horizon Telescope Results. IV. Variability, Morphology, and Black Hole Mass

Event Horizon Telescope Collaboration; Kazunori Akiyama; Antxon Alberdi; Walter Alef; Juan Carlos Algaba; Richard Anantua; Keiichi Asada; Rebecca Azulay; Uwe Bach; Anne-Kathrin Baczko; David Ball; Mislav Baloković; John Barrett; Michi Bauböck; Bradford A. Benson; Dan Bintley; Lindy Blackburn; Raymond Blundell; Katherine L. Bouman; Geoffrey C. Bower; Hope Boyce; Michael Bremer; Christiaan D. Brinkerink; Roger Brissenden; Silke Britzen; Avery E. Broderick; Dominique Broguiere; Thomas Bronzwaer; Sandra Bustamante; Do-Young Byun; John E. Carlstrom; Chiara Ceccobello; Andrew Chael; Chi-kwan Chan; Koushik Chatterjee; Shami Chatterjee; Ming-Tang Chen; Yongjun Chen; Xiaopeng Cheng; Ilje Cho; Pierre Christian; Nicholas S. Conroy; John E. Conway; James M. Cordes; Thomas M. Crawford; Geoffrey B. Crew; Alejandro Cruz-Osorio; Yuzhu Cui; Jordy Davelaar; Mariafelicia De Laurentis; Roger Deane; Jessica Dempsey; Gregory Desvignes; Jason Dexter; Vedant Dhruv; Sheperd S. Doeleman; Sean Dougal; Sergio A. Dzib; Ralph P. Eatough; Razieh Emami; Heino Falcke; Joseph Farah; Vincent L. Fish; Ed Fomalont; H. Alyson Ford; Raquel Fraga-Encinas; William T. Freeman; Per Friberg; Christian M. Fromm; Antonio Fuentes; Peter Galison; Charles F. Gammie; Roberto García; Olivier Gentaz; Boris Georgiev; Ciriaco Goddi; Roman Gold; Arturo I. Gómez-Ruiz; José L. Gómez; Minfeng Gu; Mark Gurwell; Kazuhiro Hada; Daryl Haggard; Kari Haworth; Michael H. Hecht; Ronald Hesper; Dirk Heumann; Luis C. Ho; Paul Ho; Mareki Honma; Chih-Wei L. Huang; Lei Huang; David H. Hughes; Shiro Ikeda; C. M. Violette Impellizzeri; Makoto Inoue; Sara Issaoun; David J. James; Buell T. Jannuzi; Michael Janssen; Britton Jeter; Wu Jiang; Alejandra Jiménez-Rosales; Michael D. Johnson; Svetlana Jorstad; Abhishek V. Joshi; Taehyun Jung; Mansour Karami; Ramesh Karuppusamy; Tomohisa Kawashima; Garrett K. Keating; Mark Kettenis; Dong-Jin Kim; Jae-Young Kim; Jongsoo Kim; Junhan Kim; Motoki Kino; Jun Yi Koay; Prashant Kocherlakota; Yutaro Kofuji; Patrick M. Koch; Shoko Koyama; Carsten Kramer; Michael Kramer; Thomas P. Krichbaum; Cheng-Yu Kuo; Noemi La Bella; Tod R. Lauer; Daeyoung Lee; Sang-Sung Lee; Po Kin Leung; Aviad Levis; Zhiyuan Li; Rocco Lico; Greg Lindahl; Michael Lindqvist; Mikhail Lisakov; Jun Liu; Kuo Liu; Elisabetta Liuzzo; Wen-Ping Lo; Andrei P. Lobanov; Laurent Loinard; Colin J. Lonsdale; Ru-Sen Lu; Jirong Mao; Nicola Marchili; Sera Markoff; Daniel P. Marrone; Alan P. Marscher; Iván Martí-Vidal; Satoki Matsushita; Lynn D. Matthews; Lia Medeiros; Karl M. Menten; Daniel Michalik; Izumi Mizuno; Yosuke Mizuno; James M. Moran; Kotaro Moriyama; Monika Moscibrodzka; Cornelia Müller; Alejandro Mus; Gibwa Musoke; Ioannis Myserlis; Andrew Nadolski; Hiroshi Nagai; Neil M. Nagar; Masanori Nakamura; Ramesh Narayan; Gopal Narayanan; Iniyan Natarajan; Antonios Nathanail; Santiago Navarro Fuentes; Joey Neilsen; Roberto Neri; Chunchong Ni; Aristeidis Noutsos; Michael A. Nowak; Junghwan Oh; Hiroki Okino; Héctor Olivares; Gisela N. Ortiz-León; Tomoaki Oyama; Daniel C. M. Palumbo; Georgios Filippos Paraschos; Jongho Park; Harriet Parsons; Nimesh Patel; Ue-Li Pen; Dominic W. Pesce; Vincent Piétu; Richard Plambeck; Aleksandar PopStefanija; Oliver Porth; Felix M. Pötzl; Ben Prather; Jorge A. Preciado-López; Hung-Yi Pu; Venkatessh Ramakrishnan; Ramprasad Rao; Mark G. Rawlings; Alexander W. Raymond; Luciano Rezzolla; Angelo Ricarte; Bart Ripperda; Freek Roelofs; Alan Rogers; Eduardo Ros; Cristina Romero-Cañizales; Arash Roshanineshat; Helge Rottmann; Alan L. Roy; Ignacio Ruiz; Chet Ruszczyk; Kazi L. J. Rygl; Salvador Sánchez; David Sánchez-Argüelles; Miguel Sánchez-Portal; Mahito Sasada; Kaushik Satapathy; Tuomas Savolainen; F. Peter Schloerb; Jonathan Schonfeld; Karl-Friedrich Schuster; Lijing Shao; Zhiqiang Shen; Des Small; Bong Won Sohn; Jason SooHoo; Kamal Souccar; He Sun; Fumie Tazaki; Alexandra J. Tetarenko; Paul Tiede; Remo P. J. Tilanus; Michael Titus; Pablo Torne; Efthalia Traianou; Tyler Trent; Sascha Trippe; Matthew Turk; Ilse van Bemmel; Huib Jan van Langevelde; Daniel R. van Rossum; Jesse Vos; Jan Wagner; Derek Ward-Thompson; John Wardle; Jonathan Weintroub; Norbert Wex; Robert Wharton; Maciek Wielgus; Kaj Wiik; Gunther Witzel; Michael F. Wondrak; George N. Wong; Qingwen Wu; Paul Yamaguchi; Doosoo Yoon; André Young; Ken Young; Ziri Younsi; Feng Yuan; Ye-Fei Yuan; J. Anton Zensus; Shuo Zhang; Guang-Yao Zhao; Shan-Shan Zhao; Dominic O. Chang

doi:10.3847/2041-8213/ac6736

1. Introduction

Sagittarius A* (Sgr A*), the radio source associated with the supermassive black hole (SMBH) at the center of the Milky Way, is thought to subtend the largest angular size of all black holes in the sky. At a distance of D ≈ 8 kpc and with a mass of M ≈ 4 × 10⁶ M_⊙ (Do et al. 2019; Gravity Collaboration et al. 2019, 2020), Sgr A* has a Schwarzschild radius of ∼10 μas. Models of optically thin spherical accretion flows around SMBHs generically predict that they will appear to distant observers as bright rings of emission surrounding a darker central "shadow" (e.g., Bardeen 1973; Luminet 1979; de Vries 2000; Falcke et al. 2000; Broderick & Loeb 2006; Broderick & Narayan 2006; Broderick et al. 2011, 2016; Narayan et al. 2019), and a variety of more general accretion flow simulations have demonstrated that the diameter of this ring is typically ∼5 times larger than the Schwarzschild radius (e.g., Event Horizon Telescope Collaboration et al. 2019e). The Event Horizon Telescope (EHT) collaboration provided observational verification of this picture, using a global very long baseline interferometry (VLBI) network of radio telescopes observing at a frequency of ∼230 GHz to resolve the ∼40 μas ring of emission around the M87* SMBH (Event Horizon Telescope Collaboration et al. 2019a, 2019b, 2019c, 2019d, 2019e, 2019f, 2021a, 2021b, hereafter M87* Papers I–VIII).

The predicted ring diameter for Sgr A* is ∼50 μas, about 25% larger than what the EHT observed for M87*. However, because Sgr A* is more than three orders of magnitude less massive than M87*, all dynamical timescales in the system are correspondingly shorter. In particular, the typical gravitational timescale for Sgr A* is GM/c³ ≈ 20 s, implying that the source structure can vary substantially over the several-hour duration of a single EHT observation. Consistent with this expectation, Sgr A* exhibits broadband variability on timescales of minutes to hours (e.g., Genzel et al. 2003; Ghez et al. 2004; Fish et al. 2011; Neilsen et al. 2013; Goddi et al. 2021; Wielgus et al. 2022). The multiwavelength properties of Sgr A* during the 2017 EHT observing campaign are described in Event Horizon Telescope Collaboration et al. (2022b, hereafter Paper II).

The potential for rapid structural variability complicates the analysis of EHT observations of Sgr A*. A standard strategy for ameliorating the sparsity of VLBI data sets is Earth-rotation aperture synthesis, whereby Fourier coverage of the array is accumulated as Earth rotates and baselines change their orientation with respect to the source (Thompson et al. 2017). This strategy is predicated on the source remaining static throughout the observing period, in which case the accumulated data measure a single image structure. However, Sgr A* violates this assumption on timescales as short as minutes. After several hours, the variable components of the image structure in Sgr A* are expected to be uncorrelated (Georgiev et al. 2022; Wielgus et al. 2022). Thus, image reconstructions from the EHT Sgr A* data are focused on reconstructing time-averaged source structures (Event Horizon Telescope Collaboration et al. 2022c, hereafter Paper III).

Despite the necessity of reconstructing an average source structure, the data collected within a single multihour observation epoch are associated with many specific instances of the variable emission from Sgr A*, i.e., they represent an amalgam of observations of instantaneous images. The imaging strategy pursued for the EHT observations of Sgr A* aims to mitigate the impact of this changing source structure through the introduction of a "variability noise budget," which absorbs the structural evolution into inflated uncertainties and thereby permits imaging algorithms to reconstruct a time-averaged image under the usual static source assumption.¹⁴⁹ The image reconstruction procedure is described in detail in Paper III, and the results confirm that the Sgr A* data are consistent with being produced by a ring-like emission structure with a diameter of ∼50 μas.

For the EHT observations of M87*, morphological properties of the observed ring (e.g., diameter, thickness, orientation) were quantified using both imaging and geometrical modeling analyses (M87* Paper VI), and the measured ring diameter was calibrated using general relativistic magnetohydrodynamic (GRMHD) simulations from M87* Paper V to constrain the mass of the SMBH. The current paper applies a conceptually similar strategy to the analysis of the EHT Sgr A* data, though significant alterations have been made to meet the new challenges posed by Sgr A* and to tailor the analyses appropriately. In this paper, we first characterize the variability seen in the Sgr A* data, and we develop a framework for mitigating the impact of variability when imaging or modeling the data. We then make measurements of the ring size and other structural properties using both imaging and geometrical modeling analyses, and we derive and apply a GRMHD-based calibration to bring ring size measurements made using different techniques to a common physical scale.

This paper is organized as follows. Section 2 provides an overview of the Sgr A* observations and data processing. In Section 3, we quantify the variability on different spatial scales, and we outline the strategies used to mitigate its impact during imaging and modeling. In Section 4, we discuss salient data properties in the context of a ring-like emission structure, and we describe our procedure for using GRMHD simulations to calibrate different ring size measurement techniques to a common physical scale. Sections 5, 6, and 7 detail our three primary strategies for measuring the ring size and describe their application to the Sgr A* data. Our results are presented in Section 8, and we summarize and conclude in Section 9. This paper is the fourth in a series that describes the analysis of the 2017 EHT observations of Sgr A*. The series is summarized in Event Horizon Telescope Collaboration et al. (2022a, hereafter Paper I). The data processing and calibration are described in Paper II, imaging is carried out in Paper III, physical simulations are described in Event Horizon Telescope Collaboration et al. (2022d, hereafter Paper V), and tests of gravity are presented in Event Horizon Telescope Collaboration et al. (2022e, hereafter Paper VI).

2. Observations and Data Products

In this section, we briefly review the interferometric data products used for analyses in this paper (Section 2.1), and we summarize the observations (Section 2.2) and data processing (Section 2.3) that precede these analyses. A more comprehensive description of the Sgr A* data collection, correlation, and calibration can be found in Paper II, M87* Paper III, and references therein.

2.1. VLBI Data Products

As a radio interferometer, the EHT is natively sensitive to the Fourier transform of the sky-plane emission structure. For a source of emission I( x , t), the complex visibility ${ \mathcal V }({\boldsymbol{u}},t)$ is given by

$\begin{eqnarray}&&{ \mathcal V }({\boldsymbol{u}},t)=\iint {e}^{-2\pi i{\boldsymbol{u}}\cdot {\boldsymbol{x}}}I({\boldsymbol{x}},t){d}^{2}{\boldsymbol{x}},\end{eqnarray} \tag{ 1 }$

where t is time, x = (x, y) are angular coordinates on the sky, and u = (u, v) are projected baseline coordinates in units of the observing wavelength (see, e.g., Thompson et al. 2017).

The ideal visibilities ${ \mathcal V }$ are not directly observable because they are corrupted by both statistical errors and a variety of systematic effects. For the EHT, the dominant systematics are complex station-based gain corruptions. The relationship between an ideal visibility ${{ \mathcal V }}_{{ij}}$ and the observed visibility V_ij on a baseline connecting stations i and j is given by

$\begin{eqnarray}&&{V}_{{ij}}={g}_{i}{g}_{j}^{* }{{ \mathcal V }}_{{ij}}+{\sigma }_{\mathrm{th},{ij}}\equiv | {V}_{{ij}}| {e}^{i{\phi }_{{ij}}},\end{eqnarray} \tag{ 2 }$

where σ_th,ij is the statistical (or "thermal") error on the baseline, g_i and g_j are the station gains, and we have defined the visibility amplitude ∣V_ij∣ and phase ϕ_ij. The statistical error is well described as a zero-mean circularly symmetric complex Gaussian random variable with a variance determined (per the radiometer equation) by the station sensitivities, integration time, and frequency bandwidth (Thompson et al. 2017). The station gains vary in time at every site and must in general be either calibrated out or determined alongside the source structure.

The presence of station-based systematics motivates the construction and use of "closure quantities" that are invariant to such corruptions. A closure phase ψ_ijk (Jennison 1958) is the sum of visibility phases around a closed triangle of baselines connecting stations i, j, and k,

$\begin{eqnarray}&&{\psi }_{{ijk}}={\phi }_{{ij}}+{\phi }_{{jk}}+{\phi }_{{ki}}.\end{eqnarray} \tag{ 3 }$

Closure phases are invariant to station-based phase corruptions, such that the measured closure phase is equal to the ideal closure phase, up to statistical errors. Similarly, a closure amplitude A_ijkℓ (Twiss et al. 1960) is the ratio of pairs of visibility amplitudes on a closed quadrangle of baselines connecting stations i, j, k, and ℓ,

$\begin{eqnarray}&&{A}_{{ijk}{\ell }}=\displaystyle \frac{| {V}_{{ij}}| | {V}_{k{\ell }}| }{| {V}_{{ik}}| | {V}_{j{\ell }}| }.\end{eqnarray} \tag{ 4 }$

Analogous with closure phases, closure amplitudes are invariant to station-based amplitude corruptions. Because closure quantities are constructed from nonlinear combinations of complex visibilities, they have correlated and non-Gaussian error statistics; a detailed discussion is provided in Blackburn et al. (2020).

2.2. EHT Observations of Sgr A*

The EHT observed Sgr A* on 2017 April 5, 6, 7, 10, and 11 with the phased Atacama Large Millimeter/submillimeter Array (ALMA) and the Atacama Pathfinder Experiment (APEX) on the Llano de Chajnantor in Chile, the Large Millimeter Telescope Alfonso Serrano (LMT) on Volcán Sierra Negra in Mexico, the James Clerk Maxwell Telescope (JCMT) and phased Submillimeter Array (SMA) on Maunakea in Hawai'i, the IRAM 30 m telescope (PV) on Pico Veleta in Spain, the Submillimeter Telescope (SMT) on Mt. Graham in Arizona, and the South Pole Telescope (SPT) in Antarctica (M87* Paper II). Only the April 6, 7, and 11 observations included the highly sensitive ALMA station, and the April 11 light curve exhibits strong variability (Wielgus et al. 2022) that is presumably associated with an X-ray flare that occurred shortly before the start of the track (Paper II). In this paper, we thus analyze primarily the April 6 and April 7 data sets. We note that while Paper III focuses on the April 7 data set, with the April 6 data set used for secondary validation, most of the analyses carried out in this paper instead focus on a joint data set that combines the April 6 and April 7 data.

At each site the data were recorded in two 1.875 GHz wide frequency bands, centered around sky frequencies of 227.1 GHz (low band; LO) and 229.1 GHz (high band; HI), and in each of two polarization modes. For all telescopes except ALMA and JCMT, the data were recorded in a dual circular polarization mode: right-hand circular polarization (RCP; R) and left-hand circular polarization (LCP; L). ALMA recorded using linear feeds, and the data were later converted to a circular polarization basis during the DiFX (Deller et al. 2011) correlation (Martí-Vidal et al. 2016; Matthews et al. 2018; Goddi et al. 2019). The JCMT observed only a single hand of circular polarization at a time, with the specific handedness (RCP or LCP) changing from day to day. All other stations observed in a standard dual-polarization mode, which allows the construction of RR, RL, LR, and LL correlation products. The analyses in this paper use only the parallel-hand correlations (i.e., RR and LL), which are averaged to form Stokes I data products. Because JCMT records only a single hand at a time, we instead form "pseudo-I" data products for JCMT baselines, using whichever parallel-hand correlation is available as a stand-in for Stokes I.¹⁵⁰

2.3. Data Reduction

After correlation, residual phase and bandpass errors are corrected with two independent processing pipelines: EHT-HOPS (Blackburn et al. 2019) producing "HOPS" (Whitney et al. 2004) data and rPICARD (Janssen et al. 2018, 2019) producing "CASA" (McMullin et al. 2007) data. Relative phase gains between RCP and LCP have been corrected based on the assumption of zero circular polarization on baselines between ALMA and other EHT stations. Absolute flux density scales are based on a priori measurements of each station's sensitivity, resulting in a ∼10% typical uncertainty in the amplitude gains (M87* Paper II). The amplitude gains of the colocated ALMA/APEX and SMA/JCMT stations have been further refined via time-variable network calibration (M87* Paper III) using a light curve of the compact Sgr A* flux measured by ALMA and SMA (Wielgus et al. 2022). For the remaining stations, gross amplitude gain errors have been corrected by a transfer of gain solutions from the J1924–2914 and NRAO 530 calibrator sources as described in Paper II.

Following the completion of the above calibration pipelines, additional preprocessing of the data has been carried out as described in Paper III, including calibration of the LMT and JCMT station gains and normalization of the visibility amplitudes by the total light curve. The characterization of residual calibration effects (e.g., polarization leakage) into a systematic error budget, as well as a more comprehensive description of the overall EHT Sgr A* data reduction, is provided in Paper II.

3. Variability Extraction and Mitigation

The statistical errors quoted in Paper II and summarized in the preceding section do not account for three additional sources of uncertainty that can otherwise substantially bias any analysis efforts. First, unaccounted-for nonclosing (i.e., baseline-based) systematic errors are present in the data at a level that is on the order of ∼1% of the visibility amplitude, which is often larger than the formal statistical errors (for a discussion of their magnitude and potential origins, see Paper II). Second, significant refractive scattering in the interstellar medium produces additional substructure within the image that is not present in the intrinsic emission map (Johnson et al. 2018). Third, there is intraday variability in the source itself. Source variability is theoretically expected to arise on a broad range of timescales, and it is explicitly seen in GRMHD simulations on timescales as short as minutes (Georgiev et al. 2022). Such variability was also observed in the light curve of Sgr A* during the 2017 EHT campaign on timescales from 1 minute to several hours (Wielgus et al. 2022).

In this section, we summarize the theoretical expectations for and characteristics of the variability based on GRMHD simulations, present an estimate for the degree of structural variability in Sgr A* directly from the visibility amplitude data, and describe the strategies pursued here and in Paper III to mitigate the impact of the three components of additional error listed above.

3.1. Expectations from Theory

In low-luminosity SMBH systems such as Sgr A*, we expect the emission to originate from the immediate vicinity of the black hole, i.e., on scales comparable to the event horizon size. Here, all characteristic speeds of the hot relativistic gas approach the speed of light. The timescales associated with these processes are therefore set by the gravitational timescale, GM/c³, which is ∼20 s for Sgr A*. This timescale is ∼3 orders of magnitude shorter than the nightly observations carried out by the EHT, so a single observation contains many realizations of the underlying source variability.

GRMHD simulations can model the dynamical processes in Sgr A* and, using ray-tracing and radiative transfer, provide a theoretical expectation for the observed emission. Paper V provides a library of GRMHD simulations and associated movies, which have been scaled to the conditions during the EHT 2017 observations (e.g., the average total 230 GHz flux is set to the EHT measurement). We use the variability characteristics of these simulations as our expectation for the Sgr A* variability seen by the EHT.

GRMHD simulations are universally described by a "red–red" power spectrum, with the largest fluctuations in the emission occurring on the longest timescales and the largest spatial scales (Georgiev et al. 2022). Spatially, the largest scale for variability is limited to the size of the emitting region, which for an observing frequency of 230 GHz is typically several GM/c² and for the EHT Sgr A* data is constrained to be ≲87 μas (Paper II). Temporally, the simulations exhibit a red power spectrum that flattens on timescales ≳ 1000 GM/c³. Observations of the total flux variability in Sgr A* corroborate this expectation, finding a red-noise spectrum extending to timescales of several hours and flattening on longer timescales (Wielgus et al. 2022).

We can, without loss of generality, express the time-variable image structure I in terms of some static mean image I_avg and a zero-mean time-variable component δ I that captures all of the variation,

$\begin{eqnarray}&&I({\boldsymbol{x}},t)={I}_{\mathrm{avg}}({\boldsymbol{x}})+\delta I({\boldsymbol{x}},t).\end{eqnarray} \tag{ 5 }$

The linearity of the Fourier transform ensures that an analogous decomposition holds for ${ \mathcal V }$ , which is thus simply the sum of an analogous ${{ \mathcal V }}_{0}$ and $\delta { \mathcal V }$ . The variation $\delta { \mathcal V }$ represents the component of the data we wish to mitigate.

The EHT stations ALMA and SMA are themselves interferometric arrays capable of separating out extended structure (such as the Galactic center "minispiral"; Lo & Claussen 1983; Goddi et al. 2021; Wielgus et al. 2022) from the Sgr A* light curve,

$\begin{eqnarray}&&L(t)={ \mathcal V }({\bf{0}},t)=\iint I({\boldsymbol{x}},t){d}^{2}{\boldsymbol{x}},\end{eqnarray} \tag{ 6 }$

on the largest spatial scales, predicted by GRMHD simulations to be the most variable. Using this motivation, the light-curve-normalized image is defined to be

$\begin{eqnarray}&&\hat{I}({\boldsymbol{x}},t)\equiv \displaystyle \frac{I({\boldsymbol{x}},t)}{L(t)},\end{eqnarray} \tag{ 7 }$

with ${\hat{I}}_{\mathrm{avg}}$ and $\delta \hat{I}$ similarly defined; here, the "hat" diacritic denotes light-curve normalization. From GRMHD simulations, the expected noise is well approximated by a broken power law,

$\begin{eqnarray}&&{\sigma }_{\mathrm{var}}^{2}\equiv \langle \delta {\hat{{ \mathcal V }}}^{2}\rangle \approx \displaystyle \frac{{a}^{2}{\left(| {\boldsymbol{u}}| /{u}_{0}\right)}^{c}}{1+{\left(| {\boldsymbol{u}}| /{u}_{0}\right)}^{b+c}},\end{eqnarray} \tag{ 8 }$

along any radial direction (Georgiev et al. 2022). This broken power law is described by four parameters: a break at u₀, an amplitude a representing the amount of noise at the break location, and long- and short-baseline power-law indices b and c, respectively. Typically, we expect that c ≳ 2, due to the compact nature of the source.

In Figure 1, red lines show ${\sigma }_{\mathrm{var}}^{2}$ measured for an example GRMHD simulation about average images that have been constructed on observationally relevant timescales. The variability has been averaged in azimuth and across different black hole spin orientations. As the timescale over which the average image is constructed increases, the location of the break u₀ decreases and the amount of power at the break increases.¹⁵¹ This behavior can intuitively be understood as the GRMHD simulations changing less for short timescales. For comparison, we show the thermal, systematic, and refractive scattering noise. For timescales longer than ∼10 minutes, the variability noise dominates on EHT VLBI baselines.

3.2. Intraday Variability in the Sgr A* Data

The intraday variability expected from theoretical considerations can be observed directly in the Sgr A* data. Figure 2 shows the combined baseline coverage for the EHT's 2017 Sgr A* campaign, including the observations on April 5, 6, 7, and 10. The upper limit on the source size of 87 μas (see the second-moment analysis in Paper II) implies that the complex visibilities will be correlated in regions of the (u,v)-plane smaller than ∼ 2 Gλ. In practice, the visibility amplitudes exhibit variations on scales smaller than this and otherwise appear strongly correlated on scales of 1 Gλ (see Paper III, Figure 3). Therefore, among the baseline tracks in Figure 2 there are four regions where the (u,v)-coverage is redundant, i.e., multiple baselines pass within 1 Gλ of the same (u,v)-position. We separate the redundant baseline combinations into "crossing tracks," in which two baseline tracks intersect at a single (u,v)-point, and "following tracks," in which two baselines follow a nearly identical extended track in the (u,v)-plane. Both sets of redundant baselines provide an opportunity to directly probe the degree of intraday variability in the visibilities at specific locations in the (u,v)-plane.

Prior to making comparisons, we apply the data preprocessing steps outlined in Section 2.3 to mitigate unphysical sources of variability. To avoid addressing the unknown atmospheric phase delays, we focus exclusively on visibility amplitudes. Because source structure will produce additional variations in the visibility amplitudes that are hard to visualize in projection and obscure the relative degree of variability, we detrend the visibility amplitudes with a linear model. The crossing and following tracks discussed below are shown in the top and bottom subpanels of Figure 2, respectively.

Chile–PV versus Chile–SPT: The first crossing track we consider contains baselines between the Chile stations (ALMA, APEX) and PV and SPT, which both cross near (u,v) = (4 Gλ, 3.5 Gλ) at times separated by 6.2 hr. The concurrent ALMA and APEX baselines are consistent within the reported statistical errors, and thus there is no evidence for unaddressed baseline-specific dominant systematic errors. The normalized visibility amplitudes for the Chile–PV and Chile–SPT baselines individually vary smoothly with time. Nevertheless, they differ significantly at the crossing point, and this difference is consistent in magnitude with the variation found across days (indicated by the gray band in the relevant panel of Figure 2).

Chile–SMT versus Chile–SPT: The second crossing track we consider contains baselines between the Chile stations (ALMA, APEX) and SMT and SPT, which both cross near (u, v) =(3 Gλ, 4.5 Gλ) at times separated by 5.2 hr. Again, we find excellent agreement between ALMA and APEX baselines, individually smooth variations on the Chile–SMT and Chile–SPT baselines, and significant differences in the visibility amplitudes between those baselines.

SMA–SPT versus LMT–SPT: The first following track we consider contains baselines between the SPT, which is located at the South Pole, and SMA and LMT, which have similar latitudes. Because the baseline tracks are coincident across a large range of locations in the (u,v)-plane, this following track permits many direct comparisons at a baseline length of 8 Gλ at times separated by 3.4 hr. As with both crossing tracks, significant differences exist between the two sets of baselines, consistent with the range across multiple days.

SMT–SPT versus PV–SPT: The final following track we consider again involves the SPT, and now the SMT and PV, which also have similar latitudes. This is the longest set of baselines that we consider, with a length of roughly 8.5 Gλ and covering similar regions in the (u,v)-plane at times separated by 6.7 hr. Again, significant variations are exhibited, consistent with those across days.

In summary, intraday variability is observed on multiple baselines with lengths ranging from 5 to 8.5 Gλ and on timescales as short as 3.4 hr. In all cases, this variability is broadly consistent with that observed on interday timescales. Furthermore, the variability behavior is consistent with theoretical expectations from GRMHD simulations and empirical expectations from the Sgr A* light curve, both of which imply that the variable elements of the Sgr A* emission should be uncorrelated beyond a timescale of a few hours (Georgiev et al. 2022; Wielgus et al. 2022). Any average image of Sgr A* reconstructed from data spanning a time range longer than several hours captures the long-timescale asymptotic source structure; the intrinsic image averaged over a single day or multiple days is thus expected to exhibit similar structure.

3.3. Model-agnostic Variability Quantification

To quantify the variability observed in the EHT Sgr A* data, we make use of the procedure described in Broderick et al. (2022). This procedure provides an estimate of the excess variability—i.e., the visibility amplitude variance in excess of that caused by known sources, such as average source structure, statistical and systematic uncertainties, and scattering—as a function of baseline length. We apply the same data preparation steps summarized in Section 3.2 and described in Broderick et al. (2022), combining the visibility amplitudes measured on April 5, 6, 7, and 10 in both observing bands. All data points are weighted equally.

The procedure is illustrated in Figure 3. We again make use of the strong correlations induced by the finite source size, and for every location in the (u,v)-plane we consider only those data points falling within a circular region of diameter 1 Gλ centered at that point (red circular region in the top panel of Figure 3). Within each such region containing at least three data points, we linearly detrend the light-curve-normalized visibility amplitudes with respect to u and v to remove variations due to physical structure (bottom panel of Figure 3), and we compute the variance of the residuals. This variance is then debiased to remove the contributions from the reported statistical errors, as described in Broderick et al. (2022). Finally, the variances from all regions having a common baseline length are averaged to produce an azimuthally averaged set of variances. The uncertainty in the variance estimates is obtained via Monte Carlo sampling of the unknown gains, leakage terms, and statistical errors.

Figure 4 shows the results of applying this procedure to the Sgr A* data, with the normalized visibility amplitude variance measurements given by the black points. For baselines shorter than 2.5 Gλ, the LMT calibration procedure precludes an accurate estimate of the variance, and thus these baselines have been excluded. For baselines between ∼2.5 and 6 Gλ in length, our empirical estimates of the noise exceed the typical contributions from statistical errors and refractive scattering, indicating the presence of an additional source of structural variability. The degree of inferred variability is consistent with that seen in prior millimeter-VLBI data sets, which is discussed further in Appendix A. For baselines longer than 6 Gλ, our measurements are consistent with the degree of variability expected from the statistical uncertainties in the data; we thus do not directly constrain the source variability on these long baselines.

To characterize the variability behavior within the (u,v)-plane, we fit a broken power law of the form in Equation (8) to the normalized variance measurements. As indicated by the filled black circles in Figure 4, significant measurements exist only in the range of baselines with lengths ∼2.5–6 Gλ; on baselines longer than 6 Gλ, we are unable to distinguish the variability from its associated measurement uncertainties. We thus perform the broken power-law fit only to the ∼2.5–6 Gλ range of baselines, where we have significant measurements, and we find no evidence for a break in the power law in this region. As a result, only an upper limit can be placed on u₀, and we are not able to constrain the short-baseline power-law index, c. The range of permitted broken power-law fits is illustrated in Figure 4 by the orange shaded region, with several samples from the posterior distribution explicitly plotted as orange lines.

Because the location of the broken power-law break is poorly constrained, the parameters u₀ and a (describing the location of the break and the amplitude of the power law at the break, respectively) are strongly correlated and highly uncertain. However, it is clear from the orange shaded region in Figure 4 that there is only a narrow range of variances permitted over the 2.5–6 Gλ range of baselines over which the data are constraining. We thus choose to characterize the amplitude of the excess variability noise at ∣ u ∣ = 4 Gλ, which we denote as a₄. Joint posteriors for a₄, the break location u₀, and the long-baseline power-law index b are shown in Figure 5. These constraints are used to inform the prior distributions for the full-track geometric modeling described in Section 7; the associated prior ranges on each parameter are indicated by the purple shaded regions in Figure 5.

**Figure 5.** Joint posteriors of the constrained parameters after fitting a broken power law to the model-agnostic normalized variances estimates. Because the amplitude is well constrained within the range of baseline lengths for which good estimates of the variability exist, we set the normalization at ∣ u ∣ = 4 Gλ, denoted as a₄. Contours show the enclosed 50th, 90th, and 99th percentiles. The purple bands indicate the ranges used as priors during the full-track modeling, associated with the interquartile ranges.
Download figure:
Standard image High-resolution image

3.4. Description of Variability Mitigation Approaches

Having established the existence of structural variability and quantified its magnitude in the Sgr A* data, we now turn to strategies for mitigating its impact on downstream analyses. We employ the light-curve-normalized visibility data, which eliminates large-scale variations and correlations by construction. In principle, there are four methods that we might pursue to address the remaining structural variability:

1.
Analyze time-averaged data products.
2.
Employ explicitly time-variable models.
3.
Analyze short time segments of the data and combine the results afterward to characterize the average source structure.
4.
Simultaneously reconstruct the average source structure and a statistical characterization of the structural variability.

The first of these options is complicated substantially by the uncertain visibility phases, which limit our ability to coherently average the data on timescales longer than several minutes. The second option can be employed either when a descriptive low-dimensional model for the source structure can be constructed (e.g., Miller-Jones et al. 2019; Kim et al. 2020) or when there is sufficient (u,v)-coverage for nonparametric dynamical imaging algorithms to be successful (e.g., Johnson et al. 2017; Bouman et al. 2018; Arras et al. 2022). The latter approach is explored in the dynamical imaging analyses described in Paper III, ultimately demonstrating that the Sgr A* (u,v)-coverage is insufficient to permit unambiguous reconstructions of the variable source structure.

We dub the third option "snapshot" modeling, whereby a simple geometric model of the source structure is fit to segments of the data that are short enough in duration ( ≲ 3 minutes) for the impact of structural variability to be subdominant to other sources of visibility uncertainty (e.g., refractive scattering; see Figure 1). Though the data sparsity is exacerbated by restricting the reconstructions to only a single snapshot at a time, the model itself is also correspondingly restricted in its parameterization of the source structure. The results of the fits to each individual snapshot are then combined across the entire data set, effectively averaging over the source variability. Details of our snapshot modeling analyses as applied to Sgr A* are presented in Section 6.

The fourth option we refer to as "full-track" modeling, which aims to simultaneously reconstruct both the average source structure and a set of parameters describing the contribution of the structural variability to the visibility data variances (Broderick et al. 2022). In contrast to the snapshot modeling, full-track modeling considers the entire data set at once and uses a parameterized "variability noise" model to appropriately modify the data uncertainties as part of the fitting procedure. In this way, the full-track modeling retains access to sufficient (u,v)-coverage to permit fitting a nonparametric image model to the data (see Paper III), though in Section 7 we also pursue full-track geometric modeling to provide a cross-comparison with the results from the snapshot geometric modeling. Our parameterization of the variability noise follows Equation (8), with the amplitude specified at a baseline length of 4 Gλ as described in Section 3.3. A detailed description of our full-track modeling approach as applied to Sgr A* is presented in Section 7.

Both the snapshot and full-track modeling approaches focus on describing the average source structure and treating the structural variability in a statistical manner. This goal is formally mismatched with what the EHT data measure for a single day, which is instead a collection of complex visibilities that sample different instantaneous realizations of the intrinsic Sgr A* source structure. The nature of this mismatch impacts the full-track analyses significantly.

The variability mitigation scheme employed by full-track modeling presumes that the variability may be modeled as excess uncorrelated fluctuations in the complex visibility data. This assumption is well justified on timescales exceeding a few hours, but significant correlations between visibilities exist on shorter timescales. Within a single day, subhour correlations that are localized in the (u,v)-plane can induce significant biases in the source structure reconstructed from the sparse EHT (u,v)-coverage of Sgr A*. The noise model is thus fundamentally misspecified for EHT data, with the level of misspecification increasing as shorter-in-time segments of data are analyzed; Appendix B describes pathological behavior that can arise when analyzing EHT Sgr A* data from only a single day. While prominent artifacts associated with these subhour correlations are present in the April 6 reconstructions shown in Appendices B and C, we note that the underlying origin of these artifacts is no less present on April 7.

The impact of unmodeled correlations on the reconstructed source structure can be ameliorated by combining multiple days, which provides visibility samples associated with independent realizations of the source structure. This additional sampling rapidly brings the statistical properties of the data into better agreement with the assumptions underpinning the full-track analyses; even the combination of just 2 days is often sufficient to mitigate the subhour correlations in analysis experiments that make use of GRMHD simulation data. For this reason, we combine both the April 6 and April 7 data sets during the analysis of the Sgr A* data. For comparison, Appendix C presents the results of equivalent analyses applied to the April 6 and April 7 data sets individually.

4. Ring Characterization and Calibration

We have a strong prior expectation—from both prior millimeter-VLBI observations of a different black hole (i.e., the EHT images of M87*; see M87* Paper IV) and theoretical simulations of the accretion flow around Sgr A* itself (see Paper V)—that Sgr A* ought to contain a ring of emission, and we thus aim to determine the characteristics of the ring-like image structure that best describes the Sgr A* data. In this section, we first review the evidence from the Sgr A* data for a ring-like image structure, and we then present a geometric model for fitting parameters of interest and describe our procedure for bringing ring size measurements made using different techniques to a common physical scale.

4.1. Evidence for a Ring

In reconstructing images of Sgr A*, Paper III explores a large space of imaging algorithms and associated assumptions. The resulting "top sets" of images contain primarily "ring-like" image structures, though a small fraction of the images are morphologically ambiguous. These "nonring" images still nominally provide a reasonable fit to the Sgr A* data and so are not ruled out from the Paper III results.

We can quantify the preference for a ring-like image structure by fitting the data with a set of simple geometrical models. Employing the snapshot geometric modeling technique detailed in Section 6, we compare the Bayesian evidence, ${ \mathcal Z }$ , between these different geometric models. The value of ${ \mathcal Z }$ serves as a model comparison metric that naturally balances improvements in fit quality against increases in model complexity, with larger values of ${ \mathcal Z }$ indicating preferred models (see, e.g., Trotta 2008). Figure 6 shows the results of a survey over simple geometric models with varying complexity, captured here by the number of parameters required to specify the model. At all levels of complexity, ring-like models outperform the other tested models. This disparity is most stark for the simplest models but continues to hold as the models increase in complexity.

**Figure 6.** Comparison of the relative Bayesian evidence, ${\rm{\Delta }}\mathrm{ln}{ \mathcal Z }$ , for a series of increasingly complex geometric models fitted using closure amplitudes and closure phases within the snapshot modeling formalism described in detail in Section 6. The fits have been carried out using `eht-imaging` on the HOPS April 7 Sgr A* data, and each point in the figure is colored according to the number of free parameters in the model; the number of free parameters in each model is also indicated in the horizontal axis labels. The panel on the right shows a zoom-in to the highest-evidence region of the left panel. Ring-like models are indicated with circles, and nonring models are indicated with crosses. All Bayesian evidence values are quoted relative to the highest value attained across all models. The parameter counts reflect the fact that all models are normalized to have unit total flux density and are centered at the image origin. The crescent model consists of a smaller disk subtracted from an offset larger disk. In the crescent+floor model, the smaller disk may have a nonzero flux density. The m-ring and mG-ring models are defined in Section 4.3. The maximum value of $\mathrm{ln}{ \mathcal Z }$ among the models explored in this figure is obtained for an m = 2 mG-ring model, in agreement with the `DPI` analysis described in Section 6.
Download figure:
Standard image High-resolution image

**Figure 6.** Comparison of the relative Bayesian evidence, ${\rm{\Delta }}\mathrm{ln}{ \mathcal Z }$ , for a series of increasingly complex geometric models fitted using closure amplitudes and closure phases within the snapshot modeling formalism described in detail in Section 6. The fits have been carried out using `eht-imaging` on the HOPS April 7 Sgr A* data, and each point in the figure is colored according to the number of free parameters in the model; the number of free parameters in each model is also indicated in the horizontal axis labels. The panel on the right shows a zoom-in to the highest-evidence region of the left panel. Ring-like models are indicated with circles, and nonring models are indicated with crosses. All Bayesian evidence values are quoted relative to the highest value attained across all models. The parameter counts reflect the fact that all models are normalized to have unit total flux density and are centered at the image origin. The crescent model consists of a smaller disk subtracted from an offset larger disk. In the crescent+floor model, the smaller disk may have a nonzero flux density. The m-ring and mG-ring models are defined in Section 4.3. The maximum value of $\mathrm{ln}{ \mathcal Z }$ among the models explored in this figure is obtained for an m = 2 mG-ring model, in agreement with the `DPI` analysis described in Section 6.
Download figure:
Standard image High-resolution image

The remainder of this paper proceeds with analyses that presuppose a ring-like emission structure for Sgr A*.

4.2. Salient Features in the Context of a Ring Model

The overall structure of the Sgr A* visibility amplitudes (see the left panel of Figure 7) exhibits at least three distinct regions:

1.
A "short-baseline" region containing baselines shorter than ∼2 Gλ. The effects of data calibration and preprocessing—particularly the light-curve normalization and LMT calibration procedures (Paper III)—are evident in the unit total flux density and the Gaussian structure of the visibility amplitudes in this region.
2.
An "intermediate-baseline" region containing baselines between ∼2 and 6 Gλ. The visibility amplitudes in this region exhibit a general rise and then fall with increasing baseline length, peaking at a flux density ∼20% of the total at a baseline length of ∼4 Gλ.
3.
A "long-baseline" region containing baselines with lengths in excess of ∼6 Gλ. The visibility amplitudes in this region generally rise with increasing baseline length from a deep minimum near ∼6.5 Gλ, approximately flattening out at longer baselines to a level that is ∼3%–10% of the total flux density.

The visibility amplitudes exhibit indications of asymmetric source structure, particularly on baselines with lengths of ∼3 Gλ that fall near the first minimum. Here, the baselines between the SMT and Hawai'i stations (oriented approximately in the east–west direction) have systematically higher correlated flux densities than the similar-length baselines between the LMT and Chile stations (oriented approximately in the north–south direction). The implication for the source morphology is that we would expect to see more symmetric structure in the north–south than in the east–west direction. Detailed geometric modeling analyses that are able to capture this asymmetry are described in Sections 6 and 7; here, we consider only a simple azimuthally symmetric toy model that captures some salient features of interest.

We attempt to understand the visibility behavior in light of expectations for a ring-like emitting structure. Specifically, we consider a geometric construction whereby an infinitesimally thin circular ring bordering an inner disk of emission is convolved with a Gaussian blurring kernel. The visibility function V produced by such an emission structure is given by

$\begin{eqnarray}\begin{array}{rcl}V & = & {F}_{0}{V}_{\mathrm{Gauss}}\left[{f}_{d}{V}_{\mathrm{disk}}+(1-{f}_{d}){V}_{\mathrm{ring}}\right]\\ & = & {F}_{0}\exp \left(-\displaystyle \frac{{w}^{2}{\xi }^{2}}{4\mathrm{ln}(2)}\right)\left[\displaystyle \frac{2{f}_{d}}{\xi }{J}_{1}(\xi )+(1-{f}_{d}){J}_{0}(\xi )\right],\end{array}\end{eqnarray} \tag{ 9 }$

where F₀ is the total flux in the image, f_d is the fraction of that flux that is contained in the disk component, w = W/d is a fractional ring width, W is the FWHM of the Gaussian convolving kernel, d is the diameter of the ring and disk components, ξ ≡ π∣ u ∣d is a normalized radial visibility-domain coordinate, and J_n(ξ) is a Bessel function of the first kind of order n.

The three regions of Sgr A* data identified above are separated by apparent minima in the visibility amplitudes, and they can be approximately characterized by the baseline locations of those minima and the peak flux density levels achieved at the visibility maxima between them. Figure 7 illustrates how this characterization manifests as constraints on the defining parameters of the geometric toy model. The cyan and purple shaded regions in the left panel indicate the approximate ranges of baseline lengths corresponding to the locations of the first and second visibility minima, respectively. The locations of these minima constrain the diameter of the emitting structure, as shown in the top right panel of Figure 7. To be consistent with both a first minimum falling between ∼2.5 and 3.5 Gλ and a second minimum falling between ∼6 and 7 Gλ, the emitting region must be between ∼50 and 60 μas across. The amplitudes of two visibility maxima—one falling between the first and second visibility minima, and the second following the second minimum—constrain a combination of the fractional disk flux f_d and the fractional ring width w. The bottom right panel of Figure 7 shows the constraints from the first and second visibility maxima in red and orange, respectively, and from the ratio of the two in green.

Taken together, even these few, simple, and only modestly constrained visibility features result in a rather narrow permitted range of model parameter values for d, w, and f_d; an example of a "best-fit" model from within the permitted range is shown by the gray curve in the left panel of Figure 7. However, we stress that the above constraints only strictly hold within the context of the specific toy model used to derive them. More general and robust constraints on the emission structure require a model that can accommodate more than just the gross features; such models are produced as part of the imaging (Paper III and Section 5) and geometric modeling analyses (see Sections 6 and 7) carried out in this paper series.

4.3. Geometric Ring Model Specification

The ring-like images reconstructed in Paper III are not azimuthally symmetric, but instead show pronounced azimuthal brightness variations that we would like to capture in our geometric modeling analyses. In this section, we specify the "mG-ring" model that we use in Sections 6 and 7 to quantify the morphological properties of the observed Sgr A* emission.

4.3.1. Image-domain Representation of mG-ring Model

Adopting the construction developed by Johnson et al. (2020), we can model an infinitesimally thin circular ring with azimuthal brightness variations using a sum over angular Fourier modes indexed by integer k,

$\begin{eqnarray}&&{I}_{\mathrm{ring}}(r,\phi )=\displaystyle \frac{{F}_{\mathrm{ring}}}{\pi d}\,\delta \,\left(r-\displaystyle \frac{d}{2}\right)\sum _{k=-m}^{m}{\beta }_{k}{e}^{{ik}\phi }.\end{eqnarray} \tag{ 10 }$

Here r is the image radial coordinate, ϕ is the azimuthal coordinate (east of north), d is the ring diameter, {β_k} are the set of (dimensionless) complex azimuthal mode coefficients, and m sets the order of the expansion. Because the image is real, ${\beta }_{-k}={\beta }_{k}^{* };$ we enforce β₀ ≡ 1 so that F_ring sets the total flux density of the ring. Given that the images from Paper III show a ring of radius ∼ 25 μas and the diffraction-limited EHT resolution is ∼20 μas, we expect the data to primarily constrain ring modes with $m\lesssim \pi \left(25/20\right)\approx 4$ . We refer to this asymmetric ring as an "m-ring" of order m.

For the purposes of constraining additional image structures, we augment this m-ring in two ways. First, we convolve the m-ring with a circular Gaussian kernel of FWHM W,

$\begin{eqnarray}&&{I}_{\mathrm{ring}}(r,\phi ;W)={I}_{\mathrm{ring}}(r,\phi )* \left[\displaystyle \frac{4\mathrm{ln}(2)}{\pi {W}^{2}}\exp \,\left(-\displaystyle \frac{4\mathrm{ln}(2){r}^{2}}{{W}^{2}}\right)\right].\end{eqnarray} \tag{ 11 }$

Second, we add a circular Gaussian component that is concentric with the ring, which serves to provide a nonzero brightness floor interior to the ring. The Gaussian component has a total flux density of F_Gauss and an FWHM of W_Gauss,

$\begin{eqnarray}&&{I}_{\mathrm{Gauss}}(r,\phi )=\displaystyle \frac{4\mathrm{ln}(2){F}_{\mathrm{Gauss}}}{\pi {W}_{\mathrm{Gauss}}^{2}}\exp \,\left(-\displaystyle \frac{4\mathrm{ln}(2){r}^{2}}{{W}_{\mathrm{Gauss}}^{2}}\right).\end{eqnarray} \tag{ 12 }$

We refer to the resulting composite model I(r, ϕ), where

$\begin{eqnarray}&&I(r,\phi )={I}_{\mathrm{ring}}(r,\phi ;W)+{I}_{\mathrm{Gauss}}(r,\phi ),\end{eqnarray} \tag{ 13 }$

as an "mG-ring." An example mG-ring is shown in Figure 8.

An mG-ring of order m has 5 + 2m model parameters: the flux density in the ring (F_ring), the diameter of the ring (d), the flux density in the central Gaussian (F_Gauss), the FWHM of the central Gaussian (W_Gauss), the FWHM of the ring convolving kernel (W), and two parameters for each complex Fourier coefficient β_k with 1 ≤ k ≤ m.

4.3.2. Visibility-domain Representation of mG-ring Model

To aid in efficient parameter space exploration, the mG-ring model is intentionally constructed using components and transformations that permit analytic Fourier transformations. The Fourier transform of the m-ring image (Equation (10)) is given by

$\begin{eqnarray}&&{V}_{\mathrm{ring}}(| {\boldsymbol{u}}| ,{\phi }_{u})={F}_{\mathrm{ring}}\sum _{k=-m}^{m}{\beta }_{k}{J}_{k}(\pi | {\boldsymbol{u}}| d){e}^{{ik}({\phi }_{u}-\pi /2)},\end{eqnarray} \tag{ 14 }$

where (∣ u ∣, ϕ_u) are polar coordinates in the Fourier domain. The convolution with a circular Gaussian in the image plane corresponds to multiplication of this function by the Fourier transform of the convolving kernel,

$\begin{eqnarray}&&{V}_{\mathrm{ring}}(| {\boldsymbol{u}}| ,{\phi }_{u};W)=\exp \left(-\displaystyle \frac{{\pi }^{2}{W}^{2}| {\boldsymbol{u}}{| }^{2}}{4\mathrm{ln}(2)}\right){V}_{\mathrm{ring}}(| {\boldsymbol{u}}| ,{\phi }_{u}).\end{eqnarray} \tag{ 15 }$

The Fourier transform of the Gaussian image (Equation (12)) is given by

$\begin{eqnarray}&&{V}_{\mathrm{Gauss}}(| {\boldsymbol{u}}| ,{\phi }_{u})={F}_{\mathrm{Gauss}}\exp \left(-\displaystyle \frac{{\pi }^{2}{W}_{\mathrm{Gauss}}^{2}| {\boldsymbol{u}}{| }^{2}}{4\mathrm{ln}(2)}\right).\end{eqnarray} \tag{ 16 }$

By the linearity of the Fourier transform, the visibility-domain representation of the mG-ring model is then simply the sum of these two components,

$\begin{eqnarray}&&V(| {\boldsymbol{u}}| ,{\phi }_{u})={V}_{\mathrm{ring}}(| {\boldsymbol{u}}| ,{\phi }_{u};W)+{V}_{\mathrm{Gauss}}(| {\boldsymbol{u}}| ,{\phi }_{u}).\end{eqnarray} \tag{ 17 }$

When interpreting model-fitting results in subsequent sections, we are interested in a number of derivative quantities. We will typically work with the fractional thickness of the ring, w, defined to be

$\begin{eqnarray}&&w\equiv \displaystyle \frac{W}{d}.\end{eqnarray} \tag{ 18 }$

Similarly, we are typically interested in fractional representations of flux densities. We define

$\begin{eqnarray*}&&{F}_{0}\equiv {F}_{\mathrm{ring}}+{F}_{\mathrm{Gauss}}\end{eqnarray*}$

to be the total flux density, and then

$\begin{eqnarray}&&{f}_{\mathrm{ring}}\equiv \displaystyle \frac{{F}_{\mathrm{ring}}}{{F}_{0}}\end{eqnarray} \tag{ 19 }$

and

$\begin{eqnarray}&&{f}_{\mathrm{Gauss}}\equiv \displaystyle \frac{{F}_{\mathrm{Gauss}}}{{F}_{0}}\end{eqnarray} \tag{ 20 }$

are the fraction of the total flux density that is contained in the ring and in the Gaussian components, respectively. Note that F₀ is typically close to or fixed to unity as a consequence of normalizing the data by the light curve. We also define a fractional central flux as

$\begin{eqnarray}&&{f}_{{\rm{c}}}\equiv \displaystyle \frac{{F}_{\mathrm{Gauss}}(r\lt d/2)}{{F}_{\mathrm{ring}}+{F}_{\mathrm{Gauss}}(r\lt d/2)},\end{eqnarray} \tag{ 21 }$

where F_Gauss(r < d/2) is the integrated flux density of the central Gaussian component interior to the ring radius, given by

$\begin{eqnarray}&&{F}_{\mathrm{Gauss}}(r\lt d/2)={F}_{\mathrm{Gauss}}\left[1-\exp \left(-\displaystyle \frac{{d}^{2}\mathrm{ln}(2)}{{W}_{\mathrm{Gauss}}^{2}}\right)\right].\end{eqnarray} \tag{ 22 }$

Following M87* Paper IV, the m-ring position angle η and degree of azimuthal asymmetry A are both determined by the coefficient of the m = 1 mode,

$\begin{eqnarray}\begin{array}{rcl}\eta & \equiv & \arg \left({\displaystyle \int }_{0}^{2\pi }I(\phi ){e}^{i\phi }d\phi \right)\\ & = & -\arg \left({\beta }_{1}\right),\end{array}\end{eqnarray} \tag{ 23 }$

$\begin{eqnarray}\begin{array}{rcl}A & \equiv & \displaystyle \frac{\left|{\displaystyle \int }_{0}^{2\pi }I(\phi ){e}^{i\phi }d\phi \right|}{{\displaystyle \int }_{0}^{2\pi }I(\phi ){e}^{i\phi }d\phi }\\ & = & | {\beta }_{1}| .\end{array}\end{eqnarray} \tag{ 24 }$

A number of these derivative quantities are illustrated in the example mG-ring shown in Figure 8.

4.4. Calibrating Ring Size Measurements to a Common Physical Scale

The parameters returned by the geometric modeling and feature extraction analyses used in this paper to describe the Sgr A* emission structure do not correspond directly to physical quantities. Instead, the relationship between measured and physical quantities must be calibrated using data for which we know the correct underlying physical system's defining parameters. For ring size measurements, the associated physical quantity of interest is related to the angular size of the gravitational radius,

$\begin{eqnarray}&&{\theta }_{g}=\displaystyle \frac{{GM}}{{c}^{2}D}.\end{eqnarray} \tag{ 25 }$

which sets the absolute scale of the system.

Under the assumption that the emission near the black hole originates from some "typical" radius, a measurement of the angular diameter d of the emitting region will be related to θ_g by a scaling factor α,

$\begin{eqnarray}&&d=\alpha {\theta }_{g}.\end{eqnarray} \tag{ 26 }$

If the observations were directly sensitive to the critical curve bounding the black hole shadow, then α could be determined analytically and would take on a value ranging from ∼9.6 to 10.4 depending on the black hole spin and inclination (Bardeen 1973; Takahashi 2004). For more realistic emission structures and measurement strategies, the value of α cannot be determined from first principles and must instead be calibrated.

Our α calibration strategy generally follows the procedure developed in M87* Paper VI. Using the library of GRMHD simulations described in Paper V, we generate a suite of 100 synthetic data sets that emulate the cadence and sensitivity of the 2017 EHT observations and that contain a realistic character and magnitude of data corruption; Appendix D describes the generation of these synthetic data sets. In the analyses described in Sections 5, 6, and 7, 90 of these 100 synthetic data sets are used to derive the α calibration for each analysis pathway, while the remaining 10 data sets are used to validate the calibration.

After carrying out ring size measurements on each of the data sets in the suite, we determine α (for each specific combination of data set and measurement technique) by dividing the measured ring diameter by the known value of θ_g (per Equation (26)). For a given measurement technique, the distribution of α values that results from applying this procedure to the entire suite of synthetic data sets then provides a measure of α and its theoretical uncertainty. The α value associated with each measurement technique can then be used to translate Sgr A* ring size measurements into their corresponding θ_g constraints. We note that this calibration strategy assumes that the images contained in the GRMHD library provide a reliable representation of the emission structure in the vicinity of Sgr A*; a separate calibration strategy that relaxes this GRMHD assumption is presented in Paper VI.

Appendix I describes elements of the calibration and validation strategy that are specific to each of the analysis pathways detailed in Sections 5, 6, and 7.

5. Image-domain Feature Extraction

The imaging carried out in Paper III permits very flexible emission structures to be reconstructed from the Sgr A* data, but the majority of these images exhibit a ring-like morphology whose properties we seek to characterize. In this section, we describe our image-domain feature extraction (IDFE) procedure, which uses a topological classification scheme to identify the presence of a ring-like structure in an image and quantifies the parameters that best describe this ring using two different algorithms. We apply this IDFE procedure to the Sgr A* image reconstructions from Paper III.

5.1. Imaging Methods and Products

The imaging analyses carried out in Paper III use four different algorithms classified into three categories: one sampling-based posterior exploration algorithm (Themis; Broderick et al. 2020a, 2020b), one CLEAN-based deconvolution algorithm (DIFMAP; Shepherd 1997), and two "regularized maximum likelihood" (RML) algorithms (eht-imaging,Chael et al. 2016, 2018; SMILI, Akiyama et al. 2017a, 2017b). All methods produce image reconstructions using band-combined data (i.e., both low band and high band), and the latter three are run on two versions of the Sgr A* data: a "descattered" version that attempts to deconvolve the effects of the diffractive scattering kernel from the data, and a "scattered" version that applies no such deconvolution. The posterior exploration imaging method Themis instead applies the effects of diffractive scattering as part of its internal forward model, rather than deconvolving the data; the analogous "scattered" and "descattered" versions of the Themis images thus correspond simply to those for which the scattering kernel has been applied or not, respectively. The posterior exploration imaging jointly reconstructs the combined April 6 and April 7 data sets (see Appendix B), while the CLEAN and RML imaging reconstructs each day individually, focusing primarily on the April 7 data and using the April 6 data for cross-validation. Example fits and residuals for each of the imaging pipelines are shown in Figure 9,¹⁵² and χ² statistics for each image are provided in Appendix E; detailed descriptions of the data preprocessing and imaging procedures for each imaging algorithm are provided in Paper III.

**Figure 9.** Representative examples of imaging results for each of the four imaging pipelines used in Paper III; `DIFMAP` is shown in the top left panel, `eht-imaging` in the top right panel, `SMILI` in the bottom left panel, and Themis in the bottom right panel. The top section of each panel shows the light-curve-normalized complex visibility data (in blue) as a function of baseline length; the light-curve-normalized visibilities are denoted as $\hat{V}$ . The real parts of the complex visibilities are plotted as filled markers, and the imaginary parts of the complex visibilities are plotted as open markers; the corresponding model visibilities are overplotted as red points. The plotted data have been through the pre-analysis and pre-imaging calibration procedures described in M87* Paper III, Paper II, and Paper III. The bottom section of each panel shows the normalized residuals—i.e., the difference between the model and data visibilities, normalized by the data uncertainties—as a function of baseline length. The solid red horizontal line marks zero residual, and the two dotted horizontal red lines mark ± one standard deviation. The blue histogram on the right side of each bottom panel shows the distribution of normalized residuals, with the solid red curve showing a unit-variance normal distribution and the dotted green curve showing a normal distribution with variance equal to that of the normalized residuals. We note that the visibilities for the `DIFMAP`, `eht-imaging`, and `SMILI` pipelines have been "descattered" and so have somewhat larger typical amplitudes than the visibilities for the Themis pipeline (for which the scattering is incorporated as part of the forward model; see Equation (5.1) and Paper III). We also note that the different imaging pipelines make different choices about data averaging: `DIFMAP` and `eht-imaging` average the data over 60 s intervals, `SMILI` averages over 120 s intervals, and Themis averages over scans. Detailed descriptions of each of the imaging methods are provided in Paper III.
Download figure:
Standard image High-resolution image

**Figure 9.** Representative examples of imaging results for each of the four imaging pipelines used in Paper III; `DIFMAP` is shown in the top left panel, `eht-imaging` in the top right panel, `SMILI` in the bottom left panel, and Themis in the bottom right panel. The top section of each panel shows the light-curve-normalized complex visibility data (in blue) as a function of baseline length; the light-curve-normalized visibilities are denoted as $\hat{V}$ . The real parts of the complex visibilities are plotted as filled markers, and the imaginary parts of the complex visibilities are plotted as open markers; the corresponding model visibilities are overplotted as red points. The plotted data have been through the pre-analysis and pre-imaging calibration procedures described in M87* Paper III, Paper II, and Paper III. The bottom section of each panel shows the normalized residuals—i.e., the difference between the model and data visibilities, normalized by the data uncertainties—as a function of baseline length. The solid red horizontal line marks zero residual, and the two dotted horizontal red lines mark ± one standard deviation. The blue histogram on the right side of each bottom panel shows the distribution of normalized residuals, with the solid red curve showing a unit-variance normal distribution and the dotted green curve showing a normal distribution with variance equal to that of the normalized residuals. We note that the visibilities for the `DIFMAP`, `eht-imaging`, and `SMILI` pipelines have been "descattered" and so have somewhat larger typical amplitudes than the visibilities for the Themis pipeline (for which the scattering is incorporated as part of the forward model; see Equation (5.1) and Paper III). We also note that the different imaging pipelines make different choices about data averaging: `DIFMAP` and `eht-imaging` average the data over 60 s intervals, `SMILI` averages over 120 s intervals, and Themis averages over scans. Detailed descriptions of each of the imaging methods are provided in Paper III.
Download figure:
Standard image High-resolution image

For the CLEAN and RML imaging methods, there are a number of tunable hyperparameters associated with each algorithm whose values are determined through extensive "parameter surveys" carried out on synthetic data sets. During a parameter survey, images of each synthetic data set are reconstructed using a broad range of possible values for each hyperparameter. Settings that produce high-fidelity image reconstructions across all synthetic data sets are collected into a "top set" of hyperparameters, and these settings are then applied for imaging the Sgr A* data. The resulting top sets of Sgr A* images capture emission structures that are consistent with the data, and we use these top-set images for the feature extraction analyses in this paper.

The Themis imaging algorithm explores a posterior distribution over the image structure, and there are no hyperparameters that require synthetic data surveys to determine. Rather than producing a top set of images, Themis instead produces a sample of images drawn from the posterior determined from the Sgr A* data. We use these posterior image samples for the feature extraction analyses in this paper.

5.2. Image-domain Feature Extraction Methods

Given the top-set and posterior images from Paper III, we carry out IDFE analyses using two separate tools: REx and VIDA. An independent cross-validation of both IDFE tools has been carried out in P. Tiede et al. (2022, in preparation). In this section, we provide a brief overview of each method and specify the details relevant for the analyses presented in this paper.

5.2.1. `REx`

The Ring Extractor (REx) is an IDFE tool for quantifying the morphological properties of ring-like images. It is available as part of the eht-imaging software library and is described in detail in Chael (2019). REx was the main tool used in M87* Paper IV to extract ring properties from the M87* images, and detailed definitions of the various REx parameters are provided in that paper.

For the majority of the REx-derived ring parameters, we retain the same definitions as used in M87* Paper IV. REx first defines a ring center (x₀, y₀), which is determined to be the point in the image from which radial intensity profiles have a minimum dispersion in their peak intensity radii. The ring radius, r₀, is then taken to be the average of these peak intensity radii over all angles, and the ring thickness w is taken to be the angular average of the FWHM about the peak measured along each radial intensity profile. To avoid biases associated with a nonzero floor to the image brightness outside of the ring, we subtract out the quantity

$\begin{eqnarray}&&{I}_{\mathrm{floor}}=\frac{1}{2\pi }{\int }_{0}^{2\pi }I({r}_{\max }=60\,\mu \mathrm{as},\phi )d\phi \end{eqnarray} \tag{ 27 }$

when computing the FWHM, i.e., we compute the average FWHM of I(r, ϕ) − I_floor. For all other ring parameters, the definitions remain the same as those used in M87* Paper IV.

REx defines the ring position angle η and asymmetry A as the argument and amplitude, respectively, of the first circular mode,

$\begin{eqnarray}&&{\beta }_{1}={\lt\frac{{\int }_{0}^{2\pi }I(\phi )\cos (\phi )d\phi }{{\int }_{0}^{2\pi }I(\phi )d\phi }\gt}_{r},\end{eqnarray} \tag{ 28 }$

where the angled brackets denote a radial average between r₀ − w/2 and r₀ + w/2.¹⁵³ These definitions are analogous to those used to define the corresponding position angle and asymmetry of the mG-ring model (Equations (23) and (24), respectively). The fractional central brightness f_c is defined to be the ratio of the mean brightness within 5 μas of the center to the azimuthally averaged brightness along the ring (i.e., along r = r₀).

As in Paper III, we replace the negative pixels in Themis images with zero values before performing REx analyses.

5.2.2. `VIDA`

Variational Image-Domain Analysis (VIDA; P. Tiede et al. 2022, in preparation) is an IDFE tool for quantifying the parameters describing a specifiable image morphology; it is written in Julia (Bezanson et al. 2017) and contained in the package VIDA.jl.¹⁵⁴ VIDA employs a template-matching approach for image analysis, using parameterized templates to approximate an image and adjusting the parameters of the templates until a specified cost function is minimized. Within VIDA, the cost function takes the form of a probability divergence, which provides a distance metric between the image and template; the template parameters that minimize this divergence are taken to provide the best description of the image. The VIDA optimization strategy and additional details are provided in P. Tiede et al. (2022, in preparation).

For the IDFE analyses in this paper, we use VIDA's SymCosineRingwFloor template and the least-squares divergence (for details, see Section 8 of Paper III). This template describes an image structure that is similar to the mG-ring model (Section 4.3), and it is characterized by a ring center (x₀, y₀), a ring diameter d = 2r₀, an FHWM fractional ring thickness w, and a cosine expansion describing the azimuthal brightness distribution S(ϕ),

$\begin{eqnarray}&&S(\phi )=1-2\sum _{k=1}^{m}{A}_{k}\cos \left[k(\phi -{\eta }_{k})\right].\end{eqnarray} \tag{ 29 }$

To maintain consistency with the geometric modeling analyses (see Sections 6 and 7), we use m = 4. We also restrict the value of the A₁ parameter to be <0.5 to avoid negative flux in the template. As with the mG-ring model, the orientation η is equal to the first-order phase η₁, and the asymmetry A is equal to the first-order coefficient A₁.

To permit the presence of a central brightness floor, the SymCosineRingwFloor template contains an additional component in the form of a circular disk whose center point is fixed to coincide with that of the ring. The disk radius is fixed to be r₀. A Gaussian falloff is stitched to the outer edge of the disk, such that for radii larger than r₀ the intensity profile becomes a Gaussian with mean r₀ and an FWHM that matches the ring thickness. The flux of this disk component is a free parameter in the template. We then retain the same definition of the fractional central brightness f_c as used by REx.

5.3. Identifying Rings via Topological Classification

The output of the IDFE analysis is a set of distributions for the ring parameters from each imaging method; Figure 10 shows an example set of results from applying both IDFE software packages to the descattered Sgr A* posterior and top-set images. However, both REx and VIDA implicitly assume that the images fed into them contain a ring-like emission structure. If the input image does not contain a ring, then the output measurements may not be meaningful. For each input image, we thus wish to determine both whether the image contains a ring-like structure and how sensitive the IDFE results are to the specific manner in which "a ring-like structure" is defined.

To determine whether the images we are analyzing with REx and VIDA contain ring-like structures, we use metronization,¹⁵⁵ a software that preprocesses the images into a form suitable for topological analysis and extracts topologically relevant features with the help of the open-source computational topology code Dionysus 2 ¹⁵⁶ (Morozov 2017). A detailed description of metronization can be found in Christian et al. (2022).

The metronization preprocessing procedure consists of the following steps:

1.
First, the image undergoes a "robust" thresholding step, in which the pixels are sorted by brightness in a cumulative sequence, and all pixels below a certain threshold in this sequence have their values set to zero and the rest are set to a value of one.
2.
Next, in a process called "skeletonization," the Boolean image produced in the first step is reduced to its topological skeleton that preserves the topological characteristics of the original shape. This step thins large contiguous areas of flagged pixels and enlarges the "holes."
3.
The topological skeleton is rebinned and downsampled. Holes smaller than the rebinning resolution are preserved by the skeletonization in the previous step.
4.
The downsampled image undergoes skeletonization once more.

The resulting output is a low-resolution image that preserves the topologically relevant information from the original image, speeding up the application of computationally expensive topological algorithms that follow. A technique known as persistent homology is then used to convert this low-resolution image into a topological space that preserves features that are topologically invariant. It computes a quantity known as the first Betti number that provides a metric for measuring the number of holes present in the image.

The metronization software contains a number of tunable parameters that determine how closely the emission structure in the input image must resemble that of a topological ring, and for how many cumulative threshold levels it must persist, for it to be classified as a ring. We identify three modes for these parameters—a "permissive" mode, a "moderate" mode, and a "strict" mode—and explore the impact on the REx and VIDA measured parameter distributions when the input top sets and posterior images are restricted only to those that are classified as containing rings. We compare these results with those of a fourth, default setting in which the top sets and posterior images are not filtered by the classification prescribed by metronization.

We note that metronization differs from the ring identification methods presented in Paper III in that it searches for the presence of a topological ring in the input image. Figure 11 compares the mean ring and nonring descattered images for each imaging pipeline as classified by metronization in the "permissive" mode and the clustering analysis from Paper III. Both methods classify all the posterior imaging samples as rings, while the top-set imaging samples contain both ring and nonring images. We find that the mean ring and nonring images for each imaging pipeline are broadly consistent between the two classification methods.

**Figure 11.** Comparison of two ring classification procedures. Each panel shows a mean Sgr A* ring and nonring image for a single imaging pipeline, with the top row showing how the images are classified by `metronization` in the "permissive" mode and the bottom row showing the classification determined by the clustering analysis from Paper III. All of the images have been produced using descattered Sgr A* data from the HOPS calibration pipeline. The results correspond to combined April 6 and 7 data for posterior imaging and April 7 data for top-set imaging. All of the images share a common brightness color scale; the absolute brightness scale is arbitrary because each image has been normalized to have unit total flux density.
Download figure:
Standard image High-resolution image

The definition of what constitutes a ring is subjective, and there will always be images that are ambiguous to the human eye. Different automated methods will classify these images differently. Hence, it is important to verify that the ring parameters measured by REx and VIDA are robust against the specifics of the ring identification scheme used. Figure 12 shows the resulting diameter distributions from ring fitting to the descattered Sgr A* images from all imaging pipelines, split out by metronization setting. As we move from the most to the least permissive classification scheme, the tails in the distributions are diminished while the primary peaks are sharpened, but the mean and general shape of the distribution remain largely unchanged. This trend indicates that while metronization penalizes images with emission structures deviating from a topological ring, the distributions of the REx and VIDA measurements are robust against the choice of metronization mode employed.

**Figure 12.** Diameter distributions determined by `REx` and `VIDA` for all descattered Sgr A* images from the HOPS pipeline, organized by `metronization` mode. Each panel shows the fraction of images found to possess topological ring structure. For posterior imaging we use the combined April 6+7 results, and for top-set imaging we use the April 7 results.
Download figure:
Standard image High-resolution image

6. Snapshot Geometric Modeling

Because the Sgr A* data are observed to be time-variable (see Section 3), a static model cannot reproduce the observed data. As described in Section 3.4, one method for mitigating the effects of this variability on the reconstructed source structure is through the use of an inflated variability noise budget, as pursued during the imaging (Paper III; Section 5) and full-track geometric modeling (Section 7) analyses. In this section we instead pursue "snapshot" geometric modeling, whereby we fit a geometric model—the mG-ring model described in Section 4.3—for which the parameters are allowed to vary as a piecewise constant function of time. To this end, we divide the Sgr A* data into many independent and short "snapshots" over which the source is assumed to be static. In this section, we detail our formalism for fitting the mG-ring model to snapshots of data and for combining the fits from across snapshots into a global posterior distribution.

6.1. Data Preparation

Prior to fitting the mG-ring model to real or synthetic data, we process the data using the pre-imaging pipeline described in Paper III. This preprocessing procedure entails light-curve normalization and an inflation of the error budget to account for residual calibration uncertainties and the effects of refractive scattering in the interstellar medium toward Sgr A*. Specifically, the total error budget σ_sb for a visibility measured on the baseline b during snapshot s is given by

$\begin{eqnarray}&&{\sigma }_{{sb}}^{2}={\sigma }_{\mathrm{th},{sb}}^{2}+{f}^{2}| V{| }_{{sb}}^{2}+{\sigma }_{\mathrm{ref},{sb}}^{2}.\end{eqnarray} \tag{ 30 }$

Here the first term corresponds to the baseline-specific thermal noise (see Equation (2)), the second term is a component that is multiplicative in the visibility amplitude and is intended to capture residual (nongain) calibration errors (e.g., residual polarization leakage), and the third term is the J18model1 refractive scattering noise from Paper III. For the snapshot modeling, we fix f = 0.02 per the analyses carried out in Paper II. The pre-imaging pipeline also mitigates the impact of diffractive scattering by "deblurring" the data using the Johnson et al. (2018) model.

Following the application of the pre-imaging pipeline, we split the data into 120 s segments, or "snapshots," and coherently average the visibilities in each snapshot over the 120 s window. Finally, we flag snapshots that contain fewer than four unique stations, so as to retain snapshots during which closure amplitudes can be formed.

6.2. Snapshot Fitting Procedure

The first step of our snapshot modeling procedure is to determine the posterior distribution for the mG-ring model parameters on each snapshot of data. The observation is divided up into N_s independent snapshots, which we label using a snapshot index s. Within each snapshot we fit the mG-ring model described in Section 4.3, whose parameter vector we denote as θ _s. For a single snapshot, the posterior is given by Bayes's theorem,

$\begin{eqnarray}&&{P}_{s}({{\boldsymbol{\theta }}}_{s}| {{\boldsymbol{D}}}_{s})\displaystyle \frac{{{ \mathcal L }}_{s}({{\boldsymbol{D}}}_{s}| {{\boldsymbol{\theta }}}_{s}){\pi }_{s}({{\boldsymbol{\theta }}}_{s})}{{{ \mathcal Z }}_{s}({{\boldsymbol{D}}}_{s})},\end{eqnarray} \tag{ 31 }$

where D _s denotes the data available on snapshot s, ${{ \mathcal L }}_{s}$ is the likelihood, π_s is the prior distribution, and ${ \mathcal Z }$ is the Bayesian evidence.

In our snapshot modeling analyses we make use of three different classes of interferometric data products: visibility amplitudes ∣V∣, log closure amplitudes $\mathrm{ln}A$ , and closure phases ψ. Each analysis uses only a single amplitude data product (either visibility amplitudes or log closure amplitudes) along with the closure phases. For analyses that use visibility amplitudes and closure phases, the likelihood is given by

$\begin{eqnarray}&&{{ \mathcal L }}_{s}={{ \mathcal L }}_{| V| ,s}{{ \mathcal L }}_{\psi ,s},\end{eqnarray} \tag{ 32 }$

while for those that use log closure amplitudes, we instead have

$\begin{eqnarray}&&{{ \mathcal L }}_{s}={{ \mathcal L }}_{A,s}{{ \mathcal L }}_{\psi ,s}.\end{eqnarray} \tag{ 33 }$

Here ${{ \mathcal L }}_{| V| ,s}$ , ${{ \mathcal L }}_{\psi ,s}$ , and ${{ \mathcal L }}_{A,s}$ are components of the likelihood on snapshot s associated with the visibility amplitudes, closure phases, and log closure amplitudes, respectively. We assume Gaussian likelihood functions for the amplitude data components and a von Mises likelihood function for the closure phases; the detailed expressions for each likelihood function are provided in Appendix F.

6.3. Averaging the Snapshot Results

The output of a snapshot fitting analysis is a set of posterior samples for the model parameters from each individual snapshot; Figure 13 shows an example set of posterior distributions for the mG-ring diameter parameter on each snapshot in the April 6 and April 7 data sets. To arrive at a single posterior on these parameters that combines the information from all snapshots across both days, we use a Bayesian hierarchical model similar to the one used in Baronchelli et al. (2020). This approach treats the model fit to each snapshot as a realization from some average model or "hypermodel."

6.3.1. Averaging Procedure

We denote the parameters of the average model as $\displaystyle \bar{{\boldsymbol{\theta }}\,}$ and the distribution of the snapshot model conditioned on the average model by ${\pi }_{s}({{\boldsymbol{\theta }}}_{s}| \displaystyle \bar{{\boldsymbol{\theta }}})$ . Given this conditional probability, the joint snapshot and average parameter posterior is given by

$\begin{eqnarray}&&P({\boldsymbol{\Theta }},\displaystyle \bar{{\boldsymbol{\theta }}\,}\,| {\boldsymbol{ \mathcal D }})=\displaystyle \frac{\pi (\displaystyle \displaystyle \bar{{\boldsymbol{\theta }}\,}\,)}{{ \mathcal Z }({\boldsymbol{ \mathcal D }})}\prod _{s}{{ \mathcal L }}_{s}({{\boldsymbol{D}}}_{s}| {{\boldsymbol{\theta }}}_{s}){\pi }_{s}({{\boldsymbol{\theta }}}_{s}| \displaystyle \displaystyle \bar{{\boldsymbol{\theta }}\,}\,),\end{eqnarray} \tag{ 34 }$

where $\pi (\displaystyle \bar{{\boldsymbol{\theta }}\,}\,)$ is the prior distribution for the hypermodel parameters (the "hyperprior"), ${\boldsymbol{\Theta }}=({{\boldsymbol{\theta }}}_{1},{{\boldsymbol{\theta }}}_{2},...,{{\boldsymbol{\theta }}}_{{N}_{s}})$ is the parameter vector across all snapshots, and ${\boldsymbol{ \mathcal D }}=({{\boldsymbol{D}}}_{1},{{\boldsymbol{D}}}_{2},...,{{\boldsymbol{D}}}_{{N}_{s}})$ is the data vector across all snapshots. To find the marginal average parameter posterior, we integrate this expression over all the snapshot parameters,

$\begin{eqnarray}&&P(\displaystyle \bar{{\boldsymbol{\theta }}\,}\,| {\boldsymbol{ \mathcal D }})=\displaystyle \frac{\pi (\displaystyle \bar{{\boldsymbol{\theta }}\,}\,)}{{ \mathcal Z }({\boldsymbol{ \mathcal D }})}\int \left(\prod _{s}{{ \mathcal L }}_{s}({{\boldsymbol{D}}}_{s}| {{\boldsymbol{\theta }}}_{s}){\pi }_{s}({{\boldsymbol{\theta }}}_{s}| \displaystyle \bar{{\boldsymbol{\theta }}\,}\,)\right){\rm{d}}{\boldsymbol{\Theta }}.\end{eqnarray} \tag{ 35 }$

In general, this integral is analytically intractable. However, a bit of manipulation permits us to use the posterior samples from the individual snapshot fits to make headway. Because the snapshots are independent, we can swap the order of the integral and product in Equation (35) and use Bayes's theorem to substitute in for the snapshot likelihood (Equation (31)), giving

$\begin{eqnarray}&&P(\displaystyle \bar{{\boldsymbol{\theta }}\,}\,| {\boldsymbol{ \mathcal D }})=\pi (\displaystyle \bar{{\boldsymbol{\theta }}\,}\,)\prod _{s}\left(\int P({{\boldsymbol{\theta }}}_{s}| {{\boldsymbol{D}}}_{s})\displaystyle \frac{{\pi }_{s}({{\boldsymbol{\theta }}}_{s}| \displaystyle \bar{{\boldsymbol{\theta }}\,}\,)}{{\pi }_{s}({{\boldsymbol{\theta }}}_{s})}{\rm{d}}{{\boldsymbol{\theta }}}_{s}\right).\end{eqnarray} \tag{ 36 }$

Note that the evidence term from the prefactor denominator in Equation (35) has now been subsumed into the posterior term P( θ _s∣ D _s) inside of the integral. To evaluate Equation (36), we make use of the fact that the snapshot posterior samples ${{\boldsymbol{\theta }}}_{s}^{(i)}$ permit us to approximate the integral by a sum,

$\begin{eqnarray}&&P(\displaystyle \bar{{\boldsymbol{\theta }}\,}\,| {\boldsymbol{ \mathcal D }})\approx {\pi }_{s}(\displaystyle \bar{{\boldsymbol{\theta }}\,}\,)\prod _{s}\sum _{i}\displaystyle \frac{{\pi }_{s}({{\boldsymbol{\theta }}}_{s}^{(i)}| \displaystyle \bar{{\boldsymbol{\theta }}\,}\,)}{\pi ({{\boldsymbol{\theta }}}_{s}^{(i)})}.\end{eqnarray} \tag{ 37 }$

We can use this expression to sample from the posterior distribution over just the hypermodel parameters $\displaystyle \bar{{\boldsymbol{\theta }}}$ , having fully marginalized over the parameters from each individual snapshot.

We note that the averaging procedure described here is simply a generalization of standard inverse-variance weighting. If we consider a delta-function hypermodel that contains only a single parameter (i.e., the to-be-determined mean value) for each snapshot model parameter, then in the limit where the individual snapshot posteriors P( θ _s∣ D _s) are Gaussian and the priors on $\displaystyle \bar{{\boldsymbol{\theta }}\,}$ and θ _s are uninformative, the posterior maximum for $P(\displaystyle \bar{{\boldsymbol{\theta }}\,}\,| {\boldsymbol{ \mathcal D }})$ is equal to the mean of the snapshot posterior means weighted by their inverse posterior variances. However, because the model we employ (described in the following section) does not conform to the necessary conditions (see, e.g., the non-Gaussian snapshot posteriors shown in Figure 13), we proceed with the more general averaging procedure.

6.3.2. Hypermodel Specification

We now need to specify the hypermodel $\pi ({{\boldsymbol{\theta }}}_{s}| \displaystyle \bar{{\boldsymbol{\theta }}\,}\,)$ that determines the distribution from which the individual snapshot models are drawn; for simplicity, we choose a hypermodel that is approximately Gaussian. Let $\displaystyle \bar{{\boldsymbol{\theta }}\,}\,=({\boldsymbol{\mu }},{\boldsymbol{\sigma }})$ , where μ is a vector of the mean parameter values and σ is a vector containing their standard deviations across scans. We assign most hypermodel parameters to be distributed according to a truncated normal distribution,

$\begin{eqnarray}&&{\pi }_{s,\mathrm{tN}}({\theta }_{s,i}| {\mu }_{i},{\sigma }_{i})={ \mathcal T }({\theta }_{s,i}| {\mu }_{i},{\sigma }_{i};{a}_{i},{b}_{i}).\end{eqnarray} \tag{ 38 }$

Here ${ \mathcal T }(\theta | \mu ,\sigma ;a,b)$ denotes the density for a truncated normal distribution with mean μ and standard deviation σ, and whose lower and upper bounds are given by a and b, respectively; we index the separate parameters by i. This truncation is necessary to ensure that the support of the hypermodel parameters matches that of the individual snapshot model parameters. However, for angular parameters—i.e., those with values that are periodic in [0, 2π)—we instead use a von Mises distribution,

$\begin{eqnarray}&&{\pi }_{s,\mathrm{vM}}({\theta }_{s,j}| {\mu }_{j},{\sigma }_{j})=\displaystyle \frac{1}{2\pi {I}_{0}({\sigma }_{j}^{-2})}\exp \left[\displaystyle \frac{\cos ({\theta }_{s,j}-{\mu }_{j})}{{\sigma }_{j}^{2}}\right].\end{eqnarray} \tag{ 39 }$

The subscripted "tN" in Equation (38) indicates that the corresponding parameters use a truncated normal prior, while the subscripted "vM" in Equation (39) similarly indicates that the corresponding parameters use a von Mises prior. The total hypermodel is then given by

$\begin{eqnarray}&&\pi ({{\boldsymbol{\theta }}}_{s}| {\boldsymbol{\mu }},{\boldsymbol{\sigma }})=\prod _{i}{\pi }_{s,\mathrm{tN}}({\theta }_{s,i}| {\mu }_{i},{\sigma }_{i})\prod _{j}{\pi }_{s,\mathrm{vM}}({\theta }_{s,j}| {\mu }_{j},{\sigma }_{j}),\end{eqnarray} \tag{ 40 }$

where the first product runs over nonangular parameters and the second runs over angular parameters.

We set the hyperpriors for μ to be equal to the corresponding snapshot priors, which are specified in Table 1. For σ we instead use a half-normal hyperprior,

$\begin{eqnarray}&&\pi ({\boldsymbol{\sigma }})=\prod _{i}{ \mathcal T }({\sigma }_{i}| 0,{L}_{i}/4;\ 0,{L}_{i}),\end{eqnarray} \tag{ 41 }$

where L_i = (b_i − a_i) is the breadth of support for parameter θ_s,i. Appendix G describes the level of consistency between these selected hyperpriors and the priors for the individual snapshot model parameters.

Table 1. Snapshot Modeling mG-ring Priors

Parameter	`Comrade`	`DPI`
F₀	${ \mathcal U }(0.8,1.2)$	δ(1)
d (μas)	${ \mathcal U }(25,85)$	...
${d}^{{\prime} }$ (μas)	...	${ \mathcal U }(25,85)$
W (μas)	${ \mathcal U }(1,40)$	...
${W}^{{\prime} }$ (μas)	...	${ \mathcal U }(1,40)$
∣β_m∣	${ \mathcal U }(0,0.5)$	${ \mathcal U }(0,0.5)$
$\arg ({\beta }_{m})$ (deg)	${ \mathcal U }(-180,180)$	${ \mathcal U }(-180,180)$
f_Gauss	${ \mathcal U }(0,1)$	${ \mathcal U }(0,1)$
W_Gauss (μas)	${ \mathcal U }(40,200)$	${ \mathcal U }(40,200)$

Note. Prior distributions for Comrade and DPI snapshot geometric modeling analyses. ${ \mathcal U }(a,b)$ denotes a uniform prior on the interval [a, b], and δ(a) denotes a delta-function (i.e., fixed-value) prior, with the parameter value fixed at a. For the definitions of the parameters see Section 4.3.

Download table as: ASCII Typeset image

6.4. Software Implementations

We use three different software packages to carry out snapshot geometric modeling on the Sgr A* data and a fourth software to perform the hypermodel sampling. In this section, we specify the relevant implementation specifics for these different tools. Cross-validation tests are detailed in Appendix H.

6.4.1. `Comrade`

Our primary snapshot fitting software is the modeling framework Comrade (P. Tiede 2022, in preparation), which is written in the dynamic programming language Julia (Bezanson et al. 2017). Comrade does not natively include functionality for constructing a joint probability describing both observations and model. Instead, it interfaces with existing probabilistic programming languages present in Julia. For the analyses presented in this paper, we use the probabilistic programming package Soss¹⁵⁷ to construct the joint probability. This interface is specified in the package ComradeSoss.jl.¹⁵⁸ To sample from the posterior, we use the nested sampling package dynesty, which also produces estimates of the Bayesian evidence (Speagle 2020).

Given a model specification, Comrade can fit a variety of interferometric data products, including visibility amplitudes, closure phases, and log closure amplitudes. Unless otherwise specified, for the snapshot modeling analyses performed in this paper, we use Comrade to fit to visibility amplitudes and closure phases to the mG-ring model; the snapshot likelihood is thus given by Equation (32). Prior to fitting, all time stamps that contain fewer than four baselines are flagged.

When fitting to visibility amplitudes, we include the station gain amplitudes as model parameters alongside the geometric parameters that describe the mG-ring model. For the gain amplitudes we use a lognormal prior with a log-mean of zero (i.e., corresponding to unit gain amplitude) and a log standard deviation of 0.1 on all stations except for LMT, for which we use a log standard deviation of 0.2 to accommodate its larger variations (M87* Paper III; Paper II).

6.4.2. `eht-imaging`

We also utilize the geometric model-fitting tools developed within the eht-imaging Python library (Chael et al. 2016, 2018). This library enables visibility-domain fitting to arbitrary combinations of simple analytic models, including the mG-ring model, and it can do so using a variety of interferometric data products, including visibility amplitudes, closure phases, and log closure amplitudes. eht-imaging is also able to interface with a variety of external packages to perform parameter optimization or posterior exploration.

For the snapshot modeling analyses performed in this paper, we match the operation of eht-imaging with that of Comrade. Unless otherwise specified, we use eht-imaging to fit to visibility amplitudes and closure phases, so that the snapshot likelihood is given by Equation (32), and we use dynesty (Speagle 2020) for posterior exploration and evidence estimation. We also specify the same priors for the station gain parameters as used in the Comrade fits.

Given that both eht-imaging and Comrade use identical model specifications, priors, and samplers, we expect all results produced by these softwares to be identical up to sampling precision. We thus use only Comrade fits for all Sgr A* snapshot geometric analyses in this paper.

6.4.3. `DPI`

The third software we use for snapshot geometric modeling is the Python code Deep Probabilistic Imaging/Inference (DPI/α -DPI; Sun & Bouman 2021; Sun et al. 2022). DPI approximates the posterior over all model parameters by fitting a normalizing flow neural network (Rezende & Mohamed 2015) to the data using a Rényi α-divergence variational inference technique (Li & Turner 2016). DPI is an optimization-based posterior estimation framework, and it uses the auto-differentiation package PyTorch (Paszke et al. 2017) to optimize the neural network weights. The posterior estimation accuracy is further improved post-optimization through importance reweighting of the samples generated by the normalizing flow neural network.

DPI supports fitting to multiple data products, including visibility amplitudes and closure quantities, but it does not currently support the inclusion of station gain amplitudes as model parameters. We thus use DPI to fit to closure phases and log closure amplitudes; the snapshot likelihood is given by Equation (33). Prior to fitting, all time stamps that are unable to form at least one closure phase and at least one closure amplitude are flagged.

DPI differs from both Comrade and eht-imaging in that it defines geometric models in the image domain rather than in the visibility domain, and it uses a nonuniform fast Fourier transform (NFFT) to compute the necessary data products. For the analyses carried out in this paper, we discretize the model as an image containing 32 × 32 pixels spanning a 160 μas field of view.

Because the pixel size is finite, DPI cannot support a model containing infinitesimally thin rings such as that in Equation (10); furthermore, convolutions in the image domain are computationally expensive. The DPI fits in this paper thus employ a modified version of the mG-ring model specification,

$\begin{eqnarray}\begin{array}{rcl}{I}_{\mathrm{ring}}^{{\prime} }(r,\phi ) & = & {F}_{\mathrm{ring}}\displaystyle \frac{\exp \left[\tfrac{4\mathrm{ln}2}{{{W}^{{\prime} }}^{2}}{\left(r-\tfrac{{d}^{{\prime} }}{2}\right)}^{2}\right]\sum _{k=-m}^{m}{\beta }_{k}{e}^{{ik}\phi }}{2\pi {\displaystyle \int }_{0}^{\infty }r\exp \left[\tfrac{4\mathrm{ln}2}{{{W}^{{\prime} }}^{2}}{\left(r-\tfrac{{d}^{{\prime} }}{2}\right)}^{2}\right]\,{dr}},\\ {I}^{{\prime} }(r,\phi ) & = & {I}_{\mathrm{ring}}^{{\prime} }(r,\phi )+{I}_{\mathrm{Gauss}}(r,\phi ),\\ {F}_{0} & = & {F}_{\mathrm{ring}}+{F}_{\mathrm{Gauss}}=1,\end{array}\end{eqnarray} \tag{ 42 }$

where we note that ${d}^{{\prime} }$ and ${W}^{{\prime} }$ are conceptually distinct from d and W. The quantities d and W in the mG-ring model from Section 4.3 determine the diameter of the infinitesimally thin ring and the FWHM of its convolving kernel, respectively. In contrast, the quantities ${d}^{{\prime} }$ and ${W}^{{\prime} }$ determine the intensity peak and FWHM, respectively, of a radial Gaussian function. These two specifications converge only in the limit of large d and small W. In addition, the total flux of the DPI model implementation is fixed to be 1 Jy because DPI fits only to closure quantities and closure amplitudes are not sensitive to the absolute flux scale.

6.4.4. Sampling the Hypermodel Posterior

To sample from the hypermodel posterior $P(\displaystyle \bar{{\boldsymbol{\theta }}}| {\boldsymbol{ \mathcal D }})$ (Equation (37)), we use the adaptive Metropolis sampler from Vihola (2010) via its implementation in the Julia package RobustAdaptiveMetropolisSampler.jl.¹⁵⁹ The sampler is initialized by first running an adaptive genetic algorithm from the Julia package BlackBoxOptim.jl,¹⁶⁰ which provides a starting point near the maximum posterior density. We run the sampler for a minimum of 2 million Markov Chain Monte Carlo (MCMC) steps or until we have effective sample sizes of 500 for all parameters.

6.5. Model Selection

The mG-ring model described in Section 4.3 is not actually a single model but rather a class of models, delineated by the order m (Equation (10)). To determine the m-order that is preferred by the Sgr A* data, we carry out a series of snapshot mG-ring fits to the Sgr A* using different values of m and compare Bayesian evidence estimates. Given a set of log-evidences $\mathrm{ln}{{ \mathcal Z }}_{s}$ computed for every snapshot s in a single observation, the total evidence for the entire observation is simply given by their sum,

$\begin{eqnarray}&&\mathrm{ln}{ \mathcal Z }=\sum _{s}\mathrm{ln}{{ \mathcal Z }}_{s}.\end{eqnarray} \tag{ 43 }$

The Comrade snapshot fitting analyses directly estimate the Bayesian evidence on every snapshot, and so the total evidence across an entire observation can be computed directly using Equation (43). The results of a Comrade m-order survey covering m = {1, 2, 3, 4, 5} are shown in Figure 14. We find that the m = 4 order is preferred in both bands and across both calibration pipelines. We thus use the m = 4 mG-ring as our fiducial model for all Comrade Sgr A* analyses in this paper.

**Figure 14.** Relative Bayesian evidence and best-fit mG-ring model diameter vs. m-order, from the snapshot+averaging geometric modeling from the `Comrade` pipeline applied to the Sgr A* data. Because the absolute values of $\mathrm{ln}{ \mathcal Z }$ can be substantially different for each data set (i.e., each combination of observing day, frequency band, and calibration pipeline), and because only the relative values carry information about model specification preferences, we reference all $\mathrm{ln}{ \mathcal Z }$ values to the maximum value attained at any m for that data set.
Download figure:
Standard image High-resolution image

**Figure 14.** Relative Bayesian evidence and best-fit mG-ring model diameter vs. m-order, from the snapshot+averaging geometric modeling from the `Comrade` pipeline applied to the Sgr A* data. Because the absolute values of $\mathrm{ln}{ \mathcal Z }$ can be substantially different for each data set (i.e., each combination of observing day, frequency band, and calibration pipeline), and because only the relative values carry information about model specification preferences, we reference all $\mathrm{ln}{ \mathcal Z }$ values to the maximum value attained at any m for that data set.
Download figure:
Standard image High-resolution image

Unlike Comrade, DPI does not directly estimate the Bayesian evidence during each fit. Instead, we use the evidence lower bound (ELBO) to determine the m-order preference. The ELBO is a combination of the true evidence modified by a relative entropy term that encodes the performance of the variational approximation,

$\begin{eqnarray}&&\mathrm{ELBO}(m)=\mathrm{log}p({{\boldsymbol{D}}}_{s}\parallel m)-{D}_{{KL}}[q({{\boldsymbol{\theta }}}_{s})\parallel p({{\boldsymbol{\theta }}}_{s}| {{\boldsymbol{D}}}_{s},m)],\end{eqnarray} \tag{ 44 }$

where D_KL[A∥B] is the Kullback–Leibler divergence of A from B, and q( θ _s) is the optimized DPI normalizing flow distribution. The relative entropy term is zero when the DPI distribution q( θ _s) and the true posterior p( θ _s∣ D _s) are identical, so the ELBO provides a rough estimate of the log-evidence. The results of a DPI m-order survey covering m = {1, 2, 3, 4} indicate that either m = 1 or m = 2 is preferred, depending on the day and band. We choose to err on the side of increased model flexibility and use the m = 2 mG-ring as our fiducial model for all DPI Sgr A* analyses in this paper.

Figure 15 shows representative mG-ring fits to the Sgr A* HOPS low-band data for both the Comrade and DPI pipelines. In all cases, we find that the normalized residuals are distributed around a value of zero with a subunity variance, and there is no evidence of systematic structure. The χ² statistics for each of these fits are provided in Appendix E.

7. Full-track Geometric Modeling

The snapshot modeling analysis presented in the previous section addresses the variability of the Sgr A* data by explicitly permitting the source structure to vary in time. As described in Section 3.4, an alternative approach to fitting variable data is to statistically capture the impact of variability, treating it as an additional source of uncertainty modifying data that otherwise describe a static (or average) source structure. We pursue such an approach here in the form of "full-track" geometric modeling, whereby we fit the mG-ring model (see Section 4.3) to an entire data set at once and account for the variability by simultaneously fitting a parameterized noise model. In this section we detail our formalism for fitting the mG-ring geometric model alongside a model that captures the noise budget inflation associated with source variability.

7.1. Data Preparation

The data preparation for the full-track geometric modeling analyses is similar to that used for snapshot geometric modeling analyses (see Section 6.1). The data are first processed through the pre-imaging pipeline described in Paper III, which applies light-curve normalization and performs some a priori gain calibration. However, unlike in Paper III and Section 6.1, we do not modify the data uncertainties at all beyond their thermal noise values; neither a systematic error term nor a refractive scattering term is added to the error budget. Additionally, no "deblurring" is applied to the data; instead, the blurring is applied directly to the model as described in the next section.

Following the application of the pre-imaging pipeline, we coherently average the visibilities from each baseline on a per-scan basis. A scan length (∼10 minutes) is approximately the amount of time over which we expect structural variability to be subdominant to other sources of uncertainty (see Section 3.1, in particular Figure 1). Furthermore, the station gains are expected to be constant in time across a single scan but not from one scan to the next (M87* Paper III; Paper II), meaning that a scan length is also the longest coherent integration time that the a priori calibration can support.

While the full-track modeling is necessarily focused on reconstructing a time-averaged image structure, the underlying data remain a collection of complex visibilities that sample different instantaneous realizations of the intrinsic Sgr A* source structure. As a consequence, the Sgr A* data exhibit subhour correlations that over a single day are localized in the (u,v)-plane. As previously noted in Section 3.4 and detailed in Appendix B, these unmodeled correlations can result in significant biases in the reconstructed properties of Sgr A*. However, by fitting to multiple days of Sgr A* data, and thus combining multiple samplings of the variable source structure at each location in the (u,v)-plane, we better match the statistical properties of the data to those assumed by the full-track analysis. An additional benefit of combining days is that the multiday analyses more clearly emphasize the static signatures of gravitational lensing from the spurious astrophysical variability. For these reasons, all full-track analyses presented in Section 8 make use of the combined April 6 and April 7 Sgr A* data. For comparison we provide single-day analysis results in Appendix C.

7.2. Model Specification and Implementation

The goal of the full-track geometric modeling procedure is to determine the posterior distribution for the parameters of the static mG-ring model and parameterized noise model that best describe an entire Sgr A* data set. Our specification for the mG-ring model is described in Section 4.3, and we retain the same notation and terminology in this section. Additionally, we incorporate the blurring effects of scattering in the same manner described in Section 4 of Paper III, through multiplication of the mG-ring visibilities by the Fourier transform of the scattering kernel. We note, however, that all images shown in the figures in this paper correspond to the underlying (i.e., nonscattered) image.

7.2.1. Parameterized Noise Model

Our parameterized noise model for a complex visibility V_i measured on a baseline u _i is given by

$\begin{eqnarray}&&{\sigma }_{i}^{2}={\sigma }_{\mathrm{th},i}^{2}+{f}^{2}| V{| }_{i}^{2}+{\sigma }_{\mathrm{ref}}^{2}+{\sigma }_{\mathrm{var}}^{2}(| {{\boldsymbol{u}}}_{i}| ).\end{eqnarray} \tag{ 45 }$

Here the first term is the thermal noise in the measurement (see Equation (2)), the second term is a component that is multiplicative in the visibility amplitude ∣V∣_i and is intended to capture residual (nongain) calibration errors (e.g., residual polarization leakage), the third term is a component that is additive and is intended to account for refractive scattering noise, and the fourth term is a component that is a function of the baseline length ∣ u _i∣ and is intended to capture the effects of source variability. With the exception of the variability term, Equation (45) is similar to the noise budget used in the snapshot modeling (see Equation (30)); the only difference is that now f and σ_ref enter into the model as free parameters.

The variability noise σ_var is described in Section 3 (see Equation (8)) and consists of a broken power law in ∣ u ∣ specified by four parameters: an overall amplitude a₄ specified at a baseline length of 4 Gλ, a falling long-baseline power-law index b, a rising short-baseline power-law index c, and a baseline length u₀ at which the power-law breaks. Informative prior bounds for each of these parameters are determined from the model-agnostic variability quantification described in Section 3.3, and these bounds are listed in Table 2.

Table 2. Full-track Modeling mG-ring Priors

Parameter	Prior
f_ring	${ \mathcal U }(0.05,4.0)$
d (μas)	${ \mathcal U }(20,85)$
W (μas)	${ \mathcal U }(1,40)$
∣β_m∣	${ \mathcal U }(0.0,0.5)$
$\arg ({\beta }_{m})$ (deg)	${ \mathcal U }(-180,180)$
f_Gauss	${ \mathcal U }(0.05,4.0)$
W_Gauss (μas)	${ \mathcal U }(20,200)$

σ_ref	${{ \mathcal N }}_{L}(\mathrm{log}(0.004),1)$
f	${{ \mathcal N }}_{L}(\mathrm{log}(0.01),1)$
a₄	${{ \mathcal U }}_{L}({10}^{-4.39784},{10}^{-4.27339})$
b	${ \mathcal U }(2.35213,3.37849)$
c	${ \mathcal U }(1.5,2.5)$
u₀ (Gλ)	${{ \mathcal U }}_{L}({10}^{-1.09771},{10}^{0.236534})$

Note. Prior distributions for Themis full-track geometric modeling analyses. The top section lists priors for the mG-ring model parameters, and the bottom section lists priors for the parameterized noise model. ${ \mathcal U }(a,b)$ denotes a uniform prior on the interval [a, b], ${{ \mathcal U }}_{L}(a,b)$ denotes a log-uniform prior on the interval [a, b], and ${{ \mathcal N }}_{L}(\mu ,{\sigma }^{2})$ denotes a lognormal prior with mean μ and variance σ². Priors for the variability noise parameters a₄, b, c, and u₀ are informed by the model-agnostic variability quantification analysis described in Section 3.3.

Download table as: ASCII Typeset image

7.2.2. Themis Implementation

We have implemented the combined mG-ring plus noise parameterization as a model within the sampling-based parameter estimation framework Themis developed for the EHT (Broderick et al. 2020a, 2020b). Given a model specification and a data set, Themis works within a Bayesian formalism to produce a set of samples from the posterior distribution of the model parameters. Themis uses a MCMC sampling scheme to explore the posterior space, employing a parallel tempering scheme (Syed et al. 2022) to ensure traversal over the entire prior volume and the Hamiltonian Monte Carlo sampling kernel from the Stan package (Carpenter et al. 2017) to efficiently sample within each tempering level. A detailed description of the Themis sampling framework can be found in Tiede (2021).

The full-track geometric modeling analyses carried out in this paper fit to complex visibility data. Given a vector of geometric model parameters p and a vector of noise model parameters n , the Themis likelihood function for complex visibilities is Gaussian,

$\begin{eqnarray}&&\mathrm{ln}{ \mathcal L }=-\displaystyle \frac{1}{2}\sum _{i}\left(\displaystyle \frac{| {V}_{i}-\displaystyle \hat{V}({{\boldsymbol{u}}}_{i};{\boldsymbol{p}}){| }^{2}}{{\sigma }_{i}^{2}({\boldsymbol{n}})}+\mathrm{ln}\left[2\pi {\sigma }_{i}^{2}({\boldsymbol{n}})\right]\right).\end{eqnarray} \tag{ 46 }$

Here V_i is a measured visibility, u _i is its baseline vector, $\displaystyle \hat{V}$ is the corresponding modeled visibility, and the sum is taken over all data points i. The noise σ_i( n ) in each visibility is specified as in Equation (45), with the noise model parameters n = {f, σ_ref, a, b, c, u₀}. Themis internally solves for and marginalizes over the full set of complex gain parameters (i.e., one complex gain per station per time stamp) at every sampling step using a Laplace approximation (see Broderick et al. 2020a). It also applies the Johnson et al. (2018) diffractive scattering kernel directly to the model prior to computing visibilities. A validation test of the Themis mG-ring plus noise model implementation is described along with other tests in a dedicated paper on the noise modeling approach (Broderick et al. 2022).

We assess MCMC convergence through both visual inspection of the traces and a number of quantitative chain statistics, including the integrated autocorrelation time, split- $\displaystyle \hat{R}$ , and parameter rank distributions (Vehtari et al. 2021). The number of tempering levels is selected to ensure efficient communication between the highest and lowest levels (per Syed et al. 2022), which typically requires about 20 levels. We run the sampler for between 5 × 10⁴ and 10⁵ steps per tempering level.

To compute the Bayesian evidence, Themis uses thermodynamic integration (e.g., Lartillot & Philippe 2006), which computes the log-evidence through

$\begin{eqnarray}&&\mathrm{ln}{ \mathcal Z }={\int }_{0}^{1}{\rm{d}}\beta {\left\langle \mathrm{ln}{ \mathcal L }\right\rangle }_{\beta },\end{eqnarray} \tag{ 47 }$

where ${\left\langle \mathrm{ln}{ \mathcal L }\right\rangle }_{\beta }$ is the expectation of the log-likelihood taken over the distribution

$\begin{eqnarray}&&{P}^{(\beta )}({\boldsymbol{p}},{\boldsymbol{n}})={\left(\displaystyle \frac{P({\boldsymbol{p}},{\boldsymbol{n}})}{{\pi }_{\mathrm{ref}}({\boldsymbol{p}},{\boldsymbol{n}})}\right)}^{\beta }{\pi }_{\mathrm{ref}}({\boldsymbol{p}},{\boldsymbol{n}}).\end{eqnarray} \tag{ 48 }$

Note that Themis does not take π_ref to be the prior distribution. Instead, Themis uses a uniform distribution whose support matches the support of the priors given in Table 2. To compute Equation (47), we compute the average log-likelihood for each tempering level and then use trapezoidal integration to numerically compute the integral.

Priors for all mG-ring model and noise model parameters are listed in Table 2. We impose mean-zero lognormal priors for all station gain amplitudes, with a log standard deviation of 0.01 for all network calibrated stations (ALMA, APEX, JCMT, SMA), 0.2 for the LMT, and 0.1 for the remaining stations (PV, SMT, SPT); these gain priors are motivated by the expected performance of each station after the post-processing described in Section 7.1 (see also Paper II). All station gain phase priors are uniform on the unit circle.

7.3. Model Selection

As with the snapshot geometric analyses (see Section 6.5), the mG-ring model used for the full-track analyses is really a class of models that increases in complexity with m. The results of a Themis m-order survey covering m = {2, 3, 4, 5, 6} are shown in Figure 16. In contrast to the snapshot geometric modeling (see Section 6.5), we find that the full-track analysis is able to support more complex model specifications, exhibiting a strong preference for m > 4 over m = 4. However, we find that increasing the m-order does not significantly impact the values of the primary morphological parameters of choice (see the similar diameter measurements and uncertainties for m ≥ 4 in Figure 16). Thus, to maintain consistency among model specifications and to facilitate comparison with the snapshot geometric modeling analyses, we proceed with the m = 4 mG-ring as our fiducial model for all full-track Sgr A* analyses in this paper.

**Figure 16.** Relative Bayesian evidence and median posterior mG-ring model diameter vs. m-order, from the full-track geometric modeling applied to the Sgr A* data; error bars on the diameter measurements show 1σ credible intervals. Because the absolute values of $\mathrm{ln}{ \mathcal Z }$ can be substantially different for each data set (i.e., each combination of frequency band and calibration pipeline), and because only the relative values carry information about model specification preferences, we reference all $\mathrm{ln}{ \mathcal Z }$ values to the maximum value attained at any m for that data set.
Download figure:
Standard image High-resolution image

A representative m = 4 mG-ring fit to the Sgr A* HOPS low-band data is shown in Figure 17. We find that the normalized residuals are distributed around a value of zero with near-unity variance and that there is no evidence of systematic structure. The χ² statistics for this fit are discussed in Appendix E.

**Figure 17.** Results of full-track modeling using an m = 4 mG-ring fit to the Sgr A* HOPS low-band data on April 6 and 7, arranged analogously to the individual panels of Figure 9. As in Figure 9, $\displaystyle \hat{V}$ denotes light-curve-normalized complex visibilities.
Download figure:
Standard image High-resolution image

8. Results

In this section we aggregate and present the results from the analyses described in Sections 3–7 as applied to the 2017 EHT Sgr A* data.

8.1. Structural Variability Measurements

The model-agnostic variability quantification analysis carried out in Section 3 demonstrates that the Sgr A* data exhibit variability—quantified here in terms of a normalized visibility amplitude variance—that is significantly in excess of that expected from thermal noise, station gains, and refractive scattering. As illustrated in Figure 4, the measured variability can be broken down into three regions with qualitatively distinct behavior:

1.
On short baselines with lengths ∣ u ∣ ≲ 2.5 Gλ, corresponding to spatial scales ≳100 μas, limitations in our calibration and the subsequent choices made in our preprocessing procedure preclude meaningful constraints on the variability. The light-curve normalization procedure removes all variability on intrasite baselines and suppresses variability on short intersite baselines that are highly correlated with the light curve; the variability of the light curve itself is thoroughly characterized in Wielgus et al. (2022). The source size constraint used to perform gain calibration of the LMT-SMT baseline (see Paper II; Paper III) imposes a further, more artificial suppression of the variability on this baseline. We thus do not obtain any variability measurements for baselines shorter than 2.5 Gλ.
2.
On intermediate baselines with lengths between 2.5 Gλ ≲ ∣ u ∣ ≲ 6 Gλ, corresponding to spatial scales between ∼30 and ∼100 μas, we measure significant variability that exhibits an approximately power-law decline with increasing baseline length. The power-law index is between ∼2 and 3, and the magnitude of the variability ranges from a peak rms of ∼5% of the total flux density (∼120 mJy) near 2.5 Gλ down to ∼1% of the total flux density (∼25 mJy) near 6 Gλ.
3.
On long baselines with lengths ∣ u ∣ ≳ 6 Gλ, corresponding to spatial scales ≲30 μas, the measured variability is comparable in magnitude to that expected from statistical errors and refractive scattering. These measurements thus do not contain statistically significant detections of structural variability.

These measurements describe the level of excess variance that the data exhibit about an underlying average source model. The parameters describing a broken power-law noise model fit to these measurements are thus used as a variability noise budget during image reconstruction (Paper III) and to define priors on the corresponding parameters in the full-track modeling analyses (Section 7).

Determining the intrinsic (i.e., infinite-time) source variability from these measurements requires an additional debiasing step to remove the impact of correlations between data points that are closely spaced in time. The analysis carried out in Section 3 involves binning the visibility data in the (u,v)-plane for the purpose of computing variances. However, many data points within a single bin are from measurements taken close in time, which can introduce correlations that bias the computed variance. A procedure for removing this bias is detailed in Broderick et al. (2022), whereby the factor relating the measured variability to the intrinsic variability at every baseline length is calibrated using synthetic measurements of GRMHD simulations. In practice, Broderick et al. (2022) derive the debiasing function using the same 90 GRMHD simulations we use in this paper for θ_g calibration (see Section 4.4 and Appendix D), and the resulting debiasing factor is close to unity everywhere except between ∼6 and 7.5 Gλ (see Figure 5 in Broderick et al. 2022). Applying this debiasing function to the variability measurements from the Sgr A* data yields the results shown in Figure 18 and reported in Table 3.

**Figure 18.** Similar to Figure 4, but after the direct visibility estimates have been debiased to account for the short-time temporal correlations as described in Broderick et al. (2022). These estimates are directly comparable to the power spectra implied by GRMHD simulations (Georgiev et al. 2022). A single example and the range associated with the library presented in Paper V are shown by the red line and band, respectively.
Download figure:
Standard image High-resolution image

Table 3. Debiased Variability Parameters

Baseline Length (Gλ)	Normalized Variance	Fit
2.75	(17.2 ± 14.3) × 10⁻⁴	Y
3.25	(4.9 ± 2.3) × 10⁻⁴	Y
3.75	(3.7 ± 1.8) × 10⁻⁴	Y
4.25	(2.6 ± 1.6) × 10⁻⁴	Y
4.75	(2.0 ± 0.9) × 10⁻⁴	Y
5.25	(3.2 ± 1.7) × 10⁻⁴	Y
5.75	(1.5 ± 0.8) × 10⁻⁴	Y
6.25	(0.5 ± 0.4) × 10⁻⁴	N
6.75	(0.5 ± 0.4) × 10⁻⁴	N
7.25	(0.4 ± 0.4) × 10⁻⁴	N
7.75	(0.3 ± 0.2) × 10⁻⁴	N
8.25	(0.3 ± 0.2) × 10⁻⁴	N
8.75	(0.3 ± 0.3) × 10⁻⁴	N

Quantity	Symbol	Estimate^a

Excess noise at ∣ u ∣ = 4 Gλ	a₄	1.90 ± 0.2 %
Long-baseline power-law index	b	2.4 ± 0.8
Break baseline length	u₀	< 1.3 Gλ

Note. Measurements of intrinsic variability from the EHT Sgr A* data. The top portion of the table lists the visibility amplitude variance measurements from Section 8.1 after applying the debiasing procedure described in Broderick et al. (2022); these values correspond to the black points plotted in Figure 18. The bottom portion of the table lists the best-fit broken power-law model parameters for the variability measurements.

^aQuoted are the median values and 1σ ranges. Upper limits are 1σ limits.

Download table as: ASCII Typeset image

Due to the near-unit debiasing factor, the variability measurements shown in Figure 18 are similar to those from Figure 4. Quantitative constraints on the parameters of the noise model that are well constrained are presented in Figure 19 and Table 3. Where strong constraints on the excess noise exist (i.e., on baseline lengths between ∼2 and 6 Gλ), it continues to be well described by a single power law with index $b={2.4}_{-0.8-1.6}^{+0.8+2.1}$ and amplitude (evaluated at a baseline length of 4 Gλ) of ${a}_{4}={1.9}_{-0.2-0.3}^{+0.2+0.5} \%$ , where in each value the 1σ and 2σ ranges are indicated. The measured variability magnitude is between ∼2 and 10 times higher than that expected from refractive scattering alone. The lack of an observable break places an upper limit on its location of u₀ < 1.3 Gλ at 1σ and u₀ < 3.1 Gλ at 2σ.

**Figure 19.** Joint posteriors of the constrained parameters after fitting a broken power law to the debiased model-agnostic normalized variance estimates, specifically the normalization at ∣ u ∣ = 4 Gλ, denoted by a₄, the break location u₀, and the long-baseline power-law index b. Contours enclose 50%, 90%, and 99% probability.
Download figure:
Standard image High-resolution image

The excess variability is broadly consistent with that due to structural fluctuations anticipated by the GRMHD simulations discussed in Paper V and Georgiev et al. (2022). The magnitude of the excess variability lies within the range of that predicted by GRMHD simulations for ∣ u ∣ > 2 Gλ, though it does appear to marginally favor less variable models. The long-baseline power-law index is consistent with all GRMHD simulations. A detailed discussion of the implications for GRMHD models is contained within Paper V.

8.2. Image Morphology Measurements

Both the snapshot (Section 6) and full-track (Section 7) geometric modeling analyses produce reconstructions of the Sgr A* emission structure, and the posterior distributions determined for the parameters describing these geometric model reconstructions provide a quantification of the morphological properties directly from the EHT interferometric data. Similarly, the IDFE analyses carried out in Section 5 quantify the morphological properties of the top-set and posterior images reconstructed in Paper III.

Figure 20 compares the geometric models and image reconstructions determined for Sgr A*; for the geometric modeling analyses we show posterior mean images (i.e., the mean of many images sampled from the posterior distribution), while for the image reconstructions we show averages over the top sets (for eht-imaging, SMILI, and DIFMAP) or posterior means (for Themis). We see that both the snapshot and full-track geometric modeling analyses each recover a grossly similar overall structure across frequency bands and that this structure is also similar between the snapshot and full-track analyses. The image reconstructions permit much more flexibility in the permitted image structure, and so we see correspondingly more variation both within the imaging methods and between the imaging and geometric modeling. The primary point of consistency between the ring-like structures recovered from imaging and the rings fit via geometric modeling seems to be their sizes.

We use the geometric modeling and IDFE analyses to quantify a number of morphological parameters of interest, which are shown in Figure 21 and listed in Table 4.

Table 4. Parameters Describing Sgr A* Image Morphology

				CASA					HOPS
Analysis class	Software	Day	Band	$\displaystyle \hat{d}$ (μas)	W (μas)	A	η (deg)	f_c	$\displaystyle \hat{d}$ (μas)	W (μas)	A	η (deg)	f_c

Imaging	`DIFMAP`+ `REx`	April 7	HI+LO	${50.6}_{3.0}^{+1.1}$	${31.9}_{3.1}^{+4.2}$	${0.09}_{0.04}^{+0.07}$	48.4±57.1	${0.43}_{0.10}^{+0.12}$	${50.1}_{2.2}^{+1.2}$	${31.8}_{2.7}^{+3.3}$	${0.05}_{0.02}^{+0.09}$	−22.8±69.6	${0.42}_{0.09}^{+0.10}$
	`DIFMAP`+ `VIDA`	April 7	HI+LO	${51.2}_{2.8}^{+1.2}$	${33.4}_{1.8}^{+4.1}$	${0.10}_{0.04}^{+0.11}$	49.8±61.1	${0.40}_{0.09}^{+0.14}$	${51.3}_{1.4}^{+1.3}$	${33.1}_{1.1}^{+2.2}$	${0.07}_{0.03}^{+0.18}$	−28.1±57.1	${0.39}_{0.08}^{+0.15}$

	`eht-imaging`+ `REx`	April 7	HI+LO	${53.0}_{2.5}^{+1.9}$	${25.7}_{3.2}^{+3.6}$	${0.13}_{0.06}^{+0.08}$	−33.9±56.3	${0.33}_{0.18}^{+0.12}$	${53.6}_{1.1}^{+1.8}$	${27.2}_{2.6}^{+1.6}$	${0.11}_{0.03}^{+0.05}$	−81.0±62.6	${0.21}_{0.10}^{+0.11}$
	`eht-imaging`+ `VIDA`	April 7	HI+LO	${52.7}_{2.1}^{+5.6}$	${28.0}_{3.2}^{+16.3}$	${0.19}_{0.11}^{+0.26}$	−40.5±65.2	${0.30}_{0.16}^{+0.22}$	${53.5}_{1.5}^{+3.6}$	${27.4}_{2.1}^{+2.5}$	${0.11}_{0.05}^{+0.23}$	−94.4±60.9	${0.19}_{0.05}^{+0.13}$

	`SMILI`+ `REx`	April 7	HI+LO	${52.0}_{3.0}^{+5.5}$	${25.1}_{4.4}^{+1.1}$	${0.16}_{0.06}^{+0.04}$	130.9±70.3	${0.29}_{0.20}^{+0.09}$	${51.4}_{4.8}^{+4.0}$	${26.7}_{3.0}^{+1.2}$	${0.12}_{0.03}^{+0.04}$	135.3±90.0	${0.23}_{0.13}^{+0.08}$
	`SMILI`+ `VIDA`	April 7	HI+LO	${49.3}_{1.7}^{+13.6}$	${28.5}_{3.7}^{+34.0}$	${0.15}_{0.04}^{+0.21}$	124.4±77.0	${0.26}_{0.07}^{+0.48}$	${50.6}_{1.1}^{+6.1}$	${27.2}_{1.3}^{+0.8}$	${0.07}_{0.01}^{+0.11}$	163.0±90.4	${0.19}_{0.05}^{+0.09}$

	Themis + `REx`	April 6+7	HI+LO	...	...	...	...	...	${51.7}_{0.3}^{+0.4}$	${23.6}_{0.3}^{+0.3}$	${0.07}_{0.01}^{+0.08}$	−152.7±20.9	${0.02}_{0.01}^{+0.01}$
	Themis + `VIDA`	April 6+7	HI+LO	...	...	...	...	...	${53.7}_{1.7}^{+0.4}$	${24.7}_{0.4}^{+0.4}$	${0.08}_{0.01}^{+0.05}$	−137.1±15.7	${0.08}_{0.01}^{+0.01}$

Snapshot	`Comrade`	April 6+7	HI	${53.9}_{0.5}^{+0.6}$	${16.4}_{0.2}^{+0.3}$	${0.23}_{0.02}^{+0.01}$	3.8±5.3	${0.23}_{0.01}^{+0.01}$	${52.3}_{0.6}^{+0.7}$	${17.7}_{0.2}^{+0.2}$	${0.23}_{0.04}^{+0.03}$	−8.1±6.7	${0.15}_{0.01}^{+0.01}$
		April 6+7	LO	${51.4}_{0.6}^{+0.5}$	${17.0}_{0.3}^{+0.2}$	${0.22}_{0.03}^{+0.02}$	−11.4±9.0	${0.20}_{0.01}^{+0.01}$	${53.9}_{0.7}^{+0.8}$	${17.7}_{0.2}^{+0.2}$	${0.22}_{0.02}^{+0.02}$	−0.2±6.6	${0.18}_{0.01}^{+0.01}$

	`DPI`	April 6+7	HI	${50.9}_{1.0}^{+1.0}$	${16.1}_{0.7}^{+0.7}$	${0.21}_{0.14}^{+0.18}$	−75.6±43.2	${0.18}_{0.02}^{+0.02}$	${47.7}_{1.1}^{+1.1}$	${19.1}_{0.7}^{+0.7}$	${0.31}_{0.20}^{+0.13}$	−44.9±11.7	${0.12}_{0.01}^{+0.02}$
		April 6+7	LO	${51.6}_{0.9}^{+0.9}$	${16.2}_{0.7}^{+0.7}$	${0.14}_{0.10}^{+0.07}$	−97.5±9.5	${0.14}_{0.01}^{+0.02}$	${45.3}_{1.7}^{+1.6}$	${21.6}_{0.9}^{+0.9}$	${0.31}_{0.19}^{+0.13}$	−73.2±8.2	${0.10}_{0.01}^{+0.01}$

Full-track	Themis	April 6+7	HI	${52.1}_{0.4}^{+0.4}$	${19.7}_{0.4}^{+0.4}$	${0.16}_{0.02}^{+0.02}$	−4.0±11.8	${0.09}_{0.02}^{+0.02}$	${52.0}_{0.5}^{+0.4}$	${21.5}_{0.5}^{+0.5}$	${0.14}_{0.02}^{+0.02}$	8.9±8.0	${0.05}_{0.02}^{+0.02}$
		April 6+7	LO	${52.7}_{0.5}^{+0.5}$	${20.3}_{0.4}^{+0.4}$	${0.15}_{0.02}^{+0.02}$	−74.0±10.5	${0.16}_{0.03}^{+0.03}$	${51.4}_{0.5}^{+0.4}$	${22.5}_{0.5}^{+0.4}$	${0.14}_{0.02}^{+0.02}$	1.6±7.5	${0.03}_{0.01}^{+0.02}$

Note. Median values and 68% credible intervals for the morphological quantities of interest, measured from the EHT Sgr A* data. Because medians and quantiles are not well defined for angular variables, for the position angle η we instead quote the circular mean and standard deviation.

Download table as: ASCII Typeset image

8.2.1. Ring Size

The parameter of most interest for gravitational studies (e.g., M87* Paper VI; Paper VI) is the ring size, which we quantify using its diameter. For the mG-ring modeling results from both snapshot and full-track analyses, we report a debiased diameter $\displaystyle \hat{d}$ , given by

$\begin{eqnarray}&&\displaystyle \hat{d}=d-\displaystyle \frac{1}{4\mathrm{ln}(2)}\displaystyle \frac{{W}^{2}}{d},\end{eqnarray} \tag{ 49 }$

where d and W are the mG-ring ring diameter and thickness, respectively (see Section 4.3). This debiasing corrects for the lowest-order impact of the Gaussian blurring kernel on the radial location of the peak intensity—which is shifted inward with respect to the radius of the pre-convolved ring (see Appendix G of M87* Paper IV)—and thus aids a more direct comparison of the geometric modeling diameter values with those obtained from IDFE.

The diameter measurements from the geometric modeling and IDFE analyses are compared in the top row of Figure 21 across frequency bands and calibration pipelines. We find that the diameter is the most well constrained of the geometric parameters we attempt to quantify, with both a typical measurement uncertainty and a scatter between measurement types that is substantially smaller than the magnitude of the value itself. An average of the geometric modeling results across both frequency bands and calibration pipelines yields a debiased diameter of 51.9 μas, with a corresponding symmetrized uncertainty of 2.0 μas. The quoted error is the 68% (i.e., approximately 1σ) probability and corresponds to the samples from each measurement weighted equally. The IDFE measurements are broadly consistent with the results from geometric modeling, yielding $\displaystyle \hat{d}=51.8\pm 2.6$ μas. The corresponding joint constraint from both geometric modeling and IDFE analyses yields $\displaystyle \hat{d}=51.8\pm 2.3$ μas.

We note that even after debiasing the diameter measurements from different analysis pathways remain interpretationally distinct quantities. To ensure mutual consistency between different measurement methods, in Section 8.3 we calibrate the diameter measurements to a common physical scale using the GRMHD synthetic data sets generated for this purpose (see Section 4.4).

8.2.2. Ring Thickness

The thickness of the ring is of interest for its ability to constrain the location and size of the emitting region near the black hole (e.g., Lockhart & Gralla 2022). The ring thickness measurements from the geometric modeling and IDFE analyses are compared in the second row of Figure 21 across frequency bands and calibration pipelines. We find that the geometric modeling methods recover similar ring thicknesses, with the snapshot analyses obtaining ∼16–22 μas and the full-track analyses obtaining ∼19–23 μas. The IDFE analyses obtain consistently thicker rings, with ∼30 μas being a more typical value and a somewhat larger scatter (from ∼25 to 35 μas) seen both across and within pipelines. The increased ring thickness recovered from the IDFE analyses likely arises in part from image smoothing introduced by some reconstruction algorithms (e.g., the CLEAN algorithm used by DIFMAP).

The thickness of the M87* ring was a parameter that the geometric modeling analyses carried out in M87* Paper VI had difficulty constraining, and it showed substantial variation across days and between modeling methods; only a relatively weak upper limit for the fractional thickness of $W/\displaystyle \hat{d}\lesssim 0.5$ was obtained. In contrast, we find for Sgr A* that the thickness parameter is relatively well constrained by geometric modeling approaches. The fractional thickness is $W/\displaystyle \hat{d}=0.35\pm 0.05$ , where the uncertainties quoted are symmetrized 1σ. IDFE analyses obtain systematically larger fractional thicknesses, finding $W/\displaystyle \hat{d}=0.53\pm 0.1$ .

Unlike with the diameter measurements, we do not debias the ring thicknesses obtained from different analysis pathways or attempt to calibrate them to a common scale. The ring thicknesses from geometric modeling and IDFE thus represent two interpretationally distinct quantities, and we do not produce an analysis-agnostic measurement of the ring thickness.

8.2.3. Position Angle and Asymmetry

The magnitude and orientation of any asymmetry in the azimuthal brightness distribution around the ring is of interest because it can be related to the spin and inclination of the black hole (e.g., M87* Paper V; Paper V). As described in Section 4.3, the mG-ring position angle (η) and degree of azimuthal asymmetry (A) are both determined by the coefficient of the m = 1 mode as specified in Equations (23) and (24), respectively. These definitions match closely the corresponding IDFE quantities defined in Equation (5.2). The asymmetry and position angle measurements we obtain for Sgr A* are shown in the third and fourth rows of Figure 21, respectively.

Unlike for M87*, where the image morphology exhibits a clearly defined asymmetry axis whose magnitude and orientation can be consistently quantified using either geometric modeling or IDFE analyses (M87* Paper IV; M87* Paper VI), the image structure for Sgr A* is less amenable to such a description. The asymmetry magnitude measurements show a large scatter across methods, spanning ∼0.15–0.3 for the geometric modeling methods and ∼0.04–0.20 for the IDFE methods. The IDFE methods recover systematically smaller median levels of asymmetry than the geometric modeling methods, but the uncertainties are large; several of the measurement methods have statistical uncertainties that cover nearly the entire 0–0.5 range of the prior distribution for A.

The position angle measurements show similarly little consistency between different analysis methods, spanning essentially the full (−180°, 180°) range when compared across all data sets and measurement techniques. The geometric modeling analyses find position angles that are loosely confined to a region between ∼−100° and 0° across, but the IDFE analyses show a large (>100°) scatter between methods and a similar magnitude of uncertainty for individual measurements.

8.2.4. Brightness Depression

The depth of the brightness depression interior to the ring is a key signature of the presence of a black hole. Additionally, it can be used to constrain the presence of an emitting or reflecting surface, as a potential alternative to a horizon (e.g., Broderick et al. 2015; Paper VI). For the mG-ring model, the fractional central flux f_c is given by Equation (21). For the IDFE analyses, we retain the definitions from Equation (5.2) for f_c.

The f_c measurements for Sgr A* are shown in the bottom row of Figure 21. We find a large spread in values across analysis methods, ranging from ∼0.1 to 0.25 for the geometric modeling analyses and from ∼0.0 to 0.5 for the IDFE analyses. Compared against the constraints from geometric modeling of the M87* ring structure, which consistently found f_c ≲ 0.1 (M87* Paper VI), the results obtained here for Sgr A* allow for the possibility of substantially more emission interior to the ring.

8.3. Gravitational Radius and Mass

The ring size measurements presented in Section 8.2.1 have been made using a variety of different analysis techniques, with different inherent assumptions and biases. To bring these otherwise disparate measurement techniques to a common scale, we follow a strategy similar to that developed in M87* in M87* Paper VI and calibrate the diameter measurements using simulations from the Paper V GRMHD library. As described in Section 4.4, our calibration suite consists of synthetic data sets constructed from 90 GRMHD simulations spanning a range of accretion flow and black hole parameters, and for which an absolute reference size scale (i.e., the angular size of the gravitational radius θ_g) is known. We apply the same data processing and ring diameter measurement strategies as used for the Sgr A* data to each of these synthetic data sets, and we use the resulting distribution of diameter measurements to derive the value and uncertainty in the scaling factor (α) between θ_g and d for every method (Equation (26)). We note that a conceptually similar calibration is carried out in the companion Paper VI, in which the GRMHD assumption is relaxed and a more diverse set of spacetimes and accretion flow models is used to calibrate ring size measurements.

8.3.1. Calibrated Scaling Factors

When applied to the calibration suite data sets, each diameter measurement technique produces a discrete distribution of α scaling factors. We use a kernel density estimator (KDE) from the scikit-learn package (Pedregosa et al. 2011) to produce a nonparametric estimate of the continuous distribution corresponding to these discrete samples, and we use this KDE to construct the θ_g distribution for Sgr A* described below (Section 8.3.2).

Table 5 lists the derived α value and its uncertainty—as computed from the KDE distribution—for each of the ring size measurement methods used in this paper. The uncertainty in α contains two main components: a statistical uncertainty ${\sigma }_{\alpha }^{(\mathrm{stat})}$ associated with the fidelity of the ring measurement from each data set, and a theoretical uncertainty ${\sigma }_{\alpha }^{(\mathrm{theory})}$ associated with the intrinsic scatter in α as measured across different GRMHD calibration data sets. The total uncertainty ${\sigma }_{\alpha }^{(\mathrm{tot})}$ is a combination of both the statistical and theoretical uncertainties, meaning that in practice we do not have access to ${\sigma }_{\alpha }^{(\mathrm{theory})}$ in isolation, and thus we do not report it in Table 5. Nevertheless, the relative values of ${\sigma }_{\alpha }^{(\mathrm{tot})}$ and ${\sigma }_{\alpha }^{(\mathrm{stat})}$ indicate that the theoretical uncertainty is typically the dominant component.

Table 5. α Calibration Parameters

Analysis Class	Software	Day	α	${\sigma }_{\alpha }^{(\mathrm{stat})}$	${\sigma }_{\alpha }^{(\mathrm{tot})}$
Snapshot	`Comrade`	April 6+7	12.0	(+0.2, −0.2)	(+1.6, −1.4)
	`DPI`	April 6+7	11.0	(+0.8, −0.8)	(+2.2, −4.3)

Full-track	Themis	April 6+7	11.7	(+0.1, −0.1)	(+1.3, −1.3)

Imaging	`DIFMAP` + `REx`	April 7	10.5	(+0.9, −1.4)	(+2.0, −2.3)
	`DIFMAP` + `VIDA`	April 7	10.6	(+1.0, −1.3)	(+1.7, −3.1)

	`eht-imaging` + `REx`	April 7	11.0	(+1.4, −1.3)	(+2.1, −2.5)
	`eht-imaging` + `VIDA`	April 7	11.0	(+1.2, −1.3)	(+1.7, −3.2)

	`SMILI` + `REx`	April 7	10.3	(+2.4, −2.1)	(+2.8, −4.4)
	`SMILI` + `VIDA`	April 7	10.4	(+1.4, −1.4)	(+1.8, −3.7)

	Themis + `REx`	April 6+7	10.3	(+0.5, −0.4)	(+1.5, −2.7)
	Themis + `VIDA`	April 6+7	10.6	(+0.4, −0.4)	(+1.2, −3.9)

Note. Median values and 68% credible intervals for the calibrated α values, averaged over frequency bands and calibration pipelines.

Download table as: ASCII Typeset image

The calibrated α values span a range of ∼10–12, depending on the specific measurement technique. Statistical uncertainties in the α values determined using the geometric modeling techniques are a few percent, while for the IDFE techniques the statistical uncertainty is typically larger and reaches as high as ∼20% in the worst cases. Folding in both the statistical and the theoretical components, the total uncertainties are more comparable across methods, though they still exhibit a large range spanning ∼15%–35%. Overall, the calibrated α values show similar magnitudes to what M87* Paper VI derived from fits to M87*, but the calibration uncertainty in the case of Sgr A* is substantially larger. This increased uncertainty reflects the increased flexibility that has been built into the ring size measurement techniques to capture structural variability in the source, as well as the increased morphological diversity of the GRMHD calibration suite that is necessary to accommodate the a priori unknown inclination of Sgr A*.

8.3.2. Sgr A* Angular Gravitational Radius

We apply the calibrated α values to the measured Sgr A* ring diameters for each measurement technique, which produces a distribution of θ_g values that captures the uncertainties in both the ring measurements and the GRMHD calibration. The resulting θ_g distribution exhibits sampling noise from KDE of the α distribution (due to the finite size of the calibration suite), and it can exhibit secondary low-probability modes at large values of θ_g (see Appendix I). To provide a smooth unimodal estimate of the θ_g distribution, we fit a generalized lambda distribution (GλD) to the KDE distribution for each measurement technique. The GλD is a unimodal distribution representing a diverse family of probability density functions. We use the GλD parameterization from Freimer et al. (1988), and we use the GLDEX package in R (Su 2007a, 2007b) to carry out the fitting.

The resulting θ_g distributions are shown in Figure 22 and listed in Table 6. An average across all methods and datasets yields a joint constraint of θ_g = ${4.8}_{-0.7}^{+1.4}$ μas, where the uncertainties are quoted at the 68% (i.e., 1σ) probability level and the systematic uncertainty is taken to be the standard deviation across all measurement methods. This value is consistent with the considerably more precise constraints obtained from measurements of stellar orbits (Do et al. 2019; Gravity Collaboration et al. 2019, 2020), and the gravitational implications of this consistency are explored in Paper VI.

Table 6. θ_g Values and Uncertainties across Analyses

			HOPS		CASA
Analysis	Software	Day	LO	HI	LO	HI
Snapshot	`Comrade`	April 6+7	${4.6}_{-0.5}^{+0.7}$	${4.5}_{-0.5}^{+0.7}$	${4.4}_{-0.4}^{+0.7}$	${4.6}_{-0.5}^{+0.7}$

	`DPI`	April 6+7	${4.5}_{-0.8}^{+1.7}$	...	${4.9}_{-0.8}^{+1.8}$	...
Full-track	Themis	April 6+7	${4.7}_{-0.4}^{+0.8}$	${4.7}_{-0.4}^{+0.8}$	${4.7}_{-0.4}^{+0.8}$	${4.7}_{-0.4}^{+0.8}$

Imaging	`DIFMAP`+ `REx`	April 7	${4.8}_{-0.9}^{+1.2}$		${4.8}_{-0.9}^{+1.2}$
	`DIFMAP`+ `VIDA`	April 7	${4.8}_{-0.7}^{+1.4}$		${4.8}_{-0.8}^{+1.5}$

	`eht-imaging`+ `REx`	April 7	${4.9}_{-0.8}^{+1.2}$		${4.9}_{-0.9}^{+1.3}$
	`eht-imaging` + `VIDA`	April 7	${5.0}_{-0.8}^{+1.6}$		${5.0}_{-0.9}^{+1.7}$

	`SMILI` + `REx`	April 7	${5.0}_{-1.1}^{+2.8}$		${5.2}_{-1.2}^{+2.8}$
	`SMILI` + `VIDA`	April 7	${5.1}_{-0.9}^{+1.9}$		${5.2}_{-1.3}^{+2.6}$

	Themis + `REx`	April 6+7	${5.1}_{-0.7}^{+1.1}$		...
	Themis + `VIDA`	April 6+7	${5.0}_{-0.5}^{+1.3}$		...

Note. The angular gravitational radius, θ_g, measured across analysis schemes and data sets; all values are quoted as medians and 68% credible intervals in units of μas. Entries that straddle both low-band and high-band columns have been computed from fits to band-combined data sets.

Download table as: ASCII Typeset image

8.3.3. Sgr A* Mass

The angular size of the gravitational radius is proportional to the ratio of the mass and distance to Sgr A*, per Equation (25). Our constraints on θ_g can thus be mapped directly to constraints on the black hole mass M by incorporating an independent distance measurement to Sgr A*. Reid et al. (2019) report a distance of D = 8.15 ± 0.15 (at 1σ probability) to the Galactic center, measured using trigonometric VLBI parallaxes of a large number (∼200) of masers. Using this distance measurement along with our measurement of θ_g from Section 8.3.2 yields a constraint on the mass of Sgr A* of M = ${4.0}_{-0.6}^{+1.1}\times {10}^{6}$ M_⊙, where we again quote 1σ uncertainties and the systematic component is taken to be the weighted standard deviation across methods. This measurement is once again consistent with the more precise constraints obtained from stellar orbits (Do et al. 2019; Gravity Collaboration et al. 2019, 2020), and the uncertainty in the mass remains dominated by our uncertainty in θ_g.

9. Summary and Conclusions

In this paper we quantify the temporal variability and morphological properties of the horizon-scale emission from Sgr A*, using data taken by the EHT in 2017 April. Our primary morphological quantity of interest is the diameter of the observed ring of emission, which we quantify using multiple independent analysis pathways. We then use the ring diameter to place constraints on the angular size of the gravitational radius (θ_g) and on the mass (M) of Sgr A*. The analyses presented here have been carried out using data taken on April 6 and April 7, across two frequency bands and two data calibration pipelines.

Motivated by theoretical expectations that the dynamical timescales in Sgr A* should be much shorter than the duration of EHT observing tracks, we employ a new method developed in Broderick et al. (2022) for quantifying the time variability observed in the visibilities in a manner that is agnostic to the specifics of the average underlying source structure. We find that the visibility amplitudes exhibit a light-curve-normalized variance that is in excess of that expected from thermal noise, station gains, or refractive scattering effects, and we attribute this excess variance to intrinsic structural changes in the source. The detected variability is most statistically significant on baselines with lengths between 2.5 and 6 Gλ, where it exhibits an approximately power-law decline with increasing baseline length, with a power-law index of ∼2–3. The magnitude of this variability on baselines near 3 Gλ in length exceeds 0.1 Jy, which is roughly equal to the value of the correlated flux density on these same baselines.

Through an exploration of potential simple geometric source structures, we demonstrate that the EHT Sgr A* data statistically prefer ring-like morphologies over other morphologies with comparable complexity. We develop and deploy two new methods for fitting the time-variable Sgr A* data using static geometric ring models with azimuthally modulated brightness structures. In the first method, called "snapshot" geometric modeling, we first fit the models to short segments of data over which the source variability is subdominant to other sources of uncertainty. The fits from individual segments are then combined via a hierarchical model averaging scheme to provide parameter constraints across the entire observation. In the second method, called "full-track" geometric modeling, we fit a static geometric model to the entire data set alongside parameters that describe the statistical fluctuations that time variability induces in the data. Our parameterization for the variability "noise" is motivated by the work of Georgiev et al. (2022) and takes the form of a broken power law in baseline length that contributes to the data uncertainties.

We compare the results from snapshot and full-track geometric modeling, both with one another and with the results of IDFE from the images reconstructed in Paper III, to constrain the horizon-scale morphology of Sgr A*. The ring diameter is well constrained and stable across both frequency bands and calibration pipelines, with geometric modeling and IDFE techniques jointly determining a value of 51.8 ± 2.3 μas (68% credible intervals). We find that the magnitude and orientation of the ring asymmetry, as well as the depth of its central brightness depression, are poorly constrained and have values that can depend sensitively on the measurement method employed. The thickness of the ring is well measured by individual analysis methods but takes on a value that depends on the specifics of each method; geometric modeling methods find an FWHM ring thickness of 35% ± 5% of the ring diameter, while IDFE methods find an FWHM of 53% ± 10%.

Using a suite of synthetic data sets generated from the Paper V GRMHD simulation library, we calibrate the diameter measurements from both geometric modeling and IDFE methods to a common physical scale. The resulting constraint on the angular size of the Sgr A* gravitational radius, combined across all methods and data sets, is θ_g = ${4.8}_{-0.7}^{+1.4}$ μas. This large uncertainty arises from both the model flexibility necessary to capture structural variability in the source and the broad morphological diversity of the GRMHD calibration suite that reflects the a priori unknown inclination of Sgr A*. Combining our θ_g constraint with an independent distance measurement from Reid et al. (2019), we determine the mass of Sgr A* to be M = ${4.0}_{-0.6}^{+1.1}\times {10}^{6}$ M_⊙. Though the uncertainties are large compared to those derived using other techniques (e.g., stellar orbit modeling), our measurement represents the first time that the mass of Sgr A* has been constrained by observations of light bending near the horizon.

We thank an anonymous referee for insightful and constructive comments that helped improve the quality of this paper.

The Event Horizon Telescope Collaboration thanks the following organizations and programs: the Academia Sinica; the Academy of Finland (projects 274477, 284495, 312496, 315721); the Agencia Nacional de Investigación y Desarrollo (ANID), Chile via NCN19_058 (TITANs) and Fondecyt 1221421, the Alexander von Humboldt Stiftung; an Alfred P. Sloan Research Fellowship; Allegro, the European ALMA Regional Centre node in the Netherlands, the NL astronomy research network NOVA and the astronomy institutes of the University of Amsterdam, Leiden University and Radboud University; the ALMA North America Development Fund; the Black Hole Initiative, which is funded by grants from the John Templeton Foundation and the Gordon and Betty Moore Foundation (although the opinions expressed in this work are those of the author(s) and do not necessarily reflect the views of these Foundations); Chandra DD7-18089X and TM6-17006X; the China Scholarship Council; China Postdoctoral Science Foundation fellowship (2020M671266); Consejo Nacional de Ciencia y Tecnología (CONACYT, Mexico, projects U0004-246083, U0004-259839, F0003-272050, M0037-279006, F0003-281692, 104497, 275201, 263356); the Consejería de Economía, Conocimiento, Empresas y Universidad of the Junta de Andalucía (grant P18-FR-1769), the Consejo Superior de Investigaciones Científicas (grant 2019AEP112); the Delaney Family via the Delaney Family John A. Wheeler Chair at Perimeter Institute; Dirección General de Asuntos del Personal Académico-Universidad Nacional Autónoma de México (DGAPA-UNAM, projects IN112417 and IN112820); the Dutch Organization for Scientific Research (NWO) VICI award (grant 639.043.513) and grant OCENW.KLEIN.113; the Dutch National Supercomputers, Cartesius and Snellius (NWO Grant 2021.013); the EACOA Fellowship awarded by the East Asia Core Observatories Association, which consists of the Academia Sinica Institute of Astronomy and Astrophysics, the National Astronomical Observatory of Japan, Center for Astronomical Mega-Science, Chinese Academy of Sciences, and the Korea Astronomy and Space Science Institute; the European Research Council (ERC) Synergy Grant "BlackHoleCam: Imaging the Event Horizon of Black Holes" (grant 610058); the European Union Horizon 2020 research and innovation programme under grant agreements RadioNet (No 730562) and M2FINDERS (No 101018682); the Generalitat Valenciana postdoctoral grant APOSTD/2018/177 and GenT Program (project CIDEGENT/2018/021); MICINN Research Project PID2019-108995GB-C22; the European Research Council for advanced grant 'JETSET: Launching, propagation and emission of relativistic jets from binary mergers and across mass scales' (Grant No. 884631); the Institute for Advanced Study; the Istituto Nazionale di Fisica Nucleare (INFN) sezione di Napoli, iniziative specifiche TEONGRAV; the International Max Planck Research School for Astronomy and Astrophysics at the Universities of Bonn and Cologne; DFG research grant "Jet physics on horizon scales and beyond" (Grant No. FR 4069/2-1); Joint Princeton/Flatiron and Joint Columbia/Flatiron Postdoctoral Fellowships, research at the Flatiron Institute is supported by the Simons Foundation; the Japan Ministry of Education, Culture, Sports, Science and Technology (MEXT; grant JPMXP1020200109); the Japanese Government (Monbukagakusho: MEXT) Scholarship; the Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for JSPS Research Fellowship (JP17J08829); the Joint Institute for Computational Fundamental Science, Japan; the Key Research Program of Frontier Sciences, Chinese Academy of Sciences (CAS, grants QYZDJ-SSW-SLH057, QYZDJSSW-SYS008, ZDBS-LY-SLH011); the Leverhulme Trust Early Career Research Fellowship; the Max-Planck-Gesellschaft (MPG); the Max Planck Partner Group of the MPG and the CAS; the MEXT/JSPS KAKENHI (grants 18KK0090, JP21H01137, JP18H03721, JP18K13594, 18K03709, JP19K14761, 18H01245, 25120007); the Malaysian Fundamental Research Grant Scheme (FRGS) FRGS/1/2019/STG02/UM/02/6; the MIT International Science and Technology Initiatives (MISTI) Funds; the Ministry of Science and Technology (MOST) of Taiwan (103-2119-M-001-010-MY2, 105-2112-M-001-025-MY3, 105-2119-M-001-042, 106-2112-M-001-011, 106-2119-M-001-013, 106-2119-M-001-027, 106-2923-M-001-005, 107-2119-M-001-017, 107-2119-M-001-020, 107-2119-M-001-041, 107-2119-M-110-005, 107-2923-M-001-009, 108-2112-M-001-048, 108-2112-M-001-051, 108-2923-M-001-002, 109-2112-M-001-025, 109-2124-M-001-005, 109-2923-M-001-001, 110-2112-M-003-007-MY2, 110-2112-M-001-033, 110-2124-M-001-007, and 110-2923-M-001-001); the Ministry of Education (MoE) of Taiwan Yushan Young Scholar Program; the Physics Division, National Center for Theoretical Sciences of Taiwan; the National Aeronautics and Space Administration (NASA, Fermi Guest Investigator grant 80NSSC20K1567, NASA Astrophysics Theory Program grant 80NSSC20K0527, NASA NuSTAR award 80NSSC20K0645); NASA Hubble Fellowship grant HST-HF2-51431.001-A awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS5-26555; the National Institute of Natural Sciences (NINS) of Japan; the National Key Research and Development Program of China (grant 2016YFA0400704, 2017YFA0402703, 2016YFA0400702); the National Science Foundation (NSF, grants AST-0096454, AST-0352953, AST-0521233, AST-0705062, AST-0905844, AST-0922984, AST-1126433, AST-1140030, DGE-1144085, AST-1207704, AST-1207730, AST-1207752, MRI-1228509, OPP-1248097, AST-1310896, AST-1440254, AST-1555365, AST-1614868, AST-1615796, AST-1715061, AST-1716327, AST-1716536, OISE-1743747, AST-1816420, AST-1935980, AST-2034306); NSF Astronomy and Astrophysics Postdoctoral Fellowship (AST-1903847); the Natural Science Foundation of China (grants 11650110427, 10625314, 11721303, 11725312, 11873028, 11933007, 11991052, 11991053, 12192220, 12192223); the Natural Sciences and Engineering Research Council of Canada (NSERC, including a Discovery Grant and the NSERC Alexander Graham Bell Canada Graduate Scholarships-Doctoral Program); the National Youth Thousand Talents Program of China; the National Research Foundation of Korea (the Global PhD Fellowship Grant: grants NRF-2015H1A2A1033752, the Korea Research Fellowship Program: NRF-2015H1D3A1066561, Brain Pool Program: 2019H1D3A1A01102564, Basic Research Support Grant 2019R1F1A1059721, 2021R1A6A3A01086420, 2022R1C1C1005255); Netherlands Research School for Astronomy (NOVA) Virtual Institute of Accretion (VIA) postdoctoral fellowships; Onsala Space Observatory (OSO) national infrastructure, for the provisioning of its facilities/observational support (OSO receives funding through the Swedish Research Council under grant 2017-00648); the Perimeter Institute for Theoretical Physics (research at Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development and by the Province of Ontario through the Ministry of Research, Innovation and Science); the Spanish Ministerio de Ciencia e Innovación (grants PGC2018-098915-B-C21, AYA2016-80889-P, PID2019-108995GB-C21, PID2020-117404GB-C21); the University of Pretoria for financial aid in the provision of the new Cluster Server nodes and SuperMicro (USA) for a SEEDING GRANT approved towards these nodes in 2020; the Shanghai Pilot Program for Basic Research, Chinese Academy of Science, Shanghai Branch (JCYJ-SHFY-2021-013); the State Agency for Research of the Spanish MCIU through the "Center of Excellence Severo Ochoa" award for the Instituto de Astrofísica de Andalucía (SEV-2017- 0709); the Spinoza Prize SPI 78-409; the South African Research Chairs Initiative, through the South African Radio Astronomy Observatory (SARAO, grant ID 77948), which is a facility of the National Research Foundation (NRF), an agency of the Department of Science and Innovation (DSI) of South Africa; the Toray Science Foundation; Swedish Research Council (VR); the US Department of Energy (USDOE) through the Los Alamos National Laboratory (operated by Triad National Security, LLC, for the National Nuclear Security Administration of the USDOE (Contract 89233218CNA000001); and the YCAA Prize Postdoctoral Fellowship.

We thank the staff at the participating observatories, correlation centers, and institutions for their enthusiastic support. This paper makes use of the following ALMA data: ADS/JAO.ALMA#2016.1.01154.V. ALMA is a partnership of the European Southern Observatory (ESO; Europe, representing its member states), NSF, and National Institutes of Natural Sciences of Japan, together with National Research Council (Canada), Ministry of Science and Technology (MOST; Taiwan), Academia Sinica Institute of Astronomy and Astrophysics (ASIAA; Taiwan), and Korea Astronomy and Space Science Institute (KASI; Republic of Korea), in cooperation with the Republic of Chile. The Joint ALMA Observatory is operated by ESO, Associated Universities, Inc. (AUI)/NRAO, and the National Astronomical Observatory of Japan (NAOJ). The NRAO is a facility of the NSF operated under cooperative agreement by AUI. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. We also thank the Center for Computational Astrophysics, National Astronomical Observatory of Japan. The computing cluster of Shanghai VLBI correlator supported by the Special Fund for Astronomy from the Ministry of Finance in China is acknowledged.

APEX is a collaboration between the Max-Planck-Institut für Radioastronomie (Germany), ESO, and the Onsala Space Observatory (Sweden). The SMA is a joint project between the SAO and ASIAA and is funded by the Smithsonian Institution and the Academia Sinica. The JCMT is operated by the East Asian Observatory on behalf of the NAOJ, ASIAA, and KASI, as well as the Ministry of Finance of China, Chinese Academy of Sciences, and the National Key Research and Development Program (No. 2017YFA0402700) of China and Natural Science Foundation of China grant 11873028. Additional funding support for the JCMT is provided by the Science and Technologies Facility Council (UK) and participating universities in the UK and Canada. The LMT is a project operated by the Instituto Nacional de Astrófisica, Óptica, y Electrónica (Mexico) and the University of Massachusetts at Amherst (USA). The IRAM 30-m telescope on Pico Veleta, Spain is operated by IRAM and supported by CNRS (Centre National de la Recherche Scientifique, France), MPG (Max-Planck-Gesellschaft, Germany) and IGN (Instituto Geográfico Nacional, Spain). The SMT is operated by the Arizona Radio Observatory, a part of the Steward Observatory of the University of Arizona, with financial support of operations from the State of Arizona and financial support for instrumentation development from the NSF. Support for SPT participation in the EHT is provided by the National Science Foundation through award OPP-1852617 to the University of Chicago. Partial support is also provided by the Kavli Institute of Cosmological Physics at the University of Chicago. The SPT hydrogen maser was provided on loan from the GLT, courtesy of ASIAA.

This work used the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant ACI-1548562, and CyVerse, supported by NSF grants DBI-0735191, DBI-1265383, and DBI-1743442. XSEDE Stampede2 resource at TACC was allocated through TG-AST170024 and TG-AST080026N. XSEDE JetStream resource at PTI and TACC was allocated through AST170028. This research is part of the Frontera computing project at the Texas Advanced Computing Center through the Frontera Large-Scale Community Partnerships allocation AST20023. Frontera is made possible by National Science Foundation award OAC-1818253. This research was carried out using resources provided by the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy Office of Science. Additional work used ABACUS2.0, which is part of the eScience center at Southern Denmark University. Simulations were also performed on the SuperMUC cluster at the LRZ in Garching, on the LOEWE cluster in CSC in Frankfurt, on the HazelHen cluster at the HLRS in Stuttgart, and on the Pi2.0 and Siyuan Mark-I at Shanghai Jiao Tong University. The computer resources of the Finnish IT Center for Science (CSC) and the Finnish Computing Competence Infrastructure (FCCI) project are acknowledged. This research was enabled in part by support provided by Compute Ontario (http://computeontario.ca), Calcul Quebec (http://www.calculquebec.ca) and Compute Canada (http://www.computecanada.ca).

The EHTC has received generous donations of FPGA chips from Xilinx Inc., under the Xilinx University Program. The EHTC has benefited from technology shared under open-source license by the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER). The EHT project is grateful to T4Science and Microsemi for their assistance with Hydrogen Masers. This research has made use of NASA's Astrophysics Data System. We gratefully acknowledge the support provided by the extended staff of the ALMA, both from the inception of the ALMA Phasing Project through the observational campaigns of 2017 and 2018. We would like to thank A. Deller and W. Brisken for EHT-specific support with the use of DiFX. We thank Martin Shepherd for the addition of extra features in the Difmap software that were used for the CLEAN imaging results presented in this paper. We acknowledge the significance that Maunakea, where the SMA and JCMT EHT stations are located, has for the indigenous Hawaiian people. IMV acknowledges the use of LLuis Vives HPC resources of the University of Valencia.

Facility: EHT - .

Software: Comrade, DIFMAP, DPI, eht-imaging, metronization, REx, SMILI, Stan, Themis, VIDA.

Appendix A: Comparison of Sgr A* Variability in 2013 and 2017

Additional prior epochs of millimeter-VLBI observations of Sgr A* provide a means to explore a wider range of baseline lengths and assess the consistency of the variability across many years. Johnson et al. (2015) reported 1.3 mm VLBI observations of Sgr A* from 2013 March 21, 22, 23, 26, and 27 with an array that included JCMT, SMA, SMT, and the Combined Array for Research in Millimeter-wave Astronomy (CARMA, which has since been decommissioned). By virtue of the small number of participating stations, these observations were much more limited in their (u,v)-coverage than the 2017 EHT observations. Nevertheless, the 2013 observations provide a second multiday data set for which normalized variance estimates may be produced and compared to those reported in this paper.

Johnson et al. (2015) reported only the visibility amplitudes, which we average in time on a per-scan basis. This averaging presumes that there is no intrinsic phase evolution in the visibilities over the approximately 10-minute scan lengths; this approximation is well justified for the source sizes inferred from the 2017 EHT observations. The visibility data are normalized by the intrasite SMA-JCMT, SMA-SMA, and CARMA-CARMA baselines. Note that this normalization eliminates the need to perform a phase calibration like that applied to JCMT and LMT for the 2017 EHT data.

Figure 23 shows the reconstructed normalized variances from the 2013 observations in comparison to those from the 2017 EHT observations. Baselines between 2.5 and 4 Gλ provide variability estimates that are broadly consistent between the two sets of observations. This agreement suggests that the degree of structural variability exhibited by Sgr A* during the 2017 EHT campaign is not anomalous.

Similarly, because the station gains were well characterized for all stations during the 2013 observations, it is not necessary to make assumptions about the source size on short baselines. Therefore, the 2013 observations provide estimates of the normalized variance on baselines shorter than 2 Gλ. The large statistical errors of these measurements preclude strong constraints on the variability below 1 Gλ, but there are nevertheless hints of a turnover in the variability power between 1 and 2 Gλ.

Appendix B: Origin and Mitigation of Biases in Single-day Analyses

As discussed in Section 3, there is compelling evidence that Sgr A* exhibits structural variability on timescales ranging from minutes to a full observation night. The degree of variability is estimated in Sections 3.3 and 8.1 and found to be the dominant contribution to the difference between the observed visibilities and those associated with a mean image for baselines with lengths between ∼3 and 6 Gλ. Underlying the noise modeling mitigation method (see Section 3.4) is the assumption that the added "variability noise" modifies the data in a stochastic manner, i.e., coherent deviations do not persist throughout large patches of the (u,v)-plane. However, Earth-rotation aperture synthesis naturally results in correlated variability between visibilities that are nearby in the (u,v)-plane, because visibilities that have small (u,v) separations tend to also have small temporal separations. Furthermore, the sparsity of the EHT array prevents most locations in the (u,v)-plane from being sampled more than once in a single observation (though see Section 3.2 for several exceptions), meaning that multiple observing days must be combined to access more than a single instantiation of the source variability.

The impact of structural variability can be seen most prominently on the Chile-LMT baselines, which exhibit coherent deviations on ∼1 hr timescales that are evident in the visibility amplitudes presented in Figure 24. These deviations are most pronounced near 1 hr GMST (∼12 UTC), where they are evident on both April 6 and April 7. The residual gain uncertainties of ∼10%–20% (Paper II) are insufficient to explain the dramatic drop near 4 Gλ around 1 hr GMST on April 6. At the same GMST on April 7, the visibility amplitudes fluctuate upward by a similar amount. On the remaining days the amplitudes at this GMST lie between the April 6 and April 7 values, indicating that the variations are associated with a process that is uncorrelated on interday timescales.

**Figure 24.** Visibility amplitudes from the HOPS low-band Sgr A* data set, averaged coherently over 120 s segments, on Apr 6 (red), April 7 (blue), and April 5 and 10 (gray) on the Chile-LMT baselines as functions of baseline length (left) and observing time (right). Error bars indicate the error implied by the mean noise model and are intended to account for fluctuations due to variability in addition to statistical and known systematic error components.
Download figure:
Standard image High-resolution image

Similar coherent deviations are also observed in the synthetic data sets produced from GRMHD simulations for the purposes of calibrating ring size measurements to a common physical scale (see Section 4.4 and Appendix D). Figure 25 shows a similar set of visibility amplitudes on the Chile-LMT baselines for one of these synthetic data sets (data set 092 in Table 7). While the date and time of the spurious feature differ from those seen in the real Sgr A* data—arising in the simulated data set near 3 h GMST (∼14 UTC)—a dramatic deviation is present and persists for ∼0.5 hr. In both the Sgr A* data and GRMHD simulations, these observed deviations are ≳3σ outliers (per the variance expected from the variability quantification scheme detailed in Section 3.3), they are nearly exclusively confined to the Chile-LMT baseline, and they are rare. However, due to their coherent nature—i.e., many data points on a given day are similarly displaced—these fluctuations violate the assumption of statistical independence made by the noise modeling mitigation scheme on a single day.

**Figure 25.** Visibility amplitudes from the HOPS low-band GRMHD validation synthetic data set (data set 092 in Table 7), averaged coherently over 120 s segments, on April 10 (red), April 7 (blue), and April 5 and 10 (gray) on the Chile-LMT baselines as functions of baseline length (left) and observing time (right). Error bars indicate the error implied by the mean noise model and are intended to account for fluctuations due to variability in addition to statistical and known systematic error components. Vertical dotted green lines indicate the positions at which frames are shown in Figure 26.
Download figure:
Standard image High-resolution image

Table 7. GRMHD Synthetic Data Set Parameters

Data Set Index	Accretion State	Spin	Inclination	R_high	Position Angle	Input θ_g
000	MAD	−0.94	10	10	155	4.477
001	MAD	−0.94	10	40	−114	4.140
002	MAD	−0.94	10	160	24	5.317
003	MAD	−0.94	50	10	168	5.493
004	MAD	−0.94	50	40	90	5.493
005	MAD	−0.94	50	160	89	6.423
006	MAD	−0.94	90	10	−177	3.843
007	MAD	−0.94	90	40	−73	5.503
008	MAD	−0.94	90	160	112	6.152
009	MAD	−0.5	10	10	168	5.704
010	MAD	−0.5	10	40	51	5.684
011	MAD	−0.5	10	160	−12	4.502
012	MAD	−0.5	50	10	−22	5.724
013	MAD	−0.5	50	40	178	5.563
014	MAD	−0.5	50	160	105	4.039
015	MAD	−0.5	90	10	−171	5.935
016	MAD	−0.5	90	40	146	3.596
017	MAD	−0.5	90	160	−3	5.110
018	MAD	0.0	10	10	35	3.677
019	MAD	0.0	10	40	143	5.719
020	MAD	0.0	10	160	115	5.030
021	MAD	0.0	50	10	112	3.813
022	MAD	0.0	50	40	−101	4.301
023	MAD	0.0	50	160	−12	4.909
024	MAD	0.0	90	10	76	4.059
025	MAD	0.0	90	40	11	4.029
026	MAD	0.0	90	160	97	6.323
027	MAD	0.5	10	10	40	3.974
028	MAD	0.5	10	40	−4	4.658
029	MAD	0.5	10	160	126	6.272
030	MAD	0.5	50	10	−42	4.472
031	MAD	0.5	50	40	25	4.089
032	MAD	0.5	50	160	−135	5.598
033	MAD	0.5	90	10	108	5.251
034	MAD	0.5	90	40	171	5.432
035	MAD	0.5	90	160	140	5.015
036	MAD	0.94	10	10	−54	5.679
037	MAD	0.94	10	40	1	4.200
038	MAD	0.94	10	160	−92	5.915
039	MAD	0.94	50	10	−2	6.282
040	MAD	0.94	50	40	161	5.131
041	MAD	0.94	50	160	−89	5.699
042	MAD	0.94	90	10	−48	5.025
043	MAD	0.94	90	40	−99	4.587
044	MAD	0.94	90	160	54	4.467
045	SANE	−0.94	10	10	97	5.880
046	SANE	−0.94	10	40	127	6.388
047	SANE	−0.94	10	160	−142	6.267
048	SANE	−0.94	50	10	−59	6.016
049	SANE	−0.94	50	40	144	3.652
050	SANE	−0.94	50	160	−110	4.411
051	SANE	−0.94	90	10	46	3.783
052	SANE	−0.94	90	40	−129	5.075
053	SANE	−0.94	90	160	68	6.106
054	SANE	−0.5	10	10	53	5.276
055	SANE	−0.5	10	40	76	4.281
056	SANE	−0.5	10	160	144	4.854
057	SANE	−0.5	50	10	−173	6.418
058	SANE	−0.5	50	40	55	5.070
059	SANE	−0.5	50	160	66	5.000
060	SANE	−0.5	90	10	154	5.080
061	SANE	−0.5	90	40	−154	5.236
062	SANE	−0.5	90	160	42	6.363
063	SANE	0.0	10	10	−31	4.321
064	SANE	0.0	10	40	−145	4.990
065	SANE	0.0	10	160	−13	3.612
066	SANE	0.0	50	10	70	5.684
067	SANE	0.0	50	40	83	4.773
068	SANE	0.0	50	160	−175	6.262
069	SANE	0.0	90	10	104	4.019
070	SANE	0.0	90	40	−68	5.367
071	SANE	0.0	90	160	−49	3.994
072	SANE	0.5	10	10	−116	6.142
073	SANE	0.5	10	40	−76	5.287
074	SANE	0.5	10	160	−17	4.763
075	SANE	0.5	50	10	138	5.613
076	SANE	0.5	50	40	−80	3.717
077	SANE	0.5	50	160	−109	6.333
078	SANE	0.5	90	10	128	6.403
079	SANE	0.5	90	40	−162	5.271
080	SANE	0.5	90	160	−7	3.586
081	SANE	0.94	10	10	−46	4.768
082	SANE	0.94	10	40	38	5.548
083	SANE	0.94	10	160	122	5.875
084	SANE	0.94	50	10	−36	5.920
085	SANE	0.94	50	40	142	4.311
086	SANE	0.94	50	160	177	6.096
087	SANE	0.94	90	10	84	4.592
088	SANE	0.94	90	40	138	6.142
089	SANE	0.94	90	160	165	3.521

090	MAD	0.0	150	160	19	5.694
091	MAD	0.5	70	160	−32	5.271
092	MAD	−0.5	30	160	46	4.497
093	MAD	0.94	30	10	86	4.069
094	MAD	0.5	150	160	11	5.437
095	SANE	−0.94	70	10	−51	4.778
096	SANE	0.5	110	40	176	5.835
097	SANE	−0.94	130	160	−118	3.984
098	SANE	0.0	150	40	70	5.065
099	SANE	0.5	110	10	−71	5.020

Note. Simulation parameters for each of the GRMHD-based synthetic data sets used for θ_g calibration (top; indices 000–089) and validation (bottom; indices 090–099). The sign of the spin follows the convention of M87* Paper V, where negative values indicate that the angular momentum of the accretion flow is antialigned with that of the black hole. The inclination angle is given in degrees, with 90° indicating an edge-on system and 0° indicating a system whose spin vector is pointed toward us. The position angle is given in degrees east of north and refers to the orientation of the black hole spin vector. For each simulation, the input value of θ_g is given in μas.

Download table as: ASCIITypeset images: 1 2

The origin of the visibility amplitude deviations for the GRMHD simulations can be identified with coherent variable structures moving about the ring. As shown in Figure 26, instantaneous images from the GRMHD simulations can differ qualitatively from the average image, with the former sometimes dominated by small, bright patches of emission. When aligned in the NW–SE direction and separated by the ∼50 μas ring diameter, these bright emission regions significantly impact the visibilities on the ∣ u ∣ ≈ 4 Gλ Chile-LMT baselines. Days that do not exhibit large variations correspond to periods less impacted by such patchy emission structures.

**Figure 26.** Average (left) and snapshot (right) images from the GRMHD simulation used to generate the validation synthetic data set shown in Figure 25. Three frames from the portion of the simulation used to generate the April 10 (top, outlined in red) and April 7 (bottom, outlined in blue) synthetic data sets are shown at the GMST times specified (corresponding to the times indicated by the vertical dotted green lines in Figure 25). All of the images share a common brightness color scale; the absolute brightness scale is arbitrary because each image has been normalized to have unit total flux density, and a modest amount of saturation has been permitted in the brightest regions to enhance the visibility of low-brightness features.
Download figure:
Standard image High-resolution image

The coherent deviations seen on the Chile-LMT baselines can manifest as pathological behaviors in the single-day Sgr A* analyses carried out in this paper and in Paper III, particularly when applied to the sparser April 6 data set (see left column of Figure 27). The specific manner in which the visibility fluctuations impact the reconstructed source structure depends on the details of the analysis scheme and the freedom each underlying image model has to accommodate a subset of visibilities that are discrepant with the time-averaged structure.

**Figure 27.** Comparison of the results from Themis imaging (top row), full-track geometric modeling (middle row), and snapshot modeling (bottom row) across different combinations of April 6 and 7 data sets. The first column shows results from fitting to the April 6 data, the second column shows results from fitting to the April 6 data after flagging baselines between Chile and LMT, the third column shows results from fitting to the April 7 data, and the fourth column shows results from fitting to the combined April 6 and 7 data. In all panels, we show an image corresponding to the posterior mean; for the Themis imaging results, each sample image has been shifted during averaging so as to maximize the normalized cross-correlation computed with respect to a reference sample. The full-track and snapshot modeling results show fits to the HOPS pipeline low-band data, while the imaging results show fits to the HOPS pipeline combined low- and high-band data. All of the images share a common brightness color scale; the absolute brightness scale is arbitrary because each image has been normalized to have unit total flux density, and a modest amount of saturation has been permitted in the brightest regions to enhance the visibility of low-brightness features.
Download figure:
Standard image High-resolution image

Images reconstructed from the April 6 Sgr A* data exhibit parallel NE–SW streaks, typical of baseline artifacts associated with miscalibration of a single baseline, and a natural consequence of visibility amplitudes on the Chile-LMT baseline that are discrepant with the time-averaged image. The top left panel of Figure 27 shows such artifacts in a Themis image reconstruction; similar image artifacts are observed in the RML (eht-imaging and SMILI) and, to a lesser extent, the CLEAN (DIFMAP) reconstructions presented in Paper III.

By virtue of their specification, the mG-ring source models are not capable of introducing streak-like features into the image structure. When applied to Sgr A*, full-track mG-ring fits to the April 6 data instead exhibit smaller ring sizes than those applied to the April 7 data (see the first and third panels in the second row of Figure 27), with the April 6 fits preferring rings with a ∼40 μas diameter and the April 7 fits preferring rings with a ∼55 μas diameter. This discrepancy in ring size may be associated with a shift in the location of the visibility minimum from ∼3 to ∼4 Gλ between April 7 and April 6, respectively. However, we note that the behavior of the full-track mG-ring model when applied to the April 6 data is a sensitive function of the m-order; changing the m-order can cause the model to prefer a ∼55 μas diameter. Snapshot mG-ring fits exhibit qualitatively similar behavior to the full-track fits, as shown in the third row of Figure 27. We again find that the April 6 data prefer a somewhat smaller ring diameter than the April 7 fits, though for snapshot fits the posterior distributions for the diameter parameter are consistent between the 2 days.

Despite their disparate forms, the various artifacts observed in April 6 reconstructions are effectively ameliorated by flagging the Chile-LMT baselines prior to carrying out the analyses (see the second column in Figure 27), providing further evidence that the origins of the artifacts are confined to (or at least dominated by) the Chile-LMT baselines. However, though flagging of these baselines is successful in preventing the specific analysis pathologies discussed above, such flagging is not otherwise motivated; there is no evidence for atypical data calibration issues on these baselines, and thus no reason to believe that the observed variability excess is anything other than intrinsic to the source. Rather than flagging data, we proceed instead with the noise modeling scheme described in Section 3.4, which is itself intended to mitigate the effects of intrinsic variability on the source reconstructions.

As described above, the apparent failure of the noise modeling method to produce consistent results on some individual single-day analyses can be attributed to the coherent (rather than stochastic) nature of the variability sampled at any single location in the (u,v)-plane, which violates the key assumption underlying the noise modeling approach that each data point represent an independent sample of the source variability. This assumption would be more faithfully adhered to by a data set containing a larger number of independent variability realizations; within the context of the EHT Sgr A* observations, additional variability realizations are most naturally acquired by combining data sets across observing days. Multiday analyses are also more consistent in spirit with the model-agnostic variability estimates in Sections 3.3 and 8.1. Combining the independent realizations of the structural variability from multiple days improves the estimate of the mean visibilities, as evident in Figure 24, in which the multiday mean of the visibility amplitudes is both smoother than and intermediate between those of April 6 and 7 individually.

The fourth column in Figure 27 shows reconstructions made using the combined April 6 and 7 data sets. The improved behavior of the visibility means is reflected in better consistency across methods and a reduction in image artifacts.

Appendix C: Single-day Fits

In this section we present the results from analyses carried out on the April 6 and April 7 data sets individually. Figures 28 and 29 show single-day images from each of the analysis pipelines (analogous to those shown in Figure 20), and Figure 30 shows the corresponding measurements of morphological properties (analogous to those shown in Figure 21).

**Figure 28.** Same as Figure 20, but showing results obtained from using only the April 6 data.
Download figure:
Standard image High-resolution image

**Figure 29.** Same as Figure 20, but showing results obtained from using only the April 7 data.
Download figure:
Standard image High-resolution image

**Figure 30.** Similar to Figure 21, but showing results obtained from fitting to the April 6 and 7 data sets separately. Each column from Figure 21 has now been split into a pair of columns showing April 6 results on the left and April 7 results on the right.
Download figure:
Standard image High-resolution image

Appendix D: GRMHD Synthetic Data Set Generation

To calibrate measurements of the angular gravitational radius θ_g (see Section 4.4), we rely on the library of GRMHD simulations and associated GRRT synthetic movies produced and described in Paper V. In this appendix we provide an overview of our model selection and data generation procedures, which are conceptually similar to the calibration analysis carried out in M87* Paper VI.

We select 90 simulations from the GRMHD library to be used for θ_g calibration and another 10 simulations to be used to validate this calibration. The 90 calibration data sets uniformly grid a range of GRMHD parameters: every combination of the two MAD and SANE accretion states, five black hole spin values of [−0.94, −0.5, 0, 0.5, 0.94], three inclinations of [10°, 50°, 90°], and three R_high values of [10, 40, 160] are represented.¹⁶¹ The 10 validation data sets are split evenly between MAD and SANE, but the black hole spins are randomly selected from [−0.94, −0.5, 0, 0.5, 0.94], the inclinations are randomly selected from [30°, 70°, 110°, 130°, 150°], and the R_high values are randomly selected from [10, 40, 160]. The resulting model images contain a large variety of possible image morphologies, along with self-consistent dynamics as governed by the equations of GRMHD and GRRT. The GRMHD parameters corresponding to each selected calibration and validation model are listed in Table 7, and some example average images are shown in Figure 31.

**Figure 31.** Example average images from some of the GRMHD movies selected for synthetic data generation; each movie has been light-curve-normalized prior to averaging. The data set indices are labeled in the upper left corner of each panel, and the corresponding GRMHD parameters are listed in Table 7. All of the images share a common brightness color scale; the absolute brightness scale is arbitrary because each image has been normalized to have unit total flux density, and a modest amount of saturation has been permitted in the brightest regions to enhance the visibility of low-brightness features. We note that these average images tend to have much smoother structure than the individual frames of the movies that were averaged to produce them (see, e.g., Figure 26 for several example frames from one movie); the synthetic data sets themselves are produced from the movies and not from the average images.
Download figure:
Standard image High-resolution image

After selecting the GRMHD models and prior to generating synthetic data, we first modify their orientations and angular sizes from their default simulation values. We rotate each simulated movie by a position angle that is a uniformly chosen integer in the range [−180°, 180°]. Each of the simulations from Paper V was produced assuming a mass of M = 4.14 × 10⁶ M_⊙ and a distance of 8.127 kpc (Do et al. 2019; Gravity Collaboration et al. 2019), corresponding to an angular gravitational radius of θ_g = 5.03 μas. To avoid biasing our calibration in favor of any one value of θ_g, we modify the overall spatial scale of each simulated movie by a random factor that is uniformly drawn from the range [0.7, 1.3]. The input position angles and gravitational radii for each movie are listed in Table 7.

Once the GRMHD models are selected and their movies rotated and scaled, we generate synthetic data sets in the same manner as for the synthetic data sets used in Paper III. We use the eht-imaging software to first apply artificial scattering to the source structure per the scattering model from Johnson et al. (2018) and then sample the Fourier transform of each movie at a cadence and at (u,v) locations identical to those of the EHT observations of Sgr A*. The resulting visibilities are then corrupted with thermal noise and station-based gain and leakage effects at a level that is consistent with the Sgr A* data (Paper II). Eight synthetic data sets are generated for each GRMHD model, corresponding to the (u,v)-coverage on four observing nights—2017 April 5, 6, 7, and 10—and two frequency bands (see Section 2).

Appendix E: Representative χ² Values for Each Analysis Method

In this section we provide some example representative χ² values and associated quantities for each of the analysis methods used in this paper. Specifically, we report values corresponding to the example fits shown in Figures 9, 15, and 17.

For any fitted data quantity q with modeled counterpart $\displaystyle \hat{q}$ and associated measurement uncertainty σ, we determine the χ² as

$\begin{eqnarray}&&{\chi }^{2}=\sum _{i}^{{N}_{\mathrm{data}}}{\left(\displaystyle \frac{{q}_{i}-{\displaystyle \hat{q}}_{i}}{{\sigma }_{i}}\right)}^{2},\end{eqnarray} \tag{ E1 }$

where the sum is understood to be taken over all N_data fitted data points. We also define a reduced-χ² value,

$\begin{eqnarray}&&{\chi }_{\mathrm{red}}^{2}=\displaystyle \frac{{\chi }^{2}}{{N}_{\mathrm{dof}}},\end{eqnarray} \tag{ E2 }$

where N_dof is the number of degrees of freedom remaining in the data after accounting for the free parameters in the model. However, we note that despite their familiarity, the interpretation of either of these χ² statistics is complicated by several aspects of the analyses presented in this paper.

The first complication is that the number of degrees of freedom is generically unknown, rendering ${\chi }_{\mathrm{red}}^{2}$ difficult to define in practice. The use of informative priors and the presence of correlations among model parameters mean that N_dof cannot be determined as simply the difference between N_data and the number of free parameters in the model. For example, in the RML imaging methods (eht-imaging and SMILI), the number of effective image parameters is implicitly reduced—relative to the number that would be assumed by simply counting the total number of image pixels—by a large factor by the imposition of regularization terms (e.g., smoothness, sparseness) in the objective function (Paper III). Additionally, for all methods that fit to complex visibilities or visibility amplitudes, station gains are reconstructed as part of the fitting process; for those methods that simultaneously reconstruct high- and low-band data, the gains for the two bands are necessarily strongly correlated. Furthermore, strong priors are imposed by network calibration (Paper II; M87* Paper III), further reducing the number of effective model parameters and growing the effective number of degrees of freedom.

The second complication is that, with the exception of the snapshot modeling presented in Section 6 (Comrade and DPI), all models make use of an added uncertainty budget to account for source variability. For the Themis imaging and full-track modeling analyses, the parameters describing this excess variability noise are simultaneously fit alongside those describing the image structure. In both cases, the impact is to drive the ${\chi }_{\mathrm{red}}^{2}$ value toward unity, rendering the resulting ${\chi }_{\mathrm{red}}^{2}$ value not particularly meaningful as a metric of fit quality.

Nevertheless, in Table 8 we present the χ² values and relevant properties of the data sets and models used for representative examples from each analysis pathway. In lieu of a well-defined number of degrees of freedom, we consider two limits. An optimistic estimate is given by the procedure adopted in Paper III, in which N_dof ≈ N_data, appropriate when the total number of effective model parameters is much less than the number of data points. In this limit, ${\chi }_{\mathrm{red}}^{2}$ ranges from 0.35 to 0.79 for the fits listed in Table 8. A more pessimistic accounting is given by N_dof ≈ N_data − N_gains − N_params. This quantity is negative for some analyses, a consequence of the strong correlations that limit the effective number of model parameters in practice (e.g., for RML imaging and for certain individual snapshot models). Among those analysis methods that do not exhibit this pathology in the N_dof, the pessimistic ${\chi }_{\mathrm{red}}^{2}$ estimates range from 0.9 to 1.4.

Table 8. Representative χ² Values and Ancillary Model Information

Method	Day	Band	χ²	N_data	N_gains ^a	N_params ^a
`eht-imaging` imaging	Apr 7	HI+LO	4174.5	12082	6358	6400
`SMILI` imaging	Apr 7	HI+LO	2192.9	6806	3562	5625
`DIFMAP` imaging	Apr 7	HI+LO	4194.5	12082	6358	2262
Themis imaging	Apr 6+7	HI+LO	1444.5	2990	1323	58
Themis full-track modeling	Apr 6+7	LO	1237.0	1562	666	19
`Comrade` snapshot modeling	Apr 6+7	LO	1987.2	4977	1345	3328
`DPI` snapshot modeling	Apr 6+7	LO	1752.7	3176	...	1664

Note. The χ² values, number of data points fit (N_data), number of station gain parameters (N_gains), and number of parameters describing the image structure (N_params) for representative example fits from each of the analysis methods used in this paper. For the eht-imaging and SMILI analyses, N_params is equal to the number of pixels in the image Paper III; for the DIFMAP analyses, N_params is determined by the number of CLEAN components, with each component contributing three parameters. For the Themis imaging and full-track modeling analyses, N_params is equal to the number of model parameters used to specify the image structure (see Paper III and Section 7, respectively). For the Comrade and DPI snapshot modeling analyses, each of the N_s snapshots contributes 13 and 8 parameters, respectively, to the total N_params count (see Section 6). Note that because the DPI analysis fits only to closure quantities, it does not contain any station gain parameters.

^aNote that the listed numbers of gains and model parameters are solely those necessary to forward-compute the model data values, i.e., these values characterize a property of the model specification. Importantly, due to model nonlinearity and strong correlations between parameters, these numbers are not generally suitable for determining N_dof (see Appendix E).

Download table as: ASCII Typeset image

Appendix F: Snapshot Modeling Likelihood Functions

In this appendix we provide specific expressions for the likelihood functions used during the snapshot geometric modeling analyses described in Section 6. We assume the high signal-to-noise ratio limit for all data products, which is not strictly satisfied for the relatively short integration times (120 s) employed in the snapshot modeling analyses, but which has the benefit of reducing all likelihood functions to Gaussians.

F.1. Visibility Amplitude Likelihood

For each snapshot and baseline, the visibility amplitudes are distributed according to a Rice distribution, which in the high signal-to-noise ratio limit becomes Gaussian (e.g., Wardle & Kronberg 1974; Broderick et al. 2020a). We can thus write the visibility amplitude likelihood function as

$\begin{eqnarray}&&{{ \mathcal L }}_{| V| ,{sb}}=\displaystyle \frac{1}{\sqrt{2\pi {\sigma }_{{sb}}^{2}}}\exp \left[-\displaystyle \frac{{(| {\displaystyle \hat{V}}_{{sb}}| -| {g}_{i}| | {g}_{j}| | {\displaystyle \hat{{ \mathcal V }}}_{{sb}}| )}^{2}}{2{\sigma }_{{sb}}^{2}}\right],\end{eqnarray} \tag{ F1 }$

where b is a baseline index that runs over all station pairs {i, j} in snapshot s. Here $| {\displaystyle \hat{{ \mathcal V }}}_{{sb}}|$ is the model visibility amplitude and ∣g_i∣ and ∣g_j∣ are the individual station gain amplitudes (see Equation (2)). We use this Gaussian approximation to the Rice distribution for all of the snapshot geometric modeling analyses. For a snapshot s, the joint visibility amplitude likelihood across all baselines is then given by

$\begin{eqnarray}&&{{ \mathcal L }}_{| V| ,s}=\prod _{b}{{ \mathcal L }}_{| V| ,{sb}},\end{eqnarray} \tag{ F2 }$

where the product is taken over all baselines b.

F.2. Closure Phase Likelihood

The visibility phases in EHT data sets are heavily corrupted by atmospheric fluctuations (M87* Paper II; M87* Paper III), so all of our snapshot geometric modeling analyses work instead with closure phases ψ (see Equation (3)). In the high signal-to-noise ratio limit, the variance in the closure phase on the triangle containing stations i, j, and k is given by

$\begin{eqnarray}&&{\sigma }_{\psi ,{ijk}}^{2}={\sigma }_{\mathrm{ln}| V| ,{ij}}^{2}+{\sigma }_{\mathrm{ln}| V| ,{jk}}^{2}+{\sigma }_{\mathrm{ln}| V| ,{ki}}^{2},\end{eqnarray} \tag{ F3 }$

where

$\begin{eqnarray}&&{\sigma }_{\mathrm{ln}| V| ,{ij}}=\displaystyle \frac{{\sigma }_{{ij}}}{| {V}_{{ij}}| }\end{eqnarray} \tag{ F4 }$

is the uncertainty in the log visibility amplitude in the same limit and σ_ij is the uncertainty in V_ij. Hereafter we replace the triangle indices ijk with a single multi-index t for clarity. We approximate the closure phase likelihood for a single triangle t and snapshot s by a von Mises distribution,

$\begin{eqnarray}&&{{ \mathcal L }}_{\psi ,{st}}=\displaystyle \frac{1}{2\pi {I}_{0}({\sigma }_{\psi ,t}^{-2})}\exp \left[\displaystyle \frac{\cos ({\psi }_{t}-{\displaystyle \hat{\psi }}_{t})}{{\sigma }_{\psi ,t}^{2}}\right],\end{eqnarray} \tag{ F5 }$

where $\displaystyle \hat{\psi }$ denotes a measured closure phase and I₀(x) is a modified Bessel function of the first kind of order 0. In the high signal-to-noise ratio limit, the von Mises becomes a Gaussian distribution with mean $\hat{\psi }}_{t$ and standard deviation σ_ψ,t. Note that up to a normalization $\mathrm{ln}{{ \mathcal L }}_{\psi }$ is equal to the closure phase χ² defined in Paper III.

Because a full set of closure phases is highly redundant (i.e., the set is not linearly independent; see Blackburn et al. 2020), we use eht-imaging (Chael et al. 2016, 2018) to construct a minimal nonredundant set within every snapshot. The minimal set of closure phases is constructed by selecting all triangles that contain the most sensitive station in the array (typically ALMA), which ensures that the resulting closure phases are minimally covariant. We ignore the remaining covariances and approximate the joint closure phase likelihood for a snapshot s as

$\begin{eqnarray}&&{{ \mathcal L }}_{\psi ,s}=\prod _{t}{{ \mathcal L }}_{\psi ,{st}},\end{eqnarray} \tag{ F6 }$

where the product is taken over all triangles t in the minimal set.

F.3. Log Closure Amplitude Likelihood

Some of our snapshot modeling fits use log closure amplitudes in place of visibility amplitudes. In the high signal-to-noise ratio limit, the variance in the log closure amplitude on the quadrangle containing stations i, j, k, and ℓ is given by

$\begin{eqnarray}&&{\sigma }_{\mathrm{ln}A,{ijk}{\ell }}^{2}={\sigma }_{\mathrm{ln}| V| ,{ij}}^{2}+{\sigma }_{\mathrm{ln}| V| ,k{\ell }}^{2}+{\sigma }_{\mathrm{ln}| V| ,{ik}}^{2}+{\sigma }_{\mathrm{ln}| V| ,j{\ell }}^{2}.\end{eqnarray} \tag{ F7 }$

Hereafter we replace the quadrangle indices ijk ℓ with a single multi-index q for clarity. For a single log closure amplitude on quadrangle q, we thus have the Gaussian likelihood

$\begin{eqnarray}&&{{ \mathcal L }}_{\mathrm{ln}A,{sq}}=\displaystyle \frac{1}{\sqrt{2\pi {\sigma }_{\mathrm{ln}A,{sq}}^{2}}}\exp \left[-\displaystyle \frac{{(\mathrm{ln}{A}_{{sq}}-\mathrm{ln}{\displaystyle \hat{A}}_{{sq}})}^{2}}{2{\sigma }_{\mathrm{ln}A,{sq}}^{2}}\right],\end{eqnarray} \tag{ F8 }$

in the same limit. We use eht-imaging to construct a minimal nonredundant set of log closure amplitudes within every snapshot, and for the joint likelihood over all of the linearly independent quadrangles within that snapshot we again ignore covariance and treat the data products as being statistically independent. The joint likelihood is then given by

$\begin{eqnarray}&&{{ \mathcal L }}_{A,s}=\prod _{q}{{ \mathcal L }}_{A,{sq}},\end{eqnarray} \tag{ F9 }$

where the product is taken over all quadrangles q in the minimal set.

Appendix G: Snapshot Modeling Prior Distribution Consistency

In specifying the hypermodel for combining geometric modeling results from individual snapshots (see Section 6.3), we have selected a set of snapshot priors π( θ _s), an average model prior $\pi (\displaystyle \bar{{\boldsymbol{\theta }}})$ , and a hypermodel $\pi ({{\boldsymbol{\theta }}}_{s}| \displaystyle \bar{{\boldsymbol{\theta }}})$ . A self-consistent set of hypermodel and snapshot priors should satisfy the relation

$\begin{eqnarray}&&\pi ({{\boldsymbol{\theta }}}_{s})=\int \pi ({{\boldsymbol{\theta }}}_{s}| {\boldsymbol{\mu }},{\boldsymbol{\sigma }})\pi ({\boldsymbol{\mu }})\pi ({\boldsymbol{\sigma }}){\rm{d}}{\boldsymbol{\mu }}{\rm{d}}{\boldsymbol{\sigma }}.\end{eqnarray} \tag{ G1 }$

However, because we have carried out the hierarchical modeling from Section 6 in two stages—i.e., first fitting the mG-ring model to individual snapshots, then combining the results from all snapshots—the priors on the model parameters during each stage are selected for local convenience, and the relation in Equation (G1) does not strictly hold.

As a result, the final posteriors for each parameter behave as though the individual snapshot fits were carried out using effective priors that differ from the priors we actually imposed (see Table 1). Figure 32 illustrates how the original priors differ from the effective priors induced by the snapshot-combining procedure. The net result is a modest (≲20% maximum probability density deviation) bias for parameter values to fall toward the center of the original (flat) prior range and away from the edges of that range. Because the widths of the individual snapshot posteriors are typically much narrower than the full prior range (see, e.g., Figure 13), we expect the impact of this centralizing bias on our reported values to be negligible.

**Figure 32.** Illustration of the centralizing bias induced by specifying both the snapshot priors and hypermodel priors separately, using the diameter parameter as an example. The orange curve shows the diameter prior specified during mG-ring model fitting of an individual snapshot, and the blue curve shows the effective prior on this parameter after snapshots are combined via the procedure specified in Section 6.3.
Download figure:
Standard image High-resolution image

Appendix H: Snapshot Geometric Modeling Validation Tests

We cross-validate the results of each snapshot modeling software using a single snapshot of the low-band Sgr A* data from April 7 processed with the HOPS pipeline. The specific snapshot selected begins at 12.65 hr UTC, where the (u,v)-coverage of the observation is maximized (Farah et al. 2022). Given the differences in model specification and fitted data products between Comrade/eht-imaging and DPI, we run two separate cross-validation tests:

1.
The first test compares the results between Comrade and eht-imaging when fitting an m = 3 mG-ring model to visibility amplitudes (including gain amplitudes as model parameters) and closure phases. We do not include DPI in this comparison because it cannot currently fit for station gain parameters.
2.
The second test compares the results between Comrade, DPI, and eht-imaging when fitting an m = 2 mG-ring model to log closure amplitudes and closure phases.

Figure 33 shows the diameter, thickness, and fractional central flux posteriors obtained from performing the tests described above. The posteriors show generally good agreement across codes.

$d^{\prime} $ — **Figure 33.** Comparison of the 2D joint posterior distributions obtained from fitting an mG-ring model to a 120 s snapshot starting at 12.65 hr UT in the Sgr A* April 7 HOPS low-band data set. The left plot compares the results from `Comrade` (blue) and `eht-imaging` (orange), fitting to visibility amplitudes (including gains) and closure phases with an m = 3 mG-ring. The right triangle plot shows the results from `Comrade` (blue), `eht-imaging` (orange), and `DPI` (green), fitting to closure amplitudes and closure phases for an m = 2 mG-ring. In both cases we only show the results for the diameter, width, and fractional Gaussian component flux parameters. Since `DPI` fits the diameter of the blurred m-ring $d^{\prime}$ (Equation (42)), the `DPI` diameter was debiased so that it corresponds to the infinitesimally thin m-ring diameter that is fit by `eht-imaging` and `Comrade` (see also M87* Paper VI). The contours show 1σ, 2σ, and 3σ levels of the posterior distributions.
Download figure:
Standard image High-resolution image

Appendix I: Analysis Specifics and Validation of Calibration Strategy

Our calibration strategy for determining the scaling factor α that relates measured ring diameters to intrinsic angular gravitational radii θ_g is described in Section 4.4. In this appendix, we summarize the elements of this strategy that are specific to the different analysis pathways described in Sections 5, 6, and 7. We also validate the calibration procedure by applying the calibrated α values to ring diameter measurements from 10 synthetic GRMHD data sets. For these data sets we know the underlying ground-truth θ_g values, and so we can use them to verify whether our measurement and calibration strategy is working as intended.

I.1. IDFE Specifics

To perform an IDFE-based θ_g calibration, top-set and posterior images are produced for each of the 90 synthetic GRMHD-based data sets described in Section 4.4 (see also Appendix D). Each of these images is run through both REx and VIDA in the same manner as described in Section 5 for the Sgr A* data.

I.2. Snapshot Geometric Modeling Specifics

We carry out snapshot geometric modeling of the GRMHD calibration and validation data sets in the same manner as described in Section 6 for the Sgr A* data. For all synthetic data sets we use the same data preparation and snapshot timescale as for the fits to the Sgr A* data. We also retain the same model specification, fitting an m = 4 mG-ring for all Comrade analyses and an m = 2 mG-ring for all DPI analyses. The DPI analyses are carried out on the low-band data sets only, while the Comrade analyses are carried out on both low- and high-band data sets.

I.3. Full-track Geometric Modeling Specifics

We carry out full-track geometric modeling of the GRMHD calibration and validation data sets in the same manner as described in Section 7 for the Sgr A* data. For all synthetic data sets we use the same data preparation as for the fits to the Sgr A* data (see Section 7.1). In particular, we derive appropriately individualized priors on the noise model parameters by performing model-agnostic variability quantification (per Section 3.3) on multiday instantiations (corresponding to April 5, 6, 7, and 10; see Appendix D) of each of the synthetic data sets. We also retain the same geometric model specification, fitting an m = 4 mG-ring for all analyses.

I.4. Validation

For each of the analysis pathways, we validate the θ_g calibration using an additional 10 synthetic GRMHD-based data sets. Figure 34 shows the results of carrying out ring diameter measurements and subsequent θ_g conversions on these 10 validation suite data sets, for each of the IDFE, snapshot, and full-track analyses. All analysis pathways are able to successfully recover the correct value of θ_g to within their determined level of calibration uncertainty.

I.5. Origin and Nature of Calibration Outliers

The distributions of calibrated α values from each ring diameter measurement technique show heavy tails toward small α, which manifest as heavy tails to large θ_g in Figure 34 (see also Figure 22). This behavior appears to be generic across all classes of geometric modeling and IDFE analyses used in this paper, and it implies that some fraction of the calibration data sets are reconstructed to have systematically smaller rings than would be predicted from the known values of θ_g in each of the input ground-truth simulations.

The left half of Figure 35 shows average images from the input GRMHD calibration suite simulations corresponding to the five smallest α values recovered by each analysis pathway. The number of simulations for which the reconstructed ring corresponds to a "small" value of α depends on the analysis method; for instance, the fraction of reconstructions having median α < 7.5 ranges from ∼4% for snapshot geometric modeling with Comrade up to ∼14% for imaging with SMILI. We can see in Figure 35 that many of these small-α simulations have structures that are not obviously ring-like. Common morphologies in the small-α simulations include images dominated by compact regions of bright emission (typical of highly edge-on systems), or images containing prominent diffuse emission extending well outside the shadow region of interest. Such structures cannot necessarily be well measured by, e.g., the mG-ring model or IDFE techniques aimed at extracting signatures of a ring-like emission morphology, and attempts to apply these techniques to such data sets can yield results that are difficult to interpret. However, we note that not all of the small-α simulations exhibit such morphological difficulties; some of the poor reconstructions are obtained from simulations with readily apparent ring-like structures, indicating that other difficulties (e.g., strong variability) may be playing a more important role in these cases.

The right half of Figure 35 shows average images from the input GRMHD calibration suite simulations corresponding to five "typical" α values recovered by each analysis pathway; each of these images has been randomly selected from the set whose reconstructed α falls within one standard deviation of the mean. In contrast to the small-α simulations, these images more commonly exhibit ring-like morphologies of the sort that we would expect to be amenable to mG-ring modeling or ring extraction techniques. Furthermore, for most analysis methods a large fraction (∼75%) of the data sets are contained within one standard deviation of the mean; the fact that this fraction is larger than the ∼68% that we would expect for a Gaussian distribution is another manifestation of the heavy tails in the α distributions, and it indicates that the majority of reconstructions are narrowly peaked around the mean (see also Figure 34). However, even among the well-reconstructed images we still find a small number of less obvious ring structures, including some that are dominated by compact emission regions like many of the small-α simulations. Again, the presence of such simulations indicates that the ground-truth emission morphology is not the sole driver of whether or not the underlying ring structure can be successfully reconstructed.

The ring measurement analysis techniques developed in this paper are designed to be appropriate for application to the EHT Sgr A* data. When applying these techniques to a suite of GRMHD simulations containing very diverse image morphologies, we find that a fraction of the reconstructed rings have unreliable diameter measurements. These poor reconstructions contribute to the uncertainty in our α calibration, where they manifest as heavy tails in our calibrated α distribution. The corresponding large uncertainty in α is a consequence of the fact that many of the images in the calibration suite do not resemble Sgr A*, and thus analysis techniques designed for the latter do not necessarily function well when applied to the former. A calibration suite that was more directly tailored to match the properties of the EHT Sgr A* observations may result in smaller α calibration uncertainties and a correspondingly tighter constraint on θ_g.

First Sagittarius A* Event Horizon Telescope Results. IV. Variability, Morphology, and Black Hole Mass

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Observations and Data Products

2.1. VLBI Data Products

2.2. EHT Observations of Sgr A*

2.3. Data Reduction

3. Variability Extraction and Mitigation

3.1. Expectations from Theory

3.2. Intraday Variability in the Sgr A* Data

3.3. Model-agnostic Variability Quantification

3.4. Description of Variability Mitigation Approaches

4. Ring Characterization and Calibration

4.1. Evidence for a Ring

4.2. Salient Features in the Context of a Ring Model

4.3. Geometric Ring Model Specification

4.3.1. Image-domain Representation of mG-ring Model

4.3.2. Visibility-domain Representation of mG-ring Model

4.4. Calibrating Ring Size Measurements to a Common Physical Scale

5. Image-domain Feature Extraction

5.1. Imaging Methods and Products

5.2. Image-domain Feature Extraction Methods

5.2.1. REx

5.2.2. VIDA

5.3. Identifying Rings via Topological Classification

6. Snapshot Geometric Modeling

6.1. Data Preparation

6.2. Snapshot Fitting Procedure

6.3. Averaging the Snapshot Results

6.3.1. Averaging Procedure

6.3.2. Hypermodel Specification

6.4. Software Implementations

6.4.1. Comrade

6.4.2. eht-imaging

6.4.3. DPI

6.4.4. Sampling the Hypermodel Posterior

6.5. Model Selection

7. Full-track Geometric Modeling

7.1. Data Preparation

7.2. Model Specification and Implementation

7.2.1. Parameterized Noise Model

7.2.2. Themis Implementation

7.3. Model Selection

8. Results

8.1. Structural Variability Measurements

8.2. Image Morphology Measurements

8.2.1. Ring Size

8.2.2. Ring Thickness

8.2.3. Position Angle and Asymmetry

8.2.4. Brightness Depression

8.3. Gravitational Radius and Mass

8.3.1. Calibrated Scaling Factors

8.3.2. Sgr A* Angular Gravitational Radius

8.3.3. Sgr A* Mass

9. Summary and Conclusions

Appendix A: Comparison of Sgr A* Variability in 2013 and 2017

Appendix B: Origin and Mitigation of Biases in Single-day Analyses

Appendix C: Single-day Fits

Appendix D: GRMHD Synthetic Data Set Generation

Appendix E: Representative χ2 Values for Each Analysis Method

Appendix F: Snapshot Modeling Likelihood Functions

F.1. Visibility Amplitude Likelihood

F.2. Closure Phase Likelihood

F.3. Log Closure Amplitude Likelihood

Appendix G: Snapshot Modeling Prior Distribution Consistency

Appendix H: Snapshot Geometric Modeling Validation Tests

Appendix I: Analysis Specifics and Validation of Calibration Strategy

I.1. IDFE Specifics

I.2. Snapshot Geometric Modeling Specifics

I.3. Full-track Geometric Modeling Specifics

I.4. Validation

I.5. Origin and Nature of Calibration Outliers

Footnotes

5.2.1. `REx`

5.2.2. `VIDA`

6.4.1. `Comrade`

6.4.2. `eht-imaging`

6.4.3. `DPI`

Appendix E: Representative χ² Values for Each Analysis Method