First Sagittarius A * Event Horizon Telescope Results. VI. Testing the Black Hole Metric

Astrophysical black holes are expected to be described by the Kerr metric. This is the only stationary, vacuum, axisymmetric metric, without electromagnetic charge, that satis ﬁ es Einstein ’ s equations and does not have pathologies outside of the event horizon. We present new constraints on potential deviations from the Kerr prediction based on 2017 EHT observations of Sagittarius A * ( Sgr A * ) . We calibrate the relationship between the geometrically de ﬁ ned black hole shadow and the observed size of the ring-like images using a library that includes both Kerr and non-Kerr simulations. We use the exquisite prior constraints on the mass-to-distance ratio for Sgr A * to show that the observed image size is within ∼ 10% of the Kerr predictions. We use these bounds to constrain metrics that are parametrically different from Kerr, as well as the charges of several known spacetimes. To consider alternatives to the presence of an event horizon, we explore the possibility that Sgr A * is a compact object with a surface that either absorbs and thermally reemits incident radiation or partially re ﬂ ects it. Using the observed image size and the broadband spectrum of Sgr A * , we conclude that a thermal surface can be ruled out and a fully re ﬂ ective one is unlikely. We compare our results to the broader landscape of gravitational tests. Together with the bounds found for stellar-mass black holes and the M87 black hole, our observations provide further support that the external spacetimes of all black holes are described by the Kerr metric, independent of their mass.


Introduction
Horizon-scale images of supermassive black holes provide a conceptually new avenue for testing the theory of general relativity. These images are formed by photons that originate in the deep gravitational fields of black holes and therefore carry imprints of the spacetime properties in the strong-field regime (Jaroszynski & Kurpiewski 1997;. In this series of papers, we report the first horizon-scale images of Sgr A * , the black hole at the center of our Galaxy, obtained with the Event Horizon Telescope (EHT), a global interferometric array observing at 1.3 mm wavelength (Event Horizon Telescope Collaboration et al. 2022a and2022b, hereafter Paper II andPaper III). This paper in the series explores new constraints on the potential deviations from general relativity imposed by these images.
General relativity has been tested in numerous settings with different observational tools and with different astrophysical systems (see the review by Will 2014 and references therein). Traditionally, tests have been carried out in the solar system, with the periastron precession of Mercury (Verma et al. 2014), the deflection of light observed during solar eclipses (Lambert & Le Poncin-Lafitte 2011), and the detection of Shapiro delays in photons grazing the solar surface (Bertotti et al. 2003). Radio observations of pulsars in binary systems expanded these tests, probing the radiative aspects of the theory and the strong-field coupling of the matter to the gravitational field (see Stairs 2003, for a review; see Wex & Kramer 2020;Kramer et al. 2021, for some recent examples). Cosmological observations of the accelerated expansion of the universe probed gravity at the largest scales in the cosmos and gave evidence for the presence of dark matter and dark energy (see Ferreira 2019 for a review).
As is clear from this overview, each test probes a different aspect of the theory of general relativity. First, different astrophysical objects possess widely different mass and length scales and hence map to a very broad range of gravitational potentials and curvatures (Baker et al. 2015). Second, some tests probe the stationary spacetimes of objects, while others probe the dynamic and radiative aspects of the theory. Third, some settings involve vacuum spacetimes, while others are affected by the coupling of matter and radiation to gravity (see, e.g., Damour & Esposito-Farese 1993). Because modifications to the theory of gravity can be introduced independently in each of these aspects, without necessarily affecting the others, each of these tests brings a unique ability to constrain such modifications.
Although general relativistic predictions have shown a high degree of consistency with the aforementioned tests, there remain unresolved questions at the fundamental level, e.g., whether curvature singularities are generally covered by event horizons (cosmic censorship conjecture) or can be naked. These become most urgent for black holes, as those objects have the strongest gravitational fields in the universe and possess a curvature singularity in their center. The combination with quantum theory could tame curvature singularities but at the same time predicts inherent randomness for quantum particles at the event horizon, leading to the black hole information loss paradox (see, e.g., Harlow 2016, for a review). All the concerns involve the presence of event horizons and are, therefore, accessible only to tests with black holes. Until recently, however, precision tests of gravity with black holes have not been possible. This situation has changed dramatically in recent years with the detection of gravitational waves from coalescing stellar-mass black holes with LIGO/Virgo (Abbott 1 et al. 2016(Abbott 1 et al. , 2021, the detection of relativistic effects in the orbits of stars around Sgr A * (Gravity Collaboration et al. 2018aDo et al. 2019), and the imaging observations of the black hole in the center of the M87 galaxy (Event Horizon Telescope Collaboration et al. 2019c;Psaltis et al. 2020a;Kocherlakota et al. 2021).
Tests of gravity with black holes benefit from a remarkable general relativistic prediction encapsulated in the so-called nohair theorem: the only vacuum spacetime that is stationary, is axisymmetric, is asymptotically flat, contains a horizon, and is free of pathologies is the one described by the Kerr metric (Kerr 1963;Israel 1967Israel , 1968Carter 1968Carter , 1971Hawking 1972;Price 1972aPrice , 1972bRobinson 1975). Testing this prediction involves using spacetimes that introduce deviations from this metric and applying observational constraints to place bounds on the magnitudes of the deviations. In order for these spacetimes to evade the no-hair theorem while remaining free of pathologies, they cannot be solutions to the vacuum general relativistic field equations but instead involve additional fields or parametric deviations that are agnostic to the underlying theory of gravity. In either case, measuring conclusively a deviation from the Kerr metric, while demonstrating that the compact object has a horizon, will constitute a demonstration of a violation of the no-hair theorem and, therefore, of the general relativistic field equations.
Horizon-scale images of Sgr A * offer a distinct set of advantages in testing general relativistic predictions with black holes Psaltis et al. 2016;Goddi et al. 2017;Cunha & Herdeiro 2018;Psaltis 2019). At 4 × 10 6 M e , this black hole has a mass that bridges those of the stellarmass black holes observed with LIGO/Virgo ( ∼ 10 1 -10 2 M e ) and that of the M87 black hole ( ∼ 6.5 × 10 9 M e ) and therefore probes a curvature scale that is different from those of other tests. Perhaps more importantly, it enables an approach that is different from other tests in its methodology. Because of the detection of relativistic effects in the stellar orbits around this black hole, its mass and distance are accurately known, resulting in precise predictions of its spacetime properties Gravity Collaboration et al. 2022). As a result, contrary to other tests, where the mass of the black hole is measured from the same data simultaneously with the other spacetime properties (or possesses significant astrophysical uncertainties as in M87), tests with Sgr A * rely on mass priors with completely orthogonal systematics and potential biases. In addition, the very small uncertainties in the prior mass measurement lead to a parameter-free prediction on the gravitational effects in the images, which can be tested precisely with the EHT observations.
The most prominent gravitational effect on black hole images is the black hole shadow . The boundary of the shadow on the image plane of a distant observer is marked by the impact parameters of photons, which, when traced back toward the black hole, become tangent to the spherical photon orbits close to the horizon (Bardeen 1973;Luminet 1979). Although we define the shadow as a purely geometric feature that does not depend on astrophysical effects, we relate this feature to the brightness depression in observed images. Photons with impact parameters smaller than this critical value have paths that cross the horizon and, hence, have small optical paths through this spacetime. These reduced optical paths lead to much smaller radiation intensities compared to photons at larger impact parameters and, therefore, to the brightness depression (Jaroszynski & Kurpiewski 1997;Johannsen & Psaltis 2010;Narayan et al. 2019;Özel et al. 2021;Bronzwaer & Falcke 2021;Kocherlakota & Rezzolla 2022).
In the Kerr metric, because of a cancellation between the effects of frame dragging and the quadrupole moment of the spacetime, the shape and size of the shadow boundary have a very weak dependence on black hole spin and the observer's inclination (i.e., the radius ranges from ∼4.8 GMc −2 to ∼5.2 GMc −2 ; see Johannsen & Psaltis 2010 for a detailed study of the dependence on spin). Instead, they are determined predominantly by the mass-to-distance ratio of the black hole, which is known precisely for Sgr A * , making the shadow a direct probe of the metric properties (see, e.g., Psaltis et al. 2015). For the black hole shadow to become observable, two conditions need to be satisfied. First, a sufficiently bright source of photons needs to be present close to the horizon such that these photons experience strong gravitational lensing. Second, this source needs to be optically thin (i.e., transparent) at the observing wavelength such that the shadow is not enshrouded by the material generating this radiation. Both of these conditions are satisfied at 1.3 mm in the radiatively inefficient accretion flow around Sgr A * (Özel et al. 2000; see also Event Horizon Telescope Collaboration et al. 2022d, hereafter Paper V).
For such a configuration, the predicted image of the black hole is a bright ring of emission surrounding the shadow. The imaging observations with the EHT capture this ring and allow us to measure its properties, such as its diameter and fractional width. Earlier work has shown that, when this ring of emission is observed, the ring diameter can be used, with proper calibration, as a proxy for the shadow diameter itself (Event Horizon Telescope Collaboration et al. 2019b, 2019cNarayan et al. 2019;Özel et al. 2021;Younsi et al. 2021;Kocherlakota & Rezzolla 2022). This is the approach that we follow in this paper to compare the predictions of general relativity for the size of the black hole shadow to the observed measurement of the ring in the images of Sgr A * .
The presence of a brightness depression also allows us to explore different possibilities for the nature of the compact object itself. In particular, if Sgr A * contained a reflecting surface at 1.3 mm instead of a horizon or a naked singularity, we would have observed a less pronounced brightness depression. Alternatively, if it contained a surface that was fully absorptive and reemitting thermally the accreting energy, it would still create a depression in the EHT image but would generate bright emission at wavelengths shorter than 1.3 mm. We use the EHT images in conjunction with the broadband spectrum of Sgr A * to place strong constraints on such alternatives.
In Section 2, we summarize the prior information on the mass-to-distance ratio and the spectrum of Sgr A * . In Section 3, we quantify the measurements of the image diameter and the relationship between image and shadow diameters using extensive simulations and synthetic data. We combine these to place bounds on potential deviations between the predicted and inferred shadow size for Sgr A * . In Section 4, we constrain alternatives to the black hole nature of the compact object that involve reflecting or absorbing surfaces. In Section 5, we impose constraints on the potential metric deviations from Kerr and address the possibility that Sgr A * contains a naked singularity. In Section 6, we leverage our gravity tests with those that involve other compact objects and solar system bodies in order to draw general conclusions about the theory of gravity. We summarize our findings in Section 7.

Priors on θ g
The mass and distance of Sgr A * have been extensively studied by analyzing the dynamics of the central stellar cluster in the innermost 10″ of the Galactic center (Genzel et al. , 2000(Genzel et al. , 2003a(Genzel et al. , 2010Ghez et al. 1998Ghez et al. , 2000Eckart et al. 1999Eckart et al. , 2002Gezari et al. 2002;Schödel et al. 2007;Martins et al. 2008; Morris et al. 2012;Do et al. 2013;Jia et al. 2019). Nearinfrared (NIR) observation with 8-10 m class telescopes supported by adaptive optics (AO) revealed the orbits of individual stars in the innermost arcsec (the so-called S stars), in particular the star S0-2. 149 For this star, the combined fit for orbital elements and black hole parameters (mass, distance, projected position in the sky, proper motions, and radial velocity) has provided the most precise estimates for Sgr A * 's mass and distance so far (Schödel et al. 2003;Eisenhauer et al. 2005;Ghez et al. 2005bGhez et al. , 2008Gillessen et al. 2009aGillessen et al. , 2009bGillessen et al. , 2017Meyer et al. 2012;Boehle et al. 2016;Chu et al. 2018;Hees et al. 2019;O'Neil et al. 2019;Gravity Collaboration et al. 2021a).
S0-2 is a star with an apparent 2.2 μm (NIR K-band) magnitude of m K = 14, an orbital period of P ≈ 16 yr, and a semimajor axis of a = 125 mas (or ∼10 3 au at an 8 kpc distance); thus, it is the brightest star with a comparatively close orbit and short period at the Galactic center. The study of S0-2ʼs orbit has predominantly been conducted with two sets of instruments, the two 10 m telescopes of the Keck Observatory, and the Very Large Telescope (VLT) of the European Southern Observatory (ESO), using its individual telescopes as well as GRAVITY, an interferometer combining all four 8.2 m telescopes of the VLT (VLTI). The orbit of S0-2 provides some of the best evidence for the existence of a black hole. S0-2 has concluded an entire revolution between 2002 and 2018 covered by observations and has allowed the Keck and VLTI teams to test relativistic effects like the gravitational redshift or the Schwarzschild precession (Gravity Collaboration et al. 2018aAmorim et al. 2019;Do et al. 2019) and to constrain alternative theories of gravity (Hees et al. 2017;Della Monica et al. 2022;De Martino et al. 2021) and variations of the fine-structure constant (Hees et al. 2020).
In order to measure S0-2ʼs projected position in the sky, an astrometric reference frame has to be established. Menten et al. (1997) proposed the idea to use a group of SiO maser stars at the Galactic center-visible both in the radio and in the NIRwith positions and proper motions determined by interferometric astrometry at radio wavelengths. These masers allow it to establish a reference frame in the NIR. For both imaging instruments (Keck II/NIRC2, VLT/NACO) the field of view (10″ and 14″, respectively) is not large enough to capture the S stars and the seven masers in the same pointing. Instead, a dither pattern of pointings overlapping with one another is observed. Astrometric measurements in the central field are then executed via secondary astrometric standards, either in the form of matching coordinate lists or by generating mosaic images. In this process, systematic astrometric errors occur owing to the geometric distortions of the camera optics and field dependence of the point-spread function caused by anisoplanatism of the AO correction and higher-order aberrations of the optics (Yelda et al. 2010;Plewa et al. 2015;Sakai et al. 2019).
The VLTI team included interferometric data in their analysis starting in 2016. VLTI/GRAVITY provided highprecision distance measurements to Sgr A * during S0-2ʼs closest approach with ∼1 mas resolution and ∼40 μas astrometric precision.
This subset of the interferometric data is not affected by the systematic uncertainties of the reference frame because the projected position of S0-2 is directly referenced to the projected (center of light) position of Sgr A * . However, also VLTI/ GRAVITY data have systematic uncertainties, mainly due to aberrations of the optical trains of the individual telescopes (Gravity Collaboration et al. 2021b).
Information on the third dimension of the stellar orbits is obtained in the form of radial velocities, which in the case of S0-2 can be determined by observing the 2.167 μm H I (Brγ) and the 2.11 μm He I lines with integral field spectrographs like VLT/SINFONI or Keck/OSIRIS.
In their latest publication on the measurement of the gravitational redshift , Table 1) the Keck team found for the distance a value of R 0 = 7959 ± 59 ± 32 pc (for the fit that leaves the redshift parameter free). They also published a posterior version with the assumption that general relativity is true (redshift parameter set to unity), R 0 = 7935 ± 50 pc, which is practically equivalent within the uncertainties. Their estimates for the black hole mass are M = (3.975 ± 0.058 ± 0.026) × 10 6 M e and M = (3.951 ± 0.047) × 10 6 M e , respectively.
In the publication on the detection of the Schwarzschild precession (Gravity Collaboration et al. 2020a), the VLTI team found R 0 = 8246.7 ± 9.3 pc and M = (4.261 ± 0.012) × 10 6 M e . Their latest paper on the mass distribution in the Galactic center (Gravity Collaboration et al. 2022) changes these values slightly. Their Table B.1 is an overview of recently published VLTI values for the black hole mass and distance; they also give an estimate of their systematics due to aberrations in GRAVITY's optics: R 0 = 8277 ± 9 ± 33 pc (Gravity Collaboration et al. 2021b). For the mass they find M = (4.297 ± 0.012 ± 0.040) × 10 6 M e . Additionally, the team provided a file with the posterior chains of their Bayesian analysis (S. Gillessen 2022, private communication), assuming general relativity to be true, which has median values of R 0 = 8278 ± 10 pc and M = (4.298 ± 0.013) × 10 6 M e .
It is interesting to point out that a third, independent estimate for the distance to Sgr A * has been provided by the Bessel project, a study of the Milky Way structure with VLBI astrometry: R 0 = 8.15 ± 0.15 kpc (Reid et al. 2009(Reid et al. , 2014(Reid et al. , 2019. This value for the distance is in marginally better agreement with the VLTI results. Here, the two values considered for the distance are R 0 = 7935 ± 50 ± 32 pc and R 0 = 8277 ± 9 ± 33 pc. Mass and distance set a characteristic scale of the orbit in its projection on the sky and are highly correlated. The values for θ g ≡ GM/Dc 2 as derived from the posterior distributions are θ g = 5.125 ± 0.009 ± 0.020 μas (VLTI) and θ g = 4.92 ± 0.03 ± 0.01 μas (Keck). For the VLTI value, the systematics were derived by error propagation according to M R 0 2 µ (Gravity Collaboration et al. 2022). For the Keck value, a dedicated jackknife analysis was conducted to quantify the systematics stemming from the reference system (T. Do 2022, private communication). We show the posteriors in Figure 1. The discrepancy between the values of the two studies is about 4%.

Priors on the Spectral Energy Distribution of Sgr A *
The spectral energy distribution (SED) of Sgr A * is shown in Figure 2. It has been compiled from the large body of literature, starting as early as 1992. Points show SED values taken from Zylka et al. (1992), Telesco et al. (1996), Falcke et al. (1998, Cotera et al. (1999), An et al. (2005), Dodds-Eden et al.  et al. (2020b). We present the radio part of the SED (Falcke et al. 1998;An et al. 2005;Bower et al. 2015Bower et al. , 2019Liu et al. 2016) in a binned version (for a more detailed version showing all historic literature values in the radio to submillimeter regime, including some epochs of heightened variability, see Paper II). The steepening of the SED slope at centimeter wavelengths (Falcke et al. 1998) is clearly visible between 10 and 20 GHz and the submillimeter. From THz frequencies to the mid-IR Sgr A * has not be detected, and we have included lower and upper limits.
Here we are focusing on the NIR properties of Sgr A * , in particular on limits for a steady component that is not varying on timescales of minutes and hours. Figure 2 shows the percentiles of the observed flux density distributions at 2.2 μm (VLT/NACO and KECK/NIRC2) and 4.5 μm (SPITZER/IRAC), as well as the corresponding spectral indices that change with flux density 150 . Additionally, we present the same percentiles for the flux density distribution measured with VLTI/GRAVITY at 2.2 μm (Gravity Collaboration et al. 2020b). While the VLT and KECK data are confusion limited and noise dominated at the low end of flux density distribution, resulting in nondetections of the source against the background, Gravity Collaboration et al. (2020b) report a clear detection of Sgr A * at all times. Because this detected source is variable at all times, their 5th percentile of the variable flux density distribution represents a conservative upper limit for any steady source component that may lie underneath.

EHT Observations and Error Budget
The EHT observations of Sgr A * show a bright ring of emission surrounding a brightness depression that we have identified with the black hole shadow (Event Horizon Telescope Collaboration et al. 2022b). In principle, the diameter of this ring, d m , can be used to measure the properties of the black hole metric and to assess its compatibility with the Kerr solution in general relativity for a black hole of given angular size θ g . In practice, this comparison first requires establishing a quantitative relation (i.e., a calibration factor) between the diameter of a bright ring feature and that of the corresponding shadow. We can then use this relationship, in combination with the measured ring diameter, to infer any potential deviations from the general relativistic predictions.
To accomplish this, we write In this expression, d m is the ring diameter measured from imaging and model fitting to the Sgr A * data, where the hat signifies the fact that this is a measured quantity that may differ from the true value because of measurement biases. The quantity d d c m sĥ a º is the calibration factor, defined as the ratio of the measured diameter of the image to that of the shadow, which addresses the extent to which the ring diameter can be used as a proxy for the shadow diameter. The shadow diameter depends on the metric and its properties, such as the black hole spin and potential charges, as well as on the observer inclination.
The calibration factor α c is determined primarily by the physics of image formation near the horizon and quantifies the degree to which the image diameter tracks that of the shadow, for any underlying metric and for different realistic models of the accreting plasma. For example, the calibration factor would be α c = 1.1 whether the image diameter is 11 GM/c 2 and the shadow diameter is 10 GM/c 2 or, for some non-Kerr black hole, the image diameter is 110 GM/c 2 and the shadow diameter is 100 GM/c 2 .
The quantity δ ≡ (d sh /d sh, Sch ) − 1, on the other hand, quantifies any deviation between the inferred shadow diameter and that of a Schwarzschild black hole of angular size θ g , given by d 6 3 g sh,Sch q = . Note that, for the Kerr metric, the Schwarzschild limit provides the largest possible value for the shadow diameter. Black holes with nonzero spin observed at different inclinations can have shadow sizes that are smaller by up to ∼7.5% from this limit (Takahashi 2004;Chan et al. 2013). As a result, values of δ in the range [−0.075, 0] are consistent with the Kerr predictions, while values outside this range can be considered to be in tension with it. We also note the small differences in the definitions of these quantities with respect to earlier work (see, e.g., Event Horizon Telescope Collaboration et al. 2019c; Psaltis et al. 2021), which simply scaled the image diameter to θ g and hence did not cleanly separate the effects of different spacetimes from other astrophysical effects. We will use Equation (1) to infer the posterior on the deviation parameter δ given the EHT measurements and prior information.
Even though we used, for simplicity, a single calibration factor in writing Equation (1), in reality this factor has two components that are multiplicative in nature, i.e., α c = α 1 × α 2 . This is because the calibration factor encompasses both a theoretical bias (α 1 ) and potential measurement biases (α 2 ), which are generally independent of each other and need to be quantified separately. As a result, there are four sources of uncertainty in total that contribute to the error budget in the measurement of the deviation parameter δ. These are as follows: 1. the uncertainty in the measurement of θ g from stellar dynamics, as described in the previous section; 2. the formal uncertainties obtained from measuring the diameter d m of the bright ring from the data (see Section 3.1); 3. the theoretical uncertainties in the ratio α 1 ≡ d m /d sh between the true diameter d m of the bright ring of emission and the diameter of the shadow d sh , given a model for the black hole spacetime and emissivity in the surrounding plasma (see Section 3.2); and 4. the uncertainties in the ratio d d m m 2â º between the measured ring diameter d m and its true value d m that result from fitting analytic or pixel-based models to EHT data and arise, e.g., from the limited u-v coverage, model complexity, and incomplete prior knowledge of telescope gains (see Section 3.3).
We present below our quantitative inference of the formal measurement uncertainties and of the various calibration factors.

Measurement Uncertainties
We focus here on the 2017 April 7 data set because it satisfies three important criteria: the ALMA array, which leads to the highest signal-to-noise ratio data, participated in the observation; there is no evidence for an X-ray flare or large excursion in the 1.3 mm flux; and the interferometric coverage samples the visibility amplitude minima in the u-v plane, which are critical for establishing an accurate image size measurement. We also note that the analysis presented in Paper III for the 2017 April 6 data provides consistent results. The measurement uncertainties are obtained from modeling these data with imaging and model-fitting tools, as discussed in Event Horizon Telescope Collaboration et al. 2022b and 2022c, hereafter Papers III and IV, respectively. Here we quantify these results using characterization tools, as we describe below.
We use the CHaracterization Algorithm for Radius Measurements (CHARM), which is based on the feature extraction algorithm that was employed in Event Horizon Telescope Collaboration et al. (2019c) and improved further in Özel et al. (2021). Briefly, the algorithm (i) chooses a trial center for a potential ring-like feature; (ii) uses a rectangular bivariate Figure 2. Sgr A * radio to X-ray SED. Blue points show rebinned observed flux densities (Falcke et al. 1998;An et al. 2005;Bower et al. 2015Bower et al. , 2019Liu et al. 2016); the faint colored points show upper limits for the median flux density from the papers listed on the right. Solid gray lines from 4.5 to 2.2 μm show model SEDs for the 5th, 50th, and 95th percentiles and demonstrate the predicted slope change as a function of flux density. The green points at 2.2 μm represent the most recent analysis of the 5th, 50th, and 95th percentiles of NIR flux density distribution based on VLTI/GRAVITY data. The orange-shaded envelope represents an estimate of the range of flux densities in 90% of the observed time.
5 spline interpolation to obtain radial cross sections of the filtered image brightness at 128 equidistant azimuthal orientations starting from the trial center; (iii) measures, in each radial cross section, the distance of the location of peak brightness from the trial center and identifies the ring diameter as two times the median value of this distance; (iv) iterates the location of the trial center and steps (i)−(iii) such that the variance in the diameter along different cross sections is minimized; and (v) measures a median FWHM of the ring by fitting an equivalent asymmetric Gaussian to each radial cross section such that the corresponding integrated brightness of the cross section of the filtered image is equal to that of the Gaussian. We then define the fractional width as the FWHM of the ring in units of the ring diameter.
We show in Figure 3 the fractional width and diameter measurements obtained for eht-imaging, SMILI, and DIFMAP top-set images for the April 7 Sgr A * data (see Paper IV for the details of these three imaging algorithms). Even though we apply CHARM to all of the top-set images, without employing clustering filters (e.g., to select only ringlike images), we find that the 68th and 95th percentile contours for the ring parameters form compact regions for each algorithm. This indicates that there is a discernible brightness depression in each image that is surrounded by a bright region that has a robust characteristic size.
The gray bands in Figure 3 mark the effective limit of the fractional width that can be measured with imaging methods because of the finite resolution of the EHT array. The pink bands show the expected anticorrelation between the ring FWHM and the measured diameter d m that arises from the Gaussian broadening of an infinitesimally thin ring of diameter d.

( ) = -
Because some of the inferred fractional widths are relatively large, in calculating the actual shaded areas in Figure 3 we do not make this first-order approximation but rather employ a numerical evaluation of the complete expression.
In Figure 4, we compare the fractional widths and mean diameters inferred for Sgr A * with the three imaging algorithms. Even though there appear to be small differences in the mean diameter, all of the contours lie along the expected anticorrelation. This suggests that the differences are simply caused by the various algorithmic choices and do not reflect inconsistencies between them.
We also use the image diameter and fractional width obtained from fitting analytic models to the visibility data (see Paper IV). In particular, we focus on the mG-ring model described in Paper IV, which comprises a Gaussian broadened ring with flux enhancements on the ring with m-fold azimuthal symmetry and an additional central Gaussian floor component. We use the posteriors obtained from the fitting algorithm Comrade (P. Tiede 2022, in preparation). In Figure 5, we show the posterior over the diameter and the fractional width obtained from fitting the mG-ring model to the April 7 data. The narrow posterior in diameter for this model reflects primarily the insufficient degree of model complexity in the model, as can be seen in the synthetic data analysis below (see also discussion in Psaltis et al. 2020b). Nevertheless, the inferred diameter is consistent with those of the imaging methods, given the expected anticorrelations.

The α 1 Calibration Factor and Its Uncertainties
In this section, we use simulated black hole images to quantify the correction factor α 1 , which is the ratio between the Figure 3. The distributions of fractional ring widths and diameters for the topset images for the Sgr A * April 7 data obtained using eht-imaging (top), SMILI (middle), and DIFMAP (bottom) algorithms (see Paper III). Each image in these top sets is characterized using CHARM. The two contour levels correspond to 68th and 95th percentiles of the images. The histograms are the projections of the distributions on the mean diameter axis. The gray shaded area corresponds to a nominal 15 μas resolution of the telescope array. The pink shaded area shows the expected anticorrelation between diameter and width that is caused by Gaussian broadening of a thin ring. diameter of the peak brightness of the image and the diameter of the black hole shadow. We employ three different types of models to explore a range of effects related to the plasma properties, spacetime characteristics, and different numerical realizations of the turbulent flow.
The first category of images comprises ∼180,000 snapshots of GRMHD accretion flow simulations discussed in Paper V. The simulations cover a range of black hole spins (a = −0.94, −0.5, 0.0, 0.5, 0.94), observer inclinations (i = 10°, 30°, 50°, 70°, and 90°), magnetically arrested disk (MAD) and standard and normal evolution (SANE) magnetic field configurations, and thermal electron distributions with temperature prescriptions characterized by R high = 10, 40, and 160. For each combination in these sets of parameters, we also considered snapshots calculated with two different GRMHD simulation algorithms, KHARMA (Prather et al. 2021) and BHAC (Porth et al. 2017), and corresponding images calculated using two different covariant radiation transport schemes, ipole (Mościbrodzka & Gammie 2018) and BHOSS (Younsi et al. 2012(Younsi et al. , 2016. The second set comprises ∼4000 images from covariant plasma models in the Kerr metric that go beyond some assumptions of GRMHD. These employ analytic calculations that are agnostic to the particular microprocesses responsible for angular momentum transport and particle heating. The particular parameters of these models are discussed in detail in Özel et al. (2021).
The third category includes ∼200,000 images from analytic models that explore a range of black hole metrics that either are parametrically different from the Kerr metric or represent other known solutions to the field equations . For the former, we employ the Johannsen-Psaltis (JP) metric Johannsen 2013b), which enables parametric deviations from Kerr and recovers the Kerr spacetime when its deviation parameters vanish, while still guaranteeing many of the basic properties of the Kerr metric ( i.e., it is Petrov Type-D, free of pathologies, etc.). For the latter, we utilize the EMDA (Kerr-Sen) metric (García et al. 1995), which is a solution to the field equations of a modified gravity theory with additional scalar degrees of freedom. The plasma model is the same covariant analytic model of Özel et al. (2021), and the model library spans different black hole spins, observer inclinations, magnetic field configurations, plasma parameters, and, where appropriate, metric parameters, as discussed in Younsi et al. (2021). We refer to these models as analytic non-Kerr.
Using the covariant radiation transport code BHOSS (Younsi et al. 2012(Younsi et al. , 2016, Figure 6 presents a selection of illustrative simulated 1.3 mm Sgr A * images from five different non-Kerr spacetimes, together with an image from a GRMHD simulation of a Kerr black hole. The field of view in all panels is 150 μas in both directions, with the brightest pixel value in each panel normalized to unity. We show in the top row mean images from covariant MHD simulations averaged over a time window of 5000 GM/c 3 , with snapshots every 10 GM/c 3 (∼3.5 minutes for Sgr A * ). The Kerr GRMHD simulation parameters are as follows: MAD magnetic field configuration, a = 0.9375, i = 30°, R low = 1, and R high = 10 (see Paper V for further details of the modeling). The top middle panel shows an image of accretion onto a nonrotating dilaton black hole . The top right panel presents the image from a simulation of accretion onto a boson star (Olivares et al. 2020;Fromm et al. 2021). The boson star image represents one example of a compact object without an event horizon or an unstable photon orbit, thereby lacking a central brightness depression or a photon ring in its image. We do not consider such configurations in the calibration procedure discussed here but explore them in detail in Section 4.
We present in the bottom row of Figure 6 images from non-Kerr spacetimes with the background semianalytic accretion flow model as specified in Özel et al. (2021) and Younsi et al. (2021). These spacetimes are the JP and the Kerr-Sen (EMDA) metrics, as well as a spinning traversable wormhole spacetime (Teo 1998;Harko et al. 2009). The JP metric for this example is nonspinning, with deformation parameters chosen to push the unstable photon orbit very close to the event horizon (hence the smaller central brightness depression). The Kerr-Sen spacetime parameters (axion and dilaton field couplings) have been chosen to produce an image with a photon ring larger than is possible with a Kerr black hole. Finally, the rotating wormhole spacetime is chosen to have a throat radius equal to the event horizon radius of a Kerr black hole with the same spin (a = 0.9375). In all of the examples with a central brightness Figure 4. A comparison of the fractional widths and mean diameters measured in the top sets of the three imaging algorithms for the reconstruction of the Sgr A * April 7 data. The contours show the 68th and 95th percentiles of the top-set images, as before. The pink shaded area shows the expected anticorrelation as in Figure 3. The small differences in the inferred parameters from each algorithm lie along this expected anticorrelation. The dashed and dotted lines correspond to (ring diameter + ring width) = 90 and 80 μas, respectively (see the discussion in Section 4.1.1). Figure 5. The fractional ring width and diameter measurements obtained from fitting mG-ring models to the April 7 visibility-domain data for Sgr A * . The shaded areas are the same as in Figure 4. depression, the size of the ring-like image scales with that of the black hole shadow.
We convolve all of the images in the three categories with an n = 2, 15 Gλ Butterworth filter to mimic the resolution of the EHT array. We then apply the characterization algorithm CHARM to all of these images to measure the median diameter D im of the bright ring of emission, with respect to the analytically calculated center of the black hole shadow. We also calculate the shadow diameter in each spacetime; for Kerr, we use the analytic approximation derived in Chan et al. (2013). We then define the calibration factor α 1 as the ratio of the median diameter to the diameter of the shadow. We will refer to the difference α 1 − 1 as the fractional diameter difference. If the peak emission in the bright ring coincides with the shadow boundary, then the fractional diameter difference would be equal to zero.
We show in Figure 7 the distribution over the fractional diameter difference for the three types of images. As discussed in Özel et al. (2021) and Younsi et al. (2021), the distribution peaks at small positive values of α 1 − 1, indicating that the peak of the bright ring is slightly larger than the boundary of the black hole shadow.  . The fractional diameter difference between the diameter of peak emission in the image of a black hole and that of its shadow obtained from three different types of simulations. The blue histogram shows the result from 180,000 snapshots from time-dependent GRMHD simulations in the Kerr metric, spanning a broad range of spins, inclinations, and plasma parameters. The orange histogram shows the same for analytic plasma models in the Kerr metric that relaxes some of the assumptions of the GRMHD simulations, while the green histogram shows the results for analytic plasma models in metrics that deviate from Kerr either parametrically (JP) or through different solutions to the field equations (EMDA). All distributions peak at small positive values.

The α 2 Calibration Factor and Its Uncertainties
We turn to quantifying the correction factor α 2 and its uncertainty that arises from applying imaging and model-fitting tools to infer the size of a ring-like image. To this end, we first characterized all simulations discussed in Section 3.2 based on image morphology and size, degree of variability, spacetime metric, and plasma model. We then randomly selected segments and snapshots from each category. We assigned a random position angle in the sky to each image and generated synthetic EHT data from them using the VLBI synthetic data generation pipeline SYMBA (Janssen et al. 2019;Roelofs et al. 2020b;Natarajan et al. 2022). SYMBA accounts for the effects of interstellar scattering through the Galactic disk, as well as several realistic atmospheric, instrumental, and calibration effects. In addition, we designated a last category in which synthetic data were generated from a small number of snapshots but with several different realizations of all the measurement uncertainties. This yielded a total of 145 synthetic data sets.
We carried out blind image reconstructions and mG-ring fits to all the synthetic data using the same EHT imaging pipelines as those applied to Sgr A * data, separating into teams that did not have any prior knowledge of the synthetic data characteristics. As for the case of the real data, imaging teams generated a top set of reconstructions for each synthetic data set, using the exact same set of algorithmic parameters as those used for the real Sgr A * data. We applied CHARM to the entire top-set image reconstructions (for a total of 145 data sets × 2000 top-set parameters × 3 algorithms) and to the ground-truth images to measure the calibration factor α 2 . Modeling teams applied the snapshot fitting procedure with an mG-ring model and returned their posteriors for the model diameter, which we used to calculate the α 2 calibration factor.
In the majority of cases, the set of reconstructions that correspond to the full range of top-set parameters or posteriors yielded a narrow range of diameters and widths for the ring features, indicating a robust inference of the prevalent features with little sensitivity to the choice of regularizers. However, in <30% of the data sets, the features of the images varied significantly within the top-set parameters, leading to an uncertainty in the ring diameter that is ∼3-8 times larger than what is measured in the Sgr A * data (see Figure 3). This primarily happens when the image size, position angle, and asymmetry of the ground-truth image that led to the particular synthetic data set conspire in a way to remove any prominent salient features in the visibility domain and the image reconstruction is dominated by the priors rather than any unique features in the data that an imaging or model-fitting algorithm can pick up on. More quantitatively, we define the spread in the diameter for all the reconstructions of a given synthetic data set by using the metric where d 85 , d 50 , and d 15 refer to the 85th, 50th, and 15th percentiles of diameters in a given distribution, respectively. The spread in the top-set reconstructions of the actual Sgr A * data using this metric is 0.06-0.1 (see Figure 3). We place a conservative limit of diameter spread of less than 0.2 for the synthetic data reconstructions and include only the data sets that fulfill this criterion in our derivation of the α 2 calibration parameters. Figures 8 and 9 show the distributions of the fractional diameter difference α 2 − 1 for the imaging reconstructions and mG-ring model fits, respectively, of the synthetic data sets discussed above. The trend in Figure 7, i.e., the slight offset between the peaks of the distributions calculated for the different imaging methods, follows the one we see in the reconstruction of the actual EHT Sgr A * data very closely (see Figure 3). This result reinforces our conclusion that the small differences in the inferred diameters between different algorithms are primarily caused by the different methodologies, prior, and regularizer choices in those methods (see Paper III). The same is true for the trend in the mG-ring results, albeit corresponding to more marked differences.

The Diameter of the Black Hole Shadow
We use the combination of the measurements and calibrations discussed in the previous sections to infer the diameter of the boundary of the black hole shadow, d d m The Figure 8. The fractional diameter difference between the diameter of peak emission in ground-truth images and those reconstructed through the three different imaging methods used for EHT analyses. Synthetic data cover 145 sets selected from numerical and analytic Kerr and non-Kerr models, while the image diameters were inferred for all of the top-set images for the image reconstructions of each data set. The small offsets in the calibration parameter mimic those seen in the analysis of the actual Sgr A * data. Figure 9. The fractional diameter difference between the diameter of peak emission in ground-truth images and those reconstructed through fitting the mG-ring model to the visibility-domain synthetic data used for EHT analyses.
posterior over the shadow diameter is given by is the likelihood of measuring a ring diameter dˆgiven the model parameters, P(α 1 ) and P(α 2 ) are the distributions of the calibration parameters, and C is an appropriate normalization constant. P(d sh ) is the prior over the shadow diameter, which we assume to be flat over a range that is much broader than that of the posteriors.
We show in Figure 10 the posteriors over the shadow diameter as inferred from the three image-domain algorithms and for the different theoretical calibrations discussed in Section 3.2. In Table 2, we report the most likely values of the black hole shadow diameter for Sgr A * , as well as the 68th percentile credible levels. Finally, in Figure 11 we overlay the inferred shadow boundaries on the average EHT image of Sgr A * obtained from the 2017 April 7 data (Event Horizon Telescope Collaboration et al. 2022b). In this plot, the solid lines show the range 46.9-50.0 μas of the most likely values, and the dashed lines show the envelope of the 68th percentile credible intervals across the different methods, spanning 41.7-55.6 μas.

Constraints on the Deviation Parameter
Using the uncertainties discussed above, we obtain the posterior over the deviation parameter δ by Here C is an appropriate normalization constant, d , , , is the likelihood of measuring a ring diameter dˆgiven the model parameters, which we identify with the distributions of measurements from the imaging and visibility-domain methods, P(θ g ) denotes the prior in θ g given by stellar-dynamics measurements, and P(α 1 ) and P(α 2 ) are obtained from the calibration procedures outlined in Sections 3.2 and 3.3.
As discussed earlier, we consider two separate priors for θ g denoted by Keck and VLTI; three different measurements of the ring diameter from imaging methods (together with their corresponding α 2 calibrations) denoted by eht-imaging, SMILI, and DIFMAP; and three different sets of snapshot images for the α 1 calibration denoted by GRMHD, Analytic Kerr, and Analytic JP. We assume a flat prior in the fractional deviation δ, with limits that cover a range that is sufficiently broad not to affect the posteriors. We perform the two integrals Figure 10. Posteriors over the shadow diameter inferred using the measurements of the ring diameter size d m based on three image-domain algorithms, as well as the two factors α 1 and α 2 that quantify the theoretical and measurement calibrations.

10
in Equation (5) numerically and show the resulting posteriors in the deviation parameter δ in Figure 12.
We repeat the same procedure for the measurements obtained from mG-ring fits to the Sgr A * data. We show the corresponding result for the deviation parameter δ in Figure 13.
We present in Table 2 the means and 68th percentile credible levels for the posteriors we obtain for the deviation parameter δ using different combinations of black hole mass priors, theoretical models used for calibration, and the imaging and model-fitting methods used on the Sgr A * data. All of the posteriors are consistent with each other and with no deviation from the general relativistic predictions. We choose the ehtimaging +Keck+GRMHD and eht-imaging +VLTI +GRMHD combinations as the two fiducial cases to calculate constraints on the individual metric parameters in the remainder of this paper.

Are There Viable Alternatives to an Event Horizon?
While there is overwhelming evidence that Sgr A * contains a large amount of mass confined within a very small volume, the question of whether it is a true black hole remains unresolved. The defining characteristic of a black hole is the presence of an event horizon. While it is relatively easy to show that observations of Sgr A * are consistent with the presence of an event horizon (e.g., the many black-hole-based models discussed in Paper V), proving that all alternatives are ruled out is well-nigh impossible. Here we discuss what EHT observations of Sgr A * are able to add to this question.
If Sgr A * does not have an event horizon, it is likely to have some kind of a surface. Alternatively, the object might be a boson star, naked singularity, or some other exotic solution of gravitational physics (see Cardoso & Pani 2019 for a review of exotic compact object models). If we could rule out some of these possibilities using observational data, then the case for Sgr A * having an event horizon would become significantly stronger. We discuss below two arguments against Sgr A * possessing a radiating surface. One argument (Section 4.1) is well developed in the literature (Narayan et al. 1998;Narayan 2002;Broderick & Narayan 2006Narayan & McClintock 2008;Broderick et al. 2009), while the other (Section 4.2) is new. Models involving boson stars and certain kinds of naked singularities are considered in Section 4.3, and other exotic possibilities, including wormholes, are discussed in Section 5.2.

Thermalizing Surface
Accretion in Sgr A * is believed to occur via a hot accretion flow 151 (Yuan & Narayan 2014). Now that the EHT image of Sgr A * (Paper III) has revealed a brightness temperature well in excess of 10 9 K, the evidence for the presence of very hot gas is particularly compelling.
The radiative luminosities of hot accretion flows are generally far less than Mc 2  Yuan & Narayan 2014), where M  is the mass accretion rate. Therefore, the accreting gas in these systems reaches the compact object at the center with a considerable amount of thermal and kinetic energy. If the compact object is a black hole, this energy simply disappears through the event horizon. On the other hand, if the object has a surface, the energy will be thermalized and reradiated (once the system reaches steady state), giving a large surface luminosity that should be visible to a distant observer. Figure 12. Posteriors on the parameter δ that measures the deviation of the black hole shadow size obtained for Sgr A * from the Schwarzschild predictions. The top panel uses as a prior the angular size θ g obtained with Keck observations, while the bottom uses the same quantity from VLT(I). The various curves correspond to different sets of theoretical models used for calibration and the measurements obtained with the various imaging methods. The purple shaded area shows the ∼8% range predicted for the Kerr metric, depending on the black hole spin and observer inclination. Figure 13. Same as Figure 12, but for the measurements obtained from fitting mG-ring models to the visibility-domain data.
Observations can thus tell the difference between an event horizon and a thermalizing surface.
In the previous paragraph, and also in the rest of Section 4, we assume that (i) matter in the compact object at the center of Sgr A * satisfies energy conservation; (ii) it obeys the laws of thermodynamics, in particular, it approaches statistical equilibrium in steady state; and (iii) it couples to and radiates in all electromagnetic modes. These assumptions can be considered "natural" minimal principles, but they can be violated in extreme models. For example, the shell-like black hole mimicker described in Danielsson et al. (2021) can be designed either not to produce any electromagnetic radiation or to radiate only in a handful of modes, thereby violating assumption (iii). It is not possible to constrain such models using astronomical observations in electromagnetic bands, though in certain cases it may be possible to distinguish them via gravitational waves (Abbott et al. 2021; see, e.g., Chirenti & Rezzolla 2007, for the case of gravastars). Note that even very exotic objects would satisfy our assumptions, including (iii), if only a small fraction of the accreted baryonic gas survives on their surface as normal matter. To be optically thick in the electromagnetic bands of interest to us, the skin of normal matter should have a surface density as little as 1 g cm −2 , which corresponds to just 10 −14 of the total mass of Sgr A * . An exotic object would need to convert all accreted gas on its surface to electromagnetically inactive material if it is to escape detection by electromagnetic observations.
For a spherically symmetric spacetime, matter that starts from rest at infinity and then accretes via a radiatively inefficient mode to come to rest on a surface at radius R * will release thermal energy as measured at infinity equal to a fraction η of the rest-mass energy of the gas, where (the following expression is obtained for the Schwarzschild metric; and we use geometrized units: G = c = 1. If the released thermal energy is radiated back to infinity-we emphasize that this is unavoidable once the object reaches steady state-the extra luminosity from the thermalizing surface will be typically much larger than the luminosity radiated by the hot accretion flow itself. This feature can be exploited to distinguish black holes, which by definition have an event horizon, from other kinds of compact objects that have a surface. In the context of stellar-mass black holes, this argument provides a convenient way of distinguishing black holes from neutron stars Narayan et al. 1997;Garcia et al. 2001 In the case of Sgr A * , the argument proceeds differently. In essence, the observed submillimeter radiation provides a lower limit on the mass accretion rate, M min  , regardless of whether the radiation is produced by inflowing hot gas or an outflowing jet. Therefore, given an assumed radius R * of the surface, we can estimate the minimum surface luminosity that should be observed at infinity, As we show below, the surface radiation should appear in the infrared, where observations provide strong upper limits on the luminosity of Sgr A * . These limits lie far below the predicted minimum surface luminosity, implying that Sgr A * does not have a radiating surface. Versions of this argument have been made in previous papers in the context of Sgr A * (see Narayan & McClintock 2008, for a review). A similar argument also applies to the supermassive black hole in M87 (Broderick et al. 2015;Event Horizon Telescope Collaboration et al. 2019c). In related work, Lu et al. (2017) argued that the absence of flashes of radiation from stars crashing on supermassive black hole candidates in galactic nuclei requires these candidates to be true black holes with event horizons.

EHT Limit on the Radius of the Surface
In the case of Sgr A * , a somewhat weak link in the argument outlined above was the hitherto lack of a strong upper limit on the radius R * of a putative surface in Sgr A * . Since the surface luminosity for a given M  scales as (6) and (7)), one could make the predicted luminosity small by arbitrarily increasing R * /M, thereby evading observational limits. This loophole has now been closed by EHT observations.
Using a maximally conservative analysis of EHT 2017 visibility data, and without any model assumptions, Paper II estimates the FWHM of the image of Sgr A * to lie in the range 39-87 μas. With a conversion factor, GM/c 2 D ≈ 5 μas (see Figure 1), this corresponds to an image diameter <18M.
The observed 230 GHz radiation in Sgr A * is from the hot accretion flow, not from the surface (which should radiate in the infrared). Any surface must lie interior to the 230 GHz emitting hot accretion flow and should have an apparent diameter smaller than 18M. Thus, from the analysis in Paper II, we set the following upper limit on the apparent radius of the surface as viewed by a distant observer: R app < 9M.
Paper III presents image reconstructions of Sgr A * based on the EHT 2017 data. Table 7 in that paper summarizes the results of fitting a ring model to image reconstructions based on several methods. Using the imaging results from DIFMAP, ehtimaging, and SMILI and combining the ring analyses with REx and VIDA (see Paper III for details), the average ring diameter estimate is d = 51.3 ± 2.0 μas, and the ring width estimate is w = 29.6 ± 3.6 μas (these results correspond to descattered images from April 7 data). We take (d + w) = 80.9 ± 4.1 μas as a reasonable proxy for the apparent outer diameter of the source. Using the 95% confidence upper limit, (d + w) < 88 μas, we obtain R app < 8.8M (95% CL). Paper III obtains a tighter constraint using the Bayesian imaging method THEMIS, while Paper IV similarly reports tighter constraints by fitting mG-ring models (based on Johnson et al. 2020) directly to visibility data. To be conservative, we do not use these limits.
The analyses described in the previous paragraph treat d and w as uncorrelated quantities. However, as the careful analysis in Section 3.1 of the present paper shows, there is a strong anticorrelation between the estimated values of d and w, such that their sum (d + w) is quite tightly constrained. The dotted and dashed curves in Figure 4 correspond to (d + w) = 90 and 80 μas, or equivalently R app = 9M and 8M, respectively. Clearly, from this analysis, R app < 8M is a safe upper limit (at about 95% confidence).
To be very safe, we choose as a conservative upper limit on the apparent radius of a surface in Sgr A * R app < 9M. For a Schwarzschild spacetime, gravitational deflection of rays causes the apparent radius of a spherical surface as viewed by an observer at infinity to be larger than the true areal radius R * . The relation between the two is Our upper limit, R app < 9M, then corresponds to R * < 8M. In the discussion below, we consider the full range of allowed R * values, from the event horizon radius R H = 2M to the upper limit, namely, 2M < R * 8M.

Predicted Spectrum of Surface Radiation
Paper V discusses hot accretion flow models of Sgr A * based on extensive GRMHD simulations. The models indicate that the mass accretion rate in Sgr A * is typically M M 10 yr (similar to estimates reported in, e.g., Falcke et al. 1993;Yuan et al. 2003;Chael et al. 2018;Ressler et al. 2020), but with a broad distribution that extends from M M 10 yr The models at the lower end of this range are actually ruled out by various constraints (see Paper V); nevertheless, we stick to M M 10 yr as a safe and conservative lower limit on the mass accretion rate. Equations (6) and (7), combined with our upper limit on R * , then show that the surface luminosity measured at infinity must be 10 37 erg s −1 .
Meanwhile, we know that the hot accretion flow in Sgr A * produces synchrotron radiation at submillimeter wavelengths with a luminosity ∼5 × 10 35 erg s −1 , shown by the green curves in Figure 14. Even in the absence of any independent estimate of M  , just the fact that accretion results in this much radiation implies a certain minimum energy flow onto the surface. Since the accreting gas generally moves radially inward, relativistic beaming causes more radiation to impinge on the central object compared to what escapes to infinity. Thermalization of this infalling radiation would then give a surface luminosity greater than 152 5 × 10 35 erg s −1 . Any additional energy released by the mechanical and thermal energy of the infalling gas (this is expected to dominate in most scenarios) would further increase the surface luminosity. We therefore treat 5 × 10 35 erg s −1 as an even more conservative lower bound on the surface luminosity of Sgr A * than that discussed in the previous paragraph.
A key feature of radiation emitted from a central surface in a hot accretion flow is that it will appear in a different region of the electromagnetic spectrum than the emission from the hot accreting gas and jet. The latter dominates in the radio and submillimeter bands (Figure 14, green curves). Meanwhile, the radiating gas at the surface, being optically thick, will radiate like a blackbody to a very good approximation (McClintock et al. 2004;Broderick & Narayan 2006. The temperature of this radiation, measured at infinity, is given by , are shown as solid black curves. The dashed blue curve corresponds to a model with a surface luminosity of 5 × 10 35 erg s −1 , a conservative lower bound. Observational data in various wave bands are plotted as filled circles. The thick red line and large green filled circle correspond to the 5th percentile of the variable infrared emission Gravity Collaboration et al. 2020b), which we treat as an upper limit on any quiescent infrared luminosity from a potential surface in Sgr A * . This upper limit from the observations lies well below the theoretical SEDs and therefore rules out these models. The green curve on the left is an empirical fit to the radio and submillimeter part of the observed SED; the radiation in this region of the spectrum is produced by synchrotron emission from the hot accretion flow, and the corresponding luminosity is ∼ 5 × 10 35 erg s −1 . where σ SB is the Stefan-Boltzmann constant. For the estimates of L ∞ and R app derived earlier, the predicted radiation should be in the NIR and optical bands. If we define the characteristic frequency ν * of the blackbody radiation by hν * = kT ∞ , the SED at infinity takes the form The left panel in Figure 14 shows predicted SEDs of surface radiation from Sgr A * , if the object has a surface with a radius R * = 2.5M; we choose this radius as a fiducial model for illustration. The three solid black curves correspond to mass accretion rates M M 10 , 10 , 10 yr , respectively, the last of which is the conservative lower limit from Paper V mentioned earlier. The dashed blue curve corresponds to the absolute lower limit on the surface luminosity, L ∞ = 5 × 10 35 erg s −1 , discussed above.
The right panel of Figure 14 shows another sequence of models in which we vary the surface radius R * . Taking the previously mentioned conservative mass accretion rate estimate of 10 −9 M e yr −1 , we consider surface radii R * = [2M, 3M, 4M, 6M, 8M], respectively.
In all the models shown in the two panels of Figure 14, the predicted surface emission (black and blue curves) is spectrally well separated from the synchrotron emission of the hot accretion flow (green curve).
Therefore, this predicted signature of surface emission is easy to identify via observations, making it possible to develop a robust test for the presence of a thermalizing surface.

Observational Limit on Surface Luminosity
Observations of Sgr A * have improved substantially in recent years. The current status is summarized in Section 2.2 and Figure 2, and the data are shown again in Figure 14. The infrared data are of most interest to us and are highlighted by the red line segments and green filled circles, which correspond to the 5th (thick red line and large green filled circle at the bottom), 50th (thin line, small circle), and 95th (thin line, small circle) percentiles, respectively, of the variable infrared luminosity. Sgr A * exhibits frequent flares in its infrared light curve (Eckart et al. 2004;Eisenhauer et al. 2005;Hora et al. 2014;Witzel et al. 2018), which are interpreted as transient electron heating events in the hot accretion flow or jet. A few bright flares have been shown to come from gas orbiting the central object at a projected radius R M M 6 10 flare˜- (Gravity Collaboration et al. 2018b). This location is not very different from the region of the flow that produces the submillimeter radiation observed by the EHT.
If Sgr A * were an object with a thermalizing surface, then, given its large mass, we would expect it to have an enormous thermal capacity. Consequently, thermal emission from its surface is not expected to show violent flaring activity. The observed infrared flares are thus much more likely to be produced by the hot accretion flow, possibly in transient turbulent heating or magnetic reconnection events (Markoff et al. 2001;Yuan et al. 2004;Ball et al. 2016Ball et al. , 2021Ressler et al. 2017;Davelaar et al. 2018;Dexter et al. 2020;Nathanail et al. 2020Nathanail et al. , 2021Chatterjee et al. 2021;Porth et al. 2021;Ripperda et al. 2021).
Since any surface infrared emission in Sgr A * must be steady, we ignore the fluctuating flare emission and treat the 5th percentile (the thick red line and large filled green circle in Figure 14) as the maximum steady infrared emission from a surface. 153 Note that Paper V uses an upper limit of νL ν < 10 34 erg s −1 in infrared (50th percentile) when evaluating their GRMHD-based accretion-jet models. As Figure 14 shows, this upper limit (especially the large green filled circle) lies nearly two orders of magnitude below the strict lower bound on the predicted surface luminosity discussed earlier (dashed blue line) and three orders of magnitude below predictions of more realistic models (solid black lines). We thus conclude that Sgr A * cannot have a thermalizing surface with characteristics similar to any of the models considered in Figure 14; ergo, the case for an event horizon is much strengthened.

Discussion and Caveats
Compared to previous discussions of this topic, what has improved is that, thanks to the EHT image of Sgr A * , we are now able to limit ourselves safely to surface radii R * < 8M, whereas in earlier works much larger radii were considered (as large as 1000M in Narayan 2002 and100M in Broderick et al. 2009). Moreover, the infrared constraints are also now very much stronger ( Figure 14). Correspondingly, the argument for the absence of a thermalizing surface is substantially strengthened.
The discrepancy between the maximum steady infrared luminosity that Sgr A * can possibly have (the 5th percentile thick red line and large green circle in Figure 14) and the minimum possible luminosity it could theoretically have and still possess a thermalizing surface (the dashed blue line) is too large to be circumvented with small fixes to model details. This statement is true even if we use the 50th percentile of the infrared observations (the middle red line and middle green circle in Figure 14), which would be equivalent to counting all the observed infrared radiation, including the flares, as surface emission. If we wish to consider models of Sgr A * with a surface, we have to find a weakness in one of the links in the underlying logic of the argument. An easy way out is to give up one of the basic physics assumptions listed in the third paragraph of Section 4.1. Here we consider other less drastic possibilities.
Could the predicted infrared radiation from a hypothetical surface in Sgr A * be obscured by foreground matter such as dust? This is highly unlikely since the radiation from the infrared flares is clearly visible, and that radiation comes from hot external gas (not from the surface) at radii within 10M (Gravity Collaboration et al. 2018b). It is hard to imagine an obscuring medium that allows flare emission to make it through but blocks radiation from the surface.
Another minor worry may be quickly dealt with. Because of spacetime curvature, radiation from a surface at areal radius R * in a Schwarzschild spacetime takes longer to reach a distant observer compared to a ray that travels in flat spacetime. Could this delay be so large that surface radiation has not yet reached us? Let us write Even if the logarithm is as large as 100 (corresponding to R * being located a Planck length above the event horizon), the extra time delay is only about an hour.
A related worry is that the gravitational redshift, (1 + z) ∼ μ −1/2 , between the surface and infinity might dilute the observed luminosity sufficiently to make the surface radiation invisible. Abramowicz et al. (2002) noted that this effect causes the radiation luminosity that reaches the observer to be reduced by a factor of 1/(1 + z) 4 ∼ μ 2 compared to what is emitted at the surface. They claimed that, if (1 + z) were large enough, no detectable radiation would reach the observer and it would be impossible to distinguish an event horizon from a surface.
However, gravitational redshift is not an issue for the line of argument we have presented in this paper because we expressed everything in terms of energy and luminosity as measured at infinity; in such a framework, all redshift factors drop out. For instance, if the radiation observed at infinity has a temperature T ∞ , then the radiation emitted by the surface will have a temperature T local = (1 + z)T ∞ in the local rest frame. The radiation emerging from the surface will have a flux equal to , and the corresponding luminosity is larger than what reaches infinity by precisely the factor of (1 + z) 4 noted by Abramowicz et al. (2002). The only question is whether the system has enough time to heat up to such a high local temperature. We discuss this important issue next.
Sgr A * is presumably as old as the Milky Way, i.e., several billion years old. Over much of that time, it must have accreted gas at a rate equal at least 154 to the present M  . The time needed to achieve the steady-state condition implicit in Equation (9), namely, , or equivalently T local = (1 + z)T ∞ , is far shorter than the age of Sgr A * for almost any model.
The one exception is if μ = 1, i.e., if the surface R * is extremely close to the event horizon. In this limit, as Lu et al. (2017) argued, the time required to achieve steady state scales as μ −1 and can become arbitrarily long. The physical reason is that the region between R = R * and the photon sphere, R ph = 3M, traps radiation. This volume has a large thermal capacity and therefore takes a long time to reach steady state. Applying this logic to Sgr A * , Lu et al. (2017) concluded that the absence of infrared radiation in Sgr A * rules out a thermalizing surface only if μ  10 −14 . If the surface is even closer to the horizon radius than this limit, i.e., if R *  R H + 10 −2 cm, then the steady-state condition will be invalid. Their argument thus provides an upper limit on R * .
Using completely different reasoning, Carballo-Rubio et al.
(2018) set a lower limit on μ. The argument goes as follows. Because Sgr A * accretes mass continuously, its horizon radius 2M increases with time. In order to maintain R * = (1 + μ)R H , the surface also needs to expand. However, if μ is too small, the required expansion speed is greater than the speed of light in the local frame, which is unphysical. Using a conservative estimate of M M 10 yr 11 1 --  (which is two orders of magnitude smaller than the lower limit given in Paper V and more than seven orders of magnitude less than the likely average accretion rate over the life of Sgr A * ), Carballo-Rubio et al. (2018) conclude that Sgr A * can avoid the faster-thanlight conflict only if μ  10 −23 , i.e., if R *  R H + 10 −11 cm. Note that this rules out models in which the surface lies a Planck length (10 −33 cm) above the event horizon, as gravastar models (Mazur & Mottola 2001;Chapline 2003) often implicitly assume.
Combining the arguments in the previous two paragraphs, we are left with an interesting class of models with μ in the range 10 −23 < μ < 10 −14 for which Sgr A * is currently allowed to have a thermalizing surface and yet not be ruled out by infrared constraints. This gap in model space merits further investigation.
Another issue worth serious discussion is the assumption that the surface will radiate like a blackbody. Since we are considering an object that (i) is in steady state and therefore in thermal equilibrium (by our assumptions), (ii) is likely nearly isothermal in the sense that the redshifted temperature T ∞ is independent of radius inside the object, and (iii) has an enormous optical depth, it seems unavoidable that the emission must be close to a blackbody. (For instance, stars radiate roughly like blackbodies because of their large optical depths and would be perfect blackbodies if they were isothermal.) Any deviations from a perfect blackbody in the putative surface radiation in Sgr A * might thus be expected to be minor. However, the specific case of radiation produced by energy release from matter falling on the surface of a compact supermassive (>10 6 M e ) object has not been studied and merits further attention. Models of spherical accretion on neutron stars (M = 1.4 M e ) studied by Shapiro & Salpeter (1975) suggest that modest deviations from a perfect blackbody are expected in that case; their models show some hardening of the thermal spectrum plus the appearance of a power-law spectral component extending to higher frequencies. If the corresponding effects in the case of a surface in Sgr A * (M = 4 × 10 6 M e ) are similarly modest, then our blackbody assumption is quite safe. Note that Shapiro & Salpeter (1975) did not include ray deflections and strong lensing in their model.
The argument for a blackbody spectrum is very strong in one particular limit. When the surface has a radius R * close to R H , i.e., μ = 1, the volume between R * and R ph = 3M acts like an enclosed cavity, with radiation allowed to escape only over a small solid angle ∼ μ. The cavity then behaves like a textbook isothermal "furnace" with a tiny pinhole for escaping radiation. In this limit, the radiation that reaches a distant observer will be indistinguishable from a perfect blackbody (Broderick & Narayan 2006).
If the quiescent infrared radiation in Sgr A * corresponds to blackbody emission from a surface, it should be completely unpolarized. On the other hand, if the radiation is produced by synchrotron emission in optically thin (weak) flares, we might expect a certain degree of linear polarization. Bright infrared flares in Sgr A * show clear evidence for strong linear polarization (Eckart et al. 2006;Gravity Collaboration et al. 2020c), but there is currently no information on the degree of polarization of the weak emission below the 5th percentile. Sensitive polarimetry could be used in the future to explore this regime and might help to reduce even further the maximum level of blackbody emission allowed in Sgr A * .

Reflecting Surface
In this subsection, we focus again on the possibility that Sgr A * may have a surface, but now we explore models in which the surface reflects incident radiation. We assume that, in the rest frame of the surface at some fixed areal radius R * , the following properties hold: (i) any inward-moving ray that is incident with wavevector k μ becomes an outward-moving ray with k r reversed and the other components of k μ unchanged; and (ii) if the intensity of the incoming ray is I ν , the outgoing ray has an intensity AI ν , where A 1 is the albedo of the surface. The motivation for considering such a model is that it makes interesting predictions that an interferometer like the EHT might be able to observe.

Synthetic Images Based on GRMHD Simulations
As an illustration of the effects we expect from a reflecting surface, we use a long-duration simulation of a hot accretion flow in the MAD state around a black hole of spin a * = 0 (Narayan et al. 2021). We take the profiles of density, pressure, four-velocity, and magnetic field in the poloidal (r, θ)-plane, time-averaged over the simulation period t = 50,000M-100,000M. We set the electron temperature using the prescription given in Event Horizon Telescope Collaboration et al.
(2019b, which is based on Mościbrodzka et al. 2016) with parameter values R high = 20 and R low = 1. We scale the density, and proportionately the gas pressure and magnetic energy density, such that the observed 230 GHz flux density is equal to 2.4 Jy, as measured during the 2017 EHT observations of Sgr A * (M. Wielgus et al. 2022, in preparation). We then compute a synthetic 230 GHz image for an observer at an inclination angle of i = 60°.
The top left panel in Figure 15 shows the 230 GHz image of the above model, assuming that the object at the center is a Schwarzschild black hole. The image is computed using the ray-tracing code HEROIC (Zhu et al. 2015;Narayan et al. 2016). The top middle panel shows the same image blurred with a Gaussian beam of FWHM equal to 15 μas; this beam size corresponds to the typical resolution that is achieved by the EHT using super-resolution image reconstruction techniques.
The unblurred image in the top left panel of Figure 15 shows the usual features. The sharp circular ring is the photon ring produced by strong gravitational lensing by the black hole. The diffuse elliptical feature is the image of equatorial emission from the accretion flow, flattened in the vertical direction because of the 60°inclination of the observer. These two features are visible even in the 15 μas blurred image in the top middle panel (the features merge if we blur with a 20 μas beam, the nominal resolution of the EHT). Most importantly, a dark shadow region is clearly seen in the middle of even the blurred image.
The second row in Figure 15 shows the effect of including a reflecting surface with albedo A = 1 (100% reflection) at a radius R * = 2.5M (selected as an example). In addition to the diffuse disk emission and sharp photon ring already described in the top left image, we find additional components that are caused by reflection. The thick bright ring at the center of the image corresponds to radiation from the equatorial accretion flow that is reflected from the side of the surface facing the observer. The thin ring (close to the original photon ring) is from rays that reflect off the far side of the surface and are then lensed around the compact object. Interestingly, the new features from reflection, especially the first one, appear in the shadow region of the original black hole image. When blurred, the resulting image, shown in the middle panel of the second row, has much of the shadow region filled in. This fairly dramatic effect is potentially distinguishable by the EHT.
The third and fourth rows in Figure 15 correspond to models with albedos A = 0.3 and 0.1, respectively. The image of the A = 0.3 model, when blurred, is only marginally different from the black hole image (top middle panel), while the blurred A = 0.1 model is indistinguishable from the black hole image.
The implication of these test images is that models in which Sgr A * has a reflecting surface with perfect albedo, A = 1, could potentially be distinguished by the EHT 2017 observations, but models with only partial albedo, e.g., A = 0.3 and 0.1, are harder to distinguish from the case of a black hole. Interestingly, in the latter models, a fraction (1 − A) of the radiation that falls on the surface must be absorbed and will presumably be reradiated as part of the thermalized emission discussed in Section 4.1. For any value of (1 − A)  0.1, this thermally reprocessed emission will lie well above the infrared limits discussed in Section 4.1.3 and shown in Figure 14. These models could thus be ruled out by that argument.
Note that several arbitrary choices were made in the above models: spin a * = 0, temperature ratios, R high = 20, R low = 1, observer inclination i = 60°, and surface radius R * = 2.5M. The values of R high and i were chosen to lie near the center of the corresponding ranges considered in Paper V. As it happens, for spin a * = 0, GRMHD-based models with these parameter values are fairly consistent with observations (see Paper V). Varying the parameters will certainly affect the predictions for the effect of surface reflection. The results may not change excessively since we pin the 230 GHz flux to 2.4 Jy. Nevertheless, we caution that the results presented in Section 4.2.2 below are for a preliminary toy model and are in the nature of a proof of concept. More detailed investigations are needed before we can draw firm conclusions.
An additional caveat is that, in this toy model, we have taken the flow solution to be the same as in a simulation that was run with a black hole event horizon at the center (Narayan et al. 2021). We simply truncated that solution at R = R * . As mentioned in Section 4.1.4, the problem of self-consistently solving the gasdynamics and radiation field for a supermassive object with a surface has not yet been studied.
Another caveat is that we have considered only the case of specular reflection. Diffuse reflection, where radiation incident on the surface is reflected isotropically (or with a more complicated angular distribution), is also worth exploring. In that case, the surface reflected intensity will not be restricted to a few narrow features in the image, but will be spread more 16 uniformly over the entire shadow region. This would eliminate any truly dark regions in the center of the image, conceivably making it easier to constrain such models.
Additionally, we considered a time-averaged steady image, whereas in reality we expect the image to fluctuate, which can lead to interesting time correlations of features. In addition, we Comparison with the top right image shows that the shadow region, which is prominent in the upper image (black hole case), is largely filled in by surface reflection. This change is potentially distinguishable with EHT observations. Third row: similar to the second row, but for albedo A = 0.3. Compared to the A = 1 model, in this case the difference from the black hole image due to the presence of a surface is only marginally detectable. Bottom row: model with surface albedo A = 0.1. Here the image is indistinguishable from the black hole image (top row) at the resolution and sensitivity of the EHT 2017 data.
have focused here on a spherically symmetric spacetime around a nonspinning object. Once we allow the central object to rotate, we will need to solve for the corresponding spacetime, which in general will not belong to the Kerr family of solutions. Figures 13, 14, and 17 in Paper III show a range of images of Sgr A * obtained by applying different image reconstruction techniques to the EHT 2017 data. The vast majority of images show a ring-like morphology with a pronounced dark shadow region at the center. These images are visibly different from the synthetic blurred image shown in the middle panel of the second row in our Figure 15. We can thus exclude this particular model using EHT observations.

Constraints from EHT Images
The right four panels in Figure 15 show image reconstructions of the four synthetic models shown in the left panels using one of the image reconstruction methods described in Paper III. For each model, synthetic visibility data were generated with the same (u, v)-coverage as in the April 7 EHT observations of Sgr A * , and the appropriate amount of noise was added to match the noise present in the real data. These synthetic visibilities were then analyzed using the SMILI top set (see Paper III for details), and the resulting images are shown in the panels on the right in Figure 15. The SMILI reconstructions are fairly similar to the 15 μas blurred images shown in the corresponding central panels, though the SMILI images appear to be slightly more blurred. More interestingly, the SMILI image of the A = 1 reflecting surface model (second row, right panel) is quite different from the reconstruction of the black hole model (top row, right panel). This confirms our expectation that a surface that reflects infalling radiation with 100% efficiency can potentially be ruled out by EHT 2017 observations of Sgr A * (modulo the many caveats mentioned in Section 4.2.1). In the case of the A = 0.3 model, and especially the A = 0.1 model, the SMILI reconstructions do not differ much from the black hole image; hence, it would be hard to distinguish such models using the current EHT data.
Another way of comparing models is to measure the brightness depression in the shadow region of the image. For instance, Table 13 in Paper III presents estimates of a parameter f c , which measures the ratio of the brightness at the center of the ring image to the mean brightness around the ring. This quantity is a measure of the brightness depression in the central shadow region of the image. From Figure 22 in Paper III, the estimate based on SMILI is f c ≈ 0.2, with not much probability that f c > 0.3. This implies that the image intensity in the shadow region in Sgr A * is very likely <30% of the mean intensity around the ring. 155 Such a degree of flux depression is inconsistent with the blurred synthetic image and SMILI reconstructed image in the second row of Figure 15. That particular model could be potentially ruled out. However, the dynamic range of images based on the current EHT 2017 data is too low, and its angular resolution is too poor, to constrain a weakly reflecting surface in Sgr A * , such as the models in the third and fourth rows of Figure 15.
An array with more stations and with larger bandwidth (e.g., the proposed Next Generation Event Horizon Telescope; Doeleman et al. 2019;Raymond et al. 2021) would solve the sensitivity problem. However, to improve the angular resolution substantially, it will be necessary to expand the telescope array into space by sending one or more radio dishes into largeradius orbits around Earth (Palumbo et al. 2019;Fish et al. 2020;Roelofs et al. 2020a;Fromm et al. 2021;Gurvits et al. 2021;Kudriashov et al. 2021). With such an expanded array, we might be able to observe images with sufficient angular resolution to see some of the details revealed in the various panels in the left column of Figure 15 and to check for the presence of a surface in Sgr A * . Furthermore, if a surface is present, we might be able to determine the albedo and the surface radius. Chael et al. (2021) show that, with better sensitivity and angular resolution than the present EHT is able to provide, the observed image of a hot accretion flow around a supermassive black hole could be used to delineate the inner edge of the accretion disk. Their interest is to use this technique to identify the edge of the event horizon. However, their method could equally well be used to measure the radius of a potential surface. This might provide a direct estimate of R * . Additionally, it may be possible to observe time-delayed echoes from a reflecting surface. In the case of time-variable emission from the accretion flow, e.g., submillimeter or infrared flares, the observer would see both the primary signal from the emitting gas element and a delayed reflected copy of the same radiation. This is the electromagnetic analog of gravitational-wave echoes that have been searched for in LIGO/Virgo observations of merging stellar-mass black holes (Abedi et al. 2017;Westerweck et al. 2018). However, here it could be done with spatially resolved images, with all the rich detail they can provide. We do not pursue electromagnetic echoes further, but we note that their presence could potentially be explored already using existing image-integrated lightcurve data.

Surfaceless Horizonless Compact Objects
As examples of horizonless compact objects without surfaces, we consider black hole mimickers such as (mini-) boson stars, for which synthetic images from covariant MHD simulations of radiatively inefficient accretion flows have recently been obtained in Olivares et al. (2020). It was shown there that a region with a central brightness depression could appear in the final observed image of an unstable boson star (model A there), despite the absence of an unstable photon orbit in that spacetime. These features are the result of an effective low-density region that appears in the center of the spacetime owing to a centrifugal barrier. The unstable boson star has a significantly smaller intrinsic source size than corresponding constraints of Sgr A * , thereby ruling it out as a candidate alternative. Similarly, for the stable boson star configuration considered in Olivares et al. (2020), there is a complete absence of a central brightness depression with the inner image being extremely bright, akin to a radiating surface (see, e.g., Fromm et al. 2021). Given the EHT constraints, a mini-boson star becomes an unlikely candidate as a black hole mimicker to describe Sgr A * , since their image morphologies are generally too compact and lack both a characteristic ringlike feature and a central brightness depression. However, a more extensive study of a boson star spin, compactness, and astrophysical setup should be considered to make this argument conclusive (e.g., Vincent et al. 2021). 155 The upper limit on f c is significantly lower when estimated via THEMIS (see Table 13 in Paper III) or by the modeling methods described in Paper IV. To be consistent with Figure 15, here we focus on only the SMILI results.
18 While the size of the bright emission ring/central brightness depression will be necessary to rule out or constrain the black hole and non-black-hole models considered in Section 5 below, the very presence of a central brightness depression in the 2017 image of Sgr A * (see Paper III; Paper IV; Paper V) is sufficient to rule out various models for compact objects that do not possess photon spheres. For example, these observations rule out the possibility that Sgr A * is a nonspinning Joshi-Malafarina-Narayan-2 (Joshi et al. 2014; JMN-2) naked singularity since these exotic compact objects do not cast shadows (Shaikh et al. 2019). We note that the JMN-2 spacetime is an exact solution of the Tolman-Oppenheimer-Volkoff equations of general relativity, it can form as the nonempty end state of the gravitational collapse of a (nonthermal) perfect fluid from regular Cauchy data, and photons in the spacetime move on null geodesics of the metric tensor (see also the associated discussion in Section 5.2). We assume that the naked singularity present at r = 0 does not interact with matter or radiation (classical gravity).

Metric Tests from Shadow Size
In Section 3.5, we used the prior information on the mass-todistance ratio of the Sgr A * black hole to calculate the predicted size of its shadow and compared the result to the size inferred from the EHT images and visibility-domain model fitting. We based this prediction on the Kerr metric and found that there is no evidence for any violations of the theory of general relativity. Our goal in this section is to use these bounds on plausible deviations in the shadow size that are still consistent with the imaging data in order to place constraints on deviations of the parameters of the underlying black hole metric.
We will follow two complementary approaches. First, in Section 5.1, we will constrain the parameters of stationary metrics that are agnostic to the underlying physical theory. These have been designed in a way that they reduce to the Kerr metric, when the deviation parameters are set to zero, but remain free of pathologies for a wide range of parameter values (Psaltis et al. 2020a). Although these metrics do not arise from any particular modification to gravity, they allow us to explore phenomenologically a very broad range of possibilities, which can be mapped afterward to parameters of a fundamental theory. Second, in Section 5.2, we will constrain the parameters of stationary metrics that are generated by various matter distributions and/or those that arise from specific modifications to the theory of gravity, and which depend on additional generalized charges (Kocherlakota et al. 2021). Although the latter represent only particular types of deviations from general relativity, they allow us to translate directly the constraints from the EHT images to bounds on physical parameters. Finally, in Section 5.3, we will compare the constraints on the various metrics in terms of their asymptotic post-Newtonian parameters in order to demonstrate that, fundamentally, the bounds imposed by the EHT imaging observations of Sgr A * depend weakly on the particular metrics used to describe deviations from Kerr. Throughout this section, for clarity of presentation, we will use the bounds (VLTI) on the fractional deviation inferred from the predicted shadow size, as calculated for the fiducial analyses that use the Keck and VLTI priors on the mass-to-distance ratio, the ehtimaging imaging method, and the GRMHD library for quantifying the theoretical uncertainties. Where it is not possible to show both bounds, we will show the Keck bound as an example. This constraint depends weakly on the choice of priors and techniques. In particular, the fiducial bounds correspond to the following constraints on shadow size: where d metric is the median shadow diameter, i.e., the locus of critical impact parameters.
Throughout this section we will not consider constraints on the circularity of the shadow. This is due to the sparse interferometric coverage of 2017 observations, which may lead to significant uncertainties in circularity measurements that we do not quantify here. In addition to measurement uncertainty, we also do not quantify the theoretical uncertainty between the circularity of the shadow and that of the observational feature. However, in future EHT observations with additional telescopes the circularity of the shadow may potentially be used for constraints on deviations from the Kerr metric.

Constraints on Metrics with Parametric Deviations
According to the no-hair theorem in general relativity, the only stationary, asymptotically flat, Ricci-flat spacetime that is free of pathologies 156 is the one described by the Kerr metric. We do not consider here the astrophysically irrelevant case of black holes with a net electric charge (see Section 5.2). As a result, introducing simple phenomenological deviations to the Kerr metric leads to pathologies that severely constrain our ability to make predictions for the size of the black hole shadow, especially in the case of spinning black holes (see Johannsen 2013a for a detailed study of the pathologies of several parameterized metrics and Kocherlakota & Rezzolla 2022 for an analysis of theoretically allowed parameter spaces of the RZ metric). For this reason, several parameterized metrics have been developed in the past decade that allow for general deviations from the Kerr metric while minimizing pathologies mostly by relaxing the assumption of Ricci flatness. These parameterized metrics are completely agnostic to an underlying physical theory; therefore, significant assumptions must be made for stability tests (see, e.g., Suvorov & Völkel 2021 for a quasi-normal-mode analysis of the RZ metric). Among these metrics, we choose three representative ones: the so-called JP metric , which was further developed to ensure the presence of a Carter-like integral of motion in Johannsen 2013b), the Modified Gravity Bumpy Kerr (MGBK) metric (Vigeland et al. 2011), and the so-called RZ metric (Rezzolla & Zhidenko 2014, which was further developed to include the effects of spin in Konoplya et al. 2016). We will derive analytic constraints for all three metrics and will use numerical calculations to derive spindependent constraints for both the JP and MGBK metrics.
Earlier studies have demonstrated that, because of a near cancellation between the effects of frame dragging and those of the quadrupole moment of the spacetime, the spin of the black hole affects the shadow size only marginally (see, e.g., Johannsen & Psaltis 2010;Psaltis et al. 2020a). Therefore, the bounds on the deviation parameters imposed by the measurement of the black hole shadow in Sgr A * are also expected to depend weakly on black hole spin. We demonstrate this in Figure 16, which shows the limits on different deviation parameters of the JP and MGBK metrics as a function of black hole spin, for various observer inclinations, and for different values of the secondary deviation parameters. The horizontal dashed curves show the bounds for nonspinning black holes when the secondary deviation parameters are zero. For this figure, we have set the parameters that affect the g tt -component of the metric at the r −2 order to zero (i.e., α 12 = 0 and 4,2 1 2 1,2 g g = -; see also Section 5.2) so we can focus on higherorder effects. The resulting constraints are of order unity and weakly dependent on spin, inclination angle, and the values of secondary deviation parameters.
The simulations used for this figure are described in Medeiros et al. (2020). In these simulations we assume that the geodesic equation holds for all metrics and solve for the trajectories of photons ignoring matter effects. We define the boundary of the black hole shadow as the critical impact parameter between photon trajectories that fall into the event horizon and those that escape to infinity. As was done in Section 3, we define the size of a shadow as its median radius and compare this measurement to the bounds on shadow size from the eht-imaging algorithm, the GRMHD simulation library, and the prior mass and distance measurements from Keck.
As found in Psaltis et al. (2020a), the measurement of the size of the black hole shadow places constraints of order unity primarily on parameters such as α 13 and γ 1,2 that depend weakly on the magnitude of the black hole spin. What is Figure 16. Numerical constraints on various deviation parameters for the JP and MGBK metrics as a function of the dimensionless black hole spin placed by the requirement that the predicted size of the black hole shadow is consistent with the size inferred for Sgr A * . Here the shading shows the regions of the parameter space that are ruled out by the bound derived from the calibration based on the Keck mass measurement, the eht-imaging algorithm, and the GRMHD simulation library, as an example. The left (right) panels show constraints for the JP (MGBK) metrics, while the different symbols/colors in the top and bottom panels show the effect of the secondary deviation parameter and the inclination of the observer, respectively. In the top panels we set the inclination angle to i = 42°and set the secondary deviation parameter to zero in the bottom panels. In all panels the black dashed lines correspond to the analytic constraints for nonspinning black holes. The nonmonotonic behavior of the constraints is due to the fact that both the size and the shape of the shadows are affected by the deviation parameters. The measurement of the shadow size constrains primarily the parameters that quantify deviations in the tt-components of the various metrics, as expressed in areal coordinates, and the resulting bounds depend only weakly on black hole spin or observer inclination. common between these parameters is that they describe deviations in the tt-component of the black hole metric, as expressed in areal coordinates.
Since spin has a relatively small effect on the predicted shadow size and, hence, on the metric constraints, we now focus on a more detailed exploration of the constraints on these three metrics when we set the spin to zero (i.e., in the limit of spherical symmetry). The radius of the shadow in this limit is given by (Psaltis et al. 2020a where α 1i are deviation parameters and the subscript A denotes the fact that we use areal coordinates. The tt-component of the nonspinning MGBK metric in areal coordinates is (see Gair & Yunes 2011;Vigeland et al. 2011) g r r r r 1 2 1 2 1 2 , where γ 1 (r A ) and γ 4 (r A ) are defined by Finally, the tt-component of the spherically symmetric RZ metric in areal coordinates is (see Rezzolla & Zhidenko 2014) and r 0 is the coordinate radius of the infinite redshift surface (identified with the horizon if no pathologies exist). The parameters ò, a 1 , a 2 , K are the deviation parameters. As done in Psaltis et al. (2021), we write all radii in terms of the mass of the black hole at infinity, which fixes one of the parameters For simplicity we assume r 0 = 2 throughout the rest of this section.
Using these analytic expressions, we calculate the size of the black hole shadow as a function of the various deviation parameters using Equation (15). We then apply the bounds on the shadow size imposed by the Sgr A * images and obtain constraints on the deviation parameters of the various metrics. The dashed lines in Figure 16 compare the analytic bounds with those obtained numerically for the spinning spacetimes.
Because we use only one measured quantity from the EHT image of Sgr A * , i.e., the size of the black hole shadow, but the tt-components of each metric depend on a series of deviation parameters, it follows that the EHT observations place, in fact, correlated constraints on these parameters. These can be thought of as subspaces in the multidimensional parameter space of deviations and are very difficult to visualize in full generality. In Figure 16, we showed one particular cross section of this parameter space in which the various parameters were Figure 17. Analytic constraints on deviation parameters for the JP, MGBK, and RZ metrics. The regions excluded by the Sgr A * constraints using Keck (VLTI) are shown in blue (magenta); see also Equations (12) and (13). For each curve we allow only one parameter to vary while setting the others to zero. The original g tt components of the metrics were used for this plot, not their expansions.   Table 4 we use the bound derived from the calibration based on the Keck mass measurement, the eht-imaging algorithm, and the GRMHD simulation library, as an example. JP, MGBK, and RZ are parameterized metrics that deviate from Kerr (see Vigeland et al. 2011;Rezzolla & Zhidenko 2014 for details on these metrics).
combined such that deviations from Kerr appear only at the second or higher post-Newtonian order. Figure 17 shows a different cross section of these constraints, where we have varied only one parameter at a time, setting all others equal to zero. The constraints derived from the Keck bounds are also summarized in Table 3. The deviation parameters that correspond to higher-order corrections (as denoted by the second integer in their subscripts) affect the size of the shadow less strongly than the lower-order parameters and are, therefore, less constrained. We will return to the magnitudes of these constraints in Section 5.3, after we discuss the bounds on metrics that correspond to solutions of particular modified gravity theories.

Constraints on Specific Compact Object Spacetime Metrics
Detecting possible deviations from the Kerr metric using the agnostic approach discussed above can be used to infer constraints on multiple asymptotic expansion coefficients of the spacetime, test the no-hair theorem, assess the Ricci flatness of black hole metrics, etc. In this section, we follow a complementary approach to determining whether specific fundamental principles of the theory of gravity are violated by considering explicitly theories that incorporate such violations by design, finding (stationary) solutions to the associated field equations that describe supermassive compact objects such as Sgr A * , and determining whether their images, when undergoing similar accretion processes, are compatible with those observed with the EHT.
Adopting this approach helps us assess the necessity of including additional fundamental fields (such as dilatons or axions) in the description of the classical theory and yield quantitative constraints on the amount of buildup of various fields in the vicinity of supermassive compact objects (see, e.g., Kocherlakota et al. 2021). This could be instructive of the astrophysical processes that may have produced them. Studying the images of available solutions allows us also to address questions related to the type of object that Sgr A * is, e.g., whether it is a naked singularity, a boson star, or a black hole. Finally, working with a specific theory enables a comparison of its predictions for a variety of other physical scenarios with already-existing or future observations, for its overall compatibility, as discussed in Section 6.
Alternative Black Holes.-As an example, the equivalence principle is a fundamental building block of the theory of gravity and has thus far been tested in various regimes by complementary experiments. It comprises three aspects (Dicke 2019;Will 2014): the weak equivalence principle (WEP), local Lorentz invariance (LLI), and local positional invariance (LPI). To demonstrate the scope of testing theories that violate the WEP and LPI with the EHT, we will consider here black hole solutions from two Einstein-Maxwelldilaton-axion (EMda) theories (Gibbons & Maeda 1988;Garfinkle et al. 1991;Kallosh et al. 1992;Sen 1992;García et al. 1995; see also the discussion in, e.g., Magueijo 2003;Kocherlakota & Rezzolla 2020), which emerge as the lowenergy effective descriptions of the heterotic string. This conservative choice allows us to be certain that (a) the form of the equations that describe the dynamics of accreting plasma flow around EMda black holes is identical to those in general relativity due to the minimal coupling of matter to Einstein-Hilbert gravity via the metric tensor, and (b) photons move on null geodesics of the metric tensor since electromagnetism is described by the (linear) Maxwell Lagrangian (see, e.g., Section 4.3 of Wald 1984).
Synthetic images of radiative inefficient accretion flows onto Gibbons-Maeda-Garfinkle-Horowitz-Strominger black holes (Gibbons & Maeda 1988;Garfinkle et al. 1991), which describe charged, static black holes in one of the EMda theories (henceforth the EMd-1 for brevity), have been constructed in Mizuno et al. (2018), using MHD simulations (see also Figure 6). It was demonstrated there that the final images of these EMd-1 black holes are comparable to those of the Schwarzschild/Kerr black holes. More recently, properties of images of (Kerr-)Sen black holes (Sen 1992), which are the spinning generalizations of the EMd-1 black holes, have been calculated and characterized in Younsi et al. (2021), when undergoing accretion that is described by the semianalytic model of Özel et al. (2021; see also Figure 6). To compare the features of Sen black holes against their general relativistic counterparts, we consider the Reissner-Nordstrom (RN;Reissner 1916;Nordström 1918) and the Kerr-Newman (KN;Newman et al. 1965) solutions, which describe charged black holes with and without spin, respectively.
We also consider solutions arising from various attempts to regularize the central singularities of classical black holes within general relativity. In particular, we consider solutions by Bardeen (1968), Hayward (2006), andFrolov (2016). 157 Such solutions are typically nonempty and the matter present, in a stationary configuration, typically violates one or more energy conditions (Hawking & Ellis 1973;Curiel 2017). Studying the images of available solutions that can be used to model compact objects allows us to test for possible violations of components of the equivalence principle or of energy conditions. We also include here the static Kazakov & Solodukhin (1994;KS) solution, which attempts to smear out the central singularity onto a surface. Additionally, we also consider the spinning counterparts of the Bardeen and Hayward metrics (Abdujabbarov et al. 2016). To conduct the analysis below, we have implicitly assumed that the "ordinary" matter in the accretion flow does not interact with the background matter in the nonempty spacetimes we consider here. Figure 18 shows the dependence of the deviation of the shadow size from the Schwarzschild prediction on the parameters (the "generalized charges" or simply charges henceforth) of the various metrics discussed above (see also Section IV of Kocherlakota & Rezzolla 2020 for further details). In this figure, each relevant physical parameter has been normalized to its maximum theoretically allowed value (see also Table 5). 158 Similarly to the case of the Kerr and the parametric metrics, we find that the black hole spin introduces minor corrections to the size of the shadow, which allows us to focus on nonspinning spacetimes. Moreover, the current bounds imposed by the EHT images of Sgr A * place constraints of order unity to the charges of several of the spacetimes, which are comparable to their maximum values, by construction. Figure 19 focuses on the constraints that we can set on the relevant parameter spaces of all the black holes from the two EMda theories considered here: the top panel shows the charged, nonspinning "EMd-2" solution from one theory 157 These spacetimes have also been obtained as solutions in other theories (see, e.g., Ayón-Beato & García 1998Held et al. 2019). 158 For the Kazakov-Solodukhin (KS;Kazakov & Solodukhin 1994) black hole, the theoretically permitted range of the relevant parameter is noncompact, 0 a > , and we show the range  0 2 a < in Figure 18.
22 (Kallosh et al. 1992). From the other theory, we show in the bottom panel the parameter space for the spinning Sen black hole (Sen 1992; the EMd-1 black hole corresponds to the a = 0 line). As can be seen from these figures, we find no evidence of violations of the equivalence principle or of the presence of energy-conditions violating matter within the present context. Thus far we have considered in detail the possibility that Sgr A * is a supermassive black hole (described by different metrics), as well as the alternative that it possesses a material surface (Section 4). We have also considered the possibility that Sgr A * is a surfaceless horizonless compact object without a photon sphere, with focus on mini-boson stars and naked singularities, in Section 4.3. Here, using specific solutions, we will address whether the spacetime in its vicinity can be well modeled by that of a naked singularity with a photon sphere and, later on, by a wormhole. Since all of the naked singularity solutions we consider here arise from metric theories of gravity, with the electromagnetic sector being governed by the linear Maxwell Lagrangian, photons move on null geodesics of the metric tensor, as discussed above. Since we consider exact solutions to the classical theory, we assume that the background spacetimes are static and that the naked singularities at r = 0 do not interact with matter or radiation in any way.
Naked Singularities.-For an example of a naked singularity spacetime, we will consider the Reissner-Nordström metric (Reissner 1916;Nordström 1918), characterized by specific electromagnetic charges of q 1 > , and denote it by RN * . These spacetimes admit photon spheres only for  q 1 9 8 < , and Figure 20 shows only this range, normalized to the maximum. The Janis-Newman-Winicour (JNW; Janis et al. 1968) naked singularity spacetime is a solution of the Einstein-Maxwellscalar theory with a theoretically allowed scalar charge parameter range of 0 1 n < < . However, the JNW naked singularities only cast shadows when  0 0.5 n < , as indicated in Figure 20. We will also consider a new class of naked singularities within general relativity, namely, the Joshi-Malafarina-Narayan-1 (JMN-1; Joshi et al. 2011) naked singularities. This class of solutions describes a one-parameter M 0 family of static spacetimes containing a compact region r A < r A,b ≡ 2M/M 0 filled by an anisotropic fluid, where M is Figure 18. The dependence of the fractional shadow diameter deviation from the Schwarzschild value on the relevant physical "charges" of various nonspinning black hole metrics (left) and on the black hole spins (right), for fixed values of the charges and for an observer inclination i = π/2. The white regions correspond to shadow sizes that are consistent at the 68% level with the 2017 EHT observations for Sgr A * . As in the case of the Kerr and the parametric metrics, the spin of the black hole introduces only minor corrections to the predicted shadow size and, hence, to the metric constraints. Current EHT imaging observations of Sgr A * are inconsistent with some metrics when their physical charges are comparable to their maximum theoretically allowed values. We use the median shadow diameter to characterize the size of the noncircular shadows cast by the spinning black holes (right), as done in Section 3. Figure 19. We show here the constraints on two EMda solutions from different EMda theories. In the top panel we show the constraints on the parameter space of a nonspinning black hole from an EMda theory with two U(1) gauge fields (Kallosh et al. 1992), whereas in the bottom panel we show the constraints on the parameter space of a spinning Sen black hole from an EMda theory with a single U(1) gauge field (Sen 1992).
Figure 20. Same as Figure 18, but for metrics that describe various naked singularities (denoted by a star) and a wormhole. the Arnowitt-Deser-Misner (ADM) mass of the spacetime. This spacetime can be attained at asymptotically late times as a result of gravitational collapse from regular initial data (Joshi et al. 2011; see also Section IV of Dey et al. 2019) and contains a photon sphere when r A,b < 3M or equivalently when M 0 2/3 contributed by the exterior Schwarzschild spacetime.
Spherical Bondi-Michel accretion onto JMN-1 naked singularities has been studied in Shaikh et al. (2019), where it was found that the final images of JMN-1 naked singularities with photon spheres are indistinguishable from those of a Schwarzschild black hole. More strikingly, for the same approximate luminosity as Sgr A * at 200 GHz, accretion flows onto these singularities have spectra nearly identical to those of a Schwarzschild black hole (see Figure 6 therein), indicating that a JMN-1 naked singularity with a photon sphere may be one of the best possible black hole mimickers for Sgr A * (see Section 4). Figure 20 shows the bounds imposed by the EHT images of Sgr A * on the physical charges of these metrics that describe naked singularities. With the exception of the Reissner-Nordström metric, which predicts shadow sizes that are significantly smaller than what is observed, the possibility that Sgr A * is a naked singularity cannot be ruled out based on the metric tests we describe in this section.
Wormholes.-As an example of a wormhole, we consider nonspinning, traversable Morris-Thorne (MT) metrics  in general relativity, for which the ttcomponents of the metric are determined by the "redshift function" Φ as g exp 2 tt ( ) = -F . For wormholes in general relativity to be traversable, the spacetime must necessarily contain energy-condition-violating matter and lack event horizons. The location(s) of the circular null geodesic(s) in this spacetime can be obtained by solving − 1 + r A dΦ/dr A = 0. If we restrict to the simplest case of an MT wormhole with a single unstable circular null geodesic, 159 which can then be identified as the location of the photon sphere, e.g., by setting Φ = − r A,t /r A as in Bambi (2013), we find that the (Keplerian/ADM) mass definition forces r A,t = M. This implies a shadow radius of r sh = e M ≈ 2.72 M, or equivalently δ ≈ − 0.48, which is immediately ruled out by the present considerations (see Figure 20).
Finally, as noted above in Section 4, improved angular resolution with space-VLBI would greatly help constrain possible metric deviations from the Kerr geometry. Notably, it would be possible to infer the spin of Sgr A * , when modeled as a Kerr black hole, if we are able to achieve a precision of | δ| < 0.07 (Johannsen & Psaltis 2010; see also Figure 18). Shadows of spinning MT wormholes (Teo 1998) have recently been considered in Shaikh (2018), where it was shown that their shape can vary considerably from that of a Kerr black hole and possibly be detected with future EHT or ngEHT measurements. Spacetimes admitting multiple circular null geodesics were considered by Wielgus et al. (2020), an example of which is given by black holes in a nonminimal Einstein-Maxwell-scalar theory (Gan et al. 2021). Presence of a persistent multi-ring structure in an EHT image would constitute a robust topological discriminant of this family of spacetimes, particularly with future observations at higher resolution and flux sensitivity.

Comparisons between Metric Constraints
In the discussion above, we used the inferred size of the black hole shadow in Sgr A * in order to place constraints on parameters of metrics that deviate from Kerr. For the metrics that are solutions to particular modifications to general relativity, these parameters (or "charges") correspond to particular properties of the theory or of the black hole itself. For the parameterized metrics, these parameters are phenomenological coefficients that are agnostic to any particular aspect of the underlying theory. Even though it might appear that these bounds are specific to the particular metric used, we will show here that they describe deviations from Kerr that are mathematically very similar to each other and nearly independent of the characteristics of the metric used.
First, as Figures 16 and 18 show, the constraints imposed by the measurement of the size of a black hole shadow depend weakly on the spin of the black hole, for all the metrics explored. As discussed in Section 5.1, for metrics with zero spin, it can be shown analytically that the measurements lead to constraints only on the parameters that enter the tt-component of the metric in areal coordinates. The consequence of these two statements is that, for all spins, the primary constraints imposed by the measurement of a shadow size will be only on one of the metric components, largely independent of the other metric details.
Translating directly the bounds on the parameters of one metric to those of another is nontrivial because of the usual coordinate and gauge ambiguities that are inherent to relativistic spacetimes. However, one avenue of making this comparison is by exploring the asymptotic behavior of these metrics toward radial infinity. In particular, we write the tt- and connect without ambiguity the post-Newtonian coefficients of each metric to each particular parameter. We then translate the bounds of the parameters to constraints on these post-Newtonian coefficients and compare the results obtained with different metrics. We emphasize that we do not use these post-Newtonian expansions in order to calculate black hole shadows, which would have been inappropriate given that the size of the shadow is comparable to the horizon radius. Instead, we calculate shadow sizes and place constraints on the particular parameters of each of the metrics that has been developed specifically for use in the strong-field regime. We only use the post-Newtonian coefficients as a mechanism to compare the asymptotic behavior of these metrics. Because each post-Newtonian coefficient at a given order N is proportional to the derivative of the metric coefficient with respect to 1/r A at order N + 1, comparing post-Newtonian coefficients is equivalent to comparing the detailed functional forms of the metrics.
Tables 4 and 5 summarize the post-Newtonian coefficients at the various orders for the different metrics used in the previous sections (see, e.g., Psaltis et al. 2021). Using this correspondence between metric parameters and post-Newtonian coefficients, we show in Figure 21 the fractional deviation δ of the shadow size from the Schwarzschild prediction but plotted against the equivalent post-Newtonian coefficients for each metric, at the first and second order.
As before, we show only two particular cross sections of the multidimensional parameter space for which the metric parameters were chosen such that only one of the first two post-Newtonian deviation coefficients has a nonzero value. As an example, for the JP metric we set α 13 = 2α 12 for the κ 1 plot in Figure 21 and all deviation parameters other than α 12 and α 13 to zero. This forces the κ 2 term to be zero for this metric but does not set higher-order terms to zero. Nevertheless, the influence on the shadow size of the higher-order terms for this metric decreases quite rapidly (see also Equations (29)-(33) of Psaltis et al. 2021). For the κ 2 plot for the JP metric we set α 12 = 0 to force κ 1 = 0 and allow only α 13 to be nonzero. For the MGBK metric we set γ 4,2 = − γ 1,2 /4 and set all parameters other than γ 1,2 and γ 4,2 to zero for the κ 1 plot. For the κ 2 plot we set γ 4,2 = − γ 1,2 /2 and all other parameters to zero. For the RZ metric we set r 0 = 2, which in turn sets ò = 0 as discussed in Section 5.1, and set a 0 = a 1 for the κ 1 plot and only a 1 to be nonzero for the κ 2 plot. For the metrics discussed in Section 5.2 we include only metrics for which it is possible to allow only one of the first two PPN parameters to be nonzero. The full ttcomponents of the metrics are used for these calculations, not their expansions.
These figures demonstrate that the inferred size of the Sgr A * black hole shadow places bounds of order ∼1 and ∼5 on the first and second post-Newtonian coefficients of the underlying metric, with the specific values showing a weak dependence on the particular metric used to obtain these constraints. Constraints on higher-order PN components would be factors of a few less stringent at each increasing order (see Psaltis et al. 2021 for details).

The Gravitational Field Probed by the Image of Sgr A *
There exist a number of key qualitative differences between the aspects of the theory of gravity probed by various tests of general relativity. For example, as Figure 22 illustrates, some of the tests are sensitive primarily to the dynamics and propagating modes of the gravitational fields, as is the case with pulsar timing, gravitational waves, and cosmology. 160 Other tests involve primarily measurements of photons in stationary spacetimes, as is the case of black hole images and stellar orbits. Some tests involve orbits of massive particles, while others involve the propagation of photons in relativistic spacetimes. Moreover, tests performed with neutron stars and in cosmological settings also probe the coupling of matter to the gravitational field, whereas black hole and most solar system tests are only sensitive to the properties of vacuum spacetimes.
Even within these qualitative distinctions, different tests probe vastly different regimes of gravitational potential and curvature, because of the large range of masses and length scales involved. This is illustrated in Figure 23, following Baker et al. (2015). The horizontal axis in this figure shows the gravitational potential probed by each test; in the case of a test at distance r from a Newtonian object of mass M, this dimensionless potential is equal to ò = GM/rc 2 . The vertical axis shows the spacetime curvature probed by each test, defined as the square root of the Kretschmann scalar; for a test in the Schwarzschild spacetime of an object, this is equal to GM r c 48 3 2 x = (see Baker et al. 2015). In order to highlight explicitly the fact that any test may probe a range of potentials and curvatures, we use straight lines to connect the smallest and largest potential and curvature that, in principle, may affect the outcome of each test. For example, in the case of the solar system test with the Cassini spacecraft (Bertotti et al. 2003), the Shapiro delay of radio signals was measured between Earth and the spacecraft, when the latter was between Jupiter and Saturn and as these signals grazed the surface of the Sun; this test, therefore, probes the entire range of gravitational potentials and curvatures from the solar surface to the location of the Cassini spacecraft. We also use dashed lines to connect regions of the parameter space that may affect the outcome of a test, but in a theory-specific manner. For example, in double-pulsar tests, the evolution of the orbital period caused by the emission of gravitational waves probes directly the potential and curvature at the orbital separation. However, in numerous modifications of the theory of gravity, an enhanced rate of emission of gravitational waves becomes possible because of the coupling of neutron star matter to the gravitational field at the highest potential and curvature (see, e.g., Damour & Esposito-Farese 1993). In other words, depending on the particular modification of gravity that is being tested, the test involving the evolution of the binary period may probe the entire range of field strengths covered by the solid and dashed cyan lines in Figure 23.
As we will discuss below, these qualitative and quantitative differences between tests of gravity complicate our ability to cross-compare and combine their results. However, these same differences also allow us to leverage the broad range of conditions that various tests probe in order to draw conclusions about the theory of gravity that could not have been reached by any test individually. For example, one of the key predictions of general relativity is that the spacetime properties of black hole scale with their mass. This is a prediction that we can test by comparing the results of gravitational-wave tests that probe stellar-mass black holes to those of the imaging tests that probe supermassive black holes. At the same time, general relativity predicts that, according to Birkhoff's theorem, the external spacetime of a slowly spinning object is independent of its internal structure and composition. We will test this prediction by comparing the results of black hole tests to those that involve pulsars or the Sun.

Comparing Gravitational Tests across Scales
Because of their qualitative differences, every type of test of general relativity is performed with a unique theoretical framework that is optimal for the system under study. Solar system tests use PPN expansions (Will 2014), pulsar tests use post-Keplerian parameterizations and also a strong-field equivalent of the PPN-formulation (see, e.g., Wex & Kramer 2020), shadow tests use parametric post-Kerr metrics Vigeland et al. 2011;Johannsen 2013b;Rezzolla & Zhidenko 2014;Konoplya et al. 2016), gravitational-wave tests with inspirals use post-Newtonian (Khan et al. 2016), effective-one-body (Buonanno & Damour 1999, parameterized post-Einstein (ppE) frameworks (Yunes & Pretorius 2009), etc. Unfortunately, this plurality of methods restricts our ability to combine and leverage the results of different tests since, in many cases, the 1 62 7 ,0 1< < < see, e.g., Figure 18 q 2 2 e m< -< Figure 19  (¯¯) -Note. We denote by crosses and ellipses spacetimes that are entirely ruled out and that are unaffected by the EHT measurements, respectively. Figure 21. Constraints on the post-Newtonian coefficients κ 1 (top panel) and κ 2 (bottom panel) of the Sgr A * metric imposed by the EHT images. We include three parameterized metrics, as well as several metrics that are known solutions to particular modified gravity theories. For each curve, we only allow one of the first two post-Newtonian coefficients to vary and set the others to zero. The bounds on the post-Newtonian coefficients depend weakly on the specific properties of the metric used to obtain them. parameters of each framework are not directly related to each other.
In principle, there are two ways we can combine tests across different scales and systems. In one approach, we can use a particular class of theories (e.g., scalar-tensor gravity) and compare the constraint of the parameters of that class from different tests. This is the most direct approach that requires typically no additional assumptions to be made. However, it is limited to the particular alternative to general relativity that is described by the theory under study. In a second approach, as a practical solution, we can make simplifying assumptions (e.g., that the dynamics of the theory are the same as in general relativity but the stationary spacetimes are not) and constrain phenomenological parameters of the metrics of the objects involved.
In order to make the latter approach independent of coordinate systems or gauges, and focusing here on tests of stationary metrics, we often convert the bounds on the parameters of a particular framework to constraints on the effective post-Newtonian parameters of the metrics of the objects involved. Within the particular assumptions inherent to each test, this approach is formally correct, even if one uses tests in the strong-field regime, as long as the framework used to obtain these constraints is itself applicable in that regime. Moreover, in doing so, there is no implicit assumption that the derived post-Newtonian parameters are universal constants. Indeed, in most modifications to general relativity, the values of these parameters are specific to the situation under consideration, as they may depend on the strength of the gravitational field (curvature or potential) probed, the nature of the compact object (binary or not, with matter or pure vacuum, etc.), and the boundary conditions (the coupling of matter to the field at the center of the system, the asymptotic cosmological boundary conditions, the cosmic time of the test, etc.). This is the reason why it is important to measure potential deviations of such parameters in different astrophysical and cosmological settings that span a wide range of masses and physical conditions, as shown in Figure 23.

Tests with the S2 Orbit
Sgr A * is unique in enabling us to probe the metric of the same black hole both at horizon scales, with the EHT images reported here, and at larger distances, with the orbits of S stars. In performing the imaging test in Section 3.5, we have already used the measurement of the mass-to-distance ratio for the black hole that was obtained through monitoring the S-star orbits. In spacetime terms, this Keplerian mass is simply the coefficient of the asymptotic, Newtonian expansion of the metric. However, recent measurements of relativistic effects in the stellar orbits resulted in constraints on the metric properties beyond the Newtonian regime, which we explore here.
The motions of several S stars have been monitored for almost three decades with adaptive optics instruments on VLT and Keck, and their orbits have been well determined (see, e.g., Ghez et al. 2008;Gillessen et al. 2009b). The detection of gravitational redshift (Gravity Collaboration et al. 2018a;Do et al. 2019) and of the precession of the periapsis (Gravity Collaboration et al. 2020a) in the S0-2 orbit has led to tests of the equivalence principle and of the Schwarzschild metric (Section 2; see also Hees et al. 2017;Amorim et al. 2019). Because in the gravitational test we report here with the Sgr A * images we explicitly assume the validity of the equivalence principle and only test the metric, we will focus on the connection of the imaging to the post-Newtonian tests of the metric using the precession of the S0-2 orbit.
In Gravity Collaboration et al. (2020a), the measured rate of precession of the S0-2 orbit was quantified through a phenomenological parameter f SP , such that the precession per orbit at the first post-Newtonian order can be written as where a and e are the orbital separation and eccentricity of the orbit, respectively. The best-fit value for f SP was found to be consistent with the predictions of the Schwarzschild metric, i.e., f SP = 1.1 ± 0.19. In the PPN formalism, the phenomenological parameter f SP is related to two of the first-order post-Newtonian parameters of the metric via (Will 2014) Figure 23. A parameter space of tests of gravity with astrophysical and cosmological systems (after Baker et al. 2015). For the solar system and pulsar tests, the straight lines connect the range of gravitational fields that could affect, in principle, the outcome of each test, from the location of the outermost probe to the location of the central massive object; the dashed region of the cyan line indicates that the connection to the largest curvatures is theory specific. The green lines connect the range of gravitational fields probed by two gravitational-wave tests with black hole inspirals. Filled areas show the typical range of gravitational fields probed by cosmological (orange), gravitationalwave (green), and black hole (magenta) imaging tests. Even though different tests explore, in principle, different aspects of the gravitational theory, as Figure 22 illustrates, they also probe vastly different scales. In particular, the horizon-scale images of Sgr A * that we report here probe a previously unexplored region of this parameter space of gravitational physics tests.
Here the subscripts explicitly denote the fact that these parameters are not universal constants but are specific to the metric of Sgr A * as measured at the location of the S0-2 orbit. In deriving this equation, we have also assumed that the mass of the S0-2 star is negligible with respect to the black hole mass (see Equation (32) below). We now assess the freedom these observations allow for possible deviations at higher post-Newtonian orders and hence the leverage of the strong-field imaging tests that we report here in constraining the metric of Sgr A * . We first write the precession per orbit at the second post-Newtonian orbit as  GM a e c e 6 2 1 10 . 27 The ratio of the second-to the first-order post-Newtonian term for the S0-2 star is GM a e c e 4 1 10 7 10 . 28 Because the first post-Newtonian term has been measured to an accuracy of ∼20% (see Equation (26)), the second post-Newtonian term would have to be ∼ 0.2/7 × 10 −4 ; 285 times larger than the Schwarzschild prediction in order for it to cause deviations detectable with current instruments. We will consider this as a heuristic upper bound on possible deviations at the second post-Newtonian order imposed by the precession of the S0-2 orbit. Had the metric of Sgr A * deviated from Schwarzschild by, e.g., a factor of 250 at the second post-Newtonian order, this would have still been undetectable by the S0-2-precession test but would have led to a shadow size as large as ∼ 85GM/c 2 ; 425 μas (see Equation (33) of Psaltis et al. 2021). This predicted size would have been at least a factor of 8 larger than the measured size we report in Section 3.5 and, more importantly, would have been almost two orders of magnitude larger than any potential uncertainty introduced by systematics due to plasma physics (as captured by the α 1 − 1 factor) or due to our measurement methods (as captured by the α 2 − 1 factor). In other words, the horizon-scale images of Sgr A * provide substantial constraints to potential deviations of the black hole spacetime from the GR predictions that could have evaded all prior bounds, beyond any astrophysical uncertainties.

M87 Imaging Tests
The black hole at the center of the M87 galaxy has a mass that is approximately 1500 times larger than the one in Sgr A * . As a result, observations of horizon-scale images from the M87 black hole probe similar potentials but curvatures that are 6 orders of magnitude smaller than those of Sgr A * . In Event Horizon Telescope Collaboration et al. (2019c) we used the 2017 EHT images of the M87 black hole to derive constraints on possible deviations of the inferred size of the black hole shadow from the Schwarzschild prediction, and in Psaltis et al. (2020a) and Kocherlakota et al. (2021) we used these measurements to place constraints on possible deviations of metric parameters from Kerr. Contrary to the case of Sgr A * that we report here, there were two independent and distinct priors on the mass-to-distance ratio for the black hole in M87, based on either stellardynamic (Gebhardt et al. 2011) or gas-dynamic measurements (Walsh et al. 2013). Adopting the former resulted in an upper bound on deviations from the Kerr predictions that was consistent with zero, within ∼17%. We opted to assign negligible prior likelihood to the latter prior, as it would have led us to conclude that there is significant tension between the Kerr predictions and the observations, and instead used the measurements as a null hypothesis test, i.e., concluded that the EHT images were not inconsistent with the Kerr predictions. Figure 24 compares the posteriors on the deviation parameter δ obtained here for Sgr A * to those reported earlier for the M87 black hole. In the case of the image in Sgr A * , we have a precise measurement of the mass-to-distance ratio for the black hole based on the detection of relativistic effects in the orbit of the S0-2 star, as discussed in Section 2. This removes any ambiguity in our calculation of the Kerr predictions. Moreover, the uncertainties in the mass-to-distance priors for Sgr A * are negligible compared to those in M87, even if we only adopt the stellar-dynamic measurement for the latter. This results in uncertainties on the bounds of the deviation parameter δ that are almost a factor of 2 smaller in Sgr A * compared to the M87 black hole.
In both the Sgr A * and M87 cases, the inferred sizes of the black hole shadows are consistent with the Kerr predictions, even though the black holes span 3 orders of magnitude in mass and 6 orders in curvature scale. This serves as a confirmation of the general relativistic prediction that the spacetime properties of black holes scale with their mass and can be further reinforced by leveraging tests that involve stellar-mass black holes, as we discuss below.

Gravitational-wave Tests
Observations of gravitational waves from coalescing black hole binaries with LIGO/Virgo provide strong constraints on Figure 24. Comparison of the posterior distributions for the fractional deviation δ from the Schwarzschild predictions, as inferred by the EHT measurement of the size of the black hole shadows in Sgr A * and M87. The purple shaded area shows the ∼8% range predicted for the Kerr metric, depending on the black hole spin and observer inclination. The red shaded area shows the small range of posteriors for Sgr A * , inferred with different imaging and calibration algorithms (see Figure 12). The solid and dashed lines show the posteriors for the M87 black hole, when the stellar-dynamic and gas-dynamic measurements of the mass-to-distance ratio have been used, respectively. The negligible uncertainties in the mass measurement of Sgr A * , which is the result of the detection of relativistic effects in the orbit of the S0-2 star, remove any ambiguity in the comparison with the Kerr predictions. potential near-horizon modifications of the predictions of the theory of gravity for black holes (Abbott et al. 2016(Abbott et al. , 2019(Abbott et al. , 2021. Because of the frequency range of these ground-based gravitational-wave detectors, the black hole masses they are sensitive to are in the 10-100 M e range. As a result, compared to the tests with the EHT black hole images, existing gravitational-wave observations probe similar potentials but curvatures that are different by 8-16 orders of magnitude. A second important difference arises from the fact that, fundamentally, gravitational-wave observations measure the propagating gravitational modes of the theory, whereas black hole images measure electromagnetic modes propagating on the black hole spacetimes. It is possible that the number and polarization of the propagating modes of the fundamental theory of gravity are the same as those in general relativity but the stationary metrics are not; in fact, it is possible that the fundamental theory of gravity is general relativity but the stationary metrics of the supermassive compact objects in the centers of galaxies are not described by the Kerr metric (see, e.g., Gair et al. 2008). Alternatively, it is possible that the propagating modes of the theory are very different from those in general relativity but the stationary spacetimes remain Kerr (Psaltis et al. 2008;Barausse & Sotiriou 2008). In this way, the gravitational-wave and imaging observations of black holes provide complementary probes of potential modifications to general relativity.
Because of this fundamental difference, however, in order to compare directly gravitational-wave constraints to those of black hole imaging, one needs to make specific assumptions. Our main goal in this section is to leverage the gravitationalwave tests in order to assess whether the black hole metric properties scale with mass, as predicted by general relativity. For this reason, we will focus on the inspiral phases of the observed gravitational waves, as these are ones that are mostly sensitive to modifications in the metrics of the coalescing black holes 161 (see also Völkel & Barausse 2020). Moreover, following Psaltis et al. (2021), we will assume here that the propagating modes of the theory are indistinguishable from those in general relativity and assign any room for potential deviations to changes in the underlying metrics of the black holes. Unless the fundamental theory of gravity is finely tuned such that the modifications in the radiative sector exactly cancel those in the metrics, for the masses of the LIGO/Virgo black holes, our constraints will represent broad-brush upper limits on potential metric deviations.
Under the assumptions outlined above, the LIGO/Virgo measurements of the inspiral phases of coalescing black hole binaries depend entirely on the tt-components of the metrics, as expressed in areal coordinates (Carson & Yagi 2020;Cárdenas-Avendaño et al. 2020). This is a consequence of the fact that the waveforms of the gravitational waves during the decay of quasi-circular orbits are determined by the binding energies of the orbits and their angular frequencies (see, e.g., Equation (9) of Carson & Yagi 2020), both of which are determined by the tt-components of the metric (Ryan 1995). This is the same component of the metric that determines the size of the black hole shadows measured with the EHT (Psaltis et al. 2020a). Remarkably, because of a coincidence related to the masses of the coalescing black holes, the degeneracies between the constraints from inspiral measurements on the various parameters of metrics that deviate from Kerr are nearly parallel to those of the constraints imposed from the EHT imaging observations . In other words, this coincidence allows us to use the LIGO/Virgo constraints and make a prediction on the fractional deviation δ in the shadow size one would have calculated, if the gravitational-wave sources had the same metrics as those of the supermassive black hole observed with the EHT. Figure 25 compares the results on the deviation parameter δ for the two most constraining gravitational-wave events, GW170608 and GW190924 , to those obtained in Section 3.5 for Sgr A * , as well as those for the M87 black hole derived earlier (Event Horizon Telescope Collaboration et al. 2019c). As discussed above, all observations are consistent with the predictions of general relativity, even though they utilize black holes with masses that are different by 8 orders of magnitude. This lends support not only to the Kerr nature of the black hole spacetimes but also to the fact that the fundamental theory of gravity does not have a scale between those probed by stellar-mass and supermassive black holes.

Pulsar Timing Tests
The potentials and curvatures probed by pulsar timing tests may depend on the underlying theory of gravity, as discussed above. In principle, in theories without a characteristic scale (such as screening) between the orbital separation of the binary Figure 25. Comparison of the posterior distributions for the fractional deviation δ from the Schwarzschild predictions, as inferred by the EHT measurement of the size of the black hole shadow in Sgr A * (red curve for the fiducial priors) and M87 (green curve for the stellar-dynamics mass) and by the LIGO/Virgo measurements of the inspiral phases of GW170608 (blue) and GW190924 (orange). The posteriors corresponding to the last two reflect the prediction on the fractional deviation δ in the shadow size one would have calculated based on the constraints imposed by the gravitational-wave measurements, if the coalescing, stellar-mass sources had the same form of metrics as those of the supermassive black holes observed with the EHT. The gray shaded area shows the ∼8% range predicted for the Kerr metric, depending on the black hole spin and observer inclination. Even though tests with gravitational waves and black hole images span black hole masses that are different by 8 orders of magnitude, they are all consistent with the GR predictions that all black holes are described by the same metric, independent of their mass. 161 There are a multitude of other tests of gravity that are possible with gravitational-wave observations, such as those that place bounds on the mass of the graviton (Abbott et al. 2016;Baker et al. 2017). Albeit extremely important, these tests are not directly comparable to those we report here, as imaging tests are sensitive only to the stationary black hole metrics and not to other aspects of the theory. 29 and the size of the neutron star, pulsar tests probe the coupling of matter to the gravitational field, which takes place in the strong-field regime of the neutron star interior, i.e., at a potential of order unity and curvature of order 10 −10 cm −2 . For such theories, the horizon-scale images of Sgr A * probe a similar potential but a curvature that is different by 15 orders of magnitude (see Figure 23).
In contrast, theories with a characteristic scale between the orbital separation of the binary and the size of the neutron star (or other similar effects as in Yagi et al. 2016) are only probed at the potential of the periapsis distance for tests involving the orbital period derivative and orbital precession (for the double pulsar this is about 6 × 10 4 GM/c 2 ) or at the distance of minimum approach for the Shapiro delay (for the double pulsar, this distance is about 1500GM/c 2 ). In this case, the horizon-scale images of Sgr A * probe similar curvatures but a potential that is larger by 5 orders of magnitude.
Following these two approaches, we are going to discuss the constraints imposed on the various deviation parameters in two complementary ways. First is in a theory-agnostic way, in terms of the effective post-Newtonian parameters of the metrics of the compact objects. Second is in terms of the scalar-tensor gravity theory of Damour & Esposito-Farese (1993), with second-order couplings between the scalar field and the Einstein tensor defined by the parameters m ln a a 0 a f º ¶ ¶ and β a ≡ ∂α a /∂f 0 , where the subscript a corresponds to either the pulsar "p" or the companion "c." In such a theory, the effective 1PN parameters can be expressed in terms of the theory parameters α a and β α as where the "hats" and subscripts emphasize the fact that the strong-field equivalents to the PPN parameters are not universal constants.
In principle, there are at least five theory-independent post-Keplerian parameters in a binary system that can be measured from pulsar timing (Damour & Taylor 1992). Together with the normal Keplerian parameters, they constitute a set of inferred quantities that is larger than the free parameters in the system (see Wex & Kramer 2020 for a recent review). The combination of any two post-Keplerian parameters is used to determine the masses of the two objects in the binary. Any additional measurement can then be used for testing GR and a very broad class of alternative (boost-invariant) theories (Damour & Taylor 1992;Will 2014De Laurentis et al. 2018). We will focus below on the constraints imposed by the measurement of Shapiro delay, of the precession of periapsis of the binary, and of the orbital period evolution caused by the emission of gravitational waves.
Shapiro delay.-Some of the main constraints on deviations from general relativity come from the measurement of a Shapiro delay in binary pulsar systems. Like the imaging of black holes, such experiments provide a rare opportunity to study the light propagation near strongly self-gravitating objects. Two parameters can be measured, the "shape" s and the "range" r. The Shapiro shape, s, can quite generally be identified with the sine of the orbital inclination (i.e., s i sin = ; e.g., Kramer et al. 2021). 162 In comparison, the Shapiro range r relates to the companion mass, e.g., in general relativity one finds r = Gm c /c 3 . For a theory-agnostic constraint, one can relate r to the 1PN parameter γ via r G m c where m c is the mass of the companion star and the subscript on the PN parameter makes explicit that this is the value for the coupling between a test particle and the pulsar companion, i.e., they are not universal constants but object specific. The subscript in the gravitational constant G also denotes that this is the effective gravitational constant felt by a test particle in the field of the companion. Precession of periapsis.-In GR, within the PPN framework, the rate of precession of the periapsis w  depends on the combination 2 2ˆĝ b + -(in the limit when the mass ratio is zero). Incorporating the effects due to the presence of two orbiting objects for which the strong-field coupling of matter to gravity cannot be neglected, w  becomes proportional to Here M ≡ m c + m p is the sum of the companion and pulsar masses, m c and m p , respectively, and e is the eccentricity of the orbit.
Orbital period derivative.-The orbital period of a binary system may change because of a number of effects. In case that only gravity plays a role, a decay in the orbital period results from the emission of gravitational waves. The measurement of an orbital period derivative can then be used to confront a given theory with its predictions. In general relativity, to leading order, the orbital period derivative is given by the quadrupole formula (Peters 1964). In contrast, many alternative theories of gravity violate the strong equivalence principle (SEP), resulting in the emission of gravitational dipolar radiation. Observations of binary pulsars can provide strict limits on the existence of dipolar radiation (Shao & Wex 2016;Wex & Kramer 2020). Indeed, the double-pulsar system provides currently the most precise test of the general relativistic quadrupolar description of gravitational waves, validating the prediction at a level of 1.3 × 10 −4 (95% CL; Kramer et al. 2021).
Under the assumption that the radiative sector is negligibly different from general relativity, one can put, in a theoryagnostic way, constraints on the post-Newtonian parameters of the waveform, obtaining at 1PN order where Δf 2 is the 1PN order correction to the phase term of the gravitational waveform. Following this framework and emphasizing the mentioned assumptions, we can use the other precise measurements in the double-pulsar system for r and w  to derive constraints on the corresponding post-Newtonian parameters at the first and second orders. At the 68% confidence limit, particular combinations of these parameters that give rise to the Shapiro range, the periapsis advance, and the orbital decay have been found to be consistent with the general relativistic predictions at a precision of 3.4 × 10 −3 , 2.6 × 10 −4 , and 6.3 × 10 −5 , respectively (Kramer et al. 2021).
Universality of freefall.-One can use pulsars also to test the SEP, as first shown by Damour & Schaefer (1991). The discovery of a pulsar in a triple system (Ransom et al. 2014) allows a variation of the Damour-Schäfer experiment in a very constraining way (Freire et al. 2012). In the triple system an inner pulsar−white dwarf system is orbited by a second white dwarf in an outer orbit. By tracking the orbital motion of the neutron star via pulsar timing, one can study how the inner two objects with significantly different gravitational self-energy are falling in the gravitational field of the third object. Archibald et al. (2018) presented a limit on the strong-field equivalent of the Nordtvedt parameter of η < 3 × 10 −5 . Following Damour & Schäfer, it  where η and h¢ are strong-field equivalents of the Nordtvedt parameter and ò grav = E grav /mc 2 is the normalized Newtonian gravitational binding energy. At the first post-Newtonian order, the Nordtvedt parameter is related to the post-Newtonian coefficients by |η| = |4β − γ − 3|. Ignoring higher terms and using the results by Archibald et al. (2018) leads to |4β − γ − 3| < 1.0 × 10 −5 (95% CL). At the second post-Newtonian order, we make use of the Messenger limit obtained in the solar system (Genova et al. 2018) to constrain 1.2 10 3 h¢ <´-(95% CL).

Leveraging Gravitational Tests across Different Scales
Figures 26 and 27 compare the constraints on the various metric deviation parameters at the first and second post- Figure 26. Comparison of the current limits on various potential metric deviation parameters at the first post-Newtonian order obtained from different tests of gravity with astrophysical objects, as a function of the gravitational potential (top panel) and curvature (right panel) probed by each test. For visual clarity, each upper limit is calculated assuming that the deviations at all other post-Newtonian orders are negligible; this plot, therefore, represents only one cross section in the multidimensional parameter space of plausible deviations. Every tests probes a different combination of post-Newtonian parameters, as shown in the top panel. In most modifications to general relativity, these post-Newtonian parameters are not universal constants but depend on the nature of the central object, its mass, its composition, its scale, etc.
Newtonian orders that are imposed by the gravity tests discussed above. For visual clarity reasons alone, the figures show only a particular cross section of the multidimensional parameter space of plausible deviations: in constructing this figure, we have assumed that all deviation parameters other than those plotted are negligible. Barring any fine-tuning of the fundamental theory that would lead to fortuitous cancellations, the bounds plotted can be regarded as rough upper limits. It is important to emphasize here that since each test probes a different combination of the various post-Newtonian parameter, which are indicated on the figure for those at the first post-Newtonian order, it is mathematically impossible for any non-Kerr metric to evade simultaneously all constraints purely by fortuitous cancellations.
The bounds on the post-Newtonian parameters that are imposed by the imaging observations of Sgr A * and M87, the detection of periapsis precession in the S2 orbit, the gravitational-wave observations, and the double pulsar have been discussed in Section 5.3, Event Horizon Telescope Collaboration et al. (2019c), Section 6.3, Psaltis et al. (2021), and Section 6.6, respectively. The figures also incorporate the limits on the firstand second post-Newtonian parameters obtained by solar system tests, which we briefly discuss below.
The measurement of the solar Shapiro delay with the Cassini spacecraft constrained the γ e − 1 parameter at the first post-Newtonian order with an accuracy of ∼2 × 10 −5 (Bertotti et al. 2003). At the second post-Newtonian order, the best constraint comes from the deflection of light measurements that are consistent with the Schwarzschild predictions to an accuracy of ∼10 −4 . The second-order post-Newtonian correction for light deflection at the solar surface is of order ∼3 × 10 −6 (Bodenner & Will 2003). Therefore, any second-order post-Newtonian corrections to the solar metric cannot be larger than ∼10 −4 /(3 × 10 −6 ) ; 35 times the Schwarzschild predictions.
The periastron precession of Mercury constrains the combination 2γ e − β e − 1 with an accuracy of ∼2 × 10 −5 . Given the strict limit on γ e for the solar metric from Cassini, it can be translated into an upper bound on β alone. Effects due to the second post-Newtonian order are subdominant compared to relativistic effects involving other solar system bodies and are of order ∼7 × 10 −8 smaller than the first-order effects . As a result, we can conclude that any second-order post-Newtonian corrections to the solar metric cannot be larger than ∼2 × 10 −5 /(7 × 10 −8 ) ; 300 times the Schwarzschild predictions.
Examining the combined constraints shown in Figures 26  and 27 allows us to draw some general conclusions on modifications to the equilibrium metrics of massive, isolated objects from the predictions of general relativity that can be accommodated within current observational bounds, barring any fortuitous cancellations: (i) They may appear at the first post-Newtonian order and attain up to order-unity magnitudes at the highest gravitational potentials of compact objects but only in theories with coupling to matter that evades the theoryspecific bounds imposed by the double pulsar. (ii) They may appear primarily at the second or higher post-Newtonian orders but with magnitudes that are constrained only to within orderunity deviations from the general relativistic predictions.

Summary
The Galactic center black hole, Sgr A * , is an ideal and natural laboratory for testing the strong-field predictions of general relativity. Monitoring of the orbits of tens of stars within its radius of influence and the recent detection of two relativistic effects in one of them has led to a precise determination of the black hole mass and distance from Earth (Gravity Collaboration et al. 2018aDo et al. 2019). Compared to any other black hole on the sky observed in the electromagnetic spectrum, these measurements lead to precise predictions for the magnitudes of gravitational effect, which can then be tested against other observations.
In this series of papers, we are reporting the first horizonscale images of Sgr A * , obtained with the EHT at a wavelength of 1.3 mm (Paper II). The images are characterized by a bright ring of emission surrounded by a deep brightness depression (Paper III). This image structure is stable and remains present for at least the ∼32 hr span of observations on 2017 April 5-6, which corresponds to ∼60-500 dynamical timescales at the radius of the innermost stable circular orbit, depending on black hole spin. Using a variety of image reconstruction and visibility-domain modeling tools, we measured the diameter of the ring-like structure.
The structure, size, and persistent nature of the black hole image lead us to identify the central brightness depression with the shadow that the black hole is expected to cast on the emission from the accreting plasma. Using the depth and size of this brightness depression, as well as the observed broadband spectrum of the source, we rule out the possibility either that Sgr A * has a surface that fully absorbs and remits thermally the incoming energy flux or that it has a reflecting surface at 1.3 mm with significant albedo.
Using an extensive suite of images and synthetic data based on time-dependent and semianalytic plasma models in a variety of spacetimes, we calibrated the difference between the size of the observed emission ring and that of the shadow, as well as potential systematic effects introduced by the sparse interferometric coverage of the EHT array and our analysis methods. We found the magnitudes of these effects to be of order ∼10%. This is subdominant compared to the significantly larger effect caused by changing the spacetime of Sgr A * , which can be as large as an order of magnitude while remaining consistent with the bounds imposed by the observation of the orbit of the S0-2 star in the spacetime of the same black hole. We derived a shadow size of 47-50 μas.
Because in the Kerr metric the size of the shadow depends primarily on the mass-to-distance ratio of the black hole, which is well determined by monitoring of stellar orbits, we can compare directly the predicted Kerr size to the observations, without any free parameters. We find that the Kerr predictions are consistent with observations at the ∼10% level. We then use this strong-field inference to place bounds of order unity to the parameters of metrics that deviate from Kerr.
The ∼ 4 × 10 6 M e mass of Sgr A * places these tests of the Kerr metric in a region of the parameter space of gravitational objects that has never been probed before in the strong-field regime. This mass is approximately 5 orders of magnitude larger than the masses of the objects probed by LIGO/Virgo via the detection of gravitational waves and a factor of 1500 smaller than the mass of the black hole in M87. Leveraging the fact that similar bounds on the strong-field predictions of general relativity have been placed by gravitational-wave and imaging observations across this 8-order-of-magnitude range in mass, we conclude that it is unlikely for the fundamental theory of gravity to possess a scale in this range.
These conclusions are based predominantly on the identification of the central brightness depression with the shadow of the black hole and, at the ∼10% level, on the calibration of the relative size of the bright ring of emission, which we measure, to that of the shadow, which we infer. In the case of the tests involving the image of the M87 black hole that we reported earlier (Event Horizon Telescope Collaboration et al. 2019c; Psaltis et al. 2020a;Kocherlakota et al. 2021), two issues related to that black hole left open the possibility that the observed brightness depression might not have been related to the black hole shadow. First is the factor of ∼2 difference in the prior measurements of the black hole mass based on stellar dynamics and gasdynamics (Gebhardt et al. 2011;Walsh et al. 2013). It could have been possible, in principle, (i) for the brightness depression to not be related to the black hole shadow but be generated, for example, at a larger distance from the black hole and (ii) for the black hole mass to be smaller than that inferred from stellar dynamics in such a way that, by pure coincidence, the size of the brightness depression is equal to the size inferred observationally. This is the approach taken, e.g., by Gralla et al. (2019; see also Gralla et al. 2019Gralla et al. , 2020. Second is the fact that the time spread of the M87 observations was comparable to the dynamical timescale at its innermost stable circular orbit, allowing for the possibility that the image structure observed was transient and did not correspond to the persistent image of the bright ring surrounding the black hole shadow.
In the case of the Sgr A * images, neither of these considerations provides a reasonable alternative to our interpretation. Indeed, the mass of the black hole in Sgr A * is known to such a degree of precision that it leaves very little room for uncertainties in the predicted diameter of the black hole shadow. At the same time, the relatively small value of this mass, compared to the M87 black hole, allows us to observe the image over tens to hundreds of dynamical timescales and conclude that this structure is persistent and not transient. With 33 these potential sources of uncertainty under control, we measured the size of the black hole shadow in Sgr A * and found it to be in agreement with the Kerr prediction, as we did for the case of the M87 black hole. In order to argue that the observed brightness depression is not related to the black hole shadow, we would have to not only assign this consistency to coincidence but also require that the same coincidence works for both M87 and Sgr A * . Given that the two black holes have masses that are different by a factor of ∼1500, are accreting at widely different rates with one showing a prominent jet that is missing from the other, and are probably observed at different inclinations, we consider this alternative to be highly unlikely.
Alexander Graham Bell Canada Graduate Scholarships-Doctoral Program); the National Youth Thousand Talents  We thank the staff at the participating observatories, correlation centers, and institutions for their enthusiastic support. This paper makes use of the following ALMA data: ADS/JAO.ALMA#2016.1.01154.V. ALMA is a partnership of the European Southern Observatory (ESO; Europe, representing its member states), NSF, and National Institutes of Natural Sciences of Japan, together with National Research Council (Canada), Ministry of Science and Technology (MOST; Taiwan), Academia Sinica Institute of Astronomy and Astrophysics (ASIAA; Taiwan), and Korea Astronomy and Space Science Institute (KASI; Republic of Korea), in cooperation with the Republic of Chile. The Joint ALMA Observatory is operated by ESO, Associated Universities, Inc. (AUI)/NRAO, and the National Astronomical Observatory of Japan (NAOJ). The NRAO is a facility of the NSF operated under cooperative agreement by AUI. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. We also thank the Center for Computational Astrophysics, National Astronomical Observatory of Japan. The computing cluster of Shanghai VLBI correlator supported by the Special Fund for Astronomy from the Ministry of Finance in China is acknowledged.
APEX is a collaboration between the Max-Planck-Institut für Radioastronomie (Germany), ESO, and the Onsala Space Observatory (Sweden). The SMA is a joint project between the SAO and ASIAA and is funded by the Smithsonian Institution and the Academia Sinica. The JCMT is operated by the East Asian Observatory on behalf of the NAOJ, ASIAA, and KASI, as well as the Ministry of Finance of China, Chinese Academy of Sciences, and the National Key Research and Development Program (No. 2017YFA0402700) of China and Natural Science Foundation of China grant 11873028. Additional funding support for the JCMT is provided by the Science and Technologies Facility Council (UK) and participating universities in the UK and Canada. The LMT is a project operated by the Instituto Nacional de Astrófisica, Óptica, y Electrónica (Mexico) and the University of Massachusetts at Amherst (USA). The IRAM 30-m telescope on Pico Veleta, Spain is operated by IRAM and supported by CNRS (Centre National de la Recherche Scientifique, France), MPG (Max-Planck-Gesellschaft, Germany) and IGN (Instituto Geográfico Nacional, Spain). The SMT is operated by the Arizona Radio Observatory, a part of the Steward Observatory of the University of Arizona, with financial support of operations from the State of Arizona and financial support for instrumentation development from the NSF. Support for SPT participation in the EHT is provided by the National Science Foundation through award OPP-1852617 to the University of Chicago. Partial support is also provided by the Kavli Institute of Cosmological Physics at the University of Chicago. The SPT hydrogen maser was provided on loan from the GLT, courtesy of ASIAA.
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant ACI-1548562, and CyVerse, supported by NSF grants DBI-0735191, DBI-1265383, and DBI-1743442. XSEDE Stam-pede2 resource at TACC was allocated through TG-AST170024 and TG-AST080026N. XSEDE JetStream resource at PTI and TACC was allocated through AST170028. This research is part of the Frontera computing project at the Texas Advanced Computing Center through the Frontera Large-Scale Community Partnerships allocation AST20023. Frontera is made possible by National Science Foundation award OAC-1818253. This research was carried out using resources provided by the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy Office of Science.
Additional work used ABACUS2.0, which is part of the eScience center at Southern Denmark University. Simulations were also performed on the SuperMUC cluster at the LRZ in Garching, on the LOEWE cluster in CSC in Frankfurt, on the HazelHen cluster at the HLRS in Stuttgart, and on the Pi2.0 and Siyuan Mark-I at Shanghai Jiao Tong University. The computer resources of the Finnish IT Center for Science (CSC) and the Finnish Computing Competence Infrastructure (FCCI) project are acknowledged. This research was enabled in part by support provided by Compute Ontario (http://computeontario.ca), Calcul Quebec (http://www.calculquebec.ca) and Compute Canada (http://www.computecanada.ca).
The EHTC has received generous donations of FPGA chips from Xilinx Inc., under the Xilinx University Program. The EHTC has benefited from technology shared under open-source 35