Interferometric Image Reconstruction using Closure Invariants and Machine Learning

Interferometric closure invariants encode calibration-independent details of an object's morphology. Excepting simple cases, a direct backward transformation from closure invariants to morphologies is not well established. We demonstrate using simple Machine Learning models that closure invariants can aid in morphological classification and parameter estimation. We consider six phenomenologically parametrised morphologies: point-like, uniform circular disc, crescent, dual disc, crescent with elliptical accretion disc, and crescent with double jet lobes. Using logistic regression (LR), multi-layer perceptron (MLP), and random forest models on closure invariants obtained from a sparsely covered aperture, we find that all methods except LR can classify morphologies with $\gtrsim$80% accuracy, which improves with greater aperture coverage. Separately from the classification problem, given an independently confirmed class, we estimate parameters of uniform circular disc, crescent, and dual disc morphologies using simple MLP models, and parametrically reconstruct images. The estimated parameters and images correspond well with inputs, but the accuracy worsens when degeneracies between parameters are present. This independent approach to interferometric imaging under challenging observing conditions such as that faced by the Event Horizon Telescope and Very Long Baseline Interferometry in general can complement other methods in robustly constraining an object's morphology.


INTRODUCTION
Image synthesis using radio interferometric measurements requires an array of receiver elements sampling spatial correlations of the radiation incident on the aperture to infer the spatial intensity distribution on the sky within the telescope's field of view (Thompson et al. 2017;Taylor et al. 1999).As the spatial resolution of the image scales inversely with the largest spacing between the interferometer array elements, obtaining very fine details in the image requires a technique called Very Long Baseline Interferometry (VLBI) which requires array elements widely separated from each other, with typical separations spanning continental or even planet-sized scales.Due to the sparseness of measurements and demanding requirements of accurate signal calibration required in maintaining a high degree of spatial coherence while combining signals from such large separations, VLBI imaging is extremely challenging in general (Thompson et al. 2017;Walker 1999).
There are certain invariants in interferometry like closure phases (Jennison 1958) and closure amplitudes (Twiss et al. 1960) that are immune to propagation and instrumental effects that are associable with individual array elements, and are thus independent of calibration and errors therein.Being true observables of the observed object's morphology, they implicitly or explicitly serve as useful an-★ E-mail: Nithyanandan.Thyagarajan@csiro.auchors for inferring an object's morphology.A few examples of classic VLBI successes that used closure quantities include the discovery of the double-lobed structures of Cygnus A (Jennison 1957;Jennison & Latham 1959) and Centaurus A (Twiss et al. 1960), determination of the core-jet morphology of quasar 3C 147 (Wilkinson et al. 1977), and providing the first direct evidence for superluminal expansion of the relativistic jet in quasar 3C 273 (Pearson et al. 1981).A recent example is the event horizon-scale imaging using the Event Horizon Telescope (EHT) data of the central supermassive black holes at the centres of M87 (Event Horizon Telescope Collaboration et al. 2019a,b,c,d,e,f) and Sgr A ★ (Event Horizon Telescope Collaboration et al. 2021) by the EHT collaboration (EHTC).
The sparseness of data, low signal-to-noise conditions, choices of initial models in deconvolution methods, even small miscalibrations, and other unconscious biases in the analysis can lead to artefacts and diverging image morphologies, and thus misinterpretations of the results.For instance, Carilli & Thyagarajan (2022) detailed the effects of the choice of initial models in the iterative imaging process.The choice of introducing unconstrained "boxes" in the CLEAN algorithm (Högbom 1974;Schwab 1984) can lead to divergent results (Cornwell et al. 1999;Miyoshi et al. 2022).The EHTC have acknowledged these challenges and followed a very detailed verification process to mitigate these risks.Subsequent analysis by other groups using different methods have found consistent results (Sun & Bouman 2021;Carilli & Thyagarajan 2022;Broderick et al. 2022;Arras et al. 2022;Medeiros et al. 2023).Nevertheless, independent confirmation by independent methods is paramount in such important scientific studies.
Closure invariants, being independent of element-based calibration, have the potential to be largely immune to the aforementioned risks present in traditional interferometric image synthesis.While closure invariants are mathematically well-defined and have been extensively used for decades, their physical interpretation has been unclear.A first geometric understanding of closure phase was presented in Thyagarajan & Carilli (2022).Closure invariants are adept at distinguishing point-like and centrosymmetric objects (unit closure amplitudes and vanishing closure phases) from non-symmetric morphologies, and corruptions that are element-based rather than spatially correlated (Thompson et al. 2017).Similarly, they can be used to infer the presence of polarisation inherent to the object (Broderick & Pesce 2020; Samuel et al. 2022).
Interferometric imaging based on closure invariants has been studied previously.For example, Chael et al. (2018) employ a minimisation scheme over closure quantities supplemented by some a priori information and regularisers that favour certain image features.Probabilistic sampling of the posterior distribution and forward-modeling of the sampled morphological class parameters has been used to determine the best fit to the measured closure invariants (Event Horizon Telescope Collaboration et al. 2019f;Broderick et al. 2020;Tiede et al. 2022;Saurabh & Nampalliwar 2023).Variational deep probabilistic imaging (DPI) approaches have been proposed (Sun & Bouman 2021;Sun et al. 2022) that, without training data, optimise the weights of a neural network to learn the posterior distribution of an unobserved image, from which image samples are generated to fit a particular measurement dataset such as closure invariants.However, it is still notable that unlike the analytical relationship between visibilities and image intensities, a direct backward-or inversetransformation to the object's morphology from a given set of closure invariants has not been clearly established.
In this work, our primary motivation is to demonstrate a proof-ofconcept that closure invariants can be used to distinguish between different image morphologies and provide an independent pathway for parametric image reconstruction using machine learning (ML) methods, wherein propagating the closure invariants through the weights of the layers of the ML model could provide a direct backwardtransform to the image morphology.The paper has two independent objectives, namely, to use calibration-independent closure invariants and simple ML models to (1) provide morphological classifications of images, and (2) parametrically reconstruct images.Here, we take a broad theoretical view and approach the classification and parametrisation as independent problems, both of which separately are relevant in interferometric inference of image morphologies.The latter is particularly relevant in situations where parameters have to be estimated when the classification is determined a priori such as the presence of FR-II core-jet quasar morphology, a relativistic jet expanding from an active galactic nucleus, a faint planet orbiting a star, asymmetry on a stellar surface, etc.
The paper is organised as follows.We introduce interferometric closure invariants in §2.We present a proof of concept for morphological classification using simple ML methods in §3, where we examine the accuracy of our simple ML classifiers across varying degrees of aperture coverage, noise, and morphological complexity, as well as their performance on classifying data from classes not included in training.A proof of concept for estimating the morphological parameters and reconstructing images using ML methods for some chosen morphologies is presented in §4, where we explore the sensitivity of closure invariants to the morphological parameters and degeneracies between them.§5 presents a discussion and summary.Appendix A contains supporting material on the ML models used.

CLOSURE INVARIANTS IN INTERFEROMETRY
The concepts of closure phases (Jennison 1958) and closure amplitudes (Twiss et al. 1960) have been in use in radio astronomy for many decades.They have been integral to advances in calibration and interferometric synthesis imaging (Cornwell & Fomalont 1999;Thompson et al. 2017;Carilli et al. 2022).They were extended to polarimetric measurements through the formalism of closure traces (Broderick & Pesce 2020).Recently, a unified and general theory of interferometric closure invariants for co-polar and polarimetric measurements has emerged from Thyagarajan et al. (2022) and Samuel et al. (2022), respectively.We will primarily use the former because this work pertains to co-polar measurements.Below is a brief outline of the theory of closure invariants in co-polar interferometry relevant for this work.
In radio interferometry, the basic measurement units are spatial correlations, known as visibilities, in the aperture plane corresponding to different array element spacings.In an interferometer array with  elements labelled by the indices  = 0, . . .,  − 1, each element, , measures the amplitude and phase of the stochastic electric field,   (represented by a complex number), incident on it.The true spatial correlation (visibility) between pairs of array elements is , where, † denotes complex conjugation.The visibilities are related to the spatial distribution of intensities on the sky or image plane ( van Cittert 1934;Zernike 1938).Under certain reasonable assumptions (Thompson et al. 2017;Clark 1999), these visibilities denote the Fourier components (in the aperture plane) of the images that we are aiming to classify and reconstruct, where, ŝ denotes a unit vector covering the sky surface, S. u  ≡ (, ) denotes the spacial frequency and is related to the spacing between elements, x  , as u  = x  /, where,  and  refer to and -components of u  in the aperture (Fourier) plane, and  is the wavelength. (ŝ) is the spatial distribution of intensities on S.
In real-world measurements, corruptions caused by propagation and instrumental effects affect the array element's measurement as  ′  =     , where,   is a complex corruption factor.Therefore, the measured visibility is also corrupted as  ′  =      †  .This necessitates calibration and elimination of these multiplicative corrupting factors.There are several situations when accurate calibration can be challenging and can lead to artefacts in the images (Cornwell & Fomalont 1999).
Invariants like closure phases and amplitudes are special interferometric quantities constructed in specific ways that eliminate the corrupting effects caused by the   terms partially or altogether (Thompson et al. 2017;Cornwell & Fomalont 1999).Because of this property, even when they are constructed with corrupted data, they remain true observables of the physical system under study, independent of local corrupting factors1 .
In this work, we adopt the Abelian gauge theory formalism to obtain a complete and independent set of closure invariants in copolar interferometric measurements (see Thyagarajan et al. 2022, for details).We start by identifying a specific antenna location as the reference vertex (indexed as element 0) and constructing all unique closed triangular loops starting and ending with this vertex (Thompson et al. 2017).For an -element array, this gives us (−1)(−2)/2 independent triangular loops.On each triangular loop, we define an advariant as where,  = ( † ) −1 for a non-zero complex number, .There are ( − 1) ( − 2)/2 complex advariants (one per independent triangle), each with a pair of real and imaginary parts.This amounts to ( − 1)( − 2) real numbers, all with the same unknown scaling factor, | 0 | 2 , that is associated with the reference vertex.In the final step, the sought closure invariants are obtained from the advariants by eliminating the unknown scaling factor by dividing all of them by any one of them that is non-zero such as the mean, maximum, root-meansquare, etc.This effectively yields  2 − 3 + 1 real-valued invariants after the loss of one degree of freedom in eliminating the unknown scale factor.For the EHT, if  = 7 telescopes are operational, there are 29 closure invariants in an instantaneous snapshot.

MORPHOLOGICAL CLASSIFICATION
The first goal of this paper is to explore the possibilities of classifying image morphologies using simple ML methods entirely from interferometric closure invariants rather than visibilities because the latter are not immune to calibration errors.We consider six different fiducial morphological classes (0-5) along with the parametrisations described below.Closure invariants are insensitive to some parameters such as absolute location and absolute intensity scale.The parameters that can be inferred for image reconstruction after the morphological classification are italicised: Note that these parametrisations are purely phenomenological and not physical.Fig. 1 shows examples of these morphological classes obtained with random instances of the respective parameters.The intensities are specified in arbitrary units, while linear dimensions are in pixels.The image sizes considered are 64 × 64, with an angular resolution of 3.52 as per pixel.

Input measurements
For sampling the fiducial Fourier (aperture) plane, we use frequencies of  1 = 230 GHz and  2 = 345 GHz.Fig. 2a shows the aperture sampled in this study.The red and blue symbols denote 230 GHz and 345 GHz, respectively, while the + and × symbols denote the two times at which the aperture was sampled, thereby yielding four possible combinations.We employ three sampling configurations in this paper to study the effect of varying the level of aperture plane sampling on the morphological classification accuracy.These aperture configurations correspond to time  1 at frequency  1 , times  1 and  2 at frequency  1 , and times  1 and  2 at frequencies  1 and  2 .They are denoted by 1Fx1T, 1Fx2T, and 2Fx2T, respectively.The seven elements of the EHT array yield 21 snapshot visibilities at any given time and frequency.Therefore, the three sampling configurations consist of 21, 42, and 84 visibilities, respectively.We employ a sparser aperture sampling than in typical real observations to test the performance against sparse measurements.
We generate 10,000 instances of 64 × 64 images for each morphological class using random realisations of their respective parameters (see Fig. 1).The pixel resolution in angular units is 3.51 as.Assuming all the array elements are identical and have uniform power sensitivity across the entire image, we simulated the visibilities using a direct Fourier Transform at the antenna spacings for the various realisations of the images under each of the morphological classes using equation (1).At this point, we do not add noise to the simulated visibilities.We consider the impact of noise in §3.4.
The closure invariants are constructed from equation (2) using the Abelian gauge theory formalism for co-polar invariants (Thyagarajan et al. 2022).In this formalism, a 7-element array yields 15 independent triads and a total of 30 real numbers (from 15 complex-valued advariants) with one unknown but common scale factor shared between them.Dividing them all by any one of them (which is non-zero) eliminates this common scale factor and provides 29 closure invariants with the thirtieth number being the trivial unity as a result of this division as detailed in Thyagarajan et al. (2022).In this study, besides the 29 invariants, we also keep the additional trivial number -the unity -purely for bookkeeping convenience even though it contributes no valuable information.We note that for a point-like morphology, all the complex-valued advariants have real and equal values.Therefore, the closure invariants consist of equal numbers of ones and zeros.This is equivalent to having zero closure phases and unit closure amplitudes independent of the location or the absolute intensity, as expected from the centrosymmetry of a point-like object (Thompson et al. 2017;Thyagarajan et al. 2022).Similarly, the symmetry of a uniform circular disc is reflected in the vanishing of all the imaginary parts.More complicated morphological classes have correspondingly non-trivial combinations of real and imaginary parts.
The input for the training of our ML classifiers and parameter estimators comprises of these closure invariants, the morphological class labels and class parameters they were generated from.The ML model so generated then takes an unknown set of closure invariants as input and predicts the morphological class and the respective parameters.Note that neither the images nor the visibilities themselves are used as inputs.

Machine Learning Models
We consider four simple ML methods for classifying the image morphologies, whose details and performance are characterised below.We use PyTorch for implementing most of the models except for the Random Forest classifier for which we use scikit-learn.The ML models are visually illustrated in Fig. A1 in Appendix A. In each of these methods, we split the data randomly with 80% used for training and 20% for validation.We randomise the composition of the training and validation sets several times maintaining the same proportional split, and find no significant change in the results.For  each method, we choose the minimum possible number of training epochs soon after the training loss flattens to an extent of showing no significant change (≲ 0.1%) to avoid over-fitting to the training data and to preserve the method's generalisation capabilities for unseen test data.

Logistic Regression
Logistic regression (LR) estimates the probability of an event occurring, such as the morphological class, based on a given data set of independent variables, namely, the closure invariants.LR is considered a discriminative model, which means that it attempts to distinguish between classes.Since the dependent variable (morphological class) has six possible outcomes, and no specified order, we employ multinomial LR, which is a simple network consisting two layers -input (corresponding to the number of closure invariants) and output (number of morphological classes).
The training is performed using Sigmoid activation, Adam optimiser (learning rate 10 −4 ), and 10 epochs.Fig. 3a shows the confusion matrix 2 between true and predicted classes using LR for the 2Fx2T aperture model.The presence of significant off-diagonal values indicates that LR's classification was not very accurate even 2 In statistical classification, a confusion matrix is a  ×  matrix consisting of statistics of ground truths and predictions for  classes.The diagonal elements represent statistics of accurate classification while off-diagonals denote that of misclassifications.though it is notably better than a random classification.Even pointlike morphologies are accurately classified in only 63% of the instances.The classification performance on other morphologies is only ∼ 33 − 38%.LR is one of the simplest classification schemes and is not expected to perform with high accuracy on non-binary classes.

Multi-layer Perceptron
Multi-layer Perceptron (MLP), a supervised learning algorithm, is a fully connected feed-forward artificial neural network with at least three layers (input, output, and at least one hidden layer) that can learn to classify non-linearly separable patterns.

Random Forest
A random forest classifier (RF) is an estimator that fits a number of decision tree classifiers on various sub-samples of the data set and uses averaging to improve the predictive accuracy and control overfitting.We use 200 trees in the forest.Fig. 3c shows the confusion matrix obtained with training on the 2Fx2T aperture model with an RF classifier.The point-like and uniform circular disc morphologies are classified perfectly.The double disc and crescent morphologies are classified at ≳ 90% accuracy, and the crescent with accretion disc and jet lobes are classified with ≳ 76% accuracy.Overall, for the specific sets of trainable parameters chosen in this work, the RF classifier appears to perform better than the rest.

Effect of aperture coverage on classifiers' performance
We train the aforementioned models for the three aperture models with different levels of aperture coverage with a balanced input of morphological classes.To test their performance on unknown test data, we use the  1 score (Rĳsbergen 1979), a harmonic mean of precision and recall, as a measure of the test's accuracy.During testing, to avoid skewing the results based on the level of balance of input data, we use 1000 iterations each containing 100 randomly drawn test inputs (excluding the training set) with no regard for the balance of classes to determine the  1 score statistics even including the worst cases of imbalance across classes.Fig. 4 shows the distribution of  1 scores for randomly drawn test inputs.LR and RF appear to have the worst and best  1 scores, respectively.MLP has intermediate  1 scores.In all cases, the  1 scores show marked improvement as the aperture coverage improves.This is because the morphological information sampled by the successively improving aperture coverage helps build better predictive ML classifiers.
Even with limited aperture coverage (Fig. 2a), we find that the accuracy in a multi-class classification can be > 80% using simple ML models.Increasing the aperture coverage is expected to improve the classification accuracy further.Refinements to these models for improved classification accuracy is left to future work.

Effect of noise
We examine the effect of noise in the data by injecting various levels of Gaussian noise into the images.We quantify the noise through the inverse of signal-to-noise ratio (SNR) at levels SNR −1 = 0 (noiseless), 1/30, 1/10, and 1/3, with the noise standard deviation defined relative to the maximum intensity in the images.The images are subsequently converted to visibilities, and eventually to closure invariants that are used for training and validation.Noisy test closure invariants data are then passed to the trained models to determine the classification accuracy,  1 .Note that this noise in the closure invariants is non-Gaussian due to the non-linear transformations.
Fig. 5 shows the impact of injected noise levels on  1 scores.Overall, the MLP and RF models perform similarly with the latter achieving marginally better accuracy.The LR model performs at roughly half the accuracy as the other two.The performance of the MLP and RF models at an SNR = 30 are comparable in the noiseless case with accuracies of ∼ 80-90%, which drop to ∼ 50% when SNR = 3.
It is seen that the MLP model trained on noisy and noiseless (infinite SNR) data and applied on noiseless test data (SNR −1 = 0 in Fig. 5) performs slightly better (by a few percent) than that trained only on noiseless data (2Fx2T model in Fig. 4) due to the additional diversity of information provided through the varying SNR in the training and validation process.A worst-case scenario of the ML models' performance when trained on noiseless data but tested on noisy data is shown in Fig. B1.
The EHT has non-uniform noise levels across its baselines because the SEFD of the participating telescopes vary significantly (Event Horizon Telescope Collaboration et al. 2019b).In this work, as noise is injected through the images, the noise levels in the visibilities are approximately uniform across element separations.However, the wide range of SNR covered and the corresponding accuracy range recorded here can be indicative of the accuracy achievable for nonuniform noise levels such as in the EHT.More complex scenarios like baseline-dependent noise levels applicable to specific cases like the EHT are being explored separately.

Performance on untrained class data
We conduct a limited study of applying our trained models on test data drawn from two untrained classes, namely, -ring models (Johnson et al. 2020;Roelofs et al. 2023), and radiatively inefficient accretion flow (RIAF) models of Sgr A ★ (Broderick et al. 2011).
We explored -ring models of various orders, but present results here only for order  = 1 with the  parameter,   = 0, for | | > 1. Fig. 6a illustrates the classification performance of our MLP model for first order -ring models drawn from 1000 randomly sampled parameters.They are classified primarily as crescents and jets in ≈ 54% and ≈ 43% of the instances, respectively.A few examples of the input test images classified as crescents and jets are also shown.
It is noteworthy that the -ring data classified as jets clearly have no lobe-like features as in Fig. 1f.We attribute this confusion in classification to the fact that our jet class definition includes a crescent in addition to the jet lobes, which will become indistinguishable from the crescent class when the jet lobes are significantly faint.Therefore, the two are not orthogonal but overlapping classes, which are known to suffer from ambiguity.
Notably, the images classified as jets have systematically smaller diameters than those classified as crescents.This is because the jet lobes in the training have to fit inside the image's fixed bounding box and thus restricts the crescents to smaller sizes (see parameter ranges in §3).Thus, it is apparent that our classification model assigns the crescent's dimensions a greater degree of importance over the presence of jet lobes in the data.Fig. 6b shows the MLP's classification performance on 9090 instances of test data drawn from the RIAF Sgr A ★ images.The majority (≈ 40%) are classified as crescents, but a significant number of cases (≈ 22%) are also classified as accretion and jets.Notably, these classes also include a crescent.
In either case, the confusion in classification arises primarily due to the overlap between morphological classes.The confusion between these classes is also evident from the confusion matrices in Fig. 3.
Tackling overlapping classes requires more careful consideration of class definitions, and other advanced techniques like feature engineering that involves enhancing emphasis on discriminative features, ensemble learning that trains different classification algorithms and combines their predictions, and active learning requiring the user to provide inputs on ambiguous data points.These avenues will be explored in future work.

MORPHOLOGICAL PARAMETRISATION
Here, we investigate the ability of ML regression to estimate parameters that are retrievable using only closure invariants and implement a parametric image reconstruction.Absolute position and intensity information is lost while using closure invariants, and we don't attempt to retrieve such information.We approach this parametrisation not in sequence to the classification but as an independent subject.The morphological class is assumed to have been specified independently.Our approach is to parametrically reconstruct an image once that classification is specified by independent means.Here, we consider the 2Fx2T aperture model.
Point-like morphologies can only be detected and classified, but cannot be reconstructed because they are parametrised by a single location and single intensity, whose absolute values are not accessible by closure invariants.We study the parametric image reconstruction of the following morphological classes: uniform circular disc (1), dual disc (2), and crescent (3), noting that we do not estimate an absolute location or an absolute intensity scale.We employ MLP to determine the morphological parameters.The MLP models used for the different morphologies are visually illustrated Fig. A2 in Appendix A. The specific MLP model is briefly described below for each morphology.

Uniform circular disc
The only parameter of the uniform circular disc that we can estimate with closure invariants is its radius.We use an an MLP model with ReLU activation and an Adam optimiser (learning rate 10 −3 ).After 10 epochs, the prediction accuracy asymptotically approaches 100%.Fig. 7 shows a histogram of the relative error between the prediction and the true value in percent.The prediction is accurate to within 0.04% with 95% confidence.

Crescent
In the morphologies we studied, the crescent model offers the next level of complexity relative to the uniform circular disc.The parameters of the crescent model we estimate are the radii of the outer and 0.04 0.02 0.00 0.02 0.04 Histogram (in percent) of percentage error in the prediction of uniform circular disc radius using MLP (see Fig. A2a).The box-whisker plot shows the median, lower and upper quartiles, and the 95% confidence interval (≃ ±0.04%) of the prediction error.
inner discs,  1 and  2 , respectively, and the -and -offsets of the centre of the inner disc from the outer one, , and , respectively.
We use an MLP model with ReLU activation, 20% dropout of neurons between layers, and an Adam optimiser (learning rate 10 −3 ).We perform a 10-epoch training with 8-fold cross-validation.Figs.8a-8d show the predicted parameters against the true values as "violin" plots.The predictions of all the four parameters are consistent with their respective input values to within 95% confidence over the range of values considered.
As an example, Fig. 8e shows an input image of a crescent.With the predicted parameters, Fig. 8f shows a parametrically reconstructed image.The difference between the input and the parametrically reconstructed image is shown in Fig. 8g.While the reconstructed image resembles the input image reasonably, the prediction accuracy is not as high as the uniform circular disc case.This is because the same amount of input data is used to constrain a larger number of parameters, thereby leading to a slight degradation in prediction accuracy.

Dual disc
A double disc is more complex than a crescent because in addition to the radii of the two discs ( 1 and  2 ) and the displacement of the centre of one relative to the other ( and ), the ratio of two intensities ( 1 / 2 ) is an extra parameter.Again, we use an MLP model with ReLU activation, 20% dropout of neurons between layers, and an Adam optimiser (learning rate 10 −3 ).We perform a 10-epoch training with 8-fold cross-validation.
Figs. 9a-9e show the input against the predicted parameters as violin plots.The ideal prediction is enveloped by the central 95% of the distribution of predicted parameters in all cases except  2 .The model suffers from an over-prediction of  2 at smaller values and a significant under-prediction at larger values.The prediction accuracy appears to have worsened relative to the crescent case.This is again attributed to an increase in the number of morphological parameters for the same number of input closure invariants, which results in inaccurate constraints on  2 .Figs. 9f-9h show an input, the parametric reconstruction, and the difference image for the dual disc model.
To understand the origin of this systematic misestimation, we examine the Jacobian matrix, J   /   that describes the sensitivity of closure invariants,   , to changes in morphological parameters,   , to explore degeneracies in the system.In the case of the 2Fx2T aperture and dual disk models, we have 120 closure invariants (only 116 are independent and the rest kept for bookkeeping convenience) and 5 dual disc morphology parameters.Thus, J is a 120 × 5 matrix.The rank of J indicates the degree of independence of the parameters in determining the closure invariants.A reduction from a full rank (5 in this case) corresponds to a commensurate number of degeneracies between the parameters, and a full rank implies that the closure invariants are sensitive to all 5 parameters with no degeneracies between them.We construct the Jacobian, , by introducing small perturbations to the chosen parameters and recording the corresponding changes in the closure invariants.We compute the rank of J by estimating the number of non-negligible singular values through Singular Value Decomposition (SVD).We repeat the rank determination over 1000 random realisations of the morphological parameters sampling various regions of the parameter space.
Fig. 10 shows a distribution of the Jacobian's ranks (top right) from the 1000 realisations sampling the dual disc morphology.In a majority of instances (≈ 75%), the Jacobian, J, is found to be short of full rank indicating a significant degree of degeneracy, where certain combinations of parameters are inconsequential in affecting closure invariants.The corner plot illustrates the ranks as a function of sampled parameters shown pairwise with the marginal distributions shown on the diagonal.We are unable to identify clear trends of the behaviour of Jacobian rank as a function of input parameters.This confirms that complex degeneracies are present in the data considered here.Another plausible origin of the degeneracy is the rather sparse aperture coverage used here.Using more data by enhancing the aperture beyond the sparse coverage considered here could potentially resolve these degeneracies.Refinement of ML models in conjunction with use of larger data size for mitigating the degeneracies and improving the prediction accuracy will be subject of future work.

DISCUSSION AND SUMMARY
Extreme VLBI conditions such as the requirement of high-precision calibration, sparse data, and low signal-to-noise pose a big challenge to interferometric imaging.In such cases, closure invariants, which are immune to calibration errors, critically provide a reliable information anchor for estimating the image morphology and preventing divergent solutions and artefacts, and have indeed been successfully employed for decades.Despite their extensive use, a direct backward transformation from closure invariants to morphological parameters has not been straightforward.This, and the prospect that closure invariants can provide independent avenues for constraining image morphology, are the primary motivators of this study.
We demonstrate that Machine Learning methods applied on the closure invariants can be used to classify the image morphology as well as independently determine the morphological parameters if the morphological class is known a priori.The latter can aid in parametric image reconstruction.We study the ability of simple methods like logistic regression, multi-layer perceptron, and random forest classifier to classify six different image morphologies -point-like, uniform circular disc, dual disc, crescent, crescent with elliptical accretion disc, and crescent with double jet lobes.Among them, all except logistic regression yield  1 scores ≳ 0.8 even with relatively sparse aperture coverage.The classification accuracy of all ML models notably improves with increasing aperture coverage and degrades with increasing noise.Our classifiers exhibit ambiguity between the crescent, jet, and accretion classes on untrained test inputs from ring and Sgr A ★ models.We attribute this to the fact that the class definitions used herein are not orthogonal but overlapping.
As a separate study, we employ simple multi-layer perceptron models to estimate morphological parameters such as radii, relative offsets, ratio of intensities, etc. for the uniform disc, crescent, and dual disc morphologies.With these estimates, we perform parametric image reconstruction without information about absolute position or intensity scale.The estimated parameters and reconstructed images are largely consistent and significantly correlated with the corresponding true values.Increasing the morphological complexity while using a relatively sparse aperture coverage results in a corresponding loss of accuracy in the estimated parameters.This is evidently due to the insensitivity of closure invariants to certain combinations of parameters owing to inherent degeneracies between them.Increasing the aperture coverage and the sophistication of the ML models is expected to resolve these degeneracies and improve the prediction accuracy, which will be subject of future study.
Our proof-of-concept method demonstrates, using a backwardmapping approach, that both constraints on object morphologies and reconstruction of images are possible using interferometric closure invariants.It offers an independent approach that may be particularly suited for sparsely-sampled observations with challenging calibration requirements such as the EHT.Machine learning methods can be useful in mitigating the degeneracies imposed by local minima and sparsity of information that may be encountered in existing methods when mapping from closure invariants to image morphologies.The potential of this method demonstrated herein warrants further exploration, the next steps of which will be to characterise its performance relative to other imaging methods.This approach could be used to complement, rather than replace, traditional methods since closure invariants do not carry any inherent information about absolute positions or intensities, and potentially other information due to the lost degrees of freedom relative to having reliably calibrated visibilities.(1, 200) output: (1, 200) Linear depth:1 input: (1, 200) output: (1, 300) BatchNorm1d depth:1 input: (1, 300) output: (1, 300) ReLU depth:1 input: (1, 300) output: (1, 300) Linear depth:1 input: (1, 300) output: (1, 200)

Figure 1 .
Figure 1.Example images of six morphological classes that are classified using closure invariants.Each pixel corresponds to 3.52 as.(a) Point-like objects are simulated at random locations with random intensities.(b) Uniform circular disc morphology is parametrised by its radius.(c) Double disc morphology is parametrised by the radii of the two discs, the -and -offsets between their centres, and the ratio of their intensities.(d) A crescent morphology is parametrised by the inner and outer radii, and the -and -offsets of the inner hollow region from the centre.(e) A crescent with accretion disc morphology is parametrised by the crescent parameters in (d), an elliptical accretion disc parametrised by semi-major axis with an ellipticity of 0.2 and a position angle, and an intensity ratio between the crescent and the accretion disc.(f) A crescent with double jet morphology consists of a crescent as parametrised above, in addition to two diametrically opposite jets with -and -offsets from the centre, and two intensities relative to the crescent structure.

Fig
Fig.2bshows the closure invariants computed for various realisations of each morphology.The classes are labelled on the left with horizontal lines denoting class boundaries, each of which includes 10,000 realisations.The closure invariants are numbered along the -axis.Alternating bands of 15 closure invariants correspond to the real and imaginary parts of the advariants(Thyagarajan et al. 2022), respectively.The three aperture sampling configurations, 1Fx1T, 1Fx2T, and 2Fx2T correspond to the first 30, 60, and 120 closure invariants, respectively.The colour scale represents the value of the closure invariants.We note that for a point-like morphology, all the complex-valued advariants have real and equal values.Therefore, the closure invariants consist of equal numbers of ones and zeros.This is equivalent to having zero closure phases and unit closure amplitudes independent of the location or the absolute intensity, as expected from the centrosymmetry of a point-like object(Thompson et al. 2017; Thyagarajan et al. 2022).Similarly, the symmetry of a uniform circular disc is reflected in the vanishing of all the imaginary parts.More complicated morphological classes have correspondingly non-trivial combinations of real and imaginary parts.
Sampled Fourier components in the aperture plane.Closure invariants for various aperture models.

Figure 2 .
Figure 2. (a) Aperture (Fourier) plane sampling (in units of wavelengths) obtained using locations of seven telescopes in the EHT array at  1 = 230 GHz (red) and  2 = 345 GHz (blue) at times  1 = 2017-04-05 04:46:05 UTC (+ symbols) and  2 = 2017-04-05 05:34:05 UTC (× symbols).The inset shows a zoomed view of the shortest element spacings.The three aperture sampling configurations in this paper correspond to  1 at  1 (red + denoted by 1Fx1T),  1 at  1 and  2 (red + and × denoted by 1Fx2T), and  1 and  2 at  1 and  2 (red and blue + and × denoted by 2Fx2T).(b) Closure invariants for 10,000 random realisations of parameters under each of the six morphological classes as labelled on the -axis.The closure invariants are indexed on the -axis.The aperture models 1Fx1T, 1Fx2T, and 2Fx2T correspond to the first 30, 60, and 120 indices, respectively.The values of closure invariants are indicated by the colour scale.Alternating bands of 15 indices correspond to the real and imaginary values of the complex advariants from which the real-valued co-polar closure invariants are derived using Abelian gauge theory formalism (Thyagarajan et al. 2022), in which all point-like morphologies have zero imaginary values and unit real values regardless of their location or absolute intensities.

Figure 3 .
Figure 3. Confusion matrices (in percentage) of (a) Logistic Regression, (b) Multi-layer Perceptron, and (c) Random Forest from training for image morphology classification.Diagonals denote accurate classifications while off-diagonals denote misclassifications.

Figure 5 .
Figure 5.  1 score variation as a function of injected noise in test inputs, denoted by SNR −1 , for logistic regression (blue dashed line connecting filled circles), multi-layer perceptron (black dotted line connecting filled triangles), and random forest (red solid line connecting crosses) models that were trained on noisy data using the 2Fx2T aperture coverage model.The error bars denote 95% confidence intervals.The -axis uses a linear-logarithmic scaling.The  1 scores of our ML models trained on noiseless data but tested on noisy inputs are presented in Fig.B1.
Figure 6.(a) MLP model classification of -ring data (untrained class).The -ring test data (order  = 1) are predominantly classified as belonging to the crescent and jet classes.A few examples of -ring data classified as crescents and jets are shown in the middle and right panels, respectively.(b) MLP model classification of Sgr A ★ images from radiatively inefficient accretion flow (RIAF) models (another untrained class).The RIAF Sgr A ★ test data are predominantly classified as belonging to the crescent, accretion, and jet classes.A few examples of the RIAF data classified under crescent and accretion categories are shown in the middle and right panels, respectively.Note that our jet and accretion classes include a crescent as well (see Fig. 1).
Figure 7. Histogram (in percent) of percentage error in the prediction of uniform circular disc radius using MLP (see Fig.A2a).The box-whisker plot shows the median, lower and upper quartiles, and the 95% confidence interval (≃ ±0.04%) of the prediction error.

Figure 8 .
Figure 8.Estimated parameters of crescent using MLP (see Fig. A2b) and parametric image reconstruction example.Predicted parameters against input parameters are shown for (a) outer disc radius,  1 , (b inner disc radius,  2 , (c) -offset,  , and (d) -offset,   between the two discs.The diagonal lines denote perfect prediction.The mean and central 95th percentile are marked on the "violin" plots.The shade of the "violin" indicates the relative number of input data in that bin.(e) Input crescent image example.(f) Parametrically reconstructed crescent image from estimated parameters.(g) Difference between input and reconstructed image.

Figure 9 .Figure 10 .
Figure 9.Estimated parameters of dual disc using MLP (see Fig. A2c) and parametric image reconstruction example.Predicted parameters against input parameters are shown for (a) larger disc radius,  1 , (b) inner disc radius,  2 , (c) ratio of two disc intensities,  1 / 2 , (d) -offset,  , and (e) -offset,   between the two discs.The diagonal lines and "violin" plots have same meaning as in Fig. 8. Panels (f), (g), and (h) show input, reconstructed, and difference images, respectively, of a dual disc example.

Figure A1 .Figure A2 .Figure B1 .
Figure A1.Machine learning models used for classification for the 2Fx2T aperture model.
The eht-imaging software, which contains the exact locations of the EHT array elements(Event  Horizon Telescope Collaboration et al. 2019b) provides our fiducial array layout.Two fiducial times of 1 = 2017-04-05 04:46:05 UTC and  2 = 2017-04-05 05:34:05 UTC are chosen to have seven locations of telescopes of the EHT array for which an object at RA=12 h 30 m 49.42 s , Dec=+12 • 23 ′ 28.04 ′′ (J2000) would be visible simultaneously.The seven locations correspond to that of the Atacama Large Millimeter/submillimeter Array (ALMA; Matthews et al. 2018) and the Atacama Pathfinder Experiment telescope (APEX; Wagner et al. 2015) in Chile, the Large Millimeter Telescope Alfonso Serrano (LMT; Ortiz-León et al. 2016) in Mexico, the IRAM 30 m Young et al. 2016)Veleta (PV;Greve et al. 1995)in Spain, the Submillimeter Telescope Observatory (SMT;Baars et al. 1999)in Arizona, the James Clerk Maxwell Telescope (JCMT) and the Submillimeter Array (SMA;Young et al. 2016)in Hawai'i.For this demonstrative study, we treat all telescopes as identical even though they exhibit substantial differences in their actual characteristics(Event Horizon  Telescope Collaboration et al. 2019b).