Structure Effects for 3417 Celestial Reference Frame Radio Sources

Geodetic/astrometric very long baseline interferometry (VLBI) has been routinely observing using various global networks for 40 yr, and it has produced more than 10 million baseline group delay, phase, and amplitude observables. These group delay observables are analyzed worldwide for geodetic and astrometric applications, for instance, to create the International Celestial Reference Frame (ICRF). The phase and amplitude observables are used in this paper, by means of closure analysis, to study intrinsic source structures and their evolution over time. The closure amplitude rms, CARMS, indicating how far away a source is from being compact in terms of morphology, is calculated for each individual source. The overall structure-effect magnitudes for 3417 ICRF radio sources are quantified. CARMS values larger than 0.3 suggest significant source structures and those larger than 0.4 indicate very extended source structures. The 30 most frequently observed sources, which constitute 40% of current geodetic VLBI observables, are studied in detail. The quality of ICRF sources for astrometry is evaluated by examining the CARMS values. It is confirmed that sources with CARMS values larger than 0.30 can contribute residual errors of about 15 ps to geodetic VLBI data analysis and those with the CARMS values larger than 0.4 generally can contribute more than 20 ps. We recommend CARMS values as an indicator of the astrometric quality for the ICRF sources and the continuous monitoring of the ICRF sources to update CARMS values with new VLBI observations as they become available.


Introduction
Extragalactic radio sources have been routinely observed by geodetic/astrometric 5 very long baseline interferometry (VLBI) since 1979 and have been used, for example, to create the International Celestial Reference Frame (ICRF; e.g., Johnston et al. 1995;Ma et al. 1998) adopted by the International Astronomical Union as the fundamental celestial reference frame. The newly released, third realization of the ICRF (ICRF3; 6 Charlot et al. 2018) contains 4536 radio sources on the sky. The majority of radio sources in ICRF3 have formal position uncertainties smaller than 1 mas. Since ICRF radio sources are too distant to exhibit any detectable proper motion, except apparent proper motions due to the acceleration of the solar system barycenter (see Titov et al. 2011;Xu et al. 2012;Titov & Lambert 2013), the ICRFs are considered to be global and quasi-inertial celestial reference frames accurate at the submilliarcsecond level. For instance, the second realization of the ICRF (ICRF2; Fey et al. 2015) claims that its axis stability is at the level of 10 μas.
As pointed out when the first realization of the ICRF (ICRF1) was created, however, the underlying kinematic physics of ICRF radio sources was not as well understood as that of stars (Ma et al. 1998), not yet enough to promise such a high stability of the ICRF for a long term. The radio emissions of the ICRF sources generally exhibit spatially extended structure on milliarcsecond scales. More importantly, the intrinsic structures are variable in time, and it is clear that many radio sources have changed their reference positions by far larger than their uncertainties in ICRF because of the changes in their structures.
Source structure has three negative impacts on geodetic/ astrometric VLBI. First, it leads to variations in the reference position of a source, which can be caused either by a change in observing geometry or by a change in intrinsic structure. Second, as shown, for instance, in Charlot (1990), it gives rise to structure delay in the group delay observable up to a few nanoseconds. The first impact, the absolute structure effects, can be absorbed by estimating absolute source position, but the second impact, the relative structure effects, introduces error contributions to residuals and can bias estimates of other geodetic parameters. Third, as studied in Anderson & Xu (2018a), it reduces the actual signal-to-noise ratio R sn and adds additional thermal noise to VLBI measurements. These three impacts have led source structure to be a critical issue in geodetic/astrometric VLBI.
Because geodetic/astrometric VLBI assumes that the ICRF sources are ideally compact and source-structure effects are not modeled in data analysis, source structure leads to errors in geodetic parameter estimates and causes instabilities of the ICRF. Source structure and its variability should be invaluable indicators of source quality for both the creation and the applications of the ICRFs. For example, Fey et al. (1996) and Fey & Charlot (1997, 2000 derived images of 389 ICRF radio sources from dual-frequency observations by the Very Long Baseline Array (VLBA), and based on those images, they developed so-called structure indices, which are used to categorize ICRF2 sources and exclude "bad" sources from routine observations in scheduling. In 2018 April, the Gaia data release 2 was published (Lindegren et al. 2018). Gaia will observe for another several years and will thereby improve in terms of accuracy the positions of radio sources at optical wavelengths, which do not have to be identical to the positions at the radio wavelengths at the level of the uncertainties of Gaia and VLBI (e.g., Petrov & Kovalev 2017;Gaia Collaboration et al. 2018). Investigating the source structure of each individual ICRF source over the whole time span of geodetic/astrometric VLBI observations is therefore critical both for selecting radio sources with high astrometric qualities to link optical and radio frames (e.g., Bourda et al. 2008Bourda et al. , 2011 and for understanding the position offset between Gaia and VLBI for individual sources (e.g., Kovalev et al. 2017;Petrov & Kovalev 2017;Petrov et al. 2018).
On the other hand, for geodetic and geophysical applications, geodetic VLBI has devised its next-generation system, the so-called VLBI Global Observing System (VGOS), to meet the very stringent science requirements for the terrestrial reference frame (TRF), an accuracy of 1 mm for global scales and 0.1 mm yr −1 for long-term stability (Niell et al. 2006;Petrachenko et al. 2009). Prototype VGOS systems have been implemented and obtained broadband observations with a measurement noise of only a few picoseconds (Niell et al. 2018). Based on these prototype VGOS systems, three of the four proposed strategies that were conceived to achieve its goals are addressed fully or partly: to reduce the average source-switching interval, to decrease the random thermal noise in the delay measurement, and to minimize the susceptibility to radio frequency interference (RFI). They can be solved mainly by technical improvements within the VGOS antenna system, for example, by installing ultra-broadband receivers and significantly increasing the slewing rates of telescopes. As concluded by Niell et al. (2018), however, the reduction of systematic errors including source structure, which is the fourth strategy component of the VGOS system, is not yet addressed and remains as the main challenge. The initial analysis of CONT17 observations 7 from a global VGOS network has confirmed that several sources have extremely large postfit delay residuals, such as 3C 418, 3C 371, 0229+131, 0552 +398, and 2229+695, and the observations of these sources have to be deselected in geodetic VLBI data analysis (P. Elosegui 2018, private communication). The broadband VGOS systems, the observing frequency of which ranges from 2 to 14 GHz, will bring several benefits for geodesy, but they certainly will make the impacts of structure effects much worse than the traditional S/X systems. Unlike the first three strategies of the VGOS system, in order to reduce the systematic errors in VLBI observables, a significant investigation of VLBI observations and a different method of data analysis are required.
The goal of this paper is to demonstrate the source-structure effects, to identify the variations in their intrinsic structures over time, and finally to quantify the overall structure-effect magnitudes for as many ICRF3 sources as possible. The entire data set of VLBI observations used for the creation of the ICRF3 and geodetic VLBI applications is analyzed, but with different types of observables, i.e., phase and amplitude, to provide independent insights into the quality of ICRF3 sources. This paper is structured as follows. The geometric/astrometric VLBI observations are described in Section 2. The methodology is given in Section 3. The results are presented in Section 4, with discussions of closure phases in Section 4.1, closure amplitudes in Section 4.2, and the overall structureeffect magnitudes in Section 4.3. The usages of these overall structure-effect magnitudes are discussed in Section 5, while the summary and future work follow in Section 6. The closure phase and closure amplitude plots for the sources among the 30 most frequently observed sources that are not addressed in the main body of this paper are presented in the supplemental information at [doi:10.11570/19.0007].

Data
We analyzed almost 40 yr of dual-frequency, S/X band, geodetic/astrometric VLBI observations driven by various global geodetic VLBI observing campaigns (e.g., the NASA Crustal Dynamics Project (Coates et al. 1985;Smith & Baltuck 1993) and the International Radio Interferometric Survey program (Carter & Robertson 1986)) since 1979, and coordinated by the International VLBI Service for Geodesy and Astrometry (IVS; Schuh & Behrend 2012;Nothnagel et al. 2017; please also refer to the IVS website 8 ) since its foundation in 1999. Geodetic/astrometric VLBI sessions in general have a duration of either 24 hr or 1 hr. The majority of these 24 hr sessions are Earth Orientation Parameter (EOP) sessions (twice per week, R1 and R4), TRF sessions (once every two months), and CRF sessions (at least once every two months). With other dedicated types of sessions and regional sessions filled in, the 24 hr sessions are three to four times per week on average. The 1 hr sessions, called intensives, are operated on a daily basis to monitor Earth's phase of rotation. These intensive sessions are observed by a single baseline seven times per week and by three or four stations once per week. Refer to the aforementioned IVS website for a more detailed description of IVS operations. A complete list of these sessions every year since 1979 is also publicly available. 9 All 24 hr sessions and the 1 hr sessions 10 with an observing network of at least four stations available on 2018 February 26 were used in our analysis. A total of 14,759,830 observations 11 of 5228 radio sources were obtained in 6533 24 hr sessions and 150 intensive sessions by 191 stations with about 40 yr of observing history, as listed in Table 1. Even though phase delays are well known to be more precise than group delays and have long been used by astronomers to study various celestial radio sources, they have been rarely used for geodetic VLBI, due to unresolved ambiguity. The observed amplitudes are important for the astrophysical imaging, but in geodesy they are only used empirically to monitor the total fluxes of radio sources, which are subsequently used for scheduling observations (Le Bail et al. 2016). This is the first attempt ever to extensively exploit both phase and amplitude observables in regular geodetic/astrometric VLBI sessions. Those observables were extracted from the vgosDB or Goddard databases 12 by searching for the predefined keywords listed in the supplemental information available at [doi:10.11570/19.0007]. Because about 40% of S-band observations were missing all or parts of the necessary keywords in the archived databases, the results reported here are restricted to X-band observations only.
Geodetic/astrometric VLBI schedules observations with an expected R sn to be at least 20 at the X band and 15 at the S band for all baselines in each scan. Due to strong local RFI and resolved source structures, a significant reduction in the R sn for actual measurements can happen. In this study, initial outlier flagging was done by setting the minimum R sn to be 6. After the initial flagging, the mean and median values of R sn for the X band are 112 and 47, respectively; about 23% of observables have an actual R sn smaller than 25 at the X band. For the cases of R sn >6, the probability functions for the amplitude and phase observables are approximately Gaussian distributions defined by the independent zero-mean thermal noises; therefore, the uncertainty of the phase observable due to the thermal noise σ f can be obtained from Thompson et al. (2017, Equation (9.67)), 1 , 1 sn and the uncertainty of the amplitude observable σ ν can also be determined from Thompson et al. (2017, Equation (9.66)), where ν is the correlation amplitude. Thus, the phase and amplitude observables have their uncertainties calculated based on these two equations, ν, and R sn . Another important quantity for our study is the reference frequency for phase observables, which for these X-band data ranges from 7741.99 to 8794.99 MHz. The three reference frequencies that are related to about 90% of observations are 8212.99 MHz (55%), 8210.99 MHz (25%), and 8409.99 MHz (10%).

Closure Analysis of Geodetic/Astrometric VLBI Data
The method of closure analysis was applied for this study. Closure analysis for group delay observables was introduced and conducted by Xu et al. (2016), who found that source 0642 +449, one of the ICRF2 defining sources, had two compact cores separated by about 500μas in the R.A. direction in 2014 May. The principle of that method is that group delays on shorter baselines were used to cancel out the geometric delays, clocks, and other station-based effects for observables on long baselines (>7100 km), and the last were used to measure source-structure effects. The model of closure delay for geodetic VLBI was given there. Closure analysis was further applied to geodetic VLBI for phase and amplitude observables (Xu et al. 2017), where direct model fitting of closure phases obtained the structure of 3C 371 with a model of three components. The results were proven to agree with the structure obtained from the traditional imaging process of visibility data by comparing the resultant closure phases based on the two methods. The model of closure phase for geodetic VLBI observations was developed. A more comprehensive study of closure analysis was recently done by Anderson & Xu (2018a). Instead of a specific case study for a few sources, they studied all 73 sources in CONT14 13 in detail both via closure analysis and imaging. The conclusion was that source structure is a major contributor to errors in geodetic VLBI. The closure plots for group delay, phase, and amplitude for all possible combinations of three or four stations for all sources in CONT14 during 15 days for both the S and X bands can be found in Anderson & Xu (2018b). The comparison of closures from observables and from imaging results can be obtained from those plots or from the tables in the supporting information of Anderson & Xu (2018a).
Recalling the idea of closure quantities, for a wavefront of a radio source, the closure phase and delay are the sum of the phase and delay observables around a closed triangle of three stations, while for four stations, a, b, c, and d, the closure amplitude ν clr is defined in the study here as where, for instance, ν ab is the amplitude observable on baseline ab. The sequence of the four stations appearing in the previous equation matters in nature to form the closure amplitude. For these four stations, another two closure amplitudes with different station orders, ν abdc and ν acdb , can be formed to give different absolute values of the closure amplitudes defined by Equation (3), while only two out of the three closure amplitudes are independent. For clarity in this context, one closure phase or one closure amplitude refers to an individual closure value, while a triangle or a quadrangle refers to a certain combination of three or four stations in a certain sequence. One must note that calculating the closure phase and delay needs to meet the requirement of a closed triangle with moving stations for geodetic VLBI, where the observables in one scan are not necessarily referring to the same wavefront. The models and explanations were addressed in our previous studies (see Xu et al. 2016;Anderson & Xu 2018a). Based on the assumption of independent thermal noise, the closure phase uncertainty s f clr was derived from the uncertainties of the phase observables of the three baselines, calculated from Equation (1), and the closure amplitude uncertainty s n clr , where, for instance, s n ab is the uncertainty of the amplitude observable on baseline ab, which is calculated based on Equation (2).
A rich variety of stations and networks 14 was used by geodetic/astrometric VLBI; the VLBI networks used in individual sessions had as few as only two stations and had as many as more than 30 stations, and the antenna sizes ranged from 3 up to 100 m. The IVS currently organizes the VLBI observing activities utilizing worldwide available resources in order to routinely carry out 24 hr sessions three to four times per week and 1 hr sessions eight times per week, and to keep the rapid turnaround sessions with their final results to be delivered to the public within a reasonably short time (the latency of EOP sessions is about two weeks, and it is about one day for intensives). Therefore, the 40 yr history of geodetic VLBI observations involves changes in station networks, observing strategy, frequency setup, sampling rate, hardware for recording system, correlators, and so on. Compared to the CONT14 data for our previous studies, the study in this paper requires a more comprehensive processing of all historical data. An important and unavoidable process is outlier flagging in closure analysis.
Two key points for outlier flagging are (1) the closure phase, closure delay, and closure amplitude defined by Equation (3) all have an expected value of zero with a thermal noise (Gaussian) distribution for point-like sources, and (2) for extended sources, source-structure effects are generally symmetric and continuous (except for breaks in closure phases due to ambiguity) as a function of 24 hr of Greenwich Mean Sidereal Time (GMST) for each individual baseline. At the same GMST from different dates, the geometry of a baseline with respect to a source is identical (by ignoring the nutation effects on the R.A., polar motion, and station movements). If the source structure is stable during a period of time, it allows us to combine the closures from adjacent observations as a function of wrapped GMST, which helps a great deal for flagging.
Outlier flagging was done session-wise based on the behaviors of closure quantities over all sources in one session. Three main types of phase outliers were detected for these 6533 sessions: 1. Noise-like: the standard deviations of closure phases with respect to zero and its mean for the whole session, a specific station, or a baseline are larger than the threshold; 2. Constant offset: the ratios of the mean closure phase to its standard deviation for a station or a baseline are larger than the threshold, by sorting out the closures involved by the station or the baseline; 3. 180°jump: closure phases of a station or a baseline are grouped at 0°and 180°, by sorting out the closures involved by the station or the baseline.
The constant offset outlier detection was also applied to closure amplitude. Three very short baselines were completely excluded both for phase and amplitude: WETTZ13N-WETT-ZELL, YEBES40M-REAGYEB, and HOBART26-HOBART12.
Statistics were then calculated based on the resultant closure quantities for each source. Because the expected value of closures for a compact point source is zero, the departures of closure phase and closure amplitude with respect to zero in principle indicate how far away the structure deviates from an ideal point in terms of its effects on phase and amplitude observables. The statistical quantity of phase, the closure phase rms (CPRMS), is defined as where f i clr is the ith closure phase of a source and w i is its weight. Accordingly, the statistical quantity of amplitude, the closure amplitude rms (CARMS), is defined as where n i clr is the ith closure amplitude of a source and w i is its weight.
Different weighting schemes were applied. One weighting scheme naturally uses the uncertainties of closure phases and closure amplitudes as The second scheme adds basic noise to prevent extremely high weights by high R sn according to equations where a basic noise of 1°. 5 was applied to the closure phase and that of 0.1 to the closure amplitude. The basic noise for the phase is determined by assuming that the three observables of a triangle all have an R sn of 60, while a much higher basic noise is used for the amplitude by assuming that the four observables of a quadrangle all have an R sn of 20, the minimum threshold for scheduling. As we reported in the previous section, the mean value of R sn for the entire data set is twice as large as the median value, even though we included the observables with actual R sn lower than 20. This gives a hint that a significant number of observables have much higher R sn . These large differences in R sn lead to differences of several orders of magnitude in the relative weighting from uncertainties, which have no realistic meaning in the relative accuracies of closure measurements. The basic-noise weighting can down-weight those observables with high R sn significantly by empirically adding a noise floor to the uncertainties from Equations (1) and (2). The third scheme is a uniform weighting of 1 for all closures. Uniform weighting has the advantage of taking the sourcestructure effects equally into account, whereas the natural weighting reduces significantly the contributions of closures with strong source-structure effects, because the uncertainties of closures are derived from R sn , which is systematically related to source-structure effects-it always decreases when there are strong structure effects.
The statistics used for our study, CARMS and CPRMS, are based on the technique of rms, instead of other concepts, such as mean and median, because of the nature of structure effects. For an extended structure, it is not necessary to have large structure effects in phase and amplitude all the time, and structure effects on closure quantities can be reduced or even canceled out. But it is always the case that structure effects from extended sources have large variations in closure quantities from observations at different epochs of the GMST. In this special case, the technique of rms works better.

Observational Results
Source structures are often represented as images of brightness distributions derived from phase and amplitude observables based on various imaging techniques (e.g., hybridmapping algorithms, see Readhead & Wilkinson 1978, Cornwell & Wilkinson 1981; and the maximum entropy method, see Narayan & Nityananda 1986, Shevgaonkar 1986). With the advent of the VLBA, snapshot observations of ICRF sources to make images with sensitivities obtained previously only with full-synthesis observations are possible. The largest effort to image ICRF sources is a series of VLBA astrometric experiments called Research and Development VLBI sessions (Fey et al. 1996;Fey & Charlot 1997, 2000 and the VLBA Calibrator Surveys, a series of six campaigns run on the VLBA from 1994 to 2007 (VCS1-VCS6; Beasley et al. 2002;Fomalont et al. 2003;Petrov et al. 2005Petrov et al. , 2006Petrov et al. , 2008Kovalev et al. 2007). As a continuation, the second epoch VLBA Calibrator Survey campaign (VCS-II) was undertaken to improve the position estimates of 2400 VCS sources . For ICRF sources in the southern hemisphere, a joint program of observations with the Australia Telescope National Facility (ATNF) using geodetic stations and ATNF's Long Baseline Array were carried out in order to make images (Fey et al. 2004a(Fey et al. , 2004bOjha et al. 2004). Images based on these VLBA observations are publicly available through websites. 15,16,17 Source structures can also be addressed simply by demonstrating their effects on phase and amplitude observables by closures, without calibrations being needed. According to their definitions, closure delay and closure phase measure the summation of source-structure effects on the three baselines of a triangle, and closure amplitude measures a ratio of structure effects on the amplitudes of four baselines of a quadrangle. Even though closure quantities for a specific triangle or quadrangle provide significantly less knowledge of brightness distribution compared to images, they do have several advantages. First, because a change in the pattern (see its definition in the next paragraph) of structure effects in a given triangle or quadrangle necessarily indicates a change in the intrinsic structure, closure quantities have a particular application in monitoring radio sources for changes in intrinsic structure. These changes in structure can be quantified and compared by analyzing closure quantities. Second, closure quantities directly tell the magnitudes of structure effects on VLBI observables. Changes in structure effects of a radio source are caused by variations in the observing geometry, which depends on the GMST epoch of the observable for a given baseline, and changes in its intrinsic structure over time. Because the timescale of the observing geometry variation is constant-24 hr of GMST-and the timescales of structure changing are expected to be significantly larger, these two factors for structure effect changing can be distinguished by using two time systems for the observing time, the fraction of day by GMST and the date, for our closure analysis. In this paper, we refer to the change in structure effects over GMST as pattern and to the change of the pattern over date as evolution.
Structure effects have different patterns for different triangles or quadrangles. These patterns also vary both from source to source and from time to time for a given source. A radio source can have thousands of triangles or quadrangles for the global VLBI networks. Because studying the unique instrumental, source, and triangle/quadrangle dependencies of such a large number of subarrays in IVS networks is difficult, efforts have been made to identify and select just a few triangles or quadrangles out of those many for the best identification for a specific source. The triangles or quadrangles used for each source must be determined separately. However, only examining closure quantities of too few triangles or quadrangles can lead to biases in understanding source structures. Therefore, the results are organized in two ways: (1) closure quantities of several triangles or quadrangles as representatives are shown in plots together with their statistics, and (2) the statistics for all available closures for each individual source are calculated and addressed. The CARMS and CPRMS values from all available closure phases and amplitudes for each individual source in the entire observed history are labeled global in Sections 4.1 and 4.2. The CARMS and CPRMS values based on a single triangle or quadrangle are, on the other hand, shown in the closure plots and will be addressed in these two sections as well.
The closure phases will be addressed in Section 4.1 for four selected sources, and the closure amplitudes will be addressed in Section 4.2 for seven selected sources. Note that the closure phase and closure amplitude have different responses to source structure, but a significant change identified in the pattern of one type of closure quantity will appear in that of the other because of the change in intrinsic structure. Source 2229+695 will be discussed with both its closure phases and its closure amplitudes to demonstrate that effect. The global CARMS and CPRMS values will be addressed in Section 4.3. The 30 most frequently observed sources will be discussed there to explore these statistical values.
Several remarks in order to better understand the results in the next sections are summarized here. (1) For a given source and for a specific triangle or quadrangle, the patterns of closure phases and closure amplitudes within one session have in general the feature of continuity, and these patterns remain the same for observations from other sessions if the structure did not change.
(2) For the same triangle, the patterns of closure quantities of different sources have no correlation at all, due to differences in their intrinsic structures. (3) Those patterns depend on the changes of the uv coordinates of the baselines (the baseline vectors projected onto the sky plane) involved in the closure quantities. Therefore, the patterns in general should have a wave shape. (4) The GMST ranges over which radio sources are visible differ from source to source; for instance, sources with high declination can have closures at all 24 hr of GMST while sources close to the equator tend to have closures only in limited time zones. (5) Both larger magnitudes of the peaks and more rapid changes with respect to GMST in those patterns indicate a more extended structure. (6) Whenever the pattern of closure quantities, for a specific triangle/quadrangle and a given source, evolves, a change in the intrinsic structure happens. (7) In general, ICRF sources have a core-jet morphology. Variations in the intrinsic structure result from changes in the brightness of the core or jet components, the ejection of a new jet component, and the motion of jet components along the jet, which cause evolution of the closure patterns.

Closure Phases
Source 1357+769, which shows only thermal noise in the patterns of closure phases for all available triangles, will be introduced first. It is also used to demonstrate the station performances for several antennas in the geodetic VLBI networks. Three other sources are chosen as representatives of sources with structure to be discussed in detail, in order to present a sample of various sources with different amounts of structure effects, different variabilities in intrinsic structures and different categories in ICRF2 and ICRF3. In terms of categorization, ICRF2 has three categories, defining sources that define the reference frame, special-handling sources that were claimed to have the largest position variations over time, and other sources, while ICRF3 only maintains the defining category and treats all the remaining as other sources.

Source 1357+769
Source 1357+769 is close to the north pole and is visible for most of the northern stations at all times. Based on the three weighting schemes, its global CPRMS values are 9°.0, 10°.4, and 13°.9 over 339,062 available closure phases. The upper two plots in Figure 1 show the closure phases of two triangles, GILCREEK-WESTFORD-WETTZELL and TSUKUB32-WESTFORD-WETTZELL. The baselines between stations TSUKUB32, WESTFORD, and WETTZELL are among the longest baselines of the global VLBI networks and were frequently scheduled in 24 hr sessions and intensives. The two plots suggest that 1357+769 is relatively compact. The closure phases for a smaller triangle, MEDICINA-SVETLOE-WETT-ZELL, are shown in the bottom of Figure 1. The CPRMS for this small triangle is 5°. 5 and thus significantly decreased compared to the values of 9°.2 and 7°.4 for the larger triangles, which suggests that it may be slightly resolved on those rather long baselines. The closure phase patterns for source 1357 +769, which are flat around 0°with small CPRMS values, less than 10°, for individual triangles, can be applied to detect a source with a compact structure.

Source 0133+476
In total, 0133+476 has 281,201 closure phases with global CPRMS values of 8°.7, 11°.1, and 13°.4, based on the three weighting schemes. The entire data set of its closure phases of the triangle NYALES20-WESTFORD-WETTZELL is shown in eight separate subplots of Figure 2 to better show the evolution of structure effects over time. These three stations were frequently observing 0133+476 for 15 yr from 1999 to 2014. From its global CPRMS, it is inferred that source 0133+476 was relatively compact or only slightly resolved; however, the closure phases still show clear patterns of structure effects. For instance, during the period of 2005-06 to 2006-11 as shown in plot c of Figure 2, the wave shape with a magnitude of 30°was repeated in 55 sessions with 444 closure points. Structure effects became weaker during the period from 2007 to 2009, and gradually increased again through 2010-2012. Finally, 0133+476 was extremely quiet. By inspecting the plots, several remarks can be made: (1) structure effects of source 0133+476 on this large triangle have seven stable patterns during 2001-10 to 2014-08, (2) the magnitudes of the peaks in those patterns were changing from time to time, and (3) the GMST epochs of those peaks were rather stable. The most likely explanation is that the flux density ratio of its jet to the core changes from time to time, which causes the decreases and the increases of the peak magnitudes shown in the figure. For instance, if that ratio increases, the peak magnitude of closure phases becomes larger and vice versa.

Source 0552+398
In total, 0552+398 has 447,292 closure phases with global CPRMS values of 10°. 1, 15°.5, and 19°.0 based on the three weighting schemes. Its closure phases of two triangles GIL-CREEK-NYALES20-WETTZELL and FORTLEZA-NYALES20-WETTZELL are shown in Figure 3. The closure phase patterns of both triangles are relatively stable but with an increasing peak magnitude over more than 20 yr. During 10 yr of 332 sessions from 1994 to 2005, the structure effects on the triangle GILCREEK-NYALES20-WETTZELL gradually increased with peaks in closure phase from 30°to 50°as shown in the upper plot of Figure 3. A slow increase after 2005 is also visible in the lower plot of Figure 3. 0552+398 is an example of a source with an extended but relatively stable structure over a long timescale.

Source 2229+695
In total, source 2229+695 has 125,608 closure phases with global CPRMS values of 14°.2, 17°.9, and 26°.5 based on the three weighting schemes. The closure phases of two triangles, KOKEE-NYALES20-WETTZELL and KOKEE-TSUKUB32-WETTZELL, are displayed in Figure 4. These two plots demonstrate that 2229+695 was already resolved in 2008, and its structure continuously and significantly increased after that. Finally, in early 2018, it has structure effects on the closure phases of the first triangle with magnitudes larger than π radians.

Closure Amplitude
The study of amplitude observables for geodetic VLBI is as important as that of phase observables, from which group delay observables are derived. Amplitude observables are sensitive to source structure that causes structure delays, and the measurement noises in delay observables are directly correlated with observed amplitudes. For instance, if the observed amplitude is only 10% of the flux density used for scheduling, due to resolved structure, the R sn of actual measurements will be one magnitude smaller than the R sn expected from the schedule. The contribution of thermal noise to the measurement uncertainties will significantly increase, sometimes even leading to a failure in detection.
Because a quadrangle involves four stations, one more than a triangle, and is sensitive to the station orders, an individual source generally has many more different quadrangles than triangles. However, compared to the closure phase, the closure amplitude is much more vulnerable to the loss of individual baseline observables in correlation, and because the list of stations participating in IVS geodetic sessions frequently changes from session to session, specific station quadrangles are formed less frequently than specific station triangles. Thus, for a given source, we get many more closure amplitudes than closure phases, but for each individual triangle or quadrangle, the number of closure amplitudes is much less than that of closure phases.
Source 0454−810, which is located close to the south pole, will be discussed first to demonstrate both the behavior of a compact source and the performances of several typical southern geodetic stations. Six other sources are addressed as examples to investigate their structure effects and the variabilities in intrinsic structure.

Source 0454−810
In total, source 0454−810 has 14,551 closure amplitudes with global CARMS values of 0.13, 0.14, and 0.15 based on the three weighting schemes. Closure amplitudes of two quadrangles, HART15M-HOBART12-KATH12M-YARRA12M and HART15M-KATH12M-YARRA12M-HOBART12, are displayed in Figure 5. The radio source has been heavily observed by southern stations since 2013. During its observing history, it showed minimal structure as demonstrated by the plots in Figure 5 and the global CARMS values. This case can be considered as an investigation of the performance of these small antennas in the southern hemisphere concerning amplitude observables. In the case of 0454−810, we should notice that there is no significant difference in the CARMS values from the three weighting schemes.

Source 0016+731
In total, source 0016+731 has 619,226 closure amplitudes with global CARMS values of 0.19, 0.29, and 0.34 based on the three weighting schemes. Closure amplitudes of three quadrangles, GILCREEK-KOKEE-WETTZELL-NRAO85_3, KOKEE-TSUKUB32-WETTZELL-YEBES40M, and ISHIOKA-KOKEE-YEBES40M-WETTZELL, are displayed in Figure 6. Its strongly resolved structure before 2000, which is identified by phase observables shown in the supplemental information at [doi:10.11570/19.0007], can be partly confirmed by amplitude observables in plot a. Note that the two quadrangles KOKEE-TSUKUB32-WETTZELL-YEBES40M and ISHIOKA-KOKEE-YEBES40M-WETTZELL have nearly the same geometry and have the same baseline sequence in Equation (3) because ISHIOKA and TSUKUB32 are two stations at one site. Therefore, their closure amplitudes can be combined together and compared directly. We thus can learn from plots b and c that the structure effects in amplitude decreased from 2012 to 2014 and significantly increased in 2017. The magnitude of the peak in 2018 increases to more than 2. The likely reason for these evolutions in structure-effect patterns is that the brightness of its core or of its jet is very variable on timescales of several months.
Compared to the global CARMS values of 0014+813, the global CARMS values of 0016+731 are relatively small. This indicates that 0016+731 has less structure than 0014+813, which can be equally addressed by comparing plots b and c in shown to guide the reading of the variation magnitudes of closure phases. These three plots demonstrate well the minimal structure of source 1357+769 over 20 yr. In particular, plot c implies the good case of the thermal noise level in geodetic baseline phase observables, which is about 3°. 1. Figure 6 to the closure plots of 0014+813 for the same quadrangle in the supplemental information available at [doi:10.11570/19.0007]. The structure of 0016+731 needs to be studied carefully for three reasons. First, its structure is changing at short timescales. Second, it had very extended structure in the past. Third, it develops to have extended structure again in 2018. These evolutions should cause changes in its estimated positions from geodetic VLBI.

Source 0059+581
In total, source 0059+581 has 2,909,197 closure amplitudes with global CARMS values of 0.18, 0.26, and 0.29 based on the three weighting schemes. Closure amplitudes of two quadrangles, NYALES20-TSUKUB32-WETTZELL-WESTFORD and KATH12M-KOKEE-NYALES20-WETTZELL, are displayed in Figure 7. Plot a demonstrates that its structure is changing from year to year, indicated, for instance, by the closure amplitudes for observations made in 2004 shown by green points, 2007 by yellow points, and 2014 by red points. Much larger structure effects in amplitude observables were found in the second quadrangle KATH12M-KOKEE-NYALES20-WETTZELL, which involves a very long south to north baseline, KATH12M-NYALES20. The structure-effect evolutions at the timescale of 1 yr are clearly visible for 0059+581.

Source 0642+449
In total, source 0642+449 has 1,422,456 closure amplitudes with global CARMS values of 0.31, 0.46, and 0.53 based on the three weighting schemes. Closure amplitudes of two quadrangles, KOKEE-NYALES20-TSUKUB32-WETTZELL and KOKEE-NYALES20-TSUKUB32-WESTFORD, are displayed in Figure 8. As indicated by the evolution of structure effects shown in these two plots, the structure of 0642+449 has been extended since 1999, increased in 2006, and changed several times around 2008. The structure then increased significantly in 2014 and again changed after that. According to our imaging results based on the CONT14 observations, it had an extended structure along the R. A. direction. The major jet component was located about 0.5 mas away from the core with flux density almost equal to the core.
Le Bail et al. (2014) studied in detail the time series of 0642 +449 throughout its observing history. They found out that the time series of its position had a flicker noise, and significant jumps in its R.A. coordinates happened in 2005-2008 and 2010-2014. Plots a and b in Figure 8 provide astrophysical explanations for those jumps, that is, the evolution of structure. Figure 2. Plots of closure phases at the X band for source 0133+476 as a function of GMST for triangle NYALES20-WESTFORD-WETTZELL. The 15 yr history of closure phase measurements for this triangle is divided into eight segments according to the different well-identified structure-effect patterns. They are therefore shown in eight separate plots in order to keep the patterns with small magnitudes from overlapping with each other. See Figure 1 for a description of the plot design except the color coding, which is not used in this figure. The time evolution of the structure effects for this triangle can be easily seen, which suggests changes in the source structure of 0133+476 although it was relatively compact. As we can see from these plots, the patterns changed in terms of the magnitudes of the peaks but not the GMST epochs of the peaks in these eight plots. The peak magnitudes changed, both decreasing and increasing, which suggests that the evolution of the closure patterns are caused by the changes in the flux density ratio of its jet to the core. The scatter of the closure phases along the pattern in the time period of 2010-12 to 2013-02, shown in plot g, is significantly larger than in other periods of time. One can infer that its core was more strongly resolved in 2010-12 to 2013-02.
Source 0642+449 was one of the ICRF2 defining sources, which were selected based on observations before 2009, and has been excluded from that category in the recently released ICRF3. It demonstrates that a compact source can evolve to have an extended structure in a time frame of a few years and then keep that extended structure afterwards. Closure analysis is able to give an intermediate, reliable warning for this kind of structure evolution.

Source 0851+202 (OJ 287)
In total, source 0851+202 (OJ 287) has 1,291,087 closure amplitudes with global CARMS values of 0.25, 0.41, and 0.46 based on the three weighting schemes. Closure amplitudes of two quadrangles, KATH12M-NYALES20-YARRA12M-WETT-ZELL and KATH12M-WETTZELL-YARRA12M-MATERA, are displayed in Figure 9. Plots a and b both demonstrate that during the last 6 yr, the structure effects on closure amplitudes for these two quadrangles evolved in magnitude from 0.5 to 4 twice and from 0.5 to 2 several times. The likely reason is that this equatorial source has an extended structure along the decl. axis, and the flux density of its jet component or the core is very variable, on timescales of months. For cases like OJ 287, the highly variable structure will make correcting structure effects from geodetic VLBI observables both critical and challenging.

Source 1807+698 (3C 371)
In total, source 1807+698 (3C 371) has 578,885 closure amplitudes with global CARMS values of 0.54, 0.65, and 0.68 based on the three weighting schemes. Closure amplitudes of two quadrangles, NYALES20-TSUKUB32-WETTZELL-WESTFORD and KOKEE-NYALES20-TSUKUB32-WETTZELL, are displayed in Figure 10. Plot a shows rapid changes in the structure-effect pattern with respect to GMST, and plot b shows Figure 3. Plots of closure phases at the X band for source 0552+398 as a function of GMST for two triangles, (a) GILCREEK-NYALES20-WETTZELL and (b) FORTLEZA-NYALES20-WETTZELL. See Figure 1 for a description of the plot design. The patterns were stable and were only slightly evolved over more than 20 yr in 332 sessions for the first triangle and in 463 sessions for the second. that the peak magnitude is up to 2. Both of them, however, demonstrate that its structure is reasonably stable over a long term. All other quadrangles of 1807+698 also show stable structure-effect patterns. This actually is one of the very few cases of ICRF3 sources, along with 0552+398, that have extended structure but remain stable over the entire history of observations.

Source 2229+695
In total, source 2229+695 has 486,119 closure amplitudes with global CARMS values of 0.33, 0.43, and 0.49 based on the three weighting schemes. Closure amplitudes of two quadrangles, KOKEE-NYALES20-WETTZELL-TSUKUB32 and BADARY-KOKEE-YEBES40M-WETTZELL, are displayed in Figure 11. Plot a demonstrates well the evolution of its structure during 2008-2016, while plot b demonstrates that it is a very extended source recently. Although we only present plots of both closure phases and closure amplitudes for sources 2229+695 and 0607−157 (in the next section) in the main body of this paper to show the reader an example of the crosscheck between these two kinds of closure quantities, the extended structure detected by one type of closure is clearly visible for the other closure type for virtually all sources.

Statistics of Closure Quantities
General information about the source structure is obtained by calculating the statistics CARMS and CPRMS over all available closure quantities for each individual source. Some sources have already been reported in the previous two sections, and a full list of 3417 ICRF3 sources that have at least a certain number of closures-the threshold of that number will be discussed in the next section-is available in the machine-readable version of Table 2. These statistics are then used to classify these sources as "good" or "bad" in terms of the astrometric quality.  Table 2, in order of decreasing number of observations used for the creation of the ICRF3, are discussed here in detail by exploring their statistics. This is important and interesting because the observations of these 30 sources make up about 42% of the entire data set of S/X geodetic VLBI observations. It allows the estimation of how much structure effects have been embedded in VLBI observations and the exploration of how useful these statistics can be in terms of comparing the overall magnitudes of individual source CARMS and CPRMS values to other sources.
Source 0059+581, having the most observables in geodetic VLBI, has a CARMS value of 0.27 and a CPRMS value of 10°. 8 based on the basic-noise weighting. It is resolved and undergoes changes in intrinsic structure as demonstrated in Figure 7. Its statistics, nevertheless, imply that over its observing history, the impacts of its structure effects on geodetic VLBI observations are limited. Its variable structure needs to be monitored in order to guarantee its role as a fiducial mark of the celestial reference frame.
Out of the 30 most frequently observed sources, two sources, 1357+769 and 0727−115, have minimal structure during the several decades covered by geodetic VLBI observations. This is demonstrated by the fact that 1357+769 has the smallest CARMS values and 0727−115 has the smallest CPRMS values. No significant structure effects have been found based   Figure 5 for a description of the plot design. The structure-effect patterns in plots b and c can simply be combined because ISHIOKA and TSUKUB32 are two stations at the same site. The evolution of its structure effects is seen from the three plots. on closure phases or closure amplitudes from any available triangles or quadrangles for these two sources; the plots show a pattern of thermal noise or minimal structure as illustrated for 1357+769 by Figure 1. They are considered to be compact.
Seven sources, 0133+476, 0955+476, 1749+096, 0454 −234, 1606+106, 1300+580, and 0804+499, have the same level of source structure as 0059+581. They all have CARMS values smaller than 0059+581 but have larger CPRMS values, which potentially means that their cores are more compact than that of 0059+581 but with significant jet components. The changes in the intrinsic structure of 0133+476 have been demonstrated by a series of plots in Figure 2. The closure plots demonstrating slightly resolved structures for the other five sources are shown in the supplemental information of this paper at [doi:10.11570/19.0007].
These are the only 10 sources with the CARMS values smaller than or equal to 0.30 out of the 30 most frequently observed sources. Obvious and variable structures are still identified for several sources among them, for instance, 0955 +476 and 0059+581. These 10 sources were selected as defining sources both for ICRF2 and for ICRF3, which is reasonable because sources with minimal or limited structure effects in return give rise to having stable positions estimated from the VLBI.
Three sources, 1803+784, 0552+389, and 1807+698 (3C 371), are discovered to have more extended but stable structure over the last decades. 1803+784 is resolved and its structure effects yield constant patterns. The structure of 0552+398 has increased slowly during the last 20 yr as shown in Figure 3. Of these three extended but stable sources, the structure effects of Figure 7. Plots of closure amplitudes at the X band for source 0059+581 as a function of GMST for two quadrangles, (a) NYALES20-TSUKUB32-WETTZELL-WESTFORD and (b) KATH12M-KOKEE-NYALES20-WETTZELL. See Figure 5 for a description of the plot design. The structure effects shown in the two plots reveal that 0059+581 was temporarily extended, for instance, in 2007, 2013, and 2015. 3C 371, as shown in Figure 10, are the largest and change the most rapidly. 0552+398 apparently has larger structure effects in amplitude than 1803+784, which suggests that 0552+398 has a stronger jet component relative to its core. Structure effects of 1803+784 change more rapidly with respect to GMST than 0552+389, which therefore suggests that the jet of 1803+784 is farther away from its core. 3C 371 has never been a defining source. 1803+784 was excluded from that category in ICRF3, and 0552+398 remains a defining source in ICRF3. The reason that source 0552+398 with such high CARMS values was classified as a defining source is likely due to its stable structure and the need for a uniform geometric distribution of defining sources.

Six
sources, 1741−038, 1739+522, 1638+398 (NRAO512), 0528+134, 0016+731, and 1334−127, have CARMS values ranging from 0.3 to 0.4 based on basic-noise weighting. As an example of these, 0016+731 is discussed in Section 4.2.2 to demonstrate its variable structure. The evolution in the structures of the other five sources are addressed in the supplemental information. Of these six sources, 1741−038, 1739+522, and 0528+134 are ICRF2 special handling sources primarily due to significant variations in the time series of their estimated positions. 1638+398 was excluded from the defining category in ICRF3, while 0016 +731 and 1334−127 are still defining sources.
The remaining 11 sources have very large structure effects, and their structures are highly variable according to our study. Figure 8. Plots of closure amplitudes at the X band for source 0642+449 as a function of GMST for two quadrangles, (a) KOKEE-NYALES20-TSUKUB32-WETTZELL and (b) KOKEE-NYALES20-TSUKUB32-WESTFORD. See Figure 5 for a description of the plot design. Its structure has dramatically evolved several times. The largest magnitude of closure amplitude in plot a is −4, so that the ratio of the observed amplitudes of the four baselines was less than e −4 (∼0.04). The closure amplitude patterns change very rapidly with respect to GMST in plot b. A very extended structure is detected.
All of their CARMS values based on basic-noise weighting and uniform weighting are larger than or equal to 0.40. Some of them have been discussed in detail in the previous two sections. For example, the structure evolution of 0642+449 is discussed in Section 4.2.4 and 2229+695 in Sections 4.1.4 and 4.2.7. Of these 11 sources, there are some with structures varying significantly on timescales of months, such as 0851+202 (OJ 287) and 0923+392 (4C 39.25), some with structures varying from year to year, such as 1611+343, 1044+719, and 1308 +326, and some which were developing to have a very extended structure in recent years, such as 2037+511 (3C 418), 0642+449, 0229+131, and 2229+695. There are five sources already classified as special handling sources in ICRF2, 0923 +392, 1611+343, 1044+719, 1308+326, and 2145+067.
Two sources, 0851+202 and 0642+449, were excluded from the group of defining sources in ICRF3, but 2229+695 is still one of the ICRF3 defining sources.
Another example regarding the categorization of radio sources in a reverse way is 0607−157, which was one of the ICRF2 special handling sources but was selected as one of the defining sources for ICRF3. It has been heavily observed by southern stations since 2013. In total, source 0607−157 has 156,121 closure amplitudes with CARMS values of 0.23, 0.51, and 0.57 based on the three weighting schemes. Its closure phases of the triangle HOBART12-WARK12M-YARRA12M and closure amplitudes of the quadrangle HOBART12-KATH12M-WARK12M-YARRA12M are displayed in Figure 12. Plot a demonstrates that the structure of this equatorial source was  Figure 5 for a description of the plot design. The magnitudes of the structure effect on closure amplitudes evolved within 6 yr from 0.5 to 4 two times and from 0.5 to 2 several times. The timescale of the evolution is less than 1 yr. Very extended structure is detected. evolving during the time period from 2013 to 2017. This is confirmed by the amplitude observables shown in plot b. 0607 −157 has a very extended and variable structure. There are three other ICRF2 special handling sources, 0235+164, 0208 −512, and 1448+762, that are selected for ICRF3 to be defining. The adequacy of being defining is questioned by this study for 0208−512, 0607−157, and 1448+762, which have CARMS values of 0.47, 0.51, and 0.66, respectively, using basic-noise weighting. The fourth source, 0235+164, with a CARMS value of 0.18 based on basic-noise weighting, can be confirmed to have minimal structure and thus should be classified as defining. These cases prove the importance of information that is independent of closure phases and closure amplitudes.
From the study of these 30 most frequently observed sources, we have learned that the statistics of closure quantities are highly correlated with the qualities of sources' astrometric behaviors. This is expected because the CARMS and CPRMS values tell the overall magnitudes of source-structure effects on VLBI observations for the creation of ICRF3. They are independently derived from different types of observables, that is, phase and amplitude, but they still provide information about the amount of structure effects on delay observables.
It is not surprising that 20 out of these 30 sources show extended structures during the time period covered by geodetic VLBI observations. Those structures have contributed a significant amount of errors to geodetic VLBI data analysis. Among these 20 extended sources, most are variable in Figure 10. Plots of closure amplitudes at the X band for source 1807+698 (3C 371) as a function of GMST for two quadrangles, (a) NYALES20-TSUKUB32-WETTZELL-WESTFORD and (b) KOKEE-NYALES20-TSUKUB32-WETTZELL. See Figure 5 for a description of the plot design. The structure-effect pattern of the first quadrangle repeated for 1454 closure measurements in 132 sessions over 11 yr, with only minor differences. A stable pattern is also shown in plot b. These two plots demonstrate that its structure remains constant over a long time.
intrinsic structure, and thus, their apparent positions change on the sky.
The CARMS value of 0.3 based on basic-noise weighting and uniform weighting can be referred to as the threshold for differentiating between sources being reasonably compact or extended. The sources with CARMS values smaller than 0.3 are considered to be reasonably compact or have limited structure effects. CARMS values larger than 0.4, however, indicate that the sources are extremely extended. To determine how many closure quantities must be measured in order to get a reliable determination of CARMS and CPRMS, we will examine a few sources as test cases. There are two ICRF3 defining sources, 1619−680 and 1925−610, that have large CARMS and CPRMS values but only hundreds of closure quantities. In total, source 1619−680 has 1225 closure amplitudes with CARMS values of 0.78, 0.81, and 0.84, and 713 closure phases with CPRMS values of 50°.9, 55°. 4, and 61°. 9, using the three weighting schemes. Closure amplitudes of the quadrangle HOBART12-KATH12M-YARRA12M-WARK12M are displayed in plot a of Figure 13. Even though the number of closure amplitudes available for this specific quadrangle is only 53, the structure-effect pattern in closure amplitude can be easily seen in this plot. There is only one point available from 2015 and three points from 2016; however, they all agree with the clearer pattern determined from observations in 2013 and 2014. This also shows that closure quantities can be used to test for evolution in intrinsic source structure using only a few observations provided that a clear pattern in structure effects has been detected in advance. The closure amplitudes of 1925−610 for the quadrangle Figure 11. Plots of closure amplitudes at the X band for source 2229+695 as a function of GMST for two quadrangles, (a) KOKEE-NYALES20-WETTZELL-TSUKUB32 and (b) BADARY-KOKEE-YEBES40M-WETTZELL. See Figure 5 for a description of the plot design. The magnitudes of the patterns in both plots increased, and the changes of the patterns with respect to GMST became more and more rapid.
HARTRAO-WARK12M-YARRA12M-HOBART26 are shown in plot b of Figure 13, where its structure effects are convincingly demonstrated. In total, 1619−680 has only 622 closure amplitudes with CARMS values of 0.93, 0.99, and 1.07, and 316 closure phases with CPRMS values of 39°.3, 40°. 4, and 50°.3, based on the three weighting schemes. Both the CARMS and CPRMS values, as a cross-check for each other, are extremely high for these two sources. For radio sources with extended structures that have only a few closures, these statistics still work well.

Indicator for Astrometric Use
The impact of the three different weighting schemes on the statistics can be examined in Figure 14. As shown in plot a, the basic-noise weighting generally gives larger CARMS values than natural weighting, by up to 60%. As we already discussed in Section 3, closure uncertainties do not always give realistic relative weighting, for instance, a measurement with an R sn of 200 is not 100 times more accurate than that with an R sn of 20, and closures with large structure effects often get down-weighted using natural weighting. A natural weighting scheme significantly underestimates the structure effects on geodetic VLBI observables. Plot b shows that uniform weighting gives a slightly larger CARMS value than basicnoise weighting, but they agree well with each other. Similar behaviors of these three weighting schemes are found for CPRMS values. Because the basic-noise weighting takes more realistic measurement noises into account and reduces the impacts of unrealistic relative weighting based on R sn , basicnoise weighting is superior to the other two weighting schemes presented in this article, and we use it for our further analysis.
Both closure amplitudes and closure phases detect structure effects but utilize independent measurements. We estimate the correlation between the CARMS and CPRMS values in order to test how well each method performs for identifying the magnitude of the source structure, as demonstrated in Figure 15. All sources with at least 100 closure phases and at least 100 closure amplitudes are displayed as red dots, whereas the remaining 848 sources are shown as blue dots. The red dots generally follow the pattern where the CPRMS values increase with the increased CARMS values; the blue dots, however, spread out like noise. This means that with a sufficient number Note. Sources are listed in the order of the total number of their observables for the creation of the ICRF3 . Columns 2, 3, and 4 are CARMS based on natural weighting (Equation (7)), basic-noise weighting (Equation (8)) and uniform weighting, respectively, while columns 5, 6, and 7 are CPRMS values. Columns 11 and 12 are the source categories in ICRF2 and ICRF3 with "D" indicating defining sources, "S" for special handling sources, and "O" for others. This table is available in its entirety in machine-readable format for 3417 ICRF3 sources; only 30 sources are sorted out and listed here.
(This table is available in its entirety in machine-readable form.) of closure quantities the CARMS and CPRMS values agree with each other, but with very few closure quantities available they are subject to outliers in geodetic observables. The twodimensional Kolmogorov-Smirnov test (Press et al. 1992, Chapter 14.7) for these two distributions is done. The conclusion of this test is that the probability of these two sample sets being drawn from the same parent distribution is 10 −58 -a lower value of the probability indicates a smaller correlation between the two tested distributions, and a probability value of 1 implies they are statistically identical. The test detects the most significant difference in the two distributions when the data are divided into quadrants at the point with a CARMS value of 0.53 and a CPRMS value of 17°.9. As shown at the four corners of Figure 15, the largest frequency of occurrence of red dots happens in the lower-left quadrant with a fraction of 71% while that for blue dots in the upper-left quadrant with a fraction of 40%. A denser distribution of the blue dots on the upper-left quadrant suggests that CPRMS values are more vulnerable to outliers than CARMS values. The main reason is due to the fact that phase observables are more easily corrupted than amplitude observables in geodetic VLBI. This will be further investigated in the future using the imaging process. Based on these investigations, we suggest the CARMS values from the basic-noise weighting for the astrometric use, and the CPRMS values and other weighting schemes for the cross-check. To maximize the use and to keep the reliability of these statistics, the minimum numbers of closure quantities for an individual source are set to be 100.  Figure 1 for a description of the plot design, and for plot b, see Figure 5. A very extended and variable structure is detected, as shown in the two plots. It was one of the ICRF2 special handling sources but was selected as one of the defining sources for ICRF3.

Connection with Structure Effects on Group Delay Observables
There are several practical reasons that group delay observables were not used to investigate source structure in this study. First, our previous studies (Xu et al. 2016(Xu et al. , 2017Anderson & Xu 2018a) have found out that closure phases and closure amplitudes always have clearer structure-effect patterns than closure delays, mainly due to large uncertainties of delay observables. Second, nonclosing delay offsets exist in most geodetic VLBI sessions, the impacts of which can be mitigated by estimating baseline clock offsets in geodetic data analysis but will make closure-delay analysis unreliable and cause biases in quantifying structure effects. Third, the outlier flagging for group delay observables is also unreliable, due to their large, variable uncertainties in different types of IVS sessions. Fourth, there are various other technical issues that make closure-delay analysis relatively difficult, for instance, the group delay ambiguity, introduced by the bandwidth synthesis process, of a good X-band delay with a bad S-band observable is often not properly resolved. In this section, therefore, we aim to make a connection of CARMS values with the structure effects on group delay observables, based on our recent study (Anderson & Xu 2018a), where we have derived images and done statistical closure analysis for CONT14 observations. The magnitudes of structure effects on group delay observables for sources with various CARMS values are determined as test cases, to give insights into the potential impacts of source structure on geodetic VLBI data analysis.   The difference between these two distributions is investigated by using the Kolmogorov-Smirnov test, which confirms that they are statistically different from each other. The test identifies the point with a CARMS value of 0.53 and a CPRMS value of 17°. 9, which defines the four natural quadrants, represented by the two gray dashed lines, to give the largest difference between the distributions of red dots and blue dots. The frequencies of occurrences of the red dots and blue dots in each of four quadrants are displayed in the four corners with their corresponding colors. The CARMS values are recommended for astrometric use, and the CPRMS values for the cross-check.
As shown by Charlot (1990), the structure effects on VLBI group delay measurements can be modeled based on the spatial brightness distribution of the observed source, the source image, and the geometry of the baseline vector projected onto the plane of the sky. We have derived high-dynamic source images from the CONT14 observations. The contributions of intrinsic structures to group delays of two typical geodetic baselines, TSUKUB32-WETTZELL and NYALES20-TSU-KUB32, are calculated and shown in Figure 16 for fives sources with the CARMS values from 0.18 to 0.91. These CARMS values are recalculated based on the observations in CONT14 only and, thus, represent their structure-effect magnitudes at that time. 0016+731 had a very compact structure in 2014 May with a recalculated CARMS value of 0.19 and had a maximum structure delay of a few picoseconds on these two baselines. However, the other four sources, with CARMS values larger than 0.3, had structure delays up to 40 ps for the shorter baseline NYALES20-TSUKUB32. This magnitude can be reduced by removing a sinusoidal signal with the period of 24 hr, due to the chosen reference source position in the image, which is not necessarily the same as the position estimated from geodetic data analysis. However, the variations will still remain at the level of larger than 10 ps. The structure delays on the longer baseline TSUKUB32-WETTZELL, the length of which is also very common for the IVS observations, increase to several tens of picoseconds and change more rapidly with respect to GMST for sources with CARMS values larger than 0.5. They even have a magnitude of 400 ps for the very extended source 0642+449 in 2014 May. The structureeffect magnitudes of these five sources for baseline TSU-KUB32-WETTZELL are listed in Table 3. The overall contributions of their source structures to the actual CONT14 observations, also listed there, are taken from Table C4 of Anderson & Xu (2018a). The overall contribution for a source with a CARMS value of 0.38 is already at the level of 15 ps, and those for extended sources with CARMS values larger than 0.5 are at the level of 20 ps.

CARMS Distribution in the ICRF Catalogs
The verification of categorizations of the ICRF2 and the newly released ICRF3 is performed by examining the CARMS and CPRMS distributions over different ICRF categories. Figure 17 shows the CARMS values, as a function of decl., for ICRF2 defining sources, ICRF2 special handling sources, and ICRF3 defining sources. As sources below decl. −45°are not visible to the VLBA, structure indices are not available for these sources, and therefore structure index criteria cannot be applied to select defining sources at lower decl. Unfortunately, some extremely extended sources have been included as defining sources in both ICRF2 and ICRF3, as shown on the upper-left corner of the figure with blue stars and red circles. The 39 special handling sources selected in ICRF2 based on their astrometric qualities all have significantly large CARMS values except 0235+164, which is listed in the defining group for ICRF3. Compared to ICRF2, ICRF3 excludes some sources with large CARMS values shown as blue stars in the upper part of the figure, and includes tens of sources with small CARMS values shown as red circles at the bottom. The CARMS values versus the CPRMS values for the sources in these three categories are shown in Figure 18, where a good agreement between these two types of statistics and different features in the CARMS and CPRMS distributions for these three categories are addressed. The statistics of CARMS values for different ICRF groups are listed in Table 4. A big improvement in selecting defining sources happened from ICRF1 to ICRF2, mainly because structure index criteria were not used for the creation of ICRF1. A small improvement also happened in the creation of ICRF3. Note. CONT14 CARMS values in column (2) are determined from CONT14 closure measurements only. The median and mean values in columns (4) and (5) are the median and mean of the absolute structure effects on baseline TSUKUB32-WETTZELL, while the median and mean values in columns (6) and (7), taken from Anderson & Xu 2018a), are the variances of structure-effect contributions for all CONT14 observables of individual sources using the statistics median and mean to determine the contributions of structure effects for each observables. Figure 17. Plot of CARMS values vs. declinations for 295 ICRF2 defining sources (blue crosses), 39 ICRF2 special handling sources (black triangles), and 296 ICRF3 defining sources (red circles). There are seven ICRF3 defining sources missing because they have fewer than 10 closure amplitudes available. Extremely extended sources have unfortunately been included as defining sources in both ICRF2 and ICRF3 due to lack of structure indices derived from VLBA observations and the need to fill defining sources in the southern sky for a uniform distribution, as shown with red circles and blue crosses in the upper-left corner. About 21% of ICRF3 defining sources have CARMS values larger than 0.4.
The median and mean of CARMS for all sources are 0.25 and 0.32, respectively. More than half of the sources in ICRF3 are so-called VCS sources that were observed exclusively by VLBA, whereas approximately only 1000 sources are routinely observed by the IVS global network. The ICRF3 sources are thus divided into two groups: VCS sources that have observations only by VLBA stations, and the remaining sources named non-VCS sources. The median and mean of the CARMS values for non-VCS sources are 0.31 and 0.36, respectively, which are larger than those for ICRF3 defining sources and suggest that the sources routinely used for geodesy typically exhibit extended structures and have significant structure effects on geodetic VLBI observations.
The median and mean of CARMS for VCS sources are only 0.23 and 0.30, respectively. Observations from IVS global networks with higher resolution are subject to structure effects much worse than those from the VLBA array. This is due to two reasons: (1) source get more resolved by longer baselines and show more evidence of structure, and (2) longer baselines result in the magnitudes of the structure effects being larger. The difference between the IVS global observations and the VLBA observations is investigated by separating closure amplitudes in the following way. A closure amplitude with all four stations from the VLBA array is classified as one from VLBA observations, which then produce VLBA CARMS values, while the others, with at least one non-VLBA station, are classified as ones from non-VLBA observations, giving non-VLBA CARMS values. This classification is relevant because routine geodetic VLBI observations are mostly made by the non-VLBA stations. Then, CARMS is recalculated based on these two sets of closure amplitudes. VLBA CARMS values are determined for 3788 sources, and non-VLBA CARMS values are determined for 976 sources. The median and mean of the non-VLBA CARMS values are 0.37 and 0.43, and those for the VLBA CARMS values are 0.23 and 0.29, respectively. Histograms of these CARMS values are displayed in Figure 19. The overall structure-effect magnitudes of the geodetic sources, the majority of which are those non-VLBA sources, therefore are more critical. There are 863 sources that have both the VLBA and non-VLBA CARMS values. These CARMS values are displayed in Figure 20, which demonstrates that VLBA observations nearly always give smaller CARMS values than the non-VLBA observations. Therefore, a significant amount of radio sources that have been observed exclusively by VLBA, currently having small CARMS values, are expected to show extended structures if they are observed by the global IVS networks in the future.  It is well known that VLBA antennas have higher sensitivity and better stability than IVS stations. However, instrumental phase and amplitude offsets should cancel out exactly in closure quantities, so that the VLBA instrumental stability does not account for the lower CARMS values for VLBA-only observations. By inspecting closure statistics for sources with little structure, we found that the contributions of measurement noise to CARMS values are well below 0.2 for IVS networks and below 0.1 for VLBA. There is a significant difference, but these contributions are still far below the median CARMS value of 0.37 for IVS networks and the median value of 0.23 for VLBA, reported beforehand for the complete sample of sources. The difference in the contributions of measurement noise between IVS network and VLBA does not account for the discrepancy between VLBA-only and IVS CARMS values, neither.
On the other hand, one possible reason for that discrepancy, in particular for the sources that were observed by VLBA only, might be that VLBA has been used in special sessions to observe much weaker sources (up to ∼10 times fainter) than the sources observed in regular IVS sessions. Weaker sources should have smaller angular sizes, either because they are more distant or because they are less active. Such weaker sources are expected to have less structure effects in observations. In fact, we did not find a significant difference in the median values of VLBA-only CARMS between the sources observed by VLBA only and those observed by both VLBA and IVS networks. Therefore, this should not be a major reason.

Summary and Future Works
We have analyzed phase and amplitude measurements from the 40 yr history of geodetic/astrometric VLBI observations. The concept of CARMS based on closure amplitudes is defined and used to quantify the overall structure-effect magnitude in the geodetic/astrometric VLBI data set for each individual source. CARMS values are available for the 3417 ICRF3 radio sources with at least 100 closure measurements. Almost the entire ICRF3 catalog is evaluated in terms of source structure.
The main conclusion of the closure analysis in this paper is that CARMS values smaller than 0.3 suggest relative compact structures and limited structure effects in geodetic VLBI observations and that CARMS values larger than 0.4 indicate very extended source structures. Based on the detailed investigation of the 30 most frequently observed sources, 28 sources are demonstrated to be resolved at different levels at the resolutions of the global VLBI networks. Very extended source structures have been identified for 19 of these 30 sources, and most of them have structures highly variable on timescales from years to several weeks, except for three stable sources, 3C 371, 0552+398, and 1803+784. These highly variable structures make the corrections of structure effects in geodetic VLBI data analysis both important and challenging.
For different source categories in the ICRF catalogs, the CARMS values have dramatic differences. For instance, the median and mean of the CARMS values for ICRF2 special handling sources are 0.60 and 0.65 while those for ICRF3 defining sources are 0.25 and 0.29, respectively. However, there are still a number of radio sources in the ICRF3 defining group with CARMS values larger than 0.4. We recommend  that CARMS values be used as astrometric quality indicators, for instance, for the categorization of future ICRFs and in the weighting of radio sources used for the linking of Gaia and VLBI catalogs.
In the future, structures of ICRF sources need to be continuously monitored by updating their CARMS values with new geodetic VLBI observations while they become available. As already initially discussed in this paper, the variability of intrinsic structure with time can be obtained by closure analysis as well. Changes in intrinsic structure should inevitably lead to change in derived source positions, which will result in the deterioration of the stability of the radio celestial reference frame. Together with the overall structure-effect magnitude indicated by CARMS, variability of intrinsic structure therefore should be served as an additional guidance for the use of the ICRF3 and for the categorization of radio sources, in particular the defining ones. Identification of the epochs and the magnitudes of structure evolution will be discussed in another paper. The closure phases and closure amplitudes will be further used to derive a time series of images with the techniques, for example, as demonstrated by Chael et al. (2018). Based on those images, structure effects can be corrected in geodetic VLBI data analysis. This study sets a basis for that effort by flagging outliers and examining the structure-effect pattern and evolution for each individual source, which is necessary and important for combining observations from adjacent sessions to make images.