A search for hypercompact HII regions in the Galactic Plane

We have carried out the largest and most unbiased search for hypercompact (HC) HII regions. Our method combines four interferometric radio continuum surveys (THOR, CORNISH, MAGPIS and White2005) with far-infrared and sub-mm Galactic Plane surveys to identify embedded HII regions with positive spectral indices. 120 positive spectrum HII regions have been identified from a total sample of 534 positive spectral index radio sources. None of these HII regions, including the known HCHII regions recovered in our search, fulfills the canonical definition of an HCHII region at 5 GHz. We suggest that the current canonical definition of HCHII regions is not accurate and should be revised to include a hierarchical structure of ionized gas that results in an extended morphology at 5 GHz. Correlating our search with known ultracompact (UC) HII region surveys, we find that roughly half of detected UCHII regions have positive spectral indices, instead of more commonly assumed flat and optically thin spectra. This implies a mix of optically thin and thick emission and has important implications for previous analyses which have so far assumed optically thin emission for these objects. Positive spectrum HII regions are statistically more luminous and possess higher Lyman continuum fluxes than HII regions with flat or negative indices. Positive spectrum HII regions are thus more likely to be associated with more luminous and massive stars. No differences are found in clump mass, linear diameter or luminosity-to-mass ratio between positive spectrum and non-positive spectrum HII regions.


INTRODUCTION
Newly formed massive stars are deeply embedded within molecular clouds, but they produce powerful Lyman continuum emission that is sufficiently energetic to ionize their surroundings and create observable ionized regions, known as Hii regions. Hii regions are over-pressured with respect to the surrounding interstellar medium and so expand over time, driving ionization shocks into the ambient medium (Tenorio-Tagle 1979;Dyson et al. 1995). The smallest Hii regions are thus likely to be the youngest, which means that the most dense and compact Hii regions can shed light on the early development of massive stars (Peters et al. 2010).
However, many details of these early stages are cur-rently unclear (Hoare et al. 2007;Zinnecker & Yorke 2007), and different theoretical models predict very different outcomes for the earliest stages of Hii regions. For example, as massive stars reach the main sequence while still accreting material, the  and Peters et al. (2010) turbulent core and ionization feedback models both envisage the youngest Hii regions expanding into outflowdriven cavities away from the main accretion flows (e.g. Tanaka et al. 2016). This implies the early development of Hii regions within the star formation process. However, the models of Hosokawa & Omukai (2009) and Hosokawa et al. (2010), suggest that the high accretion rates of material onto the massive star cause its outer layers to swell, reducing the stars effective temperature. This has the result of delaying the initial development of Hii regions until after the accretion phase has finished.
At first sight, the difference in the onset time for Hii regions from different models implies that the relative incidence of the smallest and earliest Hii regions may allow different models to be discriminated. However, subsequent differences in the expansion rate of the Hii regions may complicate matters. For example, the rapid expansion of Hii regions into outflow cleared cavities (e.g. Tanaka et al. 2016) in the  and Peters et al. (2010) may result in a dearth of small Hii regions despite the early onset of the ionization. Nevertheless, it is clear that the main differences in these theoretical models arise at the earliest onset of the Hii region phase and that studying the physical properties of youngest and smallest HII regions will help to improve our theoretical understanding of massive star formation.
We summarize the observational properties of known HC Hii regions from the literature in Table 1. As can be seen, the RRL line widths of HC Hii regions, their emission measures (EM) and electron densities (n e ) are in general larger than those of UC Hii regions, although there is a considerable spread of values. The radio spectral indices of HC Hii regions are positive with a typical value ∼1 (Beuther et al. 2007), i.e., between the purely optically thick and optically thin values discussed by Kurtz (2005) which may, in turn, indicate line-of-sight density gradients (Keto 2003).
Another property of HC Hii regions is their rarity, with only 16 confirmed HC Hii regions in the literature compared to ∼600 UC Hii regions Lumsden et al. 2013;Cesaroni et al. 2015). However, it is not clear if this rarity is intrinsic or caused by observational biases. As most surveys for young Hii regions have been carried out between 1.4 and 5 GHz, they unfortunately suffer from an observational bias towards the discovery of objects with flat or falling spectral indices. HC Hii regions with strong positive spectra have 1.4-5 GHz fluxes orders of magnitude less than UC Hii regions and are thus likely to have been missed in low-frequency surveys. The majority of the HC Hii regions listed in Table 1 were discovered serendipitously by highfrequency observations of known UC Hii regions.
Our current limited understanding of the number and global properties of HC Hii regions means that it is difficult to constrain theoretical models of massive star formation as these models predict very different evolutionary scenarios for Hii regions. Secondly, it is also difficult to place HC Hii regions into context with the more well-known UC Hii regions.
For example, the defining characteristics of size, electron density, and emission measure proposed by Kurtz (2005) fall on a continuous spectrum and so it is difficult to identify discrete "types". As we can see in Table 1 the sizes, emission measures, and electron densities of many of the identified "HC Hii regions" fall within ranges that are more appropriate to UC Hii regions. With only a handful of identified HC Hii regions, it is difficult to identify a clear dividing line between the two types of Hii regions. Understanding the evolution of the youngest and densest Hii regions requires more confirmed examples to be found.
We have undertaken the largest and most systematic search yet carried out for HC Hii regions. As Table 1 shows, the easiest defining observational characteristic of HC Hii regions is their positive radio spectral index. While broad recombination line widths are a key observational feature of HC Hii regions, routine spectroscopy of large numbers of faint Galactic radio sources will not be possible until the advent of the SKA (e.g. Thompson et al. 2015). We have used current radio interferometer surveys of the Galactic Plane, namely CORNISH (5 GHz, Hoare et al. 2012), MAG-PIS (1.4 GHz, Helfand et al. 2006;White et al. 2005), and THOR (1∼2 GHz, Bihr et al. 2016;Beuther et al. 2016), to identify objects with positive spectral indices between 1.4 and 5 GHz. We then cross-match these positive spectrum radio sources with mid/far-infrared and submillimetre survey data to identify objects embedded within molecular clumps, as would be expected for the initial stages of massive star formation. This is very similar to the strategy that was used to classify sources in CORNISH 1 . Of course, the sample that we have assembled here is also subject to the observational bias towards negative and flat spectrum indices that we have previously mentioned. However, we can nevertheless identify bright HC Hii regions within the CORNISH catalogue and obtain a much wider understanding of young Hii regions. This paper is organized as follows. Section 2 describes the Galactic Plane surveys we used to identify positive spectrum radio objects and confirm those that are embedded within molecular cloud clumps. We describe the procedure used to determine spectral indices and identify a sample of objects with rising radio spectra in Section 3. In Section 4, we characterize the properties of the sample at far-infrared (FIR) and submillimeter (sub-mm) wavelengths, select a subsample of young and compact Hii regions and discuss their properties. In particular, we compare the properties of Hii regions with rising (positive) radio spectra to those with flat or negative spectra and dwell upon the implications of our survey for the frequency of HC Hii regions in the Galaxy. In Section 5 we present a summary of our conclusions.

MAPGIS
Removing those within 10 Removing those within 10 Detailed description of our selection process are shown in section 3.1 and section 3.2. MAGPIS has only published catalog within the region l = 5−32 • , | b | < 0.8 • (Helfand et al. 2006). For the remaining regions within the MAGPIS survey we use the AEGEAN photometry package (Hancock et al. 2012) to construct a source catalogue from image cutouts obtained from the MAPGIS website. Only a catalog within l = 14.0−37.9 • , l = 47.1−51.2 • , | b | ≤ 1 • has been published from THOR (Bihr et al. 2016). Only the overlapping region with CORNISH is presented, and the total sky region of White et al. (2005) is l = -20−120 • and b = ± 0.8−±2.7 • .
Our starting point was the CORNISH (Coordinated Radio "N" Infrared Survey for High-mass star formation), which we used to form a base 5 GHz radio source catalog. COR-NISH was a sensitive (∼ 0.4mJy beam −1 ) and high-resolution (∼ 1.5 ) 5 GHz survey of a section of the Galactic Plane (10< < 65 • and | b | < 1 • ) , using the JVLA in B and BnA configuration. CORNISH detected 3062 continuum sources greater than 7σ (Hoare et al. 2012;Purcell et al. 2013). Together with CORNISH, we used three other 1.4 GHz radio surveys to determine 1.4-5 GHz spectral indices (or lower limits to spectral index where only 1.4 GHz upper limits were available), MAGPIS, THOR and the White et al. (2005) VLA Galactic Plane survey.
The Multi-Array Galactic Plane Imaging Survey (MAGPIS) 2 has the highest sensitivity and resolution of the 1.4 GHz surveys that we have used, with a resolution of ∼ 6 and noise level ∼ 0.3 mJy beam −1 (Helfand et al. 2006). A catalog has been published for the survey region 5< < 32 • and | b | < 0.8 • . MAGPIS images are also available between l = 32−48.5 • and | b | < 0.8 • , but no catalog of objects detected within these images has yet been published. We used image cutouts available from the MAPGIS website 2 to identify 1.4 GHz counterparts to CORNISH sources in the uncatalogued region of the Galactic Plane. Further details of this process are described in Section 3.2.
To cover the remainder of the CORNISH survey region, we have used 1-2 GHz catalogs and data from The HI, OH, Recombination line survey of the Milky Way (THOR, Bihr et al. 2016;Beuther et al. 2016) and the 1.4 GHz VLA survey White et al. (2005), hereafter White2005. Details of all four surveys are summarised in Table 2. Once the 1.4-5 GHz radio spectral index of the objects has been derived, we select those objects with a positive spectral index and confirm that they are embedded within molecular cloud clumps by examining cutout images from the ATLASGAL and Hi-GAL surveys. The APEX Telescope Large Area Survey of the Galaxy (ATLASGAL), has a resolution of 19 and typical noise level of 50 to 70 mJy beam −1 (Schuller et al. 2009), providing the largest, unbiased database for detailed studies of large numbers of early stages of massive-star forming clumps in the Galaxy. The Herschel infrared Galactic Plane Survey (Hi−GAL) imaged the Galactic Plane in five far-infrared and sub-millimetre bands, with the principal aim of detecting the earliest phases of the formation of molecular clouds and high-mass stars (Molinari et al. 2010(Molinari et al. , 2016. We used the most sensitive band of Hi-GAL, i.e. the 250 µm band which has a resolution of 18 (similar to the ATLASGAL resolution of 19 ) and a 1σ flux density sensitivity of 12.8 mJy beam −1 . The 250 µm band lies close to the peak of the SED of cold dust and is thus an excellent tracer of molecular clumps surrounding embedded Hii regions.

DETERMINING THE RADIO SPECTRAL INDEX
The spectral index of a radio source is defined by a power law relationship 3 . between its flux density S ν and frequency ν (S ν ∝ ν +α ) In practice the spectral index is determined between two flux densities S ν 1 and S ν 2 measured at two specific frequencies ν 1 and ν 2 using the following equation The true value of the spectral index α can only be determined when both fluxes in Equation 1 are available. Nevertheless, it is possible to determine a lower limit to the spectral index α min for sources that are only detected at a higher frequency (5 GHz in our case) but not at the lower (1.4 GHz) by using an upper limit for the flux density at the lower frequency, i.e.
where dS ν represents the 1σ r.m.s. error in the flux density at frequency ν. Note that this equation assumes a 5σ detection threshold to determine the upper limit and also subtracts the 1σ error from the 5 GHz measurement to determine a reliable lower limit to the spectral index. In the following subsections we present our method of determining α and α min from the CORNISH, MAGPIS, THOR and White2005 surveys. The individual steps in the process are summarized in Table 3.

CORNISH sources with 1.4 GHz counterparts
Using the CORNISH and MAGPIS/White2005/THOR source catalogs, we first identify CORNISH catalog sources that are positionally associated with corresponding MAG-PIS/White2005/THOR catalog sources. As the MAGPIS and White2005 1.4 GHz surveys have resolutions of 6 , we use a circular matching threshold of 6 , however as THOR has a lower angular resolution of 10-25 , we relax the matching threshold to 20 as used in THOR (Bihr et al. 2016). Using these matching criteria we obtain 700, 963, and 1060 matches to CORNISH sources from MAGPIS, White2005, and THOR respectively.
Next, we remove duplicates caused by overlapping survey regions and create a unique list of matching sources by merging the MAGPIS, White2005 and THOR matches. Duplicates are removed by choosing the counterpart from the survey with the highest sensitivity at the CORNISH source position. In general, MAGPIS has the highest sensitivity, followed by THOR and finally by White2005. We must also take into account the fact that a MAGPIS catalog has not been published for the region between = 32−48.5 • , although images are available from the MAGPIS project website. So, for all CORNISH sources lying in this region of the Plane, we obtained and inspected MAGPIS cutout images.
For those CORNISH sources with 1.4 GHz counterparts, we measured their 1.4 GHz peak and integrated fluxes using the AEGEAN package (Hancock et al. 2012). 53 CORNISH sources in this region were found to have MAGPIS counterparts. Overall, the matching and deduplication process resulted in 700 MAGPIS matches, 556 THOR matches and 397 White2005 matches respectively.
Finally, as we are principally interested in positive spectrum sources in this study, we calculate the 1.4-5 GHz spectral index of each source using Equation 1 and remove all sources whose spectral index α is less than zero. This results in a final positive spectrum (i.e. α − d α > 0) catalog of 410 sources; 108 from THOR matches, 151 from MAGPIS matches, and 151 from White2005 matches.
As this sample includes both point sources and extended sources, we need to bear in mind that the radio surveys are interferometric and carried out at different frequencies and with different VLA configurations. Each survey has a different coverage of the uv plane, and thus we are only able to derive reliable spectral indices for point sources. We use the same criterion to distinguish between point and extended sources as the THOR survey (Bihr et al. 2016), where extended sources are identified as those having an integrated 1.4 GHz flux density more than 1.2 times their peak 1.4 GHz flux density (S int /S peak > 1.2). Because lower frequency surveys are more sensitive to extended emission than higher frequency survey, the flux densities of extended objects at 1.4 GHz surveys could contain more extended emission than 5 GHz CORNISH. One way to address this would be to combine multiple VLA configurations with single-dish data to improve the uv coverage and include zero-spacing flux (e.g. Tian & Leahy 2005. However, this is not feasible for the archival survey data that we use here. Theoretically, the extremely small physical size of HC Hii regions ( 0.03 pc) makes them highly likely to be point sources even in these interferometric radio surveys. However, most HC Hii regions are located in complex star formation regions and/or surrounded by more diffuse ionized gas (Sewilo et al. 2004;Sewi lo et al. 2011), which tends to result in more extended emission, particularly at lower frequencies. Thus, the 1.4 GHz flux density of HC Hii regions may be contaminated by their environment and exhibit more extended emission than at 5 GHz. We give two examples of this in Figure 1 and 2, which show the morphology of two known HC Hii regions at 1.4 GHz (white contours) and 5 GHz (lime contours). The lower frequency emission traced by MAGPIS is clearly extended in both sourcesby a neighbouring UC Hii region in HC Hii G45.07+0.13 NE ( Figure 1) and by extended emission near HC Hii W49A G ( Figure 2). Therefore, we consider that the spectral indices of extended sources in the sample should be strictly considered as lower limits, and we have identified them as such in our results. In total, we obtain 256 point sources with reliable spectral indices and 154 sources that are extended at 1.4 GHz with lower limits to their spectral index.

1.4 GHz dropouts in CORNISH: spectral index lower limits
As well as deriving the spectral index from measurements at both 1.4 and 5 GHz, it is also possible to determine a lower limit to the spectral index as described in Equation 2. In order to derive α min we need to identify sources in the CORNISH catalog that do not possess counterparts in any of the 1.4 GHz surveys (aka 1.4 GHz "dropouts"). We do this by searching for sources in the CORNISH catalog that do not lie within 30 of any source within the THOR catalog and 10 of any source within the MAGPIS and White 2005 catalogs (the differing search radii result from the different beam sizes of the three 1.4 GHz surveys). That is to say, for each CORNISH source, we search for a THOR source inside a circle of 30 radius and a MAGPIS/White2005 source within 10 radius. If no 1.4 GHz sources lie inside this circle, the CORNISH source is preliminarily believed to be only detected at 5 GHz by CORNISH. An initial list of 173, 102, and 1323 dropout sources is obtained from THOR, MAG-PIS, and White2005 respectively by this step. As a secondary constraint, to avoid associations with large, extended THOR/MAGPIS/White2005 sources (which may contain sources extending over several arc minutes), we also require each CORNISH source found in the first step to have a separation from each THOR/MAG-PIS/White2005 source greater than 1.2 times the THOR/-MAGPIS/White2005 source angular diameter. With this step, we remove the very large angular diameter THOR/-MAGPIS/White2005 sources that may appear as "overresolved" by the higher resolution CORNISH survey. This constraint reduces our α min sample to 123, 101, and 1245 sources for THOR, MAGPIS, and White2005 respectively.
Next, we create a unique catalog by merging the individual THOR, MAGPIS and White2005 dropouts samples and removing the duplicates caused by the overlapping survey regions. Where measurements were taken of a CORNISH source by more than one of THOR, MAGPIS or White2005, we remove the measurements with the highest noise values and keep the most sensitive measurement (generally this belongs to the THOR or MAGPIS surveys). Again we must also take into account the fact that only a portion of the MAGPIS images have a published catalogue, as described in Section 3.1. For each qualifying CORNISH source located between = 32−48.5 • , we accordingly inspected corresponding image cutouts from MAGPIS 2 to determine whether a 1.4 GHz counterpart had been detected by MAGPIS. If no 1.4 GHz counterpart had been detected, we measured the RMS noise at the position of the CORNISH source to determine an upper limit to the 1.4 GHz flux. Overall a total of 188 CORNISH sources were found to not have identifiable 1.4 GHz counterparts in the THOR, MAGPIS or White2005 surveys (118 THOR dropouts, 6 MAGPIS dropouts, and 64 White2005 dropouts).
Finally, we calculated a lower limit to the spectral index (α min ) for each source using the CORNISH flux density and a 5σ upper limit to the 1.4 GHz flux at the CORNISH position obtained from the noise maps of the THOR, MAGPIS and White2005 surveys. Again as we are principally interested in positive spectrum sources, we filter out those sources with α min < 0 and list the remaining 124 positive spectrum objects in Appendix Table A2.

Selection of Hii regions from the sample
Using the methods outlined in the previous section, we have identified a total of 256 radio sources whose spectral index     (6 ) are white circles presented in the lower left corner of each image. Overlaid lime and white contours on each image show 5 GHz emission from CORNISH and 1.4 GHz emission from MAGPIS, respectively. This is an example of an HC Hii region showing extended 1.4 GHz emission (S int /S peak = 1.5) in our sample and we regard its spectral index as a lower limit (α mi n ).
α − d α > 0 and a further 278 positive spectrum objects with a minimum spectral index α min > 0, obtaining a initial sample of 534 positive spectrum objects. We present this sample in Appendix Table A2. However, radio sources with positive spectra are not just limited to HC Hii regions, and this sample is likely to contain radio galaxies and planetary nebulae as well as Hii regions. We must also consider that the radio surveys used here were not observed simultaneously and so radio sources whose flux has varied between the 1.4 GHz and 5 GHz measurements may result in the derivation of an incorrect spectral index. Caution must thus be applied to the interpretation of the resulting sample.
In order to select Hii regions from our sample of positive spectrum objects, we use the Hi-GAL and ATLASGAL surveys to identify those radio sources that are also associated with far-infrared and sub-mm emission. Our approach mirrors two recent studies to identify young and ultracompact Hii regions (Cesaroni et al. 2015;Urquhart et al. 2013). The appearance of each positive spectrum radio source was inspected in ATLASGAL 870 µm and Hi-GAL 250 µm cutout images, and radio sources that are positionally associated with compact 870 and 250 µm emission were identified. These radio sources are likely to be located within molecular cloud clumps, and so represent young Hii regions still in their embedded phase. We show two examples of known HC Hii regions and their far-infrared and sub-mm morphologies in Figure 3 to illustrate their association with compact FIR and sub-mm emission. It is possible that a small number of these sources are chance alignments of background radio galaxies with foreground ATLASGAL and/or Hi-GAL sources. However we expect the number of these alignments to be small, as the total number of chance alignments for CORNISH was estimated by Urquhart et al. (2013) to be 14±4, and the majority of radio galaxies exhibit a negative spectral index. Hence, the number of chance alignments in our sample is expected to be statistically insignificant.
Of the 534 objects in our positive spectrum sample, 26 sources lie outside of the ATLASGAL and Hi-GAL survey regions and so are excluded from further analysis. A total of 120 objects from the remaining sample of 508 are found to be positionally associated with 870 µm and 250µm emission and are thus highly likely to be young Hii regions. We list this sample of 120 young positive spectrum Hii regions in Appendix Table A1. We compare our sample of young positive spectrum Hii regions to the young Hii region and UC Hii regions samples of Cesaroni et al. (2015) and Urquhart et al. (2013). Unsurprisingly, due to the similar selection process for embedded sources, we find an almost one-to-one match: out of 120 Hii regions in our positive spectrum sample, 113 objects correspond to Cesaroni et al. (2015) young Hii regions or Urquhart et al. (2013) UC Hii regions. The only disparity between the numbers of objects in common arises because Cesaroni et al. (2015) merge individual radio sources in their study that are closer together than 11.5 into a single artificial source. In summary, by combining ATLASGAL and Hi-GAL surveys with our positive spectrum radio source sample, we have identified 120 embedded Hii regions with positive radio spectra.

Recovery of known HC Hii regions
As a critical test of our method, we examine the recovery of the known HC Hii regions presented in Table 1. Of these 16 known HC Hii regions, 11 lie within the CORNISH survey region. With two exceptions, M17 − UC1 and W51e2 a , 9 HC Hii regions are recovered by the CORNISH survey. However, when applying our method to identify positive spectrum radio sources, we find that 5 out of these 9 are not recovered.
The reason why we do not recover these sources is that they are all located within large complex regions with extended emission at 1.4 GHz, and their 1.4 GHz flux is seriously affected by the emission from their surroundings. Three HC Hii regions (G34.26+0.15B, G24.78+0.08A1, G28.2−0.04 N) display strong and extended counterparts at 1.4 GHz which dominate the 5 GHz emission, resulting in an overall negative spectrum. Two HC Hii regions (W49AA/AB) have no separate identified 1.4 GHz counterparts.
Our method is successful at recovering known HC Hii regions that are not located in complex environments, and as such, complements the existing discovery space where all the known examples of HC Hii regions have been serendipitously discovered within larger complexes of Hii regions. Nevertheless, we must bear in mind that our search for potential HC Hii regions is limited within complex regions and our sample is almost certainly a lower limit to the true number of HC Hii regions in the Milky Way. Below we briefly discuss the known HC Hii regions that were recovered by CORNISH and our search process.
3.4.0.1 G010.9584+00.0221: This source is the known HC Hii region G10.96+0.01 W (i.e., G010.9583+00.0223 in Table 1) identified by Sewilo et al. (2004). In our study, G10.96+0.01 W has a derived spectral index of α min = 1.1 which is consistent with α 5 1.4 ∼ 1.2 (Sewilo et al. 2004). It has an angular size of 2.2 at 5 GHz with a distance of 14 kpc (Sewilo et al. 2004), so its 5 GHz linear diameter is ∼ 0.15 pc, which is slightly larger than the 1.3 cm size ∼ 0.121 pc measured by Sewi lo et al. (2011)  3.4.0.2 G035.5781−00.0305: This HC Hii region is shown in the right panel of Figure 3. G035.5781−00.0305 is closely associated with compact FIR and sub-mm emission, and also shows emission at 1.1mm, MIR and NIR wavelengths from BGPS, GLIMPSE and UKIDSS. This source is resolved into two extremely close sources by the JVLA at 2 cm and 3.6 cm (Kurtz et al. 1994): a Western source G35.578−0.030 and an Eastern source G35.578−0.031. The Western source is the known HC Hii region G35.58−0.03 discussed by Zhang et al. (2014). The lower resolution observations of CORNISH and MAGPIS do not resolve these individual sources, instead revealing a single source with the catalogue identifier G035.5781−00.0305 surrounded by extended continuum emission. The extended emission is more than likely caused by similar surrounding diffused ionized gas, for example like G010.9584+00.0221 (e.g. Sewilo et al. 2004;Sewi lo et al. 2011). Correspondingly, the angular size of this region is larger at lower frequency (2.5 at 5 GHz implying a linear diameter ∼ 0.1 pc). We derive a lower limit to the spectral index of α min = 1.3.
3.4.0.3 G043.1665+00.0106: This is the HC Hii region W49A G (G043.1666+00.0110), found in the W49A complex with other two nearby HC Hii regions W49A A and W49A B (De Pree et al. 1997Sewilo et al. 2004). All of these HC Hii regions are detected by CORNISH, but only W49A G is recovered in MAGPIS and White2005, with a derived spectral index of α min = 1.4. Note that this spectral index should be strictly considered as a lower limit due to a moderately resolved 1.4 GHz MAGPIS counterpart (S int /S peak = 1.5), see Figure 2. The 1.4−5GHz spectral index that we derive is consistent with a previously determined value α ∼ 2 between 22 GHz and 43 GHz (De Pree et al. 1997. We measure an angular size of 3.68 at 5 GHz with a distance of 11.4 kpc, which corresponds to a 5 GHz linear diameter of ∼ 0.2 pc. This size is larger than the size determined at 3.6 cm (∼ 0.061 pc at a resolution of 0.8 × 0.8 ) from De Pree et al. (1997). Again, this region shows extended emission at 5 GHz from surrounding ionized gas.
3.4.0.4 G045.0712+00.1321: This source is the HC Hii region G45.07+0.13 NE (G045.0712+00.1322) (Keto et al. 2008;Sewi lo et al. 2011), which is also found near a fainter UC Hii region G045.0694+00.1323 (offset by ∼ 6 ) detected by CORNISH. Within MAGPIS these two objects are indistinguishable from each other. We derive a spectral index of α min = 0.68, which should be considered to be a lower limit due to the source blending and possible extended nature at 1.4 GHz (S int /S peak = 1.74). We measure a angular size of 1.89 at 5 GHz with a distance of 6 kpc, resulting in a 5 GHz linear diameter of ∼ 0.05 pc, which is slightly larger than its  Overall, we find that all of the recovered HC Hii regions display a larger angular size as measured in CORNISH compared to their discovery images at higher frequency (e.g. Sewilo et al. 2004;Sewi lo et al. 2011). Moreover, each of the known HC Hii regions is seen to be extended at 1.4 GHz, with a peak to integrated flux ratio greater than 1.2. This suggests that HC Hii regions are surrounded by lower density ionized gas and implies that searches based on size alone may not recover new HC Hii regions. This is analogous to the well-known extended emission that is found around UC Hii regions (Kim & Koo 2001;Ellingsen et al. 2005), in which snapshot and limited uv coverage observations of UC Hii regions filtered out surrounding extended emission. Kim & Koo (2001) suggested that these Hii regions were in fact comprised of a hierarchical structure where UC components remained largely embedded within molecular clumps but a much wider expansion of the Hii region had occurred along the density gradient of the clump. This results in a hierarchy of scales for the ionized gas from ultracompact to extended. We may be seeing a similar phenomenon in our HC Hii sample.

Observed Properties of the Sample
In this section, we examine the overall properties of the embedded positive spectrum Hii regions that we have identified. We plot histograms of their spectral indices and linear diameters in Figure 4. Linear diameters have been derived using the 5 GHz angular diameters measured by CORNISH (Purcell et al. 2013) and the distances to the embedding molecular clumps determined by Cesaroni et al. (2015) and Urquhart et al. (2013).
We see that there are very few objects with purely optically thick spectral indices (i.e., α ∼ 2), even taking into account those with only strict lower limits. The maximum true spectral index in our sample is α ∼1.5, and, when lower limits to the spectral index (α min ) are considered, the maximum value of α min is ∼1.9. The mean values for α are 0.6±0.4 and α min are 0.5 ± 0.4. The majority of sources have spectral indices between 0 and 0.5. The linear diameters of the positive spectrum Hii regions range between 0.02-1.2 pc, with a mean of 0.1±0.03 pc for objects with true values of α and 0.3±0.2 pc for objects with lower limits to their spectral index.
These two distributions are combined in Figure 5, where the spectral index is plotted against the linear diameter of the embedded positive spectrum Hii regions. The canonical diameter for HC Hii regions is shown by a horizontal dashed line at 0.03 pc. It is immediately obvious from Figure 5 that there are no positive spectrum Hii regions that fulfil the Kurtz (2005) definition of diameter ≤ 0.03 pc and spectral index ∼ 2. Note that the exact Kurtz definition is based on emission measure, but the proposed emission measure for HC Hii regions of 10 10 pc cm −6 would result in α 2 (assuming a constant density structure which is almost certainly not the case as seen in Section 3.4). We do see objects with spectral indices greater than 1, but most of these have linear diameters ≥ 0.05 pc which is more consistent with the literature definition of the ultracompact Hii region. The two small diameter sources (d ≤ 0.03 pc) in our sample both have lower limits to their spectral index that are below one.
Interestingly, we also see larger ∼pc diameter Hii regions that fit the definition of compact to classical Hii regions, but with positive radio spectra rather than the −0.1 spectral index expected from optically thin emission. This suggests that these larger Hii regions have density gradients indicating a mix of optically thin and thick components along the line of sight. The lack of positive spectrum Hii regions that fulfill the canonical picture for HC Hii regions is puzzling. On the one hand, this may indicate that these regions are indeed rare. But on the other, we note that the known HC Hii regions recovered in our search all show 5 GHz linear diameters larger than that expected for HC Hii regions (the filled blue circles in Figure 5). Thus, the fact that we do not find any such regions fitting the expected definition may merely be the result of the definition being incorrect! If HC Hii regions are indeed likely to be surrounded by lower density ionized gas in the same hierarchical structure, then their observed linear diameter will be a complex function of observing frequency and uv coverage. Thus, a number of the positive spectrum objects that we have identified may well be extremely young HC Hii analogs with extended halos. Further high-resolution multi-frequency and multi-configuration observations are required to study the morphology and physical properties of these objects over a range of size scales, so that we may determine whether our sample does indeed contain very young Hii regions.

Comparison with Herschel and ATLASGAL selected Hii regions
As mentioned in Section 3.3 we have compared our sample of positive spectrum Hii regions to those presented in Cesaroni et al. (2015) and Urquhart et al. (2013). These samples were all selected in a similar way, i.e. by combining CORNISH, ATLASGAL and Hi-GAL, and this facilitates their crosscomparison. The added feature of our work is that we have determined the spectral index for our sample and so can split the Cesaroni et al. (2015) and Urquhart et al. (2013) samples by their spectral index to explore differences between populations. As both Cesaroni and Urquhart samples have well-determined physical properties (e.g. clump mass, luminosity, and Lyman continuum flux) we can examine trends in these quantities with spectral index. We have combined the Cesaroni and Urquhart catalogues into one sample, eliminating duplicates between the two catalogs, and resulting in a final sample containing 251 young embedded Hii regions drawn from CORNISH. We then cross-matched this against the CORNISH sources for which we determined a spectral index (or lower limit) in Section 3, and against the CORNISH sources associated with known HC Hii in Table 1. This allows us to split the combined Cesaroni and Urquhart sample into two subsamples, those embedded Hii regions with positive (or rising) spectra and those Hii regions with flat or negative spectra (i.e., notrising spectra). We find 118 Hii regions with rising spectra and 127 Hii regions with not-rising spectra. The remaining 6 Hii regions could not have their spectral indices determined and are excluded from further analysis. Cesaroni et al. (2015) performed a similar although more limited analysis (see their Figure 1) using CORNISH and MAGPIS, in order to confirm the thermal nature of the emission from their Hii region candidates. We find similar qualitative results to Cesaroni et al. (2015) in that roughly half of the sample show evidence for rising spectra with the remainder not-rising. The individual differences between our and the Cesaroni et al. (2015) results are due to the differing matching methods used (Cesaroni et al. (2015) use a simple 20 matching radius and do not consider the confusing effect of large diameter sources in MAGPIS). We examine the differences in the physical properties between rising and not-rising spectrum subsamples in Figure 6 to Figure 8.
In Figure 6 (left-hand panel) we show the distribution of bolometric luminosity of the two subsamples. Although the means of the two subsamples are similar with a mean Log(L/L ) of 4.4 ± 1.6 for rising spectrum Hii regions compared to 3.6 ± 2.0 for not-rising Hii regions, we do see a shift towards higher luminosities for rising spectra Hii regions. A Kolmogorov−Smirnov (KS) test comparing the luminosity of the two subsamples yields a small p-value of 0.0011, and so we are able to reject the null hypothesis that the two subsamples are drawn from the same parent population.
We see a similar effect when we compare the Lyman continuum fluxes of rising and not-rising spectrum Hii regions, shown in Figure 6 (right-hand panel). We take the values for Lyman continuum flux from Cesaroni et al. (2015) and Urquhart et al. (2013). Further details of the derivation can be found in these papers, but we note in passing that both studies assume optically thin emission at 5 GHz which diam=0.03pc min Figure 5. Spectral index (α) or the lower limit of the spectral index (α mi n ) versus the linear diameter at 5 GHz of our sample. The filled blue circles show the four known HC Hii regions recovered by this method. Characteristic errors for linear diameters show in blue in lower-right of the plot. Rightward pointing arrows represent a lower limit of the spectral index α mi n . No obvious relation between spectral indices and linear diameter for α sample as well as α mi n sample, however, it is not possible to obtain real trends for α mi n sample. may significantly underestimate the Lyman flux of optically thick emission. Comparing the means of the two subsample we find that the mean N Ly for the rising spectrum Hii regions (Log(N Ly (s −1 )) = 48.0±0.8) is moderately larger than the notrising spectrum Hii regions (Log(N Ly (s −1 )) = 47.5 ± 0.7). A KS test of the Lyman fluxes of the two subsamples returns a p-value of 2×10 −5 , and thus we are able to significantly reject the null hypothesis that the two subsamples are drawn from the same population. However, we must draw attention to the possibility of systematic bias in the rising spectrum sample due to optical depth effects. As this means that the Lyman fluxes for the rising spectrum Hii regions may be underestimated, the true disparity between the subsamples may be greater than we have indicated. Kim et al. (2017) compared independent radio continuum and millimeter-wave recombination line analyses for a sample of Hii regions, with the result that optical depth effects appeared not unduly to affect the results. However, further investigation of our rising spectrum sample is needed to confirm this hypothesis.
In Figure 7 and 8, we show the distributions of linear diameter, clump mass, and luminosity-to-mass ratio (L/M(L /M )) for both rising spectrum and not-rising spectrum subsamples. All of these distributions are essentially indistinguishable for rising spectrum Hii regions and notrising spectrum Hii regions. For the linear diameter, the respective means for rising and not-rising subsamples are identical at 0.2 ± 0.2 pc and a KS test is unable to reject the null hypothesis with a p-value of 0.7. For clump mass we find respective means of log M clump = 3.5 ± 0.6 M and log M clump = 3.4 ± 0.7 M for rising and not-rising samples, and a KS test is again unable to reject the null hypothesis that the clump masses of each subsample are drawn from the same population. Finally, for the luminosity-to-mass ratio (L bol /M clump ) we find identical mean values for both subsamples (mean log L(L )/M(M = 1.4 ± 0.4) and a KS test is unable to reject the null hypothesis that L bol /M clump values are drawn from the same population.
On balance, we find that the subsample of rising spectrum Hii regions tend to have higher bolometric luminosity and Lyman continuum fluxes but are not of significantly different linear diameter or found in clumps of different mass or luminosity-to-mass ratio than the not-rising spectrum Hii regions. This suggests that rising spectrum Hii regions may result from higher luminosity (and hence higher mass) stars with larger Lyman continuum fluxes. However, their similar linear diameters and luminosity-to-mass ratios imply an evolutionary status that may be much the same between the subsamples. The peak luminosities of ∼ 10 5 L for the two subsamples of Hii regions ( Figure 6) is consistent with the result in Davies et al. (2011, see their Figure 8) who discussed the relative number of Hii regions as a function of luminosity based on data from both simulation and observation. In Figure 5, we see little evidence that the rising spectrum (i.e. potentially young, dense and optically thick) Hii regions are consistent with the canonical HC Hii description, which is observed by Urquhart et al. (2013).
We also investigate the millimeter-wave recombination line properties of our sample of positive spectrum Hii regions by cross-matching against the aforementioned study of Kim et al. (2017). This sample of Hii regions are selected from ATLASGAL clumps observed in millimeter-wave recombination lines, and we identify common objects in our two samples by matching common ATLASGAL clumps. In total, we find 35 Hii regions in the positive spectrum sample that are associated with millimeter-wave recombination line detections from Kim et al. (2017), after removing one clump that is associated with two positive spectrum Hii regions. We plot the recombination line widths against our derived spectral indices (α and α min ) in the right-hand panel Figure  8. Figure 8 shows a generally positive trend between the recombination line width and spectral index, in that Hii regions with broader line widths have larger spectral indices. However, it is difficult to confirm this as a genuine relationship between line width and spectral index as many of the spectral indices plotted here are strict lower limits rather than true values. We also indicate in Figure 8 the commonlychosen dividing line between HC Hii regions and UC Hii re-gions at a line width of 40 km s −1 . While objects with line widths in excess of 40 km s −1 do display larger spectral indices, there is no clear distinction between the two.

Implications for the frequency of HC Hii regions and the formation of massive stars
The over-riding feature of our results is that HC Hii regions following the canonical definition of Kurtz (2005) are not common in our sample. This was also noted by Kim et al. (2017) who found no hypercompact or high emission measure Hii regions in their 5 GHz selected sample. Kim et al. (2017) explain the lack of HC Hii regions in their sample due to the observational bias that we discussed in Section 1. However, given that we have carried out a wide-area survey, demonstrated that CORNISH is able to recover approximately half of the known HC Hii regions in Table 1 and that the recovered 5 GHz linear diameters of these objects are larger than the canonical definition of HC Hii regions we do not feel that this is the most likely explanation. The true picture of HC Hii regions is that they are likely to be comprised of a hierarchical structure similar to UC Hii regions (e.g. Kim & Koo 2001), with highly compact dense, high emission measure "cores" surrounded by lower density, lower emission measure "halos". The distribution of spectral indices in our sample is indicative of this structure, with the majority of indices falling between 0 and 1 which implies mixed optically thin and thick emission along the line of sight. Thus the perhaps simplistic definition of Hii regions based on size may complicate matters and should be revised to take account of the complex structure of these objects.
Nevertheless, in Figure 6, the rising spectrum Hii regions (i.e., potentially young and dense) are more likely to be of higher luminosity and have higher Lyman continuum fluxes than the not-rising spectrum Hii regions. Urquhart et al. (2013) also found that the most highly luminous Hii regions are amongst the largest. Hii regions show an excess of Lyman continuum that the measured values are larger than the theoretical prediction, which cannot be easily explained (Lumsden et al. 2013;Urquhart et al. 2013). Further research is needed to understand these differences of luminosity and Lyman continuum flux between the two subsamples Hii regions.
It is difficult to assess how complete our observations are of the potential hypercompact population of Hii regions. We find that CORNISH is able to recover roughly half of the known sample of HC Hii regions and so one might naively assume that there are perhaps a factor 2 more HC Hii regions to be discovered in the CORNISH survey region. However, it is clear that Hii regions with positive spectra are in fact common, as roughly half of the Cesaroni et al. (2015) and Urquhart et al. (2013) catalogs of compact and ultracompact Hii regions have positive spectral indices. This may have implications as the ionized gas properties that have been derived in many studies of Hii regions to date have assumed that the continuum emission is optically thin, rather than the mixture of optically thick and thin components that our distribution of spectral indices implies. Detailed multi-frequency and multi-configuration observations that can reveal the ionized structure of these regions on a range of scales are required to further investigate their nature and examine their relationship to the early phases of massive star formation.

SUMMARY AND CONCLUSIONS
We have carried out the largest and most unbiased search for hypercompact Hii regions to date by combining radio surveys at 1.4 and 5 GHz (THOR, CORNISH, MAGPIS and White2005) with far-infrared and sub-mm Galactic Plane surveys (Hi-GAL and ATLASGAL). We obtain a sample of 534 objects with a 1.4 to 5 GHz spectral index greater than zero, listed in Appendix Table A2. 256 of these objects were detected at 5 GHz and as point sources at 1.4 GHz which means that we could determine true values of their spectral index, whereas the remaining 278 objects have upper limits at 1.4 GHz or were found to be moderately extended at 1.4 GHz and thus have strict lower limits to their spectral index. We identified Hii regions in this sample using AT-LASGAL and Hi-GAL surveys in a similar manner to the recent studies of young and ultracompact Hii regions by Cesaroni et al. (2015) and Urquhart et al. (2013). We found a total of 120 Hii regions with positive radio spectral indices, shown in Appendix Table A1. Among the 120 positive spectra Hii regions, 35 have archival Radio Recombination Line (RRL) observations, see the right-hand panel of Figure 8. Twelve out of the 35 Hii regions show broad RRL line-width ∆V 40 km s −1 , listed in Table 4. Four of the 12 sources are known HCHII regions in Table 1, and follow-up JVLA observations for the rest Hii regions have been carried out to determine their nature. The physical properties of the 120 rising spectra Hii regions were examined and compared to the Cesaroni et al. (2015), Urquhart et al. (2013) and Kim et al. (2017) samples of Hii regions. We draw the following conclusions: (i) We find no objects in our resulting positive spectrum Hii region sample that match the canonical definition of HC Hii regions given in Kurtz (2005), i.e. with linear diameter ≤ 0.03 pc and with a spectral index 2. The majority of our positive spectrum Hii regions have diameters more than 0.03 pc and with spectral indices between 0.5-1.0. (ii) We recover roughly half of the known HC Hii regions in the CORNISH survey. However, these objects are generally resolved at 5 GHz with larger diameters than seen in their higher frequency discovery observations. Combined with spectral indices that indicate mixed optically thick and thin components along the line of sight, we suggest that HC Hii regions have a hierarchical structure analogous to UC Hii regions. Multi-frequency, multi-resolution radio observations are required to confirm this hypothesis. The canonical definition of Hii regions based on linear diameter may perhaps need to be updated to reflect its structure. (iii) We see a general trend between spectral index and the line width of mm-wave recombination lines observed by Kim et al. (2017), in that objects with higher spectral indices tend to show higher line widths. However, this trend is still inconclusive as many of the spectral indices for the source in Kim et al. (2017) sample are lower limits. Further higher frequency radio continuum observations are required to confirm this trend. (iv) In a combined sample drawn from Cesaroni et al. (2015) and Urquhart et al. (2013) we find that roughly half of these Hii regions have positive spectral indices. Hii regions with a positive (i.e. rising) spectrum are found to be statistically more luminous and with higher Lyman fluxes than Hii regions with negative or flat (i.e., not-rising) spectral indices. This suggests that rising spectrum Hii regions are associated with more luminous and massive stars. We find no evidence for differences in the linear diameter of rising and not-rising spectrum Hii regions, nor in the mass of their embedding clumps or their luminosity-to-mass ratios. This paper has been typeset from a T E X/L A T E X file prepared by the author.  These columns contain the name and Galactic coordinate of each source, the flux density and angular diameter of each source at 5 GHz from CORNISH, flux densities at 1.4 GHz from THOR, MAGPIS and White2005, as well as the spectral indices and its errors. Symbol † means that those objects are detected at both 5 GHz and 1.4 GHz with lower limit of the spectral indices as they are extended at 1.4 GHz. Flux densities of some sources at 1.4 GHz with no errors refer to the 1.4 GHz noise level at the source position, indicating that these sources are only detected at 5 GHz and so have lower limits of spectral indices. Table A2. Information of total 534 positive spectrum radio objects