A global atmospheric electricity monitoring network for climate and geophysical research

The Global atmospheric Electric Circuit (GEC) is a fundamental coupling network of the climate system connecting electrically disturbed weather regions with fair weather regions across the planet. The GEC sustains the fair weather electric field (or potential gradient, PG) which is present globally and can be measured routinely at the surface using durable instrumentation such as modern electric field mills, which are now widely deployed internationally. In contrast to lightning or magnetic fields, fair weather PG cannot be measured remotely. Despite the existence of many PG datasets (both contemporary and historical), few attempts have been made to coordinate and integrate these fragmented surface measurements within a global framework. Such a synthesis is important in order to fully study major influences on the GEC such as climate variations and space weather effects, as well as more local atmospheric electrical processes such as cloud electrification, lightning initiation, and dust and aerosol charging. The GloCAEM (Global Coordination of Atmospheric Electricity Measurements) project has brought together experts in atmospheric electricity to make the first steps towards an effective global network for atmospheric electricity monitoring, which will provide data in near real time. Data from all sites are available in identically-formatted files, at both 1s and 1min temporal resolution, along with meteorological data (wherever available) for ease of interpretation of electrical measurements. This work describes the details of the GloCAEM database and presents what is likely to be the largest single analysis of PG data performed from multiple datasets at geographically distinct locations. Analysis of the diurnal variation in PG from all 17 GloCAEM sites demonstrates that the majority of sites show two daily maxima, characteristic of local influences on the PG, such as the sunrise effect. Data analysis methods to minimise such effects are presented and recommendations provided on the most suitable GloCAEM sites for the study of various scientific phenomena. The use of the dataset for further understanding of the GEC is also demonstrated, in particular for more detailed characterization of day-to-day global circuit variability. Such coordinated effort enables deeper insight into PG phenomenology which goes beyond single-location PG measurements, providing a simple measurement of global thunderstorm variability on a day-to-day timescale. The creation of the GloCAEM database is likely to enable much more effective study of atmospheric electricity variables than has ever been possible before, which will improve our understanding of the role of atmospheric electricity in the complex processes underlying weather and climate.


Introduction
Earth's electrical environment has been studied since the 1750s, but its more recently-appreciated connections to clouds (Tinsley et al., 2007;Nicoll and Harrison, 2016) and climate (Price, 1993;Rycroft et al., 2000;Williams, 1992Williams, , 2005 have highlighted some incompleteness in understanding of atmospheric electricity in the climate system. It is well established that Earth has a "Global atmospheric Electric Circuit" (GEC), through which charge separation in thunderstorms sustains large scale current flow around the planet (Wilson, 1921;Williams, 2009). The GEC sustains the fair weather (FW) electric field (or potential gradient, PG, as it is also known 1 ), which is present globally in regions which are not strongly electrically disturbed by weather or aerosol. In such conditions, the PG can be related to the local electrical conductivity of air, σ, through Ohm's Law: where J c is the air-Earth conduction current which flows vertically from the ionosphere to Earth's surface. Provided no local charge separation processes are active, J c can be considered constant, hence the PG is inversely proportional to σ, and any phenomena (such as meteorological processes like fog or aerosol pollution) which perturb σ will also affect the PG. PG can be measured routinely using well-established electric field mill instrumentation (e.g. Nicoll, 2012). Measurements of PG can contribute to our understanding of how thunderstorms and the global atmospheric electrical system may be varying within our changing climate, which are difficult to assess by global lightning networks because they are not stable with time. PG measurements are also useful in understanding some of the fundamental processes occurring inside thunderstorms which are only just starting to be understood such as high energy particle emissions related to thunderstorm ground enhancements (TGEs) and terrestrial gamma ray flashes (TGFs) (e.g. Chilingarian et al., 2015;Chilingarian, 2018). However, in order that truly global signals are considered in understanding the processes within the global circuit, many validating measurements must be made simultaneously at different locations around the world. Beyond thunderstorms, another area of current research in atmospheric electricity is the role that atmospheric electricity plays in modulating cloud properties and therefore its indirect effects through clouds on the Earth's radiative balance. Recent evidence demonstrates that all persistent extensive layer clouds are electrically charged at their upper and lower boundaries, which theory indicates can influence cloud microphysical processes (Nicoll and Harrison, 2016). Since layer clouds are common globally, electrical effects on cloud microphysics may therefore always be contributing some of the underlying variability in cloud properties. One of the most uncertain elements is the effect of space weather influences on atmospheric electricity, through lower atmosphere changes in cosmic ray ionisation from solar flares and energetic particle events. Recent work (e.g. Michnowski, 1998;Harrison et al., 2013;Smirnov, 2014;Nicoll and Harrison, 2014) has reported effects of space weather influences on the PG at individual sites, but in order to identify and understand global effects, simultaneous measurements are required at multiple locations.
Despite the central role of lightning as a weather hazard and the potentially widespread importance of charge for many atmospheric processes involving particles and droplets, research is hampered by the fragmented nature of surface atmospheric electricity measurements, making anything other than local studies in fortuitous FW conditions difficult. In contrast to detection of global lightning using satellitecarried instruments and ground-based radio networks, fair weather PG cannot be measured by remote sensing and no similar extensive measurement networks exist for its study. This has been a major limitation on research into FW atmospheric electricity. Some valuable regional PG monitoring networks have however been established, such as at NASA Kennedy Space Centre (e.g. Krider, 1989;Lucas et al., 2017); in the Russian Federation (Popov et al., 2008) and in South America (AFINSA) Tacza et al., 2014), but these cover only a small part of Earth's surface. Archiving of historical atmospheric electrical data has also been achieved by the ATMEL2007A database (Tammet, 2009) which compiled a large number of hourly datasets from Russia and Europe. These valuable datasets are now historical, as data is only most recently available up to 2006. There is now an opportunity to widen the geographical coverage of available PG measurements as many researchers worldwide currently make high temporal resolution measurements of the FW PG routinely, which is neither coordinated nor exploited. The UK-led GloCAEM (Global Coordination of Atmospheric Electricity) project has brought these experts together to make the first steps towards an effective global network for FW atmospheric electricity monitoring, with publicly available data and in near real time. Another novel aspect of the GloCAEM dataset is the availability of meteorological data alongside the PG measurements. The meteorological information is, firstly, central in independently determining the existence of fair weather conditions (due to the substantial influence of non-fair weather meteorological phenomena on the PG), which are required to study global and space weather influences on the PG, and, secondly useful in allowing study of the effect of variations in local meteorology on the electrical environment. The use of identical format files for each GloCAEM measurement site makes data analysis, in particular comparison of data from all the sites, very straightforward, which is a key aspect of the project in driving research in atmospheric electricity forward.
This paper describes the properties of the GloCAEM database, as well as presents a summary of some of the initial analysis performed with the dataset. This focuses on the application of the data to Global Electric Circuit research, but also provides advice on choice of the best GloCAEM sites at which to study a variety of different atmospheric and geophysical processes related to atmospheric electricity.

Overview
The GloCAEM dataset so far contains PG and meteorological data from 12 different international institutions and 17 different locations worldwide. It is stored at the UK Centre for Environmental Data Analysis (CEDA), which is a Data Repository funded by several of the UK research councils (http://www.ceda.ac.uk/). CEDA provides secure and long term storage of datasets for atmospheric research for the academic community. The GloCAEM dataset will therefore become publicly available (with users registering for a username and password through CEDA) for research use when the period of data checking is completed, with the functional launch expected to be from summer 2019. Some key details of the GloCAEM dataset functionality are listed below: -Data values are provided in as close to real time as possible (at sites where internet and FTP are available), -Data files are provided as daily files, at two different time resolutions: 1 s and 1 min averages, -Data files of 1 s resolution contain only PG data; 1 min files contain PG and meteorological data at the same temporal resolution (where available) -Downloadable site information and instrument information files are provided for each location at the project website (https://glocaem. wordpress.com/introduction/project-partners-and-measurementsites/).
The use of daily data files allows the user to choose whether to download only a few files if analysing one specific event, or the entire dataset. The GloCAEM sites are essentially a virtual network in that the network has not specifically been created to provide new sites for PG measurements -rather it collates data from existing measurement sites and converts them into a common format which is accessible to the wider research community. Presently the GloCAEM dataset focuses on recent data, and many of the measurement sites are relatively new with only a few years of data so far; however data are available back to 2005 for some sites.

Parameters measured
In terms of atmospheric electricity parameters, the GloCAEM dataset focuses principally on measurements of PG as this is the most commonly measured quantity due to the relatively large number of commercially-available sensors. The PG is present globally and is typically ∼100 V/m in clear air during fair weather conditions at sea level, with larger values in polluted and non fair weather conditions. PG is influenced by many factors including local meteorological influences, dust and aerosol concentrations (e.g. Yair et al., 2016), global thunderstorm activity through the GEC (e.g. Rycroft et al., 2000), space weather events (e.g. Märcz, 1997;Harrison et al., 2013;De et al., 2013) and changes in the ionisation rate from radon (e.g. Lopes et al., 2015), cosmic ray changes (Mateev and Vellinov, 1992), and artificial ionisation sources (e.g. Takeda et al., 2011;Fews et al., 2002;Matthews et al., 2010), and changes in local site characteristics such as variable electrical shielding effects of surrounding trees  or buildings. Biotic factors, such as vegetation and animal activity have thus far not been considered, yet have been recognised to also affect local PG measurements. At any site the dominant changes in the PG are likely to arise from local meteorological changes; therefore, it is important to understand how various phenomena such as the local wind regime, rainfall, cloud, fog and aerosol influence the measurement of PG (e.g. Deshpande and Kamra, 2001;Minamoto and Kadokura, 2011;Bin et al., 2012;Harrison and Nicoll, 2018;Gurmani et al., 2018). Consequently the GloCAEM dataset provides meteorological measurements in the form of: pressure, temperature, relative humidity (RH), wind direction, wind speed, rainfall, global solar irradiance, diffuse solar irradiance, visibility, sunshine duration and cloud base height. Although these measurements are not available at every site, they are provided wherever and whenever possible. The inclusion of a number of solar radiation and cloud measurements is to enable different criteria to be explored in identifying fair weather conditions, since this aspect of data selection (Harrison and Nicoll, 2018) is key for GEC studies. The GloCAEM dataset also allows the study of different types of phenomena which cause perturbations in PG (such as precipitation effects), which, for the purpose of GEC studies, can be averaged out; regular systematic variations, however, must be dealt with in other ways. Aerosol-related effects on PG can be both transient (e.g. short timescale effects from space weather), and systematic (e.g. the sunrise effect, differences between weekdays and weekends, seasonal effects). These are discussed more fully in section 3.2.

Instrumentation
The PG has historically been measured by a number of different methods. These have included potential probes, in which the potential on a conductor equalises with the potential of the surrounding air and any subsequent changes in potential on the probe follow variations in PG (e.g. Chalmers, 1967). Burning fuses, water droppers or radioactive probes have been implemented to increase the conductivity of the air surrounding the probe to allow faster equalisation rates (Israël, 1970). This technique is still employed at several of the GloCAEM measurement stations, including Swider, Poland and Nagycenk, Hungary (Märcz et al., 2001), where long time series of measurements are available. One of the main limitations of the potential probe method is its slow time response, which is typically on the order of tens of seconds depending on the method of equalisation employed. An alternative method is to use an electric field mill, which allows much faster measurements (up to around 100 Hz), and versions which are robust to all meteorological conditions are available. An electric field mill typically consists of a horizontal electrode, which is alternately exposed and shielded from the atmospheric electric field. As the electrode is exposed to the electric field, a charge is induced on the electrode, the magnitude of which is proportional to the field (e.g. Chubb, 2010, chapter 6 in MacGorman andRust, 1998). This is measured with an electrometer, and phase sensitive detection. Fig. 1 shows some of the electric field mill sensors used at the various GloCAEM sites. Technical details of the sensors are described fully as separate metafiles on the GloCAEM project website (https://glocaem.wordpress.com/introduction/projectpartners-and-measurement-sites/). This file also includes information on the height of the field mill sensors above ground (which typically varies between 1 m and 3 m between sites). A full list of the various field mill types used at the GloCAEM sites is given in Table 1. It should be noted that even though potential probe measurements are in use at some of the GloCAEM sites, only the digitised PG measurements from field mills are currently included in the database. PG measurements are recorded at a variety of sampling rates at the different GloCAEM sites (from 2 Hz to 25 Hz), therefore to ensure consistency between sites, GloCAEM data have been processed to report data at 1 s and 1 min averages, in different data files. The provision of data with different temporal resolution is intended to allow easier analysis of phenomena which occur on a variety of timescales, without always having to download vast amounts of data. It is important to point out that the absolute value of PG measured by a field mill is affected both by calibration of the sensor and the physical environment surrounding the sensor. Metal masts or guy lines distort the electric field, modifying the PG which is measured.
Thus, for PG measurements from different sites to be comparable, they must be standardised to an open situation (such as flush with the ground surface, or compared with measurements from a horizontal passive wire antenna), to remove the distorting effects. See e.g. Appendix in Harrison and Nicoll (2018) for more details. Field mill calibrations and site correction factors (to account for the distortion of the electric field around the field mill mounting mast or nearby buildings, are applied to data from some of the GloCAEM sites. Details of correction factors and calibrations against other sensors are provided in the metafiles on the GloCAEM project website (https:// glocaem.wordpress.com/introduction/project-partners-and-measurementsites/). Since not all PG measurements in the GloCAEM database have been corrected for site distortion factors, it is generally not meaningful to compare absolute PG measurements between all the different sites. PG values with respect to the mean value at each site are therefore often discussed throughout this paper to address this issue.

Measurement sites
At present the GloCAEM dataset comprises PG data from seventeen different locations ranging from Poland in the north to Antarctica in the south. Fig. 2 shows a map of the various measurement sites which include ten different countries and four continents. Table 1 provides a detailed description of the measurement sites in the GloCAEM dataset, and as with the instrumentation information, the specifics of each site are included as a separate metafile on the GloCAEM project website (https://glocaem.wordpress.com/introduction/project-partners-andmeasurement-sites/). The sites include flat terrain in rural locations, mountainous regions, ice shelves, deserts and rooftop locations in city centres. Whilst traditionally many of these site locations would be avoided for atmospheric electricity research, the aim of GloCAEM is to provide access to data for a wide range of related research purposes. For example: although the PG may be enhanced as a result of the distortion caused by mountains, measurements in such locations can provide information about boundary/exchange layer transitions (as well as in cloud measurements) as these sites often move in and out of such layers as their altitude varies (e.g. Israël, 1957;Kamogawa et al., 2015;Yaniv et al., 2017). PG measurements in city centre locations can provide information on aerosol and pollution transport (e.g. Silva et al., 2016) and desert measurements can provide valuable insights into dust electrification processes (e.g. Yair et al., 2016;Yaniv et al., 2016;Esposito et al., 2016;Katz et al., 2018). Although the type of surface and surrounding orography influences the PG measurement, so too does its geographical location in terms of the typical meteorological conditions that the site experiences. Figs. 3 and 4 show an example of differences in site climatology between Graciosa, Azores -an island location in the North Atlantic Ocean -and Panská Ves -a continental location in the Czech Republic. Although the median values of PG at both locations for the year of 2016 are similar (80 V/m at Graciosa, 49 V/m at Panská Ves 2 ) the difference in variability, and the range of PG values (from Fig. 4) is obvious. This is mostly due to climatological differences (in particular rainfall) between the two mid-latitude sites. This is particularly true in the summer months, when Panská Ves experiences a relatively large number of convective events compared to Graciosa, which causes large variability in the PG. It therefore follows that Panská Ves is a better site for the study of convective activity, but Graciosa is likely to be more suited to fair weather measurements which is required for study of the GEC. Greater variability is expected in the Panská Ves data due to aerosol/conductivity variations that are inherent at inland continental stations compared to the relatively clean oceanic air at the island location of Graciosa. Thus the inclusion of different types of measurement locations in the GloCAEM dataset will allow the study of many different types of phenomena of both local and global origins.

Diurnal variations at GloCAEM sites
One of the key parameters in global atmospheric electricity research is the diurnal variation in PG on fair weather (FW) days. This is due to the fact that, in the absence of local influences, the diurnal variation in PG is known to follow closely the diurnal variation in global thunderstorm and shower cloud area, which together are understood to drive the global circuit. This result was first established by Mauchly (1921Mauchly ( , 1923 and Whipple (1929) using the pioneering measurements of the Carnegie research ship (e.g. Harrison, 2013). The characteristic shape of this variation found by the Carnegie scientists, with a principal minimum around 03 UT and principal maximum around 19 UT, is known as the Carnegie curve. Measurement sites which exhibit a daily PG variation which is very similar to the Carnegie curve are often said to be globally representative and hence, in principle can provide a method of monitoring the global variation of the GEC from a single site measurement. 3 Analysis of the diurnal variation in PG has been performed for all of the GloCAEM sites. Since meteorological data is so far available at only some of the GloCAEM sites -and hence true FW conditions cannot be explicitly identified -the PG is selected for nondisturbed conditions on the basis of the PG values only. This is based on  the fact that non fair weather conditions (such as rainfall and high winds) tend to produce large (as well as negative) values of PG (e.g. Bennett and Harrison, 2007). This approach may also remove situations in which the conductivity is low (e.g. during high aerosol concentration events), which will produce abnormally large PG values, which would not be detected by selection of meteorological conditions alone. Thus, what are considered non-disturbed values of PG are selected individually for each site by only considering positive PG values in the inner 80% of the distribution of PG values. This approach ensures that any outliers in the PG distribution are removed from further analysis. Fig. 5 shows the percentage of non-disturbed periods (in black) for 9 of the GloCAEM sites for each month of 2016. As expected, there is a large range between sites in the proportion of non-disturbed values. For example, the maximum percentage of non-disturbed PG values in any one month is 92% for Evora (EVO) (Oct 2016), whilst the minimum occurs for Xanthi (XAN) at only 14% (March 2016). For the 5 sites with no missing data in 2016, Graciosa (GRA) has the highest proportion (78%) of non-disturbed periods during the year, with Studenec (STU) (33%) the least. There is also a seasonal effect evident at some sites, for example at Reading (RDG) and Swider (SWI), which are both mid-latitude sites and subject to an increased number of non-fair weather conditions during winter months than summer. Such information can be used to assess the most suitable GloCAEM sites for GEC studies, as well as what time of year (if any) analysis should be focused on to increase the proportionality of non-disturbed data available.
The average diurnal variation for non-disturbed values of PG during 2016 is shown for 15 of the GloCAEM sites in Fig. 6, alongside the Carnegie curve (red dashed line). Sites are combined according to whether their diurnal variation in PG contains a single maximum, similar to the Carnegie Curve, or two maxima (as assessed by eye). Because many of the sites are not absolutely calibrated, comparison of absolute PG values cannot be undertaken: instead, the sites' data are given as the percentage of the PG with respect to the median value for each site. Median PG values before and after non-disturbed selection are shown in Table 2. Five of the GloCAEM sites are shown to have a single maximum; however, the remaining ten sites demonstrate evidence of substantial local effects (particularly following sunrise), with a double peak in PG evident. This is to be expected as the majority of sites are continental and relatively close to major population centres where aerosol pollution is abundant and responds to atmospheric mixing. This so called "sunrise effect" has been observed at a variety of locations (e.g. Marshall et al., 1999) and is generally thought to be related to mixing of the near-surface electrode layer (which is an accumulation of positive charge next to the negatively charged Earth's surface). At the majority of sites with two daily maxima, the first maximum is proportionally smaller than the second, with the exception of Xanthi, Greece, where the morning maximum is 30% larger than the evening one (Kastelis and Kourtidis, 2016). Such local influences complicate any potential GEC analysis, but PG measurements from such sites will provide valuable information on pollution and aerosol content of air and methods of minimising the effects of such aerosols are discussed in section 3.2.
Of the sites which exhibit a single maximum, three of these are in mountainous locations with altitudes above 2000 m. Although such sites may be subject to Austausch effects around sunrise which can cause anomalously high values of PG due to turbulent and convective mixing (e.g. Israël, 1957;Marshall et al., 1999;Yaniv et al., 2017), their high altitudes mean that they are often above the polluted boundary Fig. 5. Strip chart of percentage of non-disturbed PG values for each month for GloCAEM sites with more than 10 months of data in 2016 (using 10 min mean values of PG). Black shows percentage of non-disturbed PG, grey disturbed PG, and red denotes no data available during that time period. Percentages are given with respect to total data available for that month at a specific site. Site abbreviation codes are given in Table 1. layer and so can more readily detect GEC signals, although this is not always the case. The presence of a Carnegie-like oscillation at Graciosa (Azores); and Halley (Antarctica) is also not surprising as these are both relatively "clean" sites, as Graciosa is located on a small island in the middle of the Atlantic Ocean, and Halley is on the Brunt Ice Shelf. One explanation for the differences in the timings of the peak in the curve in Fig. 6 (a) may be related to the latitudinal and longitudinal distribution of the various GloCAEM sites, where proximity to the major thunderstorm regions of the Americas, and the African and Asian continents may influence the shape of the diurnal variation (e.g. Kamra et al., 1994). It should, however, be noted that there is disagreement in the literature as to whether or not this phenomena occurs (particularly in regard to theoretical model results), and if so on what time scale effects are evident.
Although there is considerable spread in the times of the maxima of the diurnal curves, the minima in Fig. 6 (regardless of whether the sites have single or double maxima) are generally consistent at around 03-05 UT, similar to the Carnegie curve. Fig. 7 demonstrates this by showing the difference between the PG and Carnegie curve as a function of universal time (UT) for sites with similar latitudes (and therefore similar times of day). The GloCAEM dataset therefore supports the idea that the early morning UT hours are well suited to detecting global circuit signals (as, e.g. suggested by Märcz, 1997), and even sites which demonstrate considerable local influences during the day such as Xanthi should therefore not be discounted from such analysis. In general, the time period from 21 to 06 local time (LT) is when local sources of variability are less dominant and is therefore when the most globally representative times will occur at the different GloCAEM sites.

Influence of pollution
The double maxima behaviour in the diurnal variation of PG in Fig. 6 (b) is indicative of local influences on PG and is typically related to anthropogenic pollution and seasonal effects. Sources of particulate pollution can include local traffic, domestic heating, cooking or industry. Even in Antarctica, which is typically a "clean" environment, diesel generators used to power scientific bases can be a source of significant aerosol. The increased aerosol typically acts to reduce the electrical conductivity, causing an associated increase in PG as expected from Ohm's Law when J c is constant. The additional peak in the diurnal PG curve often appears around sunrise, when the planetary boundary layer (PBL) is shallow (typically less than 1 km) and pollution sources from traffic and domestic heating and cooking are substantial. Previous work (e.g. Sheftel et al., 1994;Israelsson and Tammet, 2001;Silva et al., 2014) has noted a difference in PG between weekdays and weekends when traffic levels and industrial pollution sources are often decreased. Such analysis can be applied to the GloCAEM sites to investigate the effects of anthropogenic pollution at some of the sites which display a double diurnal variation. Fig. 8 shows boxplots of nondisturbed PG on weekdays (Monday to Friday) and weekends (Saturday and Sunday) for Reading, Evora, Tripura and Xanthi (which all have double maxima in their diurnal curves) using data from 2016. Typically, the weekday PG is larger than that at weekends, for example, by up to 9% at Tripura, but only 2% at Evora. All differences between weekday and weekend PG are statistically significant at the 5% level, with the exception of Evora. The 5% increase in Reading PG during weekdays can be attributed to increased pollution since Latha and . They also noted an increase in PM10 through Monday to Thursday, and decrease from Friday to Sunday, which also demonstrates that pollution dispersal timescales will also play an important role in controlling PG. Although such a difference between weekday and weekend PG supports the concept that a site may be affected by anthropogenic pollution, further investigation is required to properly characterize this contribution. Air pollution often exhibits an annual cycle with maxima in winter and minima in summer due to the annual variation in emissions (e.g. more use of domestic heating in winter and, in urban areas, less traffic during holiday periods). The variation in convection and PBL height throughout the seasons (which controls the distribution of aerosol particles near the surface) also determines the magnitude of the effect of pollution on PG. Fig. 9(a) shows the monthly mean values of PG for Reading for weekends and weekdays. There is a clear seasonal cycle with increased PG in winter (months 12 to 2) versus summer (months 5-8), which is most pronounced in the weekday PG. This is likely to be related to traffic and regional industrial production being at its greatest during weekdays, and coinciding with shallow boundary layer depths during the winter months. The maximum of PG in the winter months at Reading follows a similar winter maximum to that reported by Everett (1868) using instrumentation installed at Kew, London, by Kelvin, 1860. Contrasting this with PG data from Graciosa 4 ( Fig. 9 (b)), which is a primarily clean air site (Lopes et al., 2017), both weekday and weekend PG values maximize during the summer months. Measurements of aerosol optical depth at Graciosa (Logan et al., 2014, not shown here) show a maximum aerosol number concentration (for 3-10 μm particles) during winter and spring, which is generally related to high wind speeds which generate sea spray (this is the only substantial source of aerosol particles at Graciosa during these months (Logan et al., 2014)). This seasonal dependence in aerosol is opposite to the seasonal variation in PG, suggesting a negligible effect of the sea spray on the PG. This, and the fact that the seasonal variability in the GEC predicts a NH summer maximum in PG, as is evident at Graciosa, supports the interpretation of Graciosa as a clean air site and more suitable for GEC measurements than urban locations.
Despite a number of the GloCAEM sites being influenced by local pollution it is possible to select periods of the year when local influences are minimized. Fig. 10 shows the diurnal variation in PG at Reading for both winter and summer months, compared with the Carnegie diurnal variation in PG. During the winter months, there is a single maximum (∼15 UT), whereas during the summer, a double maxima is present (at 06 and 19 UT), which is characteristic of the mean diurnal variation at Reading when the average is taken over the whole year. As is also demonstrated in Fig. 9(a), the magnitude of the PG during the winter months is larger than in the summer, consistent with Everett (1868) -most likely due to the increase in aerosol emissions from domestic heating and trapping of this aerosol by a shallow PBL. The reduced variability in PBL height during winter (due to diminished convection) therefore leads to more quiescent meteorological conditions which results in a more stable diurnal variation in PG, and the disappearance of the morning maximum peak. It therefore follows that many of the GloCAEM sites which are affected by local sources of variability will inevitably have periods (such as during the winter months) which are less dominated by local effects and are therefore more globally representative. This was demonstrated by Harrison et al. (2011) who, using data from December months when more quiescent conditions prevailed, detected a GEC response to the El Nino-Southern Oscillation (ENSO) in PG data from Shetland, UK.

Average diurnal behaviour across multiple sites
There is considerable interest in whether a global PG dataset can provide information on daily variability within the global circuit (and therefore a proxy for global electrified clouds), which is generally not  possible by using PG from one site due to interference from local sources of variability. This has been suggested to have potential applications for simple monitoring of global temperatures due to the dependence of the GEC on surface temperature and global thunderstorm activity (e.g. Price, 1993).
To investigate this, Fig. 11 shows (in black) the variation in PG on two individual days averaged into hourly values across six of the Glo-CAEM sites, compared with the Carnegie curve (in red). Only sites with more than 10 months of data for 2016, and which exhibit Carnegie-like curves on at least one individual day, are considered for this analysis. Even so, the averaging approach should act to minimise any local influences on PG such as fluctuations in aerosol concentration. There is clear similarity between the heavily averaged Carnegie curve and the daily averages across the GloCAEM sites, particularly in the timings of the maxima. The differences evident in the GloCAEM curves between Fig. 11 (a) and (b) also warrant further investigation into the source of the much lower PG values in Fig. 11(b) (possibly due to a decrease in global thunderstorm activity) that day, as well as the detection of the secondary peak at 0900UT which is normally due to thunderstorm activity in Asia (potentially due to increased activity in this area on this day). Curves possessing a similar shape to the Carnegie curve, with a minimum in the early morning hours and single maximum around 19 UT (assessed by eye), were observable on ∼ 25% of days in 2016 using this averaging method. This is a substantial increase in the number of days from any single site in the GloCAEM database. This therefore demonstrates that averaging across multiple sites may well improve the statistics of observing Carnegie-llike signals on a day-to-day basis.

Seasonal variations in PG at GloCAEM sites
Establishing the exact nature of the seasonality in the GEC has not been a simple task due to interference of local influences on PG which, themselves have their own seasonal variations (e.g. boundary layer heights and aerosol concentrations) (Adlerman and Williams, 1996;Williams, 2009). The few fair weather PG measurements from the Carnegie during the northern hemisphere summer months on its main cruise (e.g. Harrison, 2013), as well as the geographically varying Carnegie PG measurements have also added to the complexity as proximity to major thunderstorm regions (e.g. Kamra et al., 1994), and their seasonal variation may be an additional factor. It has therefore required measurements from clean air sites such as Antarctica (e.g. Burns et al., 1995Burns et al., , 20052012 to confirm that the GEC has a northern hemisphere (NH) summer maxima, in agreement with the summer maxima in global lightning activity.
PG measurements from Amundsen-Scott South Pole station (Reddell et al., 2004), Vostok (Burns et al., 2005), and Concordia, all in Antarctica, have proved invaluable in establishing the seasonal variation in the GEC, but maintaining PG instrumentation in such harsh polar environments makes long term measurements difficult. To investigate the suitability of current GloCAEM sites for seasonal GEC monitoring, Fig. 12 shows seasonally averaged values of the diurnal variation in non-disturbed PG at three of the GloCAEM sites which exhibit single peaks in their diurnal PG curves -Graciosa, Halley and CAS2. Although typically only two years of data are included for each site, which is not ideal for seasonal studies, differences can be seen in the shape of the curves between seasons and times of the maxima and minima. The existence of a small peak around 06-09 UT is evident at Halley (which varies seasonally), but not at Graciosa or CAS2 which exhibit pronounced minima at this time. This is particularly prevalent at CAS2, and may be attributed to the lack of influence of the Asian thunderstorm generator (due to the large distances involved between Asia and Argentina), which typically maximises around 06-09 UT (Tacza et al.,   2014). It should also be noted that Halley and CAS2 are southern hemisphere sites, and CAS2 in particular is likely to be influenced by the nearby South American thunderstorm generator region, which has a maximum thunderstorm output in DJF (southern hemisphere summer), which may lead to the highest maxima in PG (128% with respect to the mean) being observed in this season. The high latitude of Halley also suggests that the PG there may be subject to additional variations present on a diurnal scale, caused by the ionospheric interaction with the solar wind (known as cross-polar cap variations (e.g. Weimer, 1996)). This additional diurnal variation would be superposed with the GEC variation, and is known to lead to differences in the timing of minima and maxima at certain Antarctic sites (e.g. Vostok and South Pole station (Burns et al., 2012)). The westward Antarctic location of Halley, on the Brunt Ice Shelf, means that such effects are likely to be small, and only likely an issue during disturbed solar periods, but a full analysis of the Halley data is required to remove such effects. Fig. 13 investigates the variations in the timing of the maxima and minima in the diurnal variations of PG for each of the three GloCAEM sites as a function of season and compares them with the timings of the PG maxima observed by the Carnegie (data from Harrison, 2013) and at Vostok (data from Burns et al., 2005). As is seen, for the timing of the maxima, the Carnegie and Vostok timings show an increase from NH winter through to summer, with maximum in the summer months (JJA), which results from a summer maximum in global thunderstorm activity in the NH. The three GloCAEM sites show a similar increasing trend from NH winter, but display a maximum in spring (MAM). There may be several explanations for this including a lack of data (typically less than 2 years of measurements for each site) and the proximity of sites to major regions of thunderstorms and electrified shower clouds, which may dominate over global influences at certain times of the year. In terms of the timing of the minimum, all sites (with the exception of CAS2, which does not have a dominant minimum in DJF) show the same trend with a spring maximum. The better agreement between sites in terms of the trend in the time of the minimum is likely related to the fact that there is less influence of local variability on the PG during the early morning hours (as demonstrated in Fig. 7). It is evident therefore that although some of the GloCAEM sites show promise for GEC monitoring, more data are required to fully assess their suitability.

Discussion and future directions
The GLOCAEM network aims to archive PG data generated from a variety of measurement site types and locations around the globe, which can be utilized to study different scientific phenomena related to atmospheric electricity. Table 3 summarises some of these phenomena and provides initial recommendations of the GloCAEM sites most suited for such analysis. The most widely studied topic in fair weather atmospheric electricity has historically been that of the GEC, which typically requires long time series of PG measurements in fair weather conditions, in unpolluted locations. Although several of the GloCAEM sites promise to be suitable for such studies, detailed analysis is yet to be undertaken, providing further insight into the data available, its analysis and its future reporting. Measurement of the GEC from surface PG measurements is difficult because of local influences such as aerosol variations. Despite the continental nature of most of the GloCAEM sites, this paper has however demonstrated that even sites which are subject to variable influences from e.g. local sources of aerosol, can demonstrate some global representability with careful selection of data. This includes restricting data seasonally (e.g. winter periods which show less atmospheric mixing), using weekend data only (less sources of anthropogenic pollution), and focusing on periods of the day which are subject to less local influences, such as 2100-0600 LT. The ability to monitor the daily GEC variability through the diurnal variation in PG on individual days is highly desirable, because of its relationship with global temperature changes (e.g. Williams, 1992Williams, , 1994. The GloCAEM database thus provides the opportunity to fully test whether averaging many simultaneous PG measurements from around the globe (as in Fig. 11) can provide a more robust determination of the GEC on daily timescales than from any one site, which will only encounter FW conditions intermittently.
Although at the moment GloCAEM archives only PG data, the format of the data files has been created such that inclusion of other atmospheric electricity variables such as air-Earth conduction current (J c ) and conductivity (σ) measurements can be included in the future. In order to more completely represent the GEC, a similar network of σ and J c sensors will be required, but the difficulties associated with automating measuring these parameters robustly has prevented this so far. Scope also exists for the inclusion of historical datasets, and the global coverage of sites is expected to be extended in 2019 with the inclusion of further PG datasets.

Summary
This paper summarises the features of a new dataset for global PG Fig. 13. Time of the (a) maximum and (b) minimum in diurnal variation in non-disturbed PG as a function of season from various sites including the Carnegie, Vostok Antarctica; and the GloCAEM sites of Graciosa (data from 2015 to Sept 2017), Halley (data from 2015 to 2017) and CAS2 (Argentina) (data from 2016 to 2018). Carnegie data is obtained from Harrison (2013) and Vostok data from Burns et al. (2005). Vostok has more data points than the other sites as seasonal PG averages are reported over 2 month periods (Burns et al., 2005), unlike the other sites where averages are calculated over 3 month periods.

Table 3
Recommendations of GloCAEM sites most suited to the study of a variety of scientific phenomena in atmospheric electricity research. measurements, GloCAEM, encompassing four continents and 17 different measurement sites. The work presented is very likely to be one of the largest single analysis of global PG data using multiple datasets simultaneously, which demonstrates the usefulness of a dataset with identical data formatting for each site. The variety of different site locations and characteristics contained within the GloCAEM database now means that a number of scientific problems related to atmospheric electricity can be more easily investigated, these include GEC studies, ENSO and climate effects, space weather influences on atmospheric electricity, charging of dust, snow, fog, cloud and aerosols, interactions between PG and biological activity, and turbulent transport of space charge to name a few, and recommendations are given here on the sites most suitable for such analysis. Of the preliminary GEC analysis performed, the GloCAEM dataset is demonstrated to contain several sites which show promise for the study of the GEC (primarily Graciosa (Azores), Halley (Antarctica) and CAS (Argentina)). The averaging of PG during non-disturbed conditions from a number of GloCAEM sites on a daily basis is also demonstrated to produce globally representative signals, potentially leading to the ability to study day to day variations in the GEC, which has so far proved difficult from a single site. The creation of the GloCAEM database therefore represents a major step forward in the synthesis which has previously limited atmospheric electricity research, yet provides access to central elements of the climate system.