The automatic weather stations NOANN network of the National Observatory of Athens: operation and database

During the last 10 years, the Institute for Environmental Research and Sustainable Development of the National Observatory of Athens has developed and operates a network of automated weather stations across Greece. The motivation behind the network development is the monitoring of weather conditions in Greece with the aim to support not only the research needs (weather monitoring and analysis, weather forecast skill evaluation) but also the needs of various communities of the production sector (agriculture, constructions, leisure and tourism, etc.). By the end of 2016, 335 weather stations are in operation, providing real‐time data at 10‐min intervals. This paper provides information about the logistics of this network, including real‐time applications of the collected data as well as information on the quality control protocols, the construction of the station data and metadata repository and the means through which the data are made available to users.


Introduction
Weather monitoring is among the highest priorities of agencies and research organizations that are involved in operational and/or research activities related to weather and climate. Moreover, since weather observations are of paramount importance for many socioeconomic activities, they have to be provided in a timely manner to potential end-users, through webbased platforms that permit access not only to the current weather conditions but also to past weather data that are quality controlled and archived.
Observations provided by dense surface networks of meteorological stations, except their use in numerical weather prediction, are absolutely necessary for model validation and verification (e.g. Kotroni and Lagouvardos, 2004;Akylas et al., 2007), for targeted studies within urban areas (Kotroni et al., 2011;Siu and Hart, 2013), for climatological studies relating meteorology with air quality (Perez-Martinez and Miranda, 2015), etc.
Over the last few decades, the number of automated weather stations (AWSs) and networks have been considerably increased worldwide, including not only those operated by the national weather services, but also the ones established and operated by universities, research institutes, power grid operators or private companies. Non-national weather service networks address the need for a denser distribution of weather observations and contribute to the expansion and dissemination of high-quality weather-related data needed in the research on the weather hazards and associated environmental and socioeconomic impacts (Meyer and Hubbard, 1992;Shafer et al., 2000;Papagiannaki et al., 2013Papagiannaki et al., , 2015. In the United States and Canada, the number of non-federal networks has particularly increased during the 1980s, mostly to meet needs related to agriculture, while the expansion continued since new research topics emerged to address climatic and environmental issues (Shafer et al., 2000). Among the existing AWS networks that provide detailed documentation on their quality control protocols and technical information is the Oklahoma Mesonetwork (Mesonet), which currently comprises 120 stations. The network is operated by two Oklahoma universities since 1995, and has a significant positive impact on the research conducted on the region (Brock et al., 1995;Shafer et al., 2000). The West Texas Mesonet and the Helsinki Testbed networks have also been acknowledged for the high quality and reliability of the produced datasets by Muller et al. (2013), who provide a review of urban meteorological networks, with the aim to address the need for standardized metadata protocols. The authors stress the importance of documenting and making public the networks' protocols in order to establish common guidelines and ensure high-quality services for applications and urban research.
The Institute of Environmental Research and Sustainable Development of the National Observatory of Athens (IERSD/NOA) took the initiative in 2006 to build a relatively low-cost surface weather observation network with the aim to gradually cover the entire Greek territory (including the numerous Greek islands) with automated online stations that continuously transmit their measurements through the web. This network is hereinafter referred to as NOAAN (NOA Automatic Network). The network expansion in terms of the number of deployed stations is shown in Figure 1. In December 2016, the NOAAN comprised 335 stations. Currently, NOAAN is the denser network of automated stations over Greece, complementing the Hellenic National Meteorological Service network which counts~90 stations. NOAAN has been built in a way to address not only the needs of both the scientific community but also of the various sectors of the economy for real-time accessible weather observations. Indeed, the data from NOANN, both the real time and the archived, are used for the study of the urban environment, the understanding of severe weather events and their impacts on the society, the model validation and verification, among others. Moreover, NOANN data are used in the agriculture sector not only for the definition of the local microclimate, where the station is installed, but also for the evaluation of the impacts of weather on the cultivations, in the construction business (definition of degree days, maximum rainfalls in a region, etc.), insurance companies (evaluation of damages), etc.
The next section is devoted to the general description of the network, including site selection, geographic distribution, etc. Then the construction of the corresponding database, the applied quality control procedures and the description of the multiple ways a user can access and display the data, are discussed. Finally, the last section presents the concluding remarks as well as the prospects for the sustainability and future expansion of the network and the associated services.

General description
Figure 2(a) shows the geographical distribution of the 335 stations of NOAAN at the end of 2016. The network is designed to cover the totality of the Greek territory with a more or less homogenous geographic distribution; however, in some places the network is apparently denser due to local research implications. The network is very dense especially over the greater Athens area where 40% of the Greek population lives (Figure 2(b)) with the aim to meet the needs of monitoring this highly urbanized area.
The stations are installed at buildings or land parcels owned by local authorities, schools, universities, monasteries, as well as on privately owned land and/ or premises. From the stations shown in Figure 2 The weather station type used is Davis Vantage Pro 2 that measures ambient air temperature, relative humidity, wind speed and direction, rainfall, atmospheric pressure, and solar and UV radiation (Davis   Instruments, 2010a). It also measures indoor air temperature and humidity at the location of the station's console/datalogger (Davis Instruments, 2012) and calculates a suite of bioclimatic indices and derived meteorological parameters (Davis Instruments, 2006). Table 1 provides the technical characteristics of the various sensors incorporated within each station (Davis Instruments, 2010b). It is beyond the scope of the current work to analyse the measurement differences in Davis instrumentation against other manufacturers of surface stations. For such a comparison, the readers are referred to Bell et al. (2013Bell et al. ( , 2015. In order to assure the sustainability of the network, three prerequisites are necessary for the installation of a new station: • To comply, as much as possible, with the guidelines for installing automatic weather stations, provided by WMO (2006WMO ( , 2008 (as explained in detail in the 'Site Selection' section).
• To ensure the availability of uninterrupted Internet connection that permits the real-time transmission and display of the collected data. For the quasi-majority of the stations, the network connection is provided free of charge by the local authorities so as to minimize the operational costs of the network.
• To ensure collaboration with local weather enthusiasts and/or local authorities employees, who can voluntarily support the station operation by intervening onsite in case of a sensor and/or telecommunication deficiency.

Site selection
The selection of an appropriate site for the installation of a weather station is critical for obtaining representative meteorological data. For this reason, there are two main criteria defined for the site selection of NOAAN stations.
First, the site is chosen to be as exposed as possible, and away from obstructions such as buildings and trees. The anemometer of the rural meteorological stations is mounted over an open-level terrain at a height of 5 m above the ground, while for the roof-mounted weather stations in urban sites, the anemometer is mounted at a height of 3 m. Standard anemometer height is at 10 m, but very often especially for surface stations deployed by National Weather Services in airports, the anemometers are installed at 5 or 6 m height (Kotroni et al., 2014). The height of the anemometer being referred to in the metadata helps the eventual users to make adjustments if needed. The temperature and humidity sensors are located inside a fan-aspirated radiation shield apparatus, usually at 1.8-2.0 m above bare soil or short grass, avoiding excessive vegetation that could smother the sensors; in urban areas, the roof-mounted stations are located at the edge of the roof in order to avoid radiative heating, especially during the summer months, from the underlying cement or roof tiles. Furthermore, special attention is given in order to avoid locations near heat sources usually installed at the roofs of buildings (i.e. air condition outdoor units, external heating units, chimneys, etc.). The rain gauge is placed above the fan-aspirated shield of the temperature and humidity sensors, with the collection area of the gauge bucket at a horizontal plane, open to the sky and approximately 2.0-2.3 m above the ground. At this point it should be noted that Davis Vantage Pro 2 rain gauge collector opening has a diameter of 165 mm, which is 15% smaller than that of typical gauges (~200 mm). Taking into account that rain gauge measurements are less accurate with increasing wind speed, this feature might affect more a gauge with a smaller collector diameter. Another feature that might also affect the rain measurement is when the gauge is installed on rooftops, but the siting information is included in the metadata. As it concerns solid precipitation, a rain collector heater is installed at those stations where, climatologically, snowfall is expected at least 4-5 days per year. When snowfall occurs over a station where no rain collector heater is installed, the rain reports due to melting snow are deleted from the database, while in the metadata report a comment about the loss of solid precipitation is added. Moreover, regarding stations that are equipped with solar and ultraviolet radiation sensors, special care is taken to avoid shadowing from adjacent obstacles and to assure, as much as possible, an unobstructed horizon. This last feature can be easily verified at each location by inspection of the daily variation in total solar radiation during a cloudless day.
The second criterion is to make the 10-min meteorological observations available in a real-time environment to a wide variety of Internet users, and primarily to ensure the data transmission using communication systems to relay data back to the central server of NOAAN. The largest part of the NOAAN data, collected by stations at public buildings and schools, is transmitted by the public Internet network connection 'Syzefxis', while for stations installed on privately owned land, data transmission is ensured by private Internet connections. Syzefxis is a typical project for providing large-scale telecommunication and data-transmitting services to major bodies of the Greek public sector (hospitals, libraries, town halls, port authorities, etc.). Schools are chosen for hosting meteorological stations, as they are relatively secure and provide a stable Internet connection. Furthermore, an additional benefit is that schools can use the data of the station for educational activities, familiarizing students with scientific research. In some areas, such as ski centres or in the most distant and isolated villages where Internet connection is not feasible, the data could not be transmitted. In these cases, the mobile telephony networks provide an alternative solution for data transmission by using a data transmitter usb card with a relatively low monthly cost which is funded by the local authorities or the owners of the ski centres.
Overall, approximately two thirds of the NOAAN stations are installed on public buildings or public land, and the rest on privately owned premises. In both cases, the stations are hosted free of charge. Permission is also given to NOA, by each landowner, in order to have access to the station under any weather conditions and at any time of the day. Among the 335 stations of the NOAAN, 200 are installed on the ground, while the rest 135 are installed on rooftops. At this point, it should be noted that it was a real challenge to find the appropriate locations (with Internet access and ensured security) to strictly meet the WMO guidelines on installation of the stations over ground surfaces, mainly in urban areas. Figure 3 shows a collection of representative station setups installed both on the ground and on rooftops.

Network maintenance
The regular NOAAN stations maintenance includes site visit at each station every 24 months. During this visit, the station is cleaned, the sensors are checked, the environment is also checked for any nearby installations/obstacles that might have not be reported, the rain gauge is calibrated, and the anemometer is also checked for its functionality and replaced if necessary. The temperature-humidity sensors are changed with new ones after 6-7 years of operation (or before if a problem is identified from the quality control, as described later in the text). This time interval for the replacement of the old sensors has been decided as the calibration of the 20 oldest sensors in a certified laboratory showed that all of them were still performing within the range of error specified by the manufacturer (Table 1).

Real-time data display
In the quasi-totality of the stations, data are recorded at 10 min intervals and they are transferred to a server at NOA premises, dedicated for data archiving and quality control (as explained in detail in the next section). For each weather station, a dedicated webpage is designed that provides the most recent measurements, diagrams of measured parameters during the last 24 h, as well as some basic statistical information for the current and the previous month. Apart from the direct measurements of the main meteorological parameters, additional information is provided, including heat index, daily minima and maxima of all parameters. These webpages are bilingual (Greek/English) for easy access by both Greek and foreign users. Figure 4 provides an example of a NOAAN station webpage

Quality control
Each station of the NOAAN generates a data message every 10 min that consists of measurements of the various meteorological variables (including the 10-min averages, minimum and maximum values, etc.). The NOAAN produces~44 000 such data messages daily. In order to acquire and provide accurate data records from such a large number of observations, a concise quality control procedure is required. The quality control of high-frequency meteorological data records is known to be cumbersome and there is no established standard procedure to apply (Von Arx et al., 2013). The quality control of the NOAAN high-frequency meteorological data is performed in two steps.
In the first step, a primary quality control is applied twice daily by scanning the records of each meteorological variable during the elapsed 12 h in order to detect and flag data of questionable quality. The automated algorithms are able to detect erroneous data and send error messages to the NOAAN administrators. These algorithms include checks for (1) conformance to the operational range of each sensor, (2) exceedance of predefined extreme values for each variable, (3) suspiciously persistent values (e.g. null wind for many hours, successive and persistent precipitation values of 0.2 mm). In this step, defective instruments, communication problems, or other issues affecting data quality and availability are identified, and the appropriate maintenance procedures are initiated. Furthermore, a not automated procedure includes the identification of suspicious data by comparison of the closest weather stations observations. At this point it should be mentioned that automatization of the procedure that examines the spatial consistency of the dataset is underway. However, the very complex topography of the country adds difficulty in the calibration of such an automated procedure. Although the first step is essential to identify erroneous data, all flagged data are submitted to a second stage of quality control.
The second step could also be characterized as the 'final decision' step. All data candidate for inclusion in the database are re-examined in terms of spatial and/or temporal inconsistencies both internally and in comparison to neighbouring stations, and those found of unacceptable quality are excluded. This step is carried out before incorporating any new data in the database. Once the second step is completed, the remaining good-quality data are archived and made available to the general public as explained in the following subsections. It should be noted that the erroneous data are not included in the final database, and the users can be informed on the reason of these missing data from the station metadata section. Furthermore, as this is explained later on, the users can obtain statistics on the data availability for each required period.

Database structure
As described in the previous sections, the huge amount of data from NOAAN collected on a daily basis raised the need for the creation of an advanced database system to archive them. Since the beginning of 2014, new servers were set up and running, hosting a new database and a front-end interface to ease the access of the public to the meteorological data. Numerous scripts and queries were developed, and are regularly expanded to cover the needs that arise, to evaluate the data, populate and extract data from the database, and finally calculate climatological data.
After extensive research in the technologies available to cover the needs of such a project and taking into account the WMO guidelines and the existing trends in research and development, it was concluded that open-source products should be used. Linux servers were set up and every technology used is either open source or developed in house from scratch. The back-end is written in PHP and hosts a MySQL database, while the front-end uses several open-source technologies and standards such as PHP, Javascript, HTML5, and SVG. The only commercial product used so far is the Google Maps javascript API, mainly because of the need for using both maps and satellite imagery as a base to project and display the archived data.
The general idea behind the creation of a database is the grouping and standardization of data in the least redundant form, without the loss of information. In a relational database, such as the one developed in the frame of this work, the standardization is achieved with the creation of separate tables depending on the type of the variables stored and their relation with other data.
Each station is considered unique as well as each station's 10-min entry. The variables are distributed into seven groups as shown in Figure 5. Those groups hold all the unique data received from the stations with unique 'keys' generated by the combination of the entry's time stamp, station id, and the entry's interval (most of the stations use a 10-min interval though there are a few exceptions that connectivity limitations dictated 15, 30, 60, or 120-min recording intervals). Since no automated implementation is error-free and errors may always be generated at random intervals, the entries retrieved from the stations may contain data loss or fragmentation due to numerous reasons such as sensor failure, Internet connection loss, connectivity problems, or interference in the wireless communication between the sensors and their interface (console/ computer). Thus, the above coding structure allows faster accessibility to the data, reduces the size of the database, and ensures the validity of the calculated statistics since the entries in the tables are generated only if there is data availability.
This method of data standardization allows the expandability of the system and the network itself. As already mentioned, the system consists of 335 stations, and even though the sensors implemented so far are of meteorological nature, its structure allows the addition of other data groups such as agricultural (soil moisture, leaf wetness, etc.), hydrological (lake or river levels, river flow, etc.), air quality (concentration of particulate matter ozone, etc.), or other forms facilitating the research on the correlation between them.
The relational database created allows an additional form of quality control of the available data. Even though there is a bundle of scripts running and processing the real-time data to highlight failures (as described in the previous subsection), these are not able to detect long-term sensor irregularities or problems. For example, a failing aspiration fanthat circulates the air in the temperature/relative humidity sensor enclosurewould result in a considerable change in the recorded temperature during daytime. The database and the associated scripts permit the investigation of autocorrelation patterns within the data of each station as well as cross-correlation patterns between the data from neighbouring stations, allowing faster responses when dealing with the problems that arise, and keeping the data quality to high standards.

Visualization and dissemination of historical data
The collection of meteorological data described above creates the need for their processing and analysis to turn the raw data into information useful to the public. The first step in order to make the leap and convert the data to information is to visualize them. For this purpose, a web-GIS tool, based mostly on opensource software, was created that is able to provide complete analytical geospatial data from NOAAN database. Through a user-friendly interface the user is able to easily access, view, navigate, and download all forms of meteorological and climatological data generated by the stations network and the processing routines of the database. The safety of the database is secured by keeping weekly backups on a RAID 5 data storage system hosted at NOA.
The idea behind this web-GIS tool is the creation of a product that will overcome the cross-platform problems when using geospatial data and provide the data in a simple and uniform way to make them accessible even by people that do not possess dedicated commercial products. For this purpose, a simple graphical user interface (GUI) was designed that visually resembles any other window-based application with toolbars and a workspace. This GUI is accessed through http://stratus.meteo.noa.gr/front. The application separates the stations' data in two large categories: 'Live Data' and 'Database Data', but visualizes both in three view types 'Map', 'List', and per 'Station'. The user has the ability to easily search through the stations, filter the results, and switch between live and archived data of specific date/time or period.
The 'Map' view visualizes the distribution of NOAAN stations over the Greek territory, by points on map (markers, Figure 6(a)). The markers are placed, based on the geographic location of the stations, on an interactive map layer that can be switched between map and satellite view and can be zoomed in and out. These markers hold all the data collected live from the network and are refreshed in a regular 10-min interval. The user has not only the ability to switch between the various meteorological variables but also to be informed about each station's activity status. If a station is offline (the latest entry is older than 60 min before present) or generates corrupted or false data, it is rendered on the map with different markers respectively. In addition, the user has the ability to switch the marker type/colouring to three more types if needed: by each station's status, division, or prefecture. The 'List' view presents the same data as the 'Map' view, but in a tabular form (Figure 6(b)).
Through this view, the user is capable to search quickly for the station of interest (by name or internal code) or to filter the available stations by the combinations of filters that are provided from a menu simple to use. The user can specify filtering methods based on the stations' geographical attributes (division, prefecture, point or within a radius, altitude) and stations activity period. With each search of filtering, the 'Map' and the 'List' views are automatically updated to show only the stations that fulfil the criteria selected (Figure 7). Additionally, in 'List' view is the downloadable form of the filtered stations with their locations and the values selected.
As mentioned earlier, the application separates the 'Live Data' from the 'Database Data', and this is done because the data need to be processed before being published. The user through a menu can switch from 'Live Data' to a specific date/time or a period in the past. By using a specific date/time the markers on the map are updated with the available station data of that time, whereas filtering by a time period the markers are updated with the corresponding mean, minima, maxima, or total depending on the variable that is being viewed. In this way, the user is able to visualize and download specific or climatological data. At this point it is worth mentioning that the yearly average percentage of missing values in the database is of the order of 3.5%.
The third view type is the 'Station' view that can be accessed by clicking any station's marker or by selecting a specific station from the 'List'. Through this view,  the user can access the station's general info, activity, operational, and geographical info. Additionally, when in 'Live Data' mode, all the stations sensor data are gathered there, whereas when in 'Database Data' mode, all the data are projected in forms of charts separately for daily, monthly, and annual data, which are also available for download ( Figure 8). No data are provided when the availability is lower than 25%. When downloading data, the users receive along with the measurements information about the data availability, following a four-level rating that ranges from the lowest (in the range 25-50%) to the highest (more than 95%) availability. It should be mentioned that the users have no access to download the raw data through automated scripts as this access is granted after a relevant approval by NOANN. Finally, an attempt was made to incorporate other methods of visualization, such as gridding and contouring, in addition to the point measurement presentation. For both of those visualization methods numerous algorithms exist, however, mainly due to the excessive computational requirements of the more sophisticated ones and the need to be able to run the selected algorithm in almost any end-user system, it was decided to implement a rather simple one. Among all the methods for gridding, the nearest neighbour-weighted interpolation was selected, as the one that balances computational needs and quality of the resulting grid, and its product is used as input to a bicubic interpolation algorithm (Press et al., 1992) for the contouring (Figure 9).
The user after selecting a meteorological variable, from live or past data, has the ability, through a simple menu, to alter the parameterization of the algorithms and change the result to meet his or her needs. Also, after the desired result is achieved, the map can be downloaded in a vector open-source format (SVG) and be used freely. It is in the authors' plans to produce in the near future gridded data where the elevation would be also used as predictor in the regression analysis. This feature will be incorporated in the next release of the database.
At this point it is worth mentioning how useful the NOAAN network data have been so far for both the research community and the various sectors of the economy.
First, in the frame of the scientific works performed at the National Observatory of Athens, these data have been crucial for the following studies: (1) societal analyses of high-impact weather events, including their database over Greece (Papagiannaki et al., 2013), and the analysis of flash flood occurrence in relation to rainfall hazard over the Greater Athens Area (Papagiannaki et al., 2015); (2) numerical weather prediction validation and verification of temperature (Kotroni et al., 2011), precipitation (Sindosi et al., 2012;Giannaros et al., 2015Giannaros et al., , 2016Feidas et al., 2016), wind (Banks et al., 2016), and solar radiation (Kosmopoulos et al., 2015). Furthermore, other national bodies or agencies systematically use this database. Namely, the Hellenic National Meteorological Service uses these data for the reports on high-impact weather delivered to WMO. The Hellenic Agricultural Insurance Agency uses the NOAAN database for the authentication of the weather events that impacted the cultivations. The Power Grid Distribution Agency received up to the end of 2016 operationally total solar radiation data from the NOAAN stations for the assessment of the energy produced by the photovoltaic plants.
An additional measure of the usefulness of the NOAAN can be deduced by the number of queries for data distribution that have been satisfied up to now. Since 2011, the number of queries for data that have been addressed exceeded 260 and about 1200 stationyears have been delivered. Among these requests, 10% have been made from entities outside Greece.

Metadata
The existence of metadata information is considered of paramount importance. All changes in the location of the instruments, periodic maintenance, calibration of sensors, etc., have to be meticulously noted in order to be able to provide the end-users with all the necessary information about the operation of the stations. These metadata include: • Detailed description of each site, with photos.
• Detailed description of the dates of regular maintenance, problems encountered, and the calibration of sensors.
• Detailed description of temporal coverage of data, with list of periods of lost data (partial or total loss).
• Reporting of date of station decommissioning, if applicable.
• Reporting of dates of sensor replacements, due to failure, vandalism, and/or ageing.
The part of the metadata that is useful to the data users (such as description of the site, temporal coverage of data, loss of data, etc.) is available at the online interface.

Concluding remarks
This paper is devoted to the presentation of the network of the automated meteorological stations developed, maintained, and operated by the Institute of Environmental Research and Sustainable Development of the National Observatory of Athens. The network started in 2006 and by the end of 2016 reached 335 stations over the Greek territory.
The network was built on the principles: (1) to cover the research needs of the scientists and the socioeconomic needs of both the general public and of end-users from various sectors (constructions, leisure, etc.), (2) to provide the measured data in real-time to the end-users, (3) to build and populate a database with quality-controlled data that can be visualized and downloaded in a user-friendly way. All relevant data and information is available at http://stratus.meteo. noa.gr/front.
The network was built finally on the principle to assure the sustainability of its operation. For that reason the selected equipment is relatively low cost, the site selection ensures the transmission of data through the Internet from existing infrastructure, and last but not least the network is supported locally by a large number of weather enthusiasts that voluntarily intervene when necessary. It is worth mentioning that the network has been financed with internal funds, donations, and sponsorships while during the last 3 years it also received support by THESPIA and ARISTOTELIS/ SOLAR national projects. The network expansion will considerably slow down relatively to what happened since its start in 2006 (and shown in Figure 1) because the aim is to keep the number of stations at a level that can be maintained in a high operational standard taking into account the available resources in both people and budget. Only some gaps are expected to be filled, and the plan is not to exceed the 400 stations.