SPATIAL DATA PROCESSING TOOLS AND APPLICATIONS FOR BLACK SEA CATCHMENT REGION

: The enviroGRIDS project has developed and provides through the BSC-OS portal a set of tools, applications and platforms concerning with the processing of huge spatial data for the Black Sea catchment region. The paper highlights the main issues of interoperability between Geospatial and Grid infrastructures, and between different platforms supporting the Earth Science oriented tools and applications. The BSC-OS portal provides end user applications for spatial data management, hydrological model calibration, environmental scenario development and execution, workflow based satellite image processing, data reporting and scenarios visualization, and development of Earth Science oriented training materials.


INTRODUCTION
Ecologically unsustainable development and inadequate resource management, in the context of climate, land cover and population changes, in the Black Sea catchment region, are the main concern of the enviroGRIDS (Black Sea Catchment Observation and Assessment System supporting Sustainable Development) FP7 project [1]. The quantity and quality of waters are extremely important as well as understanding the evolution of the complex environmental systems over the coming decades. The enviroGRIDS project aims to develop, calibrate, and make available the hydrological model of the Black Sea catchment region by four main achievements: • Collection of large transnational data sets; • Adequate management and sharing processes of the environmental data by a dedicated Spatial Data Infrastructure (SDI); • Distributed computing in order to allow running a high-resolution hydrological model; • Providing a set of tools and applications to specialists and citizens in order to access data processing and visualization, and analyze environmental scenarios. One main challenge of the project is to experiment and implement the interoperability between different technologies, platforms, and applications. One such a case is the interoperability between the Geospatial and Grid infrastructures, in order to extend the features provided by the both technologies. The Geospatial technologies offer very specialized functionalities for Earth Science oriented applications meanwhile the Grid technology is able to support high performance computation by scalability, and distributed and parallel processing.
The resources of the enviroGRIDS system are accessible to the large community of users through the BSC-OS (Black Sea Catchment Observation System) portal. By single sign-on authentication technique the portal provides Web applications for data management, hydrological models calibration and execution, satellite image processing, report generation and visualization, and virtual training center.
This presentation focuses on the BSC-OS portal architecture, and the main challenges and issues regarding the development of environmental tools and applications regarding the Black Sea catchment.
The paper is structured as follows. Section 2 presents the works and achievements related with the enviroGRIDS project. Section 3 sketches the portal architecture and the set of tool and application computing@computingonline.net www.computingonline.net ISSN 1727-6209 International Journal of Computing categories. Each of the next six sections describes a tool and application category such as data management, SWAT model calibration and scenario execution, satellite image processing, spatial data visualization and reporting, two demonstrator applications oriented to citizens and decision makers, and virtual training center. The last section concludes on the portal development and future work.

RELATED WORKS
The enviroGRIDS project develops the SWAT model as a high-resolution (i.e., sub-catchment spatial and daily temporal resolution) water balance model to the entire Black Sea catchment region. The model is calibrated and validated by using river discharge data, river water quality data, and crop yield data as in [2]. The Black Sea watershed related hydrological model is very complex due to the highly interconnected and continuously evolving interactions at many spatial and temporal scales, and requires to gather and integrate different sets of environmental data (e.g. physical, chemical, biological) [3]. Other European projects aim environmental related subjects [4]. IS-ENES project develops the European Network for Earth System Modeling (ENES), which calls together the European climate/Earth system modeling community in order to work on understanding and prediction of future climate change. The ENSEMBLES project was a joined effort to develop an ensemble prediction system for climate change based on the principal state-of-the-art, high resolution, global and regional Earth System models developed in Europe. The METAFOR project addresses the fragmentation and gaps in availability of metadata for climate data that are currently found in existing repositories. The goal of the DRIHMS project is to systematically build a bridge between the HMR (Hydro-Meteorological Research) and ICT (Information and Computing Technology) communities, and to identify requirements of HMR users and match them to capabilities of the newly developed ICT infrastructure. The GENESI-DEC project aims to provide guaranteed, reliable, easy, effective access to a variety of data, facilities, tools and services to an ever increasing number of Digital Earth users from all disciplines.
The projects EGEE, SEEGRID-SCI, and C3Grid, provide solutions for sharing complex spatial and environmental data sets, and Grid based processing tools and applications. The aim of the C3Grid project for instance, is to create a grid-based working environment for earth system research.
Manny other European projects such as SAW-GEO, CYCLOPS, GDI-Grid, GEO-Grid, DEEGREE, DORII, and GENESI-DR address the management of spatial data and environmental tools and applications.
Other EU projects such as OBSERVE, EGIDA, Balkan GEONET, enviroGRIDS, BalkanGEONet, and GEONetCab have significant contribution to the development of the environmental network and observation capacity in the South East Europe.
The enviroGRIDS project gathers solutions and experience from many of these mentioned projects in order to approach the particularity of the Black Sea catchment region in terms of SDI, platforms interoperability (i.e. Geospatial and Grid, and software platforms like URM, gSWAT, ESIP, GreenLand, gProcess, eGLE, etc), high resolution models, processing scalability, user interaction usability, and processing efficiency.

BSC-OS PORTAL
The BSC-OS portal consists of a set of Web applications through which the users access the system resources such as spatial data, hydrologic models, environmental scenarios, data processing tools, visualization facilities, environmental reports, and training materials (Figure 1).
There are five categories of users such as data providers, earth science specialists, decision makers, citizens, and system administrators. The user may access the features of an individual application by local authentication, or all published applications of the portal by the single sign-on authentication.
The main user tool and application categories provided by the portal are [xx]: • Data management -provides the user with spatial data management and operations. The user may enter data and metadata, visualize, modify, update, and remove spatial data from repositories; • Hydrologic model management -provides the Earth Science specialists with hydrologic model configuration, scenario and model development, model calibration and scenario running. One of the water quality models used in the enviroGRIDS project is SWAT (Soil Water Assessment Tool) [6]. It is a model designed to estimate impacts of land management practices on water quantity and quality in complex watersheds. The SWAT model requires specific information about weather, soil properties, topography, vegetation, and land management practices of the watershed; • Satellite data processing -the specialist may process satellite data and images in order to search for relevant information (e.g. land cover, vegetation, water, land use, soil composition, etc); • Data visualization and report -the specialists visualize various spatial data in different formats and views and compose environmental reports for decision makers and citizens; • Decision maker and citizen applicationsprovide the decision makers with the interactive and graphical tools to access the private environmental reports. The user may visualize data that make possible statistical analysis and predictions; • Virtual training center -supports the specialists to develop Earth Science oriented training materials and the users to execute the lessons. The regular users visualize the reports generated by the specialists as results of executing environmental scenarios. The input data for the reports are built up by the specialists by running hydrological models of the Black Sea catchment area and by processing related satellite data. All data sets required for building up the hydrological models, environmental scenarios, and spatial models are provided and entered into the system by the data providers.

DATA MANAGEMENT
The URM (Uniform Resource Management System) platform [7] allows users to search and share spatial and non-spatial information, and establish a network to encourage a broader community to adopt and support the GEOSS concept of data sharing for a more sustainable environment. The URM Geoportal is not the one integrated solution, but a set of modules and services, which are able to communicate through interoperable services defined by OGC (Open Geospatial Consortium), and W3C (World Wide Web Consortium). URM Geoportal consists of four basic blocks interconnected through metadata: 1. Metadata management is supported by the MicKa toolset for editing and management of metadata for spatial information, Web services and other sources; 2. Data management by the DataMan application. It supports the import, export, and management of spatial data in files or databases, for both raster (IFF/GeoTIFF, JPEG, GIF, PNG, BMP, ECW) and vector (ESRI Shapefile, DGN, DWG, GML) data types; 3. Data visualization, provided by the MapMan software tool. It supports publication of spatial compositions from locally stored data with external WMS (Web Map Service), WFS (Web Feature Service) data services; 4. Content management for publishing in context and connections with social networks, is supported by the SimpleCMS toolset.

SWAT MODEL CALIBRATION AND SCENARIO EXECUTION
The SWAT model supports the specialists on making predictions on the effects of management decisions on water, sediment, nutrient and pesticide yields with reasonable accuracy on large, engaged river basins [8]. The data package of the model could be quite large (up to 20 thousands) and its running requires great storage capacity and high power computation resources. A

. gSWAT Application
The gSWAT application has been developed in enviroGRIDS project and available through the BSC-OS Portal in order to support the development, calibration and execution of the SWAT model [9]. Grid based computation infrastructure is the basic solution for parallel and distributed processing of the hydrological model in the gSWAT application.
It is developed as a Web application that hides to the user the complexity of the Grid infrastructure ( Figure 2). The application provides support for scalable models in terms of geographical area, modeling resolution, and number of users. Multicore architecture and GPU cluster based solutions are explored as well in order to speed up and optimize the hydrological model processing [10].
B. SWAT Oriented Services gSWATSim is a server side extension of the gSWAT platform that is exposed as a collection of REST Web Services supporting the user to create new projects (i.e. new scenarios), modify some information related to projects (e.g. project name, description, etc.), run environmental scenarios, upload results to visualization module (i.e. BASHYT), and visualize the execution status of scenarios. C

. SWAT Model Development and Running
The hydrological model could be developed, calibrated and run through various approaches based on the gSWAT, gSWATSim services, and BASHYT platforms. The specialist could use the following solutions: 1. gSWAT application -The specialist develops the SWAT model by using ArcSWAT and ArcView tools on his desktop. By using the gSWAT application the user uploads the model onto the gSWAT server and performs interactively the calibration of the model [9]. The user controls the convergence to the optimal calibration (i.e. parameters, simulations, and iterations) by interactive techniques provided through the Web graphical user interface (Figure 2). Finally the user may download the resulted calibrated model.
2. gSWAT and BASHYT tools -The applications "work together" by separate working sessions that are connected just at the data level. The main advantage of this solution is the independency of the tools. The user performs the following steps: develop the SWAT model just in BASHYT, and then downloads the archived SWAT files and metadata. Now, follows the calibration by gSWAT as in the first solution. Finally the user uploads the results into BASHYT and visualizes the environmental information.

Fig. 2 -Detailed visualization of the gSWAT calibration results
3. gSWATSim services -The applications work together through a common Storage Element and dedicated Web Services. The working session is in BASHYT through which the user develops the model and defines the scenario. The user exports to gSWATSim the data onto the Storage Element, then through the dedicated Web Service the execution environment is customized, and the scenario is executed. Scenario execution progress can be monitored directly in BASHYT. Finally after the execution, the results are available automatically into BASHYT for visualization. In this solution the user does not need to switch between the applications. BASHYT accesses a new functionality available through gSWATSim services, which allows the execution and the monitoring of running scenarios.

SATELLITE IMAGE PROCESSING
Satellite images could reveal information on land cover, precipitations, geographic areas, pollution, and natural phenomena. Spatial and environment related data could be acquired by imagery classification that is actually a data mining throughout the multispectral bands. It is a multivariable process taking in account satellite image types (e.g. MODIS, Landsat), particular geographic area, soil composition, vegetation cover, and generally the context (e.g. clouds, snow, and season). All these specific and variable conditions require flexible tools and applications to support an optimal search for the appropriate solutions.
One of the basic platforms supporting the development of the Grid oriented applications for satellite image processing are ESIP (Environment oriented Satellite Data Processing Platform) and gProcess [11]. ESIP supports a workflow based flexible description of the satellite images complex processing over the Grid. Actually ESIP includes as well the gridified GRASS functionality [12]. The gProcess platform supports the management and execution of workflows (i.e. task distribution, management of parallel and sequential tasks) over the Grid infrastructure.
The ESIP based applications have been developed according with the methodology reported in [13]. The BSC-OS portal publishes the GreenLand end user application that is accessible by Web browsers (Figure 3). The GreenLand application layers on the gProcess and ESIP platforms and extends the satellite image processing related functionality: • Supporting the scalability, in terms of number of users, number of projects, number of workflows; • By using the OGC Web services in order to search, visualize, fetch, and store the satellite images; • Interoperability between GreenLand and URM is supported by standard OGC services (e.g. WMS, WCS, and WFS); • GreenLand publishes satellite data by OGC services provided by GeoServer, and registered on the URM server; • The GreenLand functionality and operators are published as WPS services (e.g. NDVI, EVI, and Accuracy Assessment); • Two editors support the development of Basic Operators and Workflows. The first editor includes into the GreenLand platform the

DATA VISUALIZATION AND REPORTING
BASHYT (The Basin Scale Hydrological Tool) [14] is a Web based interface to SWAT that works together with ArcSWAT and AvSWAT [15]. It can be used to manage many watersheds/scenarios at once and exposes on the Web a template to produce environment oriented applications. The applications can be edited directly through the browser. BASHYT implements the Driving forces-Pressures-States-Impacts-Responses paradigm and is able to produce reports on environmental states that can be visualized in different ways.
In BASHYT the SWAT models are stored into a relational database. A preprocessing step is required to import raw data (vector, raster and tabular data) into the system. After importing SWAT models BASHYT could offer tables, charts, and maps in a transparent way to the end users.

CITIZENS ORIENTED APPLICATIONS
Two demonstrator Web applications for citizens have been developed within the enviroGRIDS project and available through the BSC-OS Portal. The first application, which is related to near real time dissemination of environmental data to citizens, a flood forecasting demonstrator is applied on the Somes Mare catchment in northern Romania. For the second application, related to long term planning in river basins a demonstrator for long-term planning of remediation strategies regarding flooding, sediment and ecosystem problems along the Danube River section between the towns of Braila and Tulcea has been selected.
The first application is supported by the HEC-HMS [16] hydrological model executed over the Grid infrastructure, and the second one is supported by the SOBEK 1D/2D [17] hydrodynamic model of flow and sediment transport. Geospatial data is available through the enviroGRIDS URM Portal by standard OGC services, while for water-related time series data the emerging WaterML standard is used.
On the client side, for both applications the main interfaces are map-based (e.g. OpenLayers, Google maps, and Google Earth platforms), over which the additional data are overlaid as spatially distributed data, or point data containing time series of modeled results.

TRAINING MATERIALS
The BSC-OS Portal provides the virtual training center based on eGLE (GiSHEO eLearning Environment), developed initially through the GiSHEO project [18]. The training system has as generic users the teacher and the student. The teacher is the Earth Science specialist who authors teaching materials and coordinates the training sessions. The student is the trainee who accesses the teaching objects organized by lessons in order to get presentations, experiment algorithms on spatial data, process satellite images, execute environmental scenarios, and visualize reports already prepared by the specialists.
The teaching material is built as lessons in terms of templates, patterns, and tools. The Earth Science related content of the lessons may be fix or dynamically fetched from data repositories by standard OGC services such as WMS and WCS, (Figure 4).
The teacher may use the Grid based execution to process satellite images, to execute specific algorithms through workflow descriptions or to visualize previously created teaching resources (i.e. already processed satellite images, geographical maps, diagrams, algorithm workflow descriptions, etc.). The students have only the ability to execute the lessons according to the constraints established by the teacher. Depending on the interaction specified level, they could as well be allowed to describe and experiment new workflows (i.e. algorithms, scenarios) or choose different input data (e.g. satellite images, parameters) for existing ones.

CONCLUSIONS
The development of the BSC-OS portal and generally the research through the enviroGRIDS project have revealed a lot of challenges regarding the gathering data into a dedicated SDI, interoperability between Geospatial and Grid infrastructures, connections through standard OGC services, and interoperability between platforms developed by different partners (e.g. URM, gSWAT, ESIP, gProcess, GreenLand, gLite, BASHYT, and eGLE), huge spatial data sets involved in the development of hydrological models and environmental scenarios (e.g. Danube, Mosaic, Black Sea Catchment, Istanbul, and Rioni River in Georgia), security and access management in different platforms, application development in distributed and heterogeneous systems, etc.
Another issue the portal development has to face is the compatibility with new technologies and functional requirements. One main concern is the compatibility with the new European Middleware Initiative (EMI) that aims to improve and standardize the dominant existing middlewares in order to produce one simplified and interoperable middleware [19]. EMI attempts to unify a few Grid platforms such as ARC, gLite, Unicore and dCache. The EMI and Globus platforms will empower the EGI (European Grid Infrastructure) with more stable, useable and manageable software. The main work aims to develop extended and high resolution models and scenarios, to improve the tool and application functionality, and to improve the user interaction techniques with spatial data models.
The service oriented architecture, multicore, GPGPU based systems, Cloud processing are other technologies that are explored in order to extend the scalability, interoperability, standard connectivity, functionality, usability of end user applications, system efficiency, and to improve the performance of data processing.