The role of spatial cyberinfrastructure in geoinformation systems

. The potential future of a spatial cyberinfrastructure is considered, combining advanced computer, information and communication technologies to expand the possibilities of computational and information scientific practice and improve the synthesis and analysis of scientific data through collaboration and sharing. Currently, spatial cyberinfrastructure represents a paradigm shift in scientific research that has facilitated easy access to computational utilities and facilitated collaboration across distances and disciplines, thereby enabling scientific breakthroughs to be achieved faster and more efficiently. Spatial cyberinfrastructure seeks to address the long-standing complex problems of processing and analyzing massive and heterogeneous spatial datasets, as well as the need and benefits of flexible and secure spatial data exchange. The focus is on spatial analysis using distributed and high performance computing, sensor networks and other advanced information technology capabilities to transform massive spatial datasets into information and knowledge. To solve large-scale and multi-scale problems, the spatial cyberinfrastructure, despite the complexity of the geographic space, allows transforming spatial analysis into powerful and affordable computing utilities to provide broad scientific breakthroughs when creating the well-known Digital Earth project. Critical analysis and recommendations are provided on issues of conflicting data that researchers do not want to exclude from various data models and geographic scales. This case study shows the value of spatial cyberinfrastructure in addressing the challenges that arise from spatial as well as multilingual, biographical and temporal ambiguities in various databases, solutions that, again, would not be possible without spatial cyberinfrastructure.


Introduction
Cyberinfrastructure (CI) was first coined by the National Science Foundation Blue-Ribbon Committee [1] to reflect how traditional ways of scientific research (e.g. experiments in the lab, observation in the field, processing/analysis on a single computer, and even calculations on the back of sheets of paper) are being improved and even revolutionized by the integration capabilities of high-performance computers, storage and visualization facilities for very large datasets, digitally supported sensors and tools in the environment, virtual organizations for collaborative problem solving, and compatible packages of software services and tools [2]. The world of scientific publishing is transforming as part of the evolution of CI [3].
Examples include the discovery of abrupt transitions in Earth's climate and ecosystem dynamics, previously unknown properties of minerals at extreme temperatures and pressures deep within the Earth, simulations of the development of the early universe, development of improved ocean models, understanding of individual and group behavior and its relationship to social, economic and political structures and creation of a genetic map of connections between people [2,4,5]. Computation, along with theory and experiment, has become the "third pillar" of science and engineering [6]. In addition, scientific discovery requires the computational ability to synthesize and analyze very large data sets, which are integrated into the biological, physical, and social sciences and engineering, as well as the science and technology interface, where database-intensive science is defined as the "fourth paradigm" [5]. CI has evolved not only into hardware and software, but also into its own emerging area of research in the field of knowledgeintensive and digital libraries [5][6][7][8][9], with many countries investing in research and development of CI [10][11]. Achieving high performance computing at an affordable cost and showing significant advances in simulation tools, many challenges remain in the efficient integration of field stations containing thousands of instruments, millions of users and petabytes of data with analysis capability [5].
Spatial CI is a new term in the literature [14][15][16], and it is defined as a specific type of CI that synergistically integrates the capabilities of CI, geographic information systems (GIS) [17,18] and spatial analysis [19,20] to solve geospatial problems and decision making. By spatial or space, we mean both real, physical space (i.e. on the surface of the Earth, in the atmosphere or ocean) and virtual space (i.e. digital worlds or understanding how and where computers are connected around the world). Almost all knowledge about the world can be classified in terms of space (location, area, distance, or spatial interaction) as well as time. However, despite the fact that time is divided into globally understandable units: seconds, hours, years, and so on, spatial units and related relationships are much more complex, multidimensional (for example, x, y and z) at multiple scales and resolutions, often heterogeneous (even in a single variable representation) and always changing over time. Without a clear understanding of space, any related models, structures, and hypotheses can be misleading (especially those concerning relationships between variables). In particular, the complexity of geographic space creates significant computational and intellectual challenges in the area of distributed access to spatial data, sharing and analysis of government-funded spatial data information infrastructures [21] and the geospatial semantic web [22] (i.e. localization and integration of information without human intervention, including enabling geographic information to be searched within web pages), all of which are part of the spatial CI. However, many of these problems are already well known to those working with spatial data, and a number of non-spatial CI approaches have emerged to address these problems. Spatial CI goes beyond these existing approaches, anchoring solutions in more complex representations of the combination of mathematical and statistical models [23][24][25][26] and forging closer collaboration between computer science, information sciences and branch disciplines of geography, geology and geophysics, oceanography, ecology, engineering and environmental sciences, as well as social sciences [5,8,27,28].
This interdisciplinary collaboration enables new knowledge systems that finally lead to the partial realization of the Digital Earth, [29] and is embodied in products such as Google Earth, Microsoft Bing Maps and the National Aeronautics and Space Administration (NASA) WorldWind. The flood of spatial data being collected at an accelerated rate for the foreseeable future from networks of sensors, satellites and even cell phones is still driven by the vast needs of the aforementioned areas and cannot be well used or well understood unless it is properly managed, analyzed and shared across spatial KI. The dynamic nature of earth systems (such as waves, tides, atmospheric turbulence, and movements in the earth's crust) further complicates our efforts to accurately measure. Massive datasets are also common in the spatial analysis of human systems, including demographic and transportation systems, risk assessment, disease vectors, human mobility, and more. Spatial analysis itself (in a broad sense including spatial modeling) traditionally covers various approaches, including, but not limited to, spatial statistics [30,31], heuristics and optimization [32,33], and modeling for spatial problem solving and decision making [34,35]. These methods have been widely applied in many areas [36][37][38][39], but they have been difficult to implement for large and multiscale problems that are computationally intensive and require joint input. This limitation exists despite the advances already made in dealing with the complexity of geographic space discussed earlier. However, spatial CI promises to remove this limitation and thus transform spatial analysis into powerful and accessible computational utilities to enable broad scientific breakthroughs. Spatial CI also proven invaluable in the analysis of errors that propagate due to measurements, and it facilitates the development of better models for error representation, propagation, and control in large distributed computing networks [40]. Linking CI with spatial thinking and geographic approaches offers a promising avenue for solving scientific problems and improving decision-making practices that have a significant impact on society (for example, assessing the impact of global climate change, understanding the complexity of interconnected human and natural systems, maintaining ecosystem services, conserving digital resources in the humanities and social sciences and access to them, as well as the management of transport infrastructure). It does not cover all aspects and current interests of spatial CI because the field of study is growing rapidly. However, this path is representative of current research on the long-term challenges of the complexity of spatial datasets and spatial analysis, and the need and benefits of flexible and secure spatial data exchange. This study highlights some of the discoveries and insights that could be made, and these results could not easily have happened without spatial CI.

Materials and research methods
It all starts with technical processing. Spatial foundations governing the interaction of various parameters and phenomena in various physical geographical studies (for example, the Earth's lithosphere, hydrosphere, atmosphere, and global models of flora and fauna) [41]. The main one is the development of architecture and algorithms for distributed processing of geographic information within the framework of spatial CI to improve understanding of ecosystem dynamics and improve prediction of the onset and extent of dust storms in the southwestern United States. As a result of the experiments, scientists were able to predict the onset of dust storms at higher resolutions (3×3 km) over longer time periods (5-10 days).
The evolution of a set of methods and software tools for the integration of multiscale, source and disciplinary oceanographic data began after several recent research expeditions to the Antarctic [40]. The initial goal was to explore several scientific hypotheses about sea ice movement and track iceberg meltwater plumes, but an important parallel effort was to create a near-real-time geospatial basis for decision making. Since they built a spatial CI to support this structure, this led to the development of a sampling design that was optimized to cover smaller areas in relation to the larger scale of the study area. This sampling strategy overcame the limitations of traditional sampling methods used in the past (i.e., using a research ship as a static platform for sampling a single parameter across stations), thereby allowing faster characterization of the ocean surface using multiple data streams from the sea and in outer space and simultaneously on multiple spatial and temporal scales. Thus, without spatial CI, it is not possible to directly observe and characterize meltwater plumes from individual icebergs and less effectively integrate these individual results with regional and global data. The results provide insight into the impact of iceberg meltwater on the carbon flux from the ocean surface to ocean floor sediments, as well as the role of icebergs in controlling biological productivity in the Weddell Sea. These results also illustrate the importance of spatial CI in general science and highlight key architectural considerations in the development of current and future Earth observation systems, especially as oceanographers and other Earth scientists move into the era of petascale computing.

Results and discussion
The aim is to show that spatial CI is not only about using hardware and software or enabling the physical sciences, but also about providing knowledge that serves the needs of the social sciences and humanities, as well as multiple stakeholders and decision makers in groups of citizens from various social, economic and political backgrounds. The creation of a CI is also both a social and a scientific undertaking. Spatial CI, including the Chinese Biographical Database (the largest in the world), the Chinese Historical Geographic Information System (part of China's original Electronic Cultural Atlas Initiative) and the McGill-Harvard-Yenchin Library of Women's Writings Ming Qing database [42]. The research focuses in general on CI for the humanities data, and in particular on spatial CI that aids research into Chinese women writers, their kinship ties, their publishing locations, and their literary and social communities. Critical analysis and recommendations are provided on related issues of conflicting data that researchers do not want to exclude from different data models and geographic scales. This case study shows the value of spatial CI in addressing the difficulties that arise from spatial as well as multilingual, biographical, and temporal ambiguities in these databases, solutions that, again, would not be possible without spatial CI. Big science will continue to need small independent researchers to be the "engine of innovative research", and widespread adoption of CI will allow the two approaches to harmonize. Geographers and sociologists, as well as geoinformatics scientists are actively involved in spatial CI as users, there is a great opportunity to make spatial CI a truly user-centric enterprise [43]. Spatial CI should make room not only for scientists who will use cybernetic tools to collaborate at a distance, but also for educators who will teach with CI. This also applies to scientists who will provide data and information to CI projects on some of the most important scientific issues of the day, such as global climate change.

Conclusion
Citizens, along with professional scientists, can now, more and more, participate in ubiquitous cloud computing that uses a wide-ranging architecture to manage the lifecycle of virtual machines and data archives for everything from a personal address book to the largest of multidimensional, multidisciplinary scientific simulation systems. However, while public computing centers are establishing computer networks -cloud storage (Microsoft, Amazon and Google) instead, focus on the delivery of infrastructure as a service, software as a service, and so on. Huge commercial investments in clouds will make them dominant over large-scale computing hardware and software in the next decade [44][45][46][47]. Spatial CI is an important subset of the more general CI, covering both computationally intensive and interdisciplinary usage requirements such as service hosting, virtual computing environments, and virtual datasets. The specific requirements of spatial CI are well suited to many common cloud capabilities, thus warranting further fundamental and empirical exploration. Indeed, the concept of space, which is special in CI, introduces several interesting research problems for physicists and social scientists alike. Many geographic applications are interdisciplinary and involve multiple stakeholders and decision makers who have diverse social, economic, and political conditions, thus making collaboration very important but challenging.
Spatial CI will follow the path of GIS and eventually become as transparent as GIS is becoming in the world of Google Maps and Google Earth. As the size of spatial datasets and the complexity of spatial analysis and modeling continue to increase, and the need for virtual collaboration in scientific research becomes in demand, scalable spatial CI becomes even more important and timely. The virtues of spatial CI stem from the complexity of the problems, the danger of errors that can proliferate, the deep need to develop solutions that will benefit many societal domains, the ongoing vision of access to a complete Digital Earth and the next generation of GIS -CyberGIS -with integral high-performance, distributed, and joint opportunities.

Acknowledgements
This study was supported by the Russian Federation State Task № FNNN-2021-0005.