Abstract
In CIDR 2009, we presented a collection of requirements for SciDB, a DBMS that would meet the needs of scientific users. These included a nested-array data model, science-specific operations such as regrid, and support for uncertainty, lineage, and named versions. In this paper, we present an overview of SciDB's key features and outline a demonstration of the first version of SciDB on data and operations from one of our lighthouse users, the Large Synoptic Survey Telescope (LSST).
- Agrawal et. al. Trio: a system for data, uncertainty, and lineage. In Proc. of the 32nd VLDB Conf., pages 1151--1154, 2006. Google ScholarDigital Library
- J. Becla and K.-T. Lim. Report from the first workshop on extremely large databases. Data Science Journal, 7:1--13, 2008.Google ScholarCross Ref
- J. Becla and K.-T. Lim. Report from the second workshop on extremely large databases. Data Science Journal, 7:194--208, 2008.Google Scholar
- DeWitt et. al. GAMMA - a high performance dataflow database machine. In Proc. of the 12th VLDB Conf., pages 228--237, 1986. Google ScholarDigital Library
- DeWitt et. al. Client-server paradise. In Proc. of the 20th VLDB Conf., pages 558--569, 1994. Google ScholarDigital Library
- Dozier et. al. Sequoia 2000: A next-generation information system for the study of global change. In Proc. of the IEEE Symp. on Mass Storage Systems, pages 47--53, 1994.Google Scholar
- G. Graefe and D. L. Davison. Encapsulation of parallelism and architecture-independence in extensible database query execution. IEEE Trans. Softw. Eng., 19(8):749--764, 1993. Google ScholarDigital Library
- A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proc. of the SIGMOD Conf., pages 47--57, 1984. Google ScholarDigital Library
- B. Howe. GRIDFIELDS: Model-Driven Data Transformation in the Physical Sciences. PhD thesis, Portland State University, 2007. Google ScholarDigital Library
- Ivanova et. al. MonetDB/SQL meets SkyServer: the challenges of a scientific database. In Proc. of the 19th SSDBM Conf., page 13, 2007. Google ScholarDigital Library
- Kantor et. al. Designing for peta-scale in the LSST database. Astronomical Data Analysis Software and Systems XVI, 376:3--11.Google Scholar
- National Center for Supercomputing Applications (NCSA). HDF5: API specification reference manual. http://hdf.ncsa.uiuc.edu/, 2004.Google Scholar
- NETCDF user guide. http://www.unidata.ucar.edu/software/netcdf/.Google Scholar
- Pan-Starrs - panoramic survey telescope & rapid response system. http://pan-starrs.ifa.hawaii.edu/public/.Google Scholar
- D. A. Schneider and D. J. DeWitt. A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment. In Proc. of the SIGMOD Conf., pages 110--121, 1989. Google ScholarDigital Library
- M. Stonebraker. The design of the POSTGRES storage system. In Proc. of the 13th VLDB Conf., pages 289--300, 1987. Google ScholarDigital Library
- Stonebraker et. al. A standard science DBMS benchmark. (submitted for publication).Google Scholar
- Stonebraker et. al. Requirements for science data bases and SciDB. In Fourth CIDR Conf. -- Perspectives, 2009.Google Scholar
- Szalay et. al. Designing and mining multi-terabyte astronomy archives: the Sloan Digital Sky Survey. In Proc. of the SIGMOD Conf., pages 451--462, 2000. Google ScholarDigital Library
Index Terms
- A demonstration of SciDB: a science-oriented DBMS
Recommendations
SciDB: A Database Management System for Applications with Complex Analytics
A description and discussion of the SciDB database management system focuses on lessons learned, application areas, performance comparisons against other solutions, and additional approaches to managing data and complex analytics.
A demonstration of AQWA: adaptive query-workload-aware partitioning of big spatial data
Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, HawaiiThe ubiquity of location-aware devices, e.g., smartphones and GPS devices, has led to a plethora of location-based services in which huge amounts of geotagged information need to be efficiently processed by large-scale computing clusters. This demo ...
Selective Scan for Filter Operator of SciDB
SSDBM '16: Proceedings of the 28th International Conference on Scientific and Statistical Database ManagementRecently there has been an increasing interest in analyzing scientific data generated by observations and scientific experiments. For managing these data efficiently, SciDB, a multi-dimensional array-based DBMS, is suggested. When SciDB processes a ...
Comments