research-article

YoctoDB: A Partitioned Immutable Embedded Database

Authors:
Vadim Tsesko

Yandex.Classifieds

Yandex.Classifieds
View Profile

,
Svyatoslav Demidov

Yandex.Classifieds

Yandex.Classifieds
View Profile

CEE-SECR '16: Proceedings of the 12th Central and Eastern European Software Engineering Conference in RussiaOctober 2016Article No.: 3Pages 1–10https://doi.org/10.1145/3022211.3022214

Published:28 October 2016Publication History

CEE-SECR '16: Proceedings of the 12th Central and Eastern European Software Engineering Conference in Russia

Pages 1–10

ABSTRACT

YoctoDB is a small embedded engine for extremely fast partitioned immutable-after-construction databases. Several high load services at Yandex.Classifieds implement pipelined partitioned data reindexing. The result of the reindexing process is an immutable index delivered to many search machines, reopened as a part of the composite index and queried when serving user requests. Read performance, memory consumption, fast reopening and reproducible latencies are paramount for the database engine. YoctoDB has successfully provided a solution for all of these services. We describe the role of YoctoDB in the architecture of indexing and search components, it's simple data model, client API, design, implementation and use cases. We conclude the paper with limitations of the approach and directions of future development.

References

S. Chambi, D. Lemire, O. Kaser, and R. Godin. Better bitmap performance with roaring bitmaps. Software: practice and experience, 2015.Google Scholar
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008. Google ScholarDigital Library
D. Comer. Ubiquitous b-tree. ACM Computing Surveys (CSUR), 11(2):121--137, 1979. Google ScholarDigital Library
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems, volume 14. ACM, 1984. Google ScholarDigital Library
L. George. HBase: the definitive guide. O'Reilly Media, Inc., 2011.Google Scholar
G. Graefe. Modern b-tree techniques. Foundations and Trends in Databases, 3(4):203--402, 2011. Google ScholarDigital Library
J. E. Hopcroft. Data structures and algorithms. Addison-Wesley Boston, MA, USA, 1983.Google Scholar
M. Kleppmann. Designing data-intensive applications. O'Reilly Media, 2016.Google Scholar
A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2):35--40, 2010. Google ScholarDigital Library
D. Lemire, G. Ssi-Yan-Kai, and O. Kaser. Consistently faster and smaller compressed bitmaps with roaring. Software: Practice and Experience, 2016. Google ScholarDigital Library
C. Okasaki. Purely functional data structures. Cambridge University Press, 1999.Google ScholarDigital Library
P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. The log-structured merge-tree (lsm-tree). Acta Informatica, 33(4):351--385, 1996. Google ScholarDigital Library
V. Tsesko. Akka at Yandex (in Russian) // JPoint 2014. http://2014.javapoint.ru/talks/07/, Apr. 2014. [Online; accessed 18-July-2016].Google Scholar
J. Zhou and K. A. Ross. Implementing database operations using simd instructions. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pages 145--156. ACM, 2002. Google ScholarDigital Library

Recommendations

EmbedDB: A High-Performance Database for Resource-Constrained Embedded Systems Too Small for SQLite
SAC '24: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing

Data processing on the smallest devices typically requires custom development as embedded databases such as SQLite require too many resources for use. This work develops EmbedDB, an embedded, key-value database optimized for memory-constrained devices ...
Read More
Java Embedded Real-Time Systems: An Overview of Existing Solutions
ISORC '00: Proceedings of the Third IEEE International Symposium on Object-Oriented Real-Time Distributed Computing

Michel Banatre, Gilbert Cabillic, Jean-Philippe Lesot and Frederic ParaiIRISA-INRIAJava is a programming language with features not found in traditional languages such as platform independence and dynamic loading. Because of this, Java is extending and ...
Read More
Utilizing a NoSQL Data Store for Scalable Log Analysis
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium

A potential problem for persisting large volume of data logs with a conventional relational database is that loading massive logs produced at high rates is not fast enough due to the strong consistency model and high cost of indexing. As a possible ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CEE-SECR '16: Proceedings of the 12th Central and Eastern European Software Engineering Conference in Russia
October 2016
102 pages
ISBN:9781450348843
DOI:10.1145/3022211

Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 October 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Database
embedded
high load
horizontal partitioning
immutable
low latency
parametric search
read-only
real-time
sharding
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 82
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

YoctoDB: A Partitioned Immutable Embedded Database

CEE-SECR '16: Proceedings of the 12th Central and Eastern European Software Engineering Conference in Russia

ABSTRACT

References

Cited By

Recommendations

EmbedDB: A High-Performance Database for Resource-Constrained Embedded Systems Too Small for SQLite

Java Embedded Real-Time Systems: An Overview of Existing Solutions

Utilizing a NoSQL Data Store for Scalable Log Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

YoctoDB: A Partitioned Immutable Embedded Database

CEE-SECR '16: Proceedings of the 12th Central and Eastern European Software Engineering Conference in Russia

ABSTRACT

References

Cited By

Recommendations

EmbedDB: A High-Performance Database for Resource-Constrained Embedded Systems Too Small for SQLite

Java Embedded Real-Time Systems: An Overview of Existing Solutions

Utilizing a NoSQL Data Store for Scalable Log Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media