Elsevier

Future Generation Computer Systems

Volume 74, September 2017, Pages 232-240
Future Generation Computer Systems

Research and implementation of a distributed transaction processing middleware

https://doi.org/10.1016/j.future.2016.01.021Get rights and content

Highlights

  • A middleware-level distributed system is complemented for improving the performance of transaction processing.

  • By making partition extension to Berkeley DB, this paper overcomes the disadvantage of non-support parallel writing across multiple nodes.

  • Monitoring nodes of the distributed database system by the middleware ensures correct execution and migration of transaction.

Abstract

Currently, increasingly transactional requests require high-performance transaction processing systems as support. The performance of a distributed transaction processing system is superior to that of traditional single-node transaction processing system, and the characteristic of multi-node determines that distributed transaction processing systems should pay more attention to availability. For example, in traditional single-node systems, the performance of Berkeley DB is high, but its shortcoming of not supporting parallel writing across multiple nodes is weakening its availability and scalability in the distributed environment. This paper has designed and implemented a middleware-level distributed transaction processing system called POST, including a distributed database system called POSTBOX which is based on Berkeley DB and data partition, and a distributed transaction processing middleware called POSTMAN. POSTBOX inherits the availability of highly available Berkeley DB, and expands it with data partition. By Partition Replication Body (PRB), POSTBOX overcomes the native weakness of highly available Berkeley DB, which indicates that highly available Berkeley DB does not support parallel writing across multiple nodes; POSTMAN is a middleware adapting PRB. POSTMAN monitors POSTBOX in real-time via Partition Replication Body State Array (PRBSA), and ensures the correctness of transaction processing and transactions migration in the case of node failure. The actual test results show that POST possesses high availability, and has an obvious improvement of write performance compared with highly available Berkeley DB.

Introduction

Historically, OnLine Transaction Processing (OLTP)  [1] refers to submitting traditional transactions such as ordering goods or transferring payments to the OLTP system, based on Relational DataBase Management System (RDBMS). With the rapid development of Internet and Internet application, transaction occurs some changes, one of the most significant features of which is the explosive growth of transaction throughput  [2]. For instance, excellent multi-user game based on the web can produce a large amount of interactions within one second, and the growth of smart phone use and other mobile terminals has given rise to the development of mobile transaction. These Internet applications produce more transaction requests than the capability of the traditional OLTP system, and it is difficult for RDBMS to deal with high concurrent transaction requests. In addition, RDBMS cannot support expansion without offline and distribution very well. For example, SQL query  [3] of a table with massive records in a relational database management system will cost an amount of time. Although it can be solved by data segmentation and table segmentation, it also increases the difficulties of programming, data backup, database expansion and some other issues. In order to enhance the performance of system, the most direct solution is to purchase a machine which has stronger performance, but its higher cost is often prohibitive for most enterprises.

Distributing data and loading it to multiple nodes by using distributed database systems  [4], [5] is an effective method to improve the performance of the transaction processing system. Currently, application and deployment of a distributed system are becoming known more and more widely, especially in the background of expansion of cloud computing and big data  [6], [7]. Cloud computing  [8], [9] requires using low-cost servers instead of expensive machines as a hardware infrastructure platform, and obtains high availability and scalability through redundancy between nodes. Berkeley DB  [10] is a powerful key/value database engine: its high availability version (referred to highly available Berkeley DB) provides a distributed database solution based on master–slave replication, with high availability and better reading scalability. Berkeley DB provides full ACID  [11] transactional guarantees. This ensures that highly available Berkeley DB can be applied not only to lower requirements for data consistency (for example, state updates of social network users do not need to immediately synchronize to the entire application), but also to higher requirements for data consistency, such as financial systems or order processing systems, because these systems are intolerable to abandoning transaction and data consistency  [12].

Highly available Berkeley DB supports high availability and read scalability, it does not have write scalability, that is to say, it does not support parallel write cross multiple nodes. In addition, compared to building a centralized or client/server system, it is quite difficult to build a truly distributed database system, because distributed database systems may have multi-node failures and problems with security of data storing  [13], [14], inter-node communication is relatively complex. Middleware is an effective way to solve the fault tolerance of distributed database systems, communication difficulties, and other problems. An effective distributed transaction processing middleware  [15], [16], [17], [18] can effectively manage a distributed database system, and reduce the programming difficulties.

In response to these problems, this paper designs and implements a distributed transaction processing middleware system called POST, which consists of a distributed database system called POSTBOX based on Berkeley DB and data partition, and a distributed transaction processing middleware called POSTMAN. POSTBOX makes partition extension to highly available Berkeley DB, and overcomes the problem that Berkeley DB does not support parallel write cross multiple nodes via Partition Replication Body (PRB). POSTMAN, which is deployed on top of POSTBOX, and fully adapted to the PRB of POSTBOX, can provide an access interface for the interaction application and POSTBOX. POSTMAN monitors the status of each node of POSTBOX by Partition Replication Body State Array (PRBSA), and ensures correct execution and migration of transaction when a node fails through an efficient scheduling mechanism.

The rest of this paper is organized as follows. Section  2 describes the highly available of Berkeley DB. Section  3 describes the system architecture of POST, and introduces the distributed database system called POSTBOX based on Berkeley DB and data partition, and distributed transaction processing middleware called POSTMAN. Section  4 provides analysis of the availability and performance of the POST system. Section  5 provides experimental results and analyses. Section  6 introduces related work. Finally, this paper makes a summary in Section  7.

Section snippets

High availability of Berkeley DB

Berkeley DB achieves high availability by the replication group. Replication group is a collection of Berkeley DB environments distributed on different physical nodes. Nodes in replication group have the following three states:

  • (1)

    Master: “Master node” is chosen by a simple majority of electable nodes. It can process both read and write transactions.

  • (2)

    Replica: “Replica node” is in communication with a Master node via a replication stream which is used to keep track of changes made at the Master

POST

As shown in Fig. 3, POST is composed of distributed database system called POSTBOX based on highly available Berkeley DB and data partition, and distributed transaction processing middleware called POSTMAN. POSTMAN provides an application programming interface for the application program, and POSTBOX provides the underlying call interface for POSTMAN.

Analysis of POST

CAP  [19], [20] was proposed by Brewer. As shown in Fig. 6, CAP theory suggests that, any distributed system cannot satisfy Availability, Consistency and Partition Tolerance at the same time. When improving any two of them, the third must be sacrificed. POST follows the CAP theory, meets the high availability and partition tolerance, and supports eventual consistency. This consistency model does not guarantee that all Replica nodes are consistent at the moment when the Master node submits.

Evaluation

Write performance of POST (POSTBOX contains a PRB) and highly available Berkeley DB are compared in the experiment.

Related work

With the rapid development of high cost-effective computer and the higher requirement of information processing capability, distributed database and transaction systems attract more attention and are being widely used. In particular, they are more and more important with the growth of big data and cloud computing, also the appearance of other applications with massive data.

NoSQL (Not Only SQL) has become the first alternative to relational databases with higher flexibility, availability and

Conclusion

Aiming at the shortcomings of the highly available Berkeley DB, this paper designs and implements distributed transaction processing middleware system called POST, which includes distributed database system called POSTBOX based on the Berkeley DB and data partition, and distributed transaction processing middleware called POSTMAN. The POSTBOX inherits the high availability of Berkeley DB from them, and extends the partition of the highly available Berkeley DB, so it overcomes the problem that

Acknowledgments

This work was supported in part by the National High Technology Research and Development Program of China (No. 2015AA01A303), Beijing Key Subject Development Project(XK10080537), NSF grants CNS 149860, CNS 1461932, CNS 1460971, CNS 1439672, CNS 1301774, ECCS 1231461, ECCS 1128209, and CNS 1138963.

Jianjiang Li is currently an associate professor at University of Science and Technology Beijing, China. He received his Ph.D. degree in computer science from Tsinghua University in 2005. He was a visiting scholar at Temple University from Jan. 2014 to Jan. 2015. His current research interests include parallel computing, cloud computing and parallel compilation.

References (34)

  • B. Nemade et al.

    Cloud computing: Windows AZURE platform

  • Overview of oracle berkeley db, 2010....
  • A.A. Farrag et al.

    Using semantic knowledge of transactions to increase concurrency

    ACM Trans. Database Syst. (TODS)

    (1989)
  • J. Krueger et al.

    Fast updates on read-optimized databases using multi-core CPUS

    Proc. VLDB Endow.

    (2011)
  • W. Xue et al.

    Corslet: A shared storage system keeping your data private

    Sci. China Inf. Sci.

    (2011)
  • L. Millet et al.

    Facing peak loads in a P2P transaction system

  • A. Pavlo et al.

    On predictive modeling for optimizing transaction execution in parallel OLTP systems

    Proc. VLDB Endow.

    (2011)
  • Cited by (2)

    • The MOM of context-aware systems: A survey

      2019, Computer Communications

    Jianjiang Li is currently an associate professor at University of Science and Technology Beijing, China. He received his Ph.D. degree in computer science from Tsinghua University in 2005. He was a visiting scholar at Temple University from Jan. 2014 to Jan. 2015. His current research interests include parallel computing, cloud computing and parallel compilation.

    Qian Ge is currently a student in University of Science and Technology Beijing for her master degree. Her research interests include distributed system technology and fault tolerance for database. She received her bachelor’s degree in 2012 from China Women’s University.

    Jie Wu is the chair and a Laura H. Carnell Professor in the Department of Computer and Information Sciences at Temple University. Prior to joining Temple University, USA, he was a program director at the National Science Foundation and a distinguished professor at Florida Atlantic University. He received his Ph.D. degree from Florida Atlantic University in 1989. His current research interests include mobile computing and wireless networks, routing protocols, cloud and green computing, network trust and security, and social network applications. Dr. Wu regularly published in scholarly journals, conference proceedings, and books. He serves on several editorial boards, including IEEE Transactions on Computers, IEEE Transactions on Service Computing, and Journal of Parallel and Distributed Computing. Dr. Wu was general co-chair/chair for IEEE MASS 2006 and IEEE IPDPS 2008 and program co-chair for IEEE INFOCOM 2011. Currently, he is serving as general chair for IEEE ICDCS 2013 and ACM MobiHoc 2014, and program chair for CCF CNCC 2013. He was an IEEE Computer Society Distinguished Visitor, ACM Distinguished Speaker, and chair for the IEEE Technical Committee on Distributed Processing (TCDP). Dr. Wu is a CCF Distinguished Speaker and a Fellow of the IEEE. He is the recipient of the 2011 China Computer Federation (CCF) Overseas Outstanding Achievement Award.

    Yue Li has graduated from University of Science and Technology Beijing with his master degree in 2014. He received his bachelor’s degree in 2011 from Tian jin College, University of Science and Technology Beijing. His research interests include cloud computing and distributed system technology.

    Xiaolei Yang is currently a student in University of Science and Technology Beijing for his master degree. He received his bachelor’s degree in 2013 from University of Science and Technology Beijing. His research interests include cloud computing and recommend systems.

    Zhanning Ma is currently a student in University of Science and Technology Beijing for his master degree. His research interests include distributed storage technology and data duplicate removal. He received his bachelor’s degree in 2010 from Taishan Medical University.

    View full text