Maintaining Query Performance through Table Rebuilding & Archiving

Despite optimized system configurations, rapid advances in transactional data accumulation and update frequency have created challenges in maintaining query speed on relational databases. While there have been improvements in configuration and query settings in the past, the volumetric increase in transaction tables in databases, characterized by data growth and constant updates on each entry, has impacted query speed. Some current database engines sometimes miss issues such as block fragmentation that impact database performance. Since data is now businesses' main asset, maintaining optimal query performance is essential in supporting business activities and decision-making. Therefore, for database administrators, choosing the right method to ensure data performance and integrity is crucial.


INTRODUCTION
Relational Database Management System (RDBMS) has become one of the basic needs for companies.Nowadays data is a valuable asset and information for the customers, and having access to it is absolutely necessary.Obviously, a reliable RDBMS is expected to support organizations by providing information accurately and timely [1].
Database, before being used in production systems, has been properly configured, optimizing business process tuning, including indexing tables and performance testing [2].Indexing is a technique to increase the performance query of table [3].Bitmap index is a technique that is common to make performance databases better [4].
In a paper titled Archiving ERP data to enhance operational effectiveness: the case of Dolphin explains that Transaction-intensive, customer-facing applications, and the most critical high-volume ERP and CRM applications are collecting and storing huge amounts of data, resulting in an exponential increase in the size of the database, and highlight why data archiving is crucial for the enterprises running SAP solution in optimizing business performance [5].
All about Oracle database fragmentation, written by Craig A. Shallahamer, OraPub, Inc.There are many types of Oracle database fragmentation.Some are harmful, and some are not.He explains the Database Administrator (DBA) to proactive application design and space management steps, addresses tablespace space, segment, data block, index leaf block, and row fragmentation [6].
In a B-tree (row store) index, fragmentation exists when the index has pages where the logical ordering in the index, based on the index key values, does not match the physical order of the index pages.The Database Engine automatically modifies the index whenever an insert, update, or delete operation is performed on the underlying data.For example, adding a row in a table can cause existing pages in the row store index to split, making room for the insertion of new rows.Over time, these modifications can cause the data in the index to be spread across the database (fragmented).A query that reads many pages uses a full index or range scan; a highly fragmented index can reduce query performance because additional I/O may be required to read the data required by the query.Instead of a small number of large I/O requests, a query would require many small I/O requests to read the same amount of data [7].
Hariprasath Rajaram explains that performance is sometimes degraded in the Online Transaction Processing (OLTP) database environment because of table plan changes and rowchaining and row-migration issues.Maintenance activities such as table reorganization may be required based on transactions on the table.After table reorg, performance was seen to improve a lot [8].
In the current study conducted at PT XYZ, both rebuild table and data archiving methods are synergistically combined to optimize query performance.While earlier research might have individually explored these techniques, the integration of the two within this unique context stands out.Specifically, the study harnesses a high transaction table updating with staggering data of 45 million rows every month, a dataset scale potentially uncharted in prior investigations.Notably, observations pointed out challenges like big datasets leading to overheating in the Database Management System (DBMS), a revelation that could be a novel finding not addressed in earlier works.This study also draws references from several noteworthy papers, such as those focusing on Archiving ERP data and Oracle database fragmentation.A juxtaposition of these findings with the current research could elucidate distinct facets.Central to this research's distinctiveness is its contextual grounding, which is anchored in the tangible operations of PT XYZ.Such a hands-on application in a real-world environment, emphasizing these specific methodologies, might have been overlooked in previous scholarly endeavors.

Infrastructure preparation
Besides setting up server hardware and the pre-configured Oracle version 11 database, the database is categorized into two types based on transaction nature: Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) [9].The transaction schema is designed to house transaction tables, and the history schema is dedicated to data archiving.Archived data refers to information that remains unchanged yet is essential for reporting purposes [10].

Data Tablespace
Tablespace serves as a logical storage unit within a database.It may be composed of one or multiple physical data files.Within the Transaction schema, both the transaction table and its corresponding index have distinct tablespaces and datafiles.Similarly, in the history schema, the index and data are allocated to separate tablespaces.

Figure 2 Logic Table Schema in Tablespace
Within the transaction schema, a data table is designed with 47 columns and nine indexes, encompassing primary key indexes, combined indexes, and individual column indexes to expedite query times according to the specific columns users search for.Before initiating the table rebuild and data archiving process, ensure that the table designated to hold the data being transferred to the history schema (for archiving) is set up according to the subsequent design.

Figure 3 OLTP and OLAP table schemas
Before proceeding with the table rebuild, transactions must be halted to ensure data remains consistent.The rebuilding typically occurs during designated maintenance periods or when transactions are paused.A crucial preliminary step involves drafting a DDL script detailing the creation of tables, indexes, and constraints.During this phase, meticulous attention to detail is required when setting up the script for object regeneration.With Oracle's Toad editor, a table reconstruction script is available, designed to minimize potential mishaps.It's advised to keep a copy of this rebuild script as it will be instrumental when remolding database components, specifically tables, indexes, and constraints for table A. Following these steps, the transaction table named 'A' should be rebranded as 'A_X.'Before initiating the table rebuild, halt all transactions to ensure the consistency of the data.The rebuilding process typically occurs during scheduled maintenance windows or transaction break points.Before starting the rebuild, it's essential to draft a DDL script outlining the creation of tables, indexes, and constraints.Precision and caution are required when devising the script for object reestablishment.A specialized table reconstruction script has been devised using Oracle's Toad editor to minimize potential mistakes.Ensure this rebuild script is stored securely as it will  It's imperative that the history tablespace and history index are situated on separate physical drives.c.Instituting a transaction user granted rights to store data in the transaction and transaction index tablespaces, as well as a history user bestowed with access to the Histo tablespace and Histo Index.d.During the actual implementation phase, it's crucial to pinpoint tables with high transaction volumes, encompassing both data insertions and updates alongside those accessed frequently.Specific measures are vital at this juncture to ascertain the smooth execution of the rebuild & archiving operations while minimizing potential hazards.

Implementation
During the table rebuilding and data archiving processes, several phases are undertaken as described below: a. Pinpoint tables with significant transaction volumes and frequently executed queries.b.Within the database schema, numerous database entities exist, encompassing tables, views, constraints, and stored procedures.Tables with minimal data and transactions, like user information and primary data, aren't the focal points of this research as their impact on the observation procedure is minor.Conversely, thorough preparation is required for tables with heavy transactions, which includes understanding their interrelations and associated entities, such as indexes and constraints, which will be essential during the rebuild During the process of table rebuilding and data archiving, several steps will be taken: a. Pinpoint tables with high transaction volumes and frequently executed queries.b.Within the database schema, numerous entities exist, encompassing tables, views, constraints, and stored procedures.Tables that contain minimal data or experience low transaction rates, like user and master data, aren't the primary focus of this research due to their limited significance in the study.Conversely, it's essential to set up tables with high transaction volumes, ensuring all related entities, such as indexes and constraints, are in place for future rebuilding.c.Halt all transactions, making sure that no active tasks are in progress and no fresh sessions access the application.d.Temporarily restrict access to the database by locking out users.This precaution ensures data integrity throughout the rebuilding and archiving stages and mitigates potential transactional errors.Every month, data accumulates at a rate of 45 million rows, with each row undergoing updates within that same month.In this investigation, we tracked the evolution of query performance through seven months of data accumulation.Before any updates, the average data row is sized at 153 bytes, accounting for a total of 45,010,790 rows and 1,066,120 blocks.Postupdate, the average size of a row swells to 273 bytes, and the total row count slightly increases to 45,027,430, taking up 1,751,015 blocks.The variation in the block allocation is attributed to data updates necessitating additional space.The preset 10% free space within a block proves inadequate, compelling the formation of a new block to house the revised data.This initiation of a new block paves the way for space fragmentation, leading to the potential situation where a single data row might span across two distinct blocks.Such circumstances contribute to a sluggish response time during data queries, especially as data volume grows and updates are frequent.To

Rebuild table
Reconstructing a table involves reproducing a specific table (TABLE A) within the database, especially after it has experienced numerous Data Manipulation Language (DML) activities, such as insertions and modifications.This operation mandates a pause in application activities, typically scheduled during the early morning when transactional demands are minimal.To prevent naming conflicts that could result in creation failures, it's imperative to first retitle the current table (TABLE A) to another name (TABLE A_X).This renaming doesn't impact the existing data or its indexes.However, following the renaming, any stored procedures or functions may become dysfunctional, as they won't be able to locate TABLE A.  .The recreation of table A ensues, mirroring its original state, including its tablespace positioning.During this phase, only the foundational table structure is set up, excluding any indexes and constraints.The newly established TABLE A is then populated with specific data based on the predetermined transaction history duration that users will access.It's essential to highlight that the period for retaining accessible data is subject to each organization's guidelines, which might differ.For this particular phase, the volume of data designated for reintroduction into the new TABLE A spans the last five months, specifically from the third to the seventh month, amassing 225,000,000 rows derived from 45,000,000 rows multiplied by five.During this phase, the transaction table, TABLE A, has been rejuvenated with fresh and index blocks.This comes post the DML transaction, with negligible alterations data across rows.The culmination of this process involves instituting database attributes, specifically constraints, and recompiling the PL/SQL stored procedures and functions that serve as the database's business logic.This ensures the smooth operation of all business-related processes concerning TABLE A. The final move involves table analysis to retrieve database statistical insights and refine query optimization.Subsequently, the transaction table is primed to store data that customers will reimburse.This regimen is periodically executed in the database's upkeep, emphasizing transaction tables and data sustainability.

Data Archiving
At this point, it's been established that the data subject to the DML (update) has been retained in the transaction table for the previous five months and the current month.This allows customers to view data statuses for the most recent six months.

Outcome Measurement
The following are the results of the query comparison before the rebuild was carried out.The effectiveness of the rebuild is also obtained from the disk capacity used, not using the word "essentially" to mean "approximately" or "effectively."

CONCLUSIONS
From the research conducted, it was found that updated rows can lead to block fragmentation in the transaction table by more than 4.9 times, impacting the query speed depending on the table's characteristics.The solution in rebuilding the table has proven effective by reducing table and index block fragmentation by 36 times the disk consumption.Furthermore, after adding data, the query speed increased to 4 times faster with this rebuild process.Additionally, after the sixth month, archiving data successfully maintained the query speed by limiting the amount of data accessed.In the future, there may be other methods in the process of rebuilding and archiving that are more advanced (reducing manual processes), and the impact will

Figure 4
Figure 4 Rebuild and Archiving3.1.Pre-ImplementationAfter setting up the infrastructure and software, the prerequisites for pre-implementation include: a. Establishment of both transaction and transaction index tablespaces for the logical placement of transaction data.b.Formulation of this tablespace and history index tablespace for relocating updated data.It's imperative that the history tablespace and history index are situated on separate physical drives.c.Instituting a transaction user granted rights to store data in the transaction and transaction index tablespaces, as well as a history user bestowed with access to the Histo tablespace and Histo Index.d.During the actual implementation phase, it's crucial to pinpoint tables with high transaction volumes, encompassing both data insertions and updates alongside those accessed frequently.Specific measures are vital at this juncture to ascertain the smooth execution of the rebuild & archiving operations while minimizing potential hazards.

◼
ISSN (print): 1978-1520, ISSN (online): 2460-7258 IJCCS Vol. 18, No. 1, January 2024 : 73 -82 78 c.Halt all transactions by ensuring no ongoing operations and blocking new application session access.d.Briefly suspend access to the database by restricting user access.This measure ensures data consistency throughout the rebuilding and archiving stages and avoids potential transaction mishaps.

Table 2
Rename the transaction table

Table 3
Insert Data Into New TableAfter the insert process is complete, enter the process of creating indexes back to New TABLE A. This needs to be done to refresh the index and serve the queries required for the application.

Table 4 Insert
Data Into Table

Table 6
Comparison table of query speed before and after the update

Table 7
Comparison of block sizes before and after rebuild