ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Method Article

A novel data storage logic in the cloud

[version 1; peer review: 1 approved, 1 not approved]
PUBLISHED 21 Jan 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Databases which store and manage long-term scientific information related to life science are used to store huge amounts of quantitative attributes. Introduction of a new entity attribute requires modification of the existing data tables and the programs that use these data tables. The solution is increasing the virtual data tables while the number of screens remains the same. The main objective of the present study was to introduce a logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. It means all types of input data can be interpreted as an entity and attribute at the same time, in the same data table.

Keywords

Joker Tao, NoSQL, Cloud, Database, Life science, Physical data table, Virtual data table, RDBMS

Introduction

Databases which store and manage long-term scientific information related to life science are used to store huge amount of quantitative attributes. This is specially true for medical databases1,2. One major downside of these data is that information on multiple occurrences of an illness in the same individual cannot be connected1,3,4. Modern database management systems fall into two broad classes: Relational Database Management System (RDBMS) and Not Only Structured Query Language (NoSQL)5,6. The primary goal of this paper is to introduce a novel database model which provides an opportunity to store and manage each input data in one (physical) data table while the data storage concept is structured. JT can be defined as a NoSQL engine on an SQL platform that can serve data from different data storage concepts without several conversions.

Methods

The technical environment is Oracle Application Express (Apex) 5.0 cloud-based technology. Workstation: OS (which is indifferent) + internet browser (Chrome). The Joker Tao logic (www.jokertao.com) can be applied in any RDBMS system (e.g. www.taodb.hu). Specification of the physical data table structure was determined with -ID (num) as the identifier of the entity, which identifies the entity between the data tables (not only in the given data table); -ATTRIBUTE (num) is the identifier of the attribute; -SEQUENCE (num) which is used in the case of a vector attribute; and -VALUE (VARCHAR2) which is used for storing values of the attributes. The codes which are stored in the Attribute column are also defined, sooner or later, in the ID column. At that time the attribute becomes an entity. In every case, the subjectivity determines the depth of entity-attribute definition in the physical data table. Firstly, we demonstrate a traditional (relational) data table structure (Table 1).

Table 1. Traditional data storage structure.

NameAttribute 1Attribute 2Attribute 3
Item 1Value 1Value 2Value 3
Item 2Value 4Value 2Value 5

Following this, the presented data table has been modified step by step. At the end of these steps, the JT data storage structure is created. The first step is the technical data storage. In Table 2, technical data will be stored which describes exactly what the virtual data table stores in the physical data table.

Table 2. Belonging to the virtual data tables.

TablesNameAttr.1Attr.2Attr.3
Table 1Item 1Value 1Value 2Value 3
Table 1Item 2Value 4Value 2Value 5

In the second step, the identifiers assigned to the attributes are displayed (Table 3).

Table 3. Identifiers assigned to the attributes.

Tables
(1010)
Name
(1019)
Attr.1
(1027)
Attr.2
(1028)
Attr.3
(1029)
Table 1Item 1Value 1Value 2Value 3
Table 1Item 2Value 4Value 2Value 5

In the third step, identifiers assigned to the entities are also displayed (Table 4). These identifiers are assigned to each cell of the entity. These identifiers are determined by the developer. The values of these identifiers can be any natural number that has not already been used in the ID column.

Table 4. Identifiers assigned to the entities.

Name
(1019)
Attribute 1
(1027)
Attribute 2
(1028)
Attribute 3
(1029)
Item 1
(10001)
Value 1
(10001)
Value 2
(10001)
Value 3
(10001)
Item 2
(10002)
Value 4
(10002)
Value 2
(10002)
Value 5
(10002)

In the fourth step, the attribute identifiers are also assigned to each cell (Table 5). These identifiers are determined by the developer. The values of these identifiers can be any natural number that has not already been used in the Attribute column.

Table 5. Identifiers of records and columns.

Table
(1010)
Name
(1019)
Attrib. 1
(1027)
Attrib. 2
(1028)
Attrib. 3
(1029)
Table 1
(10001,
1010)
Item 1
(10001,
1019)
Value 1
(10001,
1027)
Value 2
(10001,
1028)
Value 3
(10001,
1029)
Table 1
(10001,
1010)
Item 2
(10002,
1019)
Value 4
(10003,
1027)
Value 2
(10004,
1028)
Value 5
(10005,
1029)

In the fifth step, the initial value of the cell is inserted as the Value of the JT structure (Table 6). From this stage, the developer uses identifiers (which were defined in the previous steps) instead of attribute names.

Table 6. Data table representation in record ID, column ID and value structure.

1010
(Tab 1)
1019
(Name)
Attrib. 1
(1027)
Attrib. 2
(1028)
Attrib. 3
(1029)
10001,
1010,
1 1086
10001,
1019,
1 Item1
10001,
1027,
1 Value 1
10001,
1028,
1 Value 2
10001,
1029,
1 Value 3
10001,
1010,
1 1086
10002,
1019,
1 Item2
10003,
1027,
1 Value 4
10004,
1028,
1 Value 2
10005,
1029,
1 Value 5

The final step is to rotate the traditional data table structure 90 degrees. This means each virtual data table is defined in one physical data table. With these steps the developer can design one data table to store each entity, attribute and formula in a database. The above described method can be applied manually. For the automatic conversion we created a Java code below7:

public static String getEntityName ( )
throws Exception
{ 
Connection conn = broker.getConnection ( );
PreparedStatementpstmt = 
conn.prepareStatement ("select * from joker"); 
ResultSetrs = pstmt.executeQuery ( );
inti = 0; 
while (rs.next ( )) { 
i++; 
} 
System.out.println ("number of records:" + i ); 
broker.freeConnection (conn); 
return ""; 
} 
public static void insert JokerRow 
(Integr GROUP_ID, Integer UNIQ_ID, 
Integer FIELD_ID, Integer ARRAY_INDEX,
String SEEK_VALUE, String FIELD_VALUE)
throws Exception { 
if (GROUP_ID == null) pstmt.setNull (1, 2); 
else pstmt.setInt (1, GROUP_ID.intValue ( ));
if (UNIQ_ID == null ) pstmt.setNull (2, 2);
else pstmt.setInt (2, UNIQ_ID.intValue ( )); 
if (FIELD_ID == null ) pstmt.setNull (3, 2); 
else pstmt.setInt (3, FIELD_ID.intValue ( ));
if (ARRAY_INDEX == null ) pstmt.setNull (4, 2); 
else pstmt.setInt (4, ARRAY_INDEX.intValue ( )); 
if (SEEK_VALUE == null) pstmt.setNull (5, 12); 
else pstmt.setString (5, SEEK_VALUE); 
if (FIELD_VALUE == null) pstmt.setNull (6, 12); 
else pstmt.setString (6, FIELD_VALUE); 
pstmt.execute ( ); 
} 
public static void readFile ( ) throws Exception 
{ 
File f = new File ("data.txt"); 
BufferedReaderbr = new BufferedReader 
(new FileReader (f )); 
while (br.ready ( )) { 
String line = br.read Line ( ); 
int GROUP_ID = Integer.parseInt
(line.substring (0, 10 )); 
int UNIQ_ID = Integer.parseInt 
(line.substring (11, 21)); 
int ARRAY_INDEX = Integer.parseInt 
(line.substring (22, 32)); 
String FIELD_VALUE = line.length ( ) > 32? 
line.substring (33, line.length ( )): " "; 
insertJokerRow (Integer.valueOf (GROUP_ID), 
Integer.valueOf (UNIQ_ID), null, 
Integer.valueOf (ARRAY_INDEX), null, FIELD_VALUE); 
} 
br.close ( ); 
}

Results

The resulting table structure is called JT structure (Table 7).

Table 7. JT data storage structure.

ID (Record)AttributeSequenceValue
10001101011086
1000110191Item 1
1000110271Value 1
1000110281Value 2
1000110291Value 3
10002101011086
1000210191Item 2
1000210271Value 1
1000310281Value 2
1000410291Value 3

From the JT physical data table, the following definitions can be read out:

  • Virtual record is the set of the physical data tables which have the same ID value.

  • Virtual data table is the set of the virtual records which have the same value of the belonging to the virtual data table (code 1010) attribute.

Thesis: In the JT structure, each attribute needs only one index for indexing in the database.

Proof using mathematical induction: It is obvious the statement is true for the case of one record stored in a data table (according to the RDBMS structure where the developers use more indexes to indexing more attributes). In this case the data table appears as shown in Figure 1.

838d1737-d4a3-449b-a483-a55f13e89a2c_figure1.gif

Figure 1. Indexing a record.

                          Index = attribute (num) + value (varchar 2)

In view of entity, an ID (numerical) index is also used in JT logic-based systems. This ID does not depend (no transitive dependency) on any attribute. Thus, the entities of the virtual data tables meet the criteria of the third normal form (Figure 2).

838d1737-d4a3-449b-a483-a55f13e89a2c_figure2.gif

Figure 2. ID usage.

The modes of the expansion of a data table are: -input new entity (Figure 3); -input new attribute (Figure 4); -input new virtual data table (Figure 5).

838d1737-d4a3-449b-a483-a55f13e89a2c_figure3.gif

Figure 3. New entity.

838d1737-d4a3-449b-a483-a55f13e89a2c_figure4.gif

Figure 4. New attribute.

838d1737-d4a3-449b-a483-a55f13e89a2c_figure5.gif

Figure 5. New virtual data table.

The indexing is correct in case of n+1 record expansion also. With JT logic the user is able to use only one physical data table to define each virtual data table in a database. Therefore, since only one index is required to index each attribute, the statement of the thesis is true in every case of the JT logic-based data table according to the principle of mathematical induction below. Thesis: For n=1 ergo;

                          1 + 2 + .. + n = n * (n + 1)/2

substituting one into the equation we get:

                                    1 = 1 * (1 + 1)/2

result of the operation is 1=1, that is, the induction base is true.

Using proof by induction we can now show that this is true for the following equation:

n = k where k is a optional but fixed natural number. Therefore, we know that the following operation is true:

                          1 + 2 + .. + k = k * (k + 1)/2

Finally using n=k+1 we can prove our assumption to be true:

                  1 + 2 + .. + k + (k + 1) = (k + 1) * (k + 2)/2

The above induction proof shows:

                  1 + 2 + .. + k + (k + 1) = k * (k + 1)/2 + (k + 1)

Conducting the mathematical operations we obtain the following:

                  1 + 2 + ..k + (k + 1) = (k * ((k + 1)/2) + 2 * (k + 1))/2 =

                        (k * k + k + 2k + 2)/2 = (k * k + 3k + 3)/2

Conducting the mathematical operations on the other side we obtain the same:

                       (k+1)*(k+2)/2 = (k*k+2k+k+2)/2 = (k+k+3k+2)/2

Thus, the induction step is true. Given that both the induction base and the induction step are true, the original statement is therefore true. In the present study, we explained the JT data storage logic. In our other study we focused on the query tests. Our previous results7 show that from 10000 records the relational model generates slow (more than 1 second) queries in a cloud-based environment while JT can remain with the one second time frame.

Discussion and conclusions

Using the developed database management logic, each attribute needs only one index for indexing in the database. JT allows any data whether entity, attribute, data connection or formula, to be stored and managed even under one physical data table. Thanks to this flexibility, a formula which is stored in a database can be utilized for problem solving in another field regardless of the difference in data storage method used in the present environment. In the JT data model, the entity and the attribute are used interchangeably, so users can expand the database with new attributes after or during the development process. With JT logic, NoSQL engine is ensured in SQL database systems for the storage and management of long term scientific information.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 21 Jan 2016
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Mátyás B, Szarka M, Járvás G et al. A novel data storage logic in the cloud [version 1; peer review: 1 approved, 1 not approved] F1000Research 2016, 5:93 (https://doi.org/10.12688/f1000research.7727.1)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 21 Jan 2016
Views
23
Cite
Reviewer Report 25 Feb 2016
Kavita Sunil Oza, Department of Computer Science, Shivaji University, Kolhapur, Maharashtra, India 
Approved
VIEWS 23
Work demonstrated in the paper is good and well explained. Complexity of work is not mentioned (algorithmic complexity) but this is not necessary as we already have high speed processors and time complexity may not matter much. Some more references ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Oza KS. Reviewer Report For: A novel data storage logic in the cloud [version 1; peer review: 1 approved, 1 not approved]. F1000Research 2016, 5:93 (https://doi.org/10.5256/f1000research.8321.r12373)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
32
Cite
Reviewer Report 15 Feb 2016
Jan Lindström, MariaDB Corporation, Espoo, Finland 
Not Approved
VIEWS 32
In this paper authors introduce a new logic called Joker Tao (JT) which provides universal data storage for cloud-based databases. However, the paper is very poorly written. Firstly, the proposed logic is not presented detailed enough for the reader to ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Lindström J. Reviewer Report For: A novel data storage logic in the cloud [version 1; peer review: 1 approved, 1 not approved]. F1000Research 2016, 5:93 (https://doi.org/10.5256/f1000research.8321.r12375)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 21 Jan 2016
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.