Improve HR Decision-Making Based On Data Mart and OLAP

The organizations always need to manage their operations, process the data electronically, and find a platform that help them to support their strategic decisions. The success of the human resource department can reflect overall organizational success. The human resource department professionals try to ensure finding the right person at the right time for the job that fits the person according to skills and qualifications. This task needs a platform that supports making the right decision based on historical managerial information. The data mart is a departmental based decision support system that used departmental data to help decision-makers to support short term decisions. Human resource (HR) data mart is the base stone for building an enterprise data warehouse. The paper presents the implementation process of HR data mart starting from implementing data mart schema to online analytical processing (OLAP) reports. The data mart is implemented on retired employees’ data of Basra Oil Company for over 15 years. A human resource data mart can provide a base platform to perform a different analysis operation to support the right decisions. Different OLAP reports are implemented to help analysts and decision-makers to get the answers for their questions as OLAP queries. Two categories of reports are implemented offline reports using Microsoft Excel Pivot Table 2010 and web OLAP reports using SQL Server Reporting Service 2014 (SSRS). The tools used to implement data mart vary from SQL Server Management Services (SSMS) 2014, SQL Server Integration Service 2014 (SSIS), SQL Server Analytical Service 2014 (SSAS), SQL Server Reporting Service 2014 (SSRS), SQL Server Data Tools 2013 (SSDT), and Microsoft Excel Pivot Table 2010.


Introduction
The organization's success can be measured by the success of their decisions.The decisions should be made based on a solid base of historical experiments.One of the most important basis is to depend on the HR systems and using historical data to make the right decisions.The decision of the HR manager depends on many factors from judgment and human experience to knowledge preference which may cause unforeseen, inaccurate, inconsistent decisions.Human Resource Management (HRM) is a set of management activities and tasks to maintain and develop the HR workflow.The HRM goals are basically enhancing the quality and productivity, improving the development and growth of the individual, simplifying the organizational competitiveness, and committing the social and legal obligations [1] [2].
HR system helps the organizational managers by performing the operations of the organization, performing the tasks of HR individuals, and support the right decisions [3].The applications of HR with artificial intelligence (AI) and data mining techniques can resolve the problems of indistinct and unstructured decision making.The inaccurate, inconsistent, unpredicted and inequality decisions can solve by the applications of HR with AI techniques.The data in HR databases can be considered as a rich resource to be used by and produce a knowledge discovery database (KDD).This HR KDD can be used for analyzing and exploring large data, finding rules, and discovering rich patterns [4] [5].
DWs become very important asset of all business enterprises that hold classical applications for critical analysis and business planning.DW component basically is a data store built for analytical processing with main objective "analyzing historical data and supporting decisions".The data mart which is a part form DW which is constructed to solve problems and to enable simplification of patterns usage.Each data mart dedicated to specific problem.The data organize in the tables of data mart in a shape of star, snowflake schema, with fact and dimension tables around fact table.Star and snowflake schemas are the simple shape of DWs where the complex architecture is a constellation schema where each dimension table may be the fact table of other dimension tables [6][7][8] [9].The operational data stores are differ from DW in data handling, data types, data size, users, and architecture.The users of DWs are data analysis and decisionmakers in the first place where the users of operational systems operate the entire organization and perform all data entry operations such as registration of new customers, taking orders, complaints handling, and monitoring operational activities' status.These operational systems are handling one transaction record at a time [10].There are many applications based on DW and data mart to support decisions such as spatial personalization DW [11], construction management DW [12], Online recruiting [13], clinical [14], [15], construction management [12], educational [16], crime DW [17], invoice DW [18].DW and data mart are the best solutions for managers and decision makers seeking for platform that present information for decision supporting process.
In this paper, the HR data mart implementation process will be explained to find how HR data mart can support the HR managers' decisions.OLAP queries can answer complex multidimensional queries and present the answers in a tabular or chart form.Many valuable questions to decision-makers will be answered as OLAP queries such as: what are the most required employee's skills after retirement?Which month has the most number of retired employees?Which gender, in which month, and in which job title has the most number of retired employees?Are there any privacy concerns related to data mart implementation?Besides that, the reason behind using data mart rather than DW will be explained and how OLAP with data mart can make a decision support system.The best approach to implement data mart will be discussed.
The rest of the paper is organized as follows: section two lists and criticizes the related works to HR data mart implementation, section three explains the HR data mart and OLAP reports implementation process.The fourth section conducts the concluded points after implementing the data mart and lists future works.

Related Works
Mohammed Abdulameer Mohammed and Mohammed Morad Anad [19] proposed a framework to adopt all Iraqi universities databases of human resources related to students in a DW.The proposed DW architecture provides the authorized access for decision makers to access all students' information of a private and public Iraqi universities.The researchers found that the framework can be used to increase the income of the Iraqi ministry of higher education and scientific research by finding the right staff for private and public organizations by arranging the annual payment.However, the paper presents a framework to adopt all databases of students of different universities which can be considered very difficult since the databases are vary in structure, data types, constraints, and domains.Moreover, the difficulties in this approach is difficulties of integration of all databases of universities (which hold many colleges and departments) since they are not available of very hard to get.
Alecos M. Kelemenis and Dimitrios Th.Askounis [3] proposed a framework to implement a decision support system for human recourse.The researchers explained the main components of human resource system, the relationships among components, and the tools of decisions that should be used for analysis.However, the researchers listed and explained theoretically the components, rules, relationship among components that used to implement the human resource decision support system.
ZHANG Dan-Ping [20] proposed a model based on data warehouse for human resource of a dataset of a university.The main aim is to evaluate the scientific research of teachers' abilities.The researcher combined the DW with the practical work and used a snowflake as an architecture for the proposed HRDW.
However, there is no reports implemented to show the OLAP functionalities and view the multidimensional data cube of HRDW.
Hokey Min [21] presented an empirical analysis to find the variable that affect the employee turnover DW.The researcher used the data obtained from a questionnaire intended for many industries such as wholesalers, logistics providers, retailers, and manufacturing that performed DW operations.The researcher found after analyzing the data using regression that the variables that affect DW employee turnover are years of experience, skills, firm size, pay scale and job security.The paper conducted also the model that link the variables of job satisfaction and job alternatives.However, the paper did not present how to implement DW model and how to use the functionalities of DW to find the most effective variable.

Model
The dataset used to implement the HR data mart consists of 484 rows of retired employees of Basra Oil Company, Basra, Iraq.The dataset holds detailed information (birth, gender, years of service, job, employee title, skills, retirement date, qualification, work department, and salary stage).The HR dataset is imported as a comma-separated value (CSV) to SQL Server DBMS where the processes of selection, cleaning, consolidating, and integrating are performed on the staging area.The employee's information privacy is one of the major concerns in all the organization's policies.Due to that, the implementation process of HR data mart should handle this point and reserve it.The data mart is used here rather than DW since the data is a departmental level and small compared with the intended data to build DW.This data mart can be used later for building the enterprise HR DW.
Designing DW and data mart is the responsibility of information technology (IT) professionals where the DW storage and performance must be optimal where the HR system should ensure that each member in the department has enough information to fulfill his/her job.HR and IT should collaborate together to find a better long-term solution to produce a useful and optimized system [22].The approach used to implement HR data mart is a bottom-up approach since the data mart parts will be implemented to produce the final shape of HR data mart [23].
DW is an information repository that been collected from different heterogeneous data sources and stored under one schema in a single data store.Many pre and post processing operations are required to build DW such as data integration, data cleaning, data transformation, and data refresh.The four characteristics of DW is "subject-oriented, integrated, time-variant, and non-volatile" set of data that used to support decision-making process [24].Subject-oriented define the general nature of data collected to build and analyze the subject in the DW.Integrated refers to the data integration process for all data sources while time-variant refers to the historical nature of the stored data in DW.Finally, the non-volatile means that the stored data will not be altered and changed [6].The data mart is a part of DW which is selected to make fast analyzing and list simple reports.The nature of data sources determines the type of data mart into depended and independent [25].The dependent data mart takes data from central DW which is already implemented while independent data mart is implemented based on data sources or standalone systems.The process of implementing DW and get the data out of data sources to DW tables is extract-transformload (ETL).ETL involves many other processes to organize the data coming from different operational systems and filtering these data and loading the data into DW schema.The dependent data mart construction process is simple than independent data mart since the source of dependent data mart is central DW which is already implemented [26], [27].
The main difference between DW and data mart is the goal of implementation, DW is an enterprisedriven approach while data mart is a departmental-driven approach.Data mart may contains data, programs, software, and hardware of specific department.Data mart of single company are different from each other but can be coordinate to support the decision-making of entire organization.Data warehousing in the other side, represents the processing operations and DW implementation steps starting from extracting data and ending with implementing reports for analyzing [28][29] [30].The model implementation framework is presented in the figure (1) to build independent HR data mart.The basic steps of HR data mart implementation process are data preprocessing, implementing ETL, building HR cube, and implementing OLAP reports.

Data Preprocessing
The data selection process is performed to determine the data types in the data source and the appropriate strategy to get the data from a CSV data source.The staging area table in the server will hold all selected data in order to process them and load them in the tables of the HR data mart.The staging area represented by a table that holds all the selected data with its data type and acts as an intermediate area between data source and data mart storage area, the data mart schema should be built and prepared to load the processed data from staging area.Figure (2) shows the data mart proposed schema.consists of (Year, Month, Day, Week, and Quarter) which can have used to provide different results from different perspectives.Star schema is the base structure in DW development where the dimensional structure of databases is used to implement reports for BI needs.Star schema is used as a model schema for many reasons that started from the simplicity of implementation and ending with the efficiency of the query.The multidimensional cube result from using snowflake/star schema allows performing data visualization and complex OLAP queries.The cube allows getting the views of summarized data with efficient queries [31][32].

Implementing ETL
ETL process of dependent data mart consists basically of determining the right data subset from central DW relevant to the data mart subject and transferring the summarized data into data mart.In the other hand, the independent data mart deal with all ETL processes since there are many different data sources.The motivation behind independent data mart implementation is providing a solution in a short time, while improving performance, better control, lower cost of telecommunication, and availability are the motivations behind implementing dependent data mart [33].
To perform ETL effectively, the staging area needs to be prepared and the staging table should be loaded with HR data from the data source.One of the significant parts of DW is the data staging area where it is a place of arranging and organizing data sources before loading them in the DW dimension tables [34].The ETL processes are performed using SSIS 2014 based on SSDT 2013.The dimension tables are loaded using a fixed type of slowly changing dimension (SCD).The reason behind the loading table with this type there is no need for changing the data in the dimension tables.All dimension tables are loaded parallel.The dimension tables are loaded first with type 0 SCD than the fact table are loaded with only surrogate keys of dimension tables (Info, Date, Salary, and Job) and measurement key (count).The measurement key represents what will be measured in the data mart [35].In HR data mart the measurement represents the count of retired employees which should be calculated over all dimensions' columns.The calculation will be performed as a multidimensional query over HR cube to find the number of employees that match the query arguments.

Building HR Cube
The HR data mart should be built as a cube.The cube should be formed based on dimension tables constructed before in order to perform different calculations on the detailed data reside in HR data mart.In this stage, the data is transformed from the normalized form of tables into multidimensional cubic representation.Since the SCD type in the loading process is type 0, so the new data will be loaded as soon as the ETL package is executed.SSAS package is used to build an HR cube based on all dimension tables keys which allow the analyst to perform all OLAP operations (slice, dice, roll up, and drill down).The advantages of using cube for DSS can be listed as [29]: x The fast response of the OLAP query makes it better choice for HR DSS.
x If the query is set and defined before, the OLAP query will be very fast and accurate.
x Slice and dice can perform quickly.As soon as the data loaded into the cube, it is indexed by using techniques and formats which are designed for dimensional representation.The OLAP engine manages the performance aggregation and summary tables to make very fast response results due to indexing strategies, pre-calculations, and optimization techniques.The analysts and decision-makers can perform drilling down and rolling up by removing or adding dimension keys without building new queries.OLAP cube also provides analytical functions that exceed SQL functions which offer you the price of load performance of large data.OLAP server offers you a graphical view and a multidimensional spreadsheet tabular view for better structure of data [6][36].

Implementing OLAP Reports
The OLAP server is used to implement two categories of OLAP reports.OLAP server can answer the "what-if" question without manipulating the cube on the server [37] [38].Many tools can handle the cube in the OLAP server to present the query result and shape these results as a chart.OLAP servers' goals are storing and managing information in multi-dimensional view using different front-end, data mining, reports, and query tools [39].The data in DW is stored and managed by OLAP servers which used to present the information multi-dimensionally.The needs of decision-makers and managers are varying from simple queries to complex queries.The view method also varies from locally to web view of charts to analyze the result charts.Some HR managers are concerned with listing the number of retired employees according to month and classified by gender.In this case, an OLAP report based on MS excel pivot table should be sufficient, while other managers and decision-makers need a sophisticated OLAP query to list all the retired employees according to job title grouped by gender and classified by quarter and grouped by month.In this case, the reports based on SSRS is required for these needs.The managers and decisionmakers' needs should be determined in order to build matching reports based on a proper schema table.These needs may determine the shape of DW architecture [40] [41].
The main obstacle in DW and data mart adoption in HR is the consideration of decision-makers to users' fears to lose control and the inability to express their knowledge.The adoption of a data mart is the proper use of supporting the decision-making process.The other use of DW and data mart besides supporting decision making is controlling and monitoring activities.DW and data mart can prevent the conflicts which could appear from different languages, cultures, and practices of stakeholders and DW and data mart present " single version of the truth" [42].The dimensional cube browsing can help the analysts to measure the model implementation success and to perform OLAP queries which used to implement web OLAP reports.The first way to present the result of OLAP queries is to drag and drop the dimension table column(s) or hierarchy followed by measurement to perform the query and present the results.Figure (3) lists the number of employees according to quarter and gender.Two dimensions are used to list the results (Date, and Info).The result showed that the number of employees according to four quarters and classified by gender (M, and F).Male employees in the fourth quarter take the maximum number with 221 while the overall female employees took the minimum number overall quarters.The second way to perform OLAP queries and view cube results is to select the dimension hierarchy with operator to filter expression.In this window, the analyst can select more than one dimension and hierarchy with different operators and filter expressions.A. Offline OLAP Reports.
To implement the two categories of OLAP reports, there is a need for a query language that makes calculations and retrieves information based on fact table measurement.The structured query language (SQL) is not suitable to perform this kind of complex OLAP calculations.So, OLAP tools use a multidimensional expression (MDX) to perform the calculations.The standardized language MDX is developed by Microsoft and being adopted by other OLAP vendors.MDX is similar in some respects to SQL but it has special features and clauses that make the querying and multidimensional view of underlying data very easy [43].The OLAP report using the MS Excel Pivot table is one of the effective tools to presents OLAP queries as a customized report.This tool permits to select different dimension columns with filter window to minimize the results and get accurate results.The figure (5) presented the number of retired employees according to job title and classified by quarter.Cooling and heating, and electricians took the maximum  The second report presented in figure (6).This figure shows the retired employees according to month and classified by gender.It is obvious that the retired male (M) employees took the maximum number of overall months.The retired male employees reached the maximum number in November and the minimum number in April month.The number of retired female (F) employees, in general, is very less compared with male employees.B. Web OLAP The Web OLAP report in figure (7) shows the number of retired employees according to quarter classified by gender and grouped by job title.The figure shows as mentioned before, the maximum number of retired employees fall in the fourth quarter.The male cooling and heating employees take the maximum number (68) followed by electricians (64) while female employees (communication technologies, electricians, engineers, and mechanical) took the minimum number of overall retired employees.The next web OLAP report presented in figure (8).This figure lists the number of retired employees according to skills where there are three main categories (university graduated, not studied, and read and write) according to quarter.The retired employees with (university graduated) skill took the maximum number of overall retired employees and this category fall in the fourth quarter.

Conclusion, Discussion, and Future Work
This paper presented a framework of HR data mart with OLAP implementation based on a dataset of retired employees' information of Basra Oil Company.The aim of this work is to find how can this information used to make decisions related to putting the right person in the right place based on skills and qualifications.Since the data used to build data mart is little compared with data intended to implement DW, so data mart is the best choice to build departmental HR data mart.In general, the departmental characteristic is a salient property of data mart.The HR data mart is independent data mart which can be used later to construct full dependent enterprise DW.An HR data mart can assist the organization managers in improving the standards, cost management, innovation, and service and quality improvement.HR data mart helps the analysts with finding the answers to their questions by using OLAP queries and present the information in different charts.The major concern related to building HR data mart is privacy which is reserved during the implementation process.The approach used to implement HR data mart is a bottom-up approach since it started with implementing small parts of the model to reach the final shape.Two categories of OLAP queries are implemented based on a multidimensional cube of HR data and present the results in different ways.Offline OLAP reports using MS Excel Pivot table after importing all dimension tables data from an analytical server to perform multidimensional query OLAP and present the result in charts.These kinds of charts can be accessed locally from a desktop client PC.The field of HR data mart needs more researches and applications to make in order to show the functionalities and effectiveness of using DW and data mart with HRM.Using HR data mart with OLAP can be considered as a promising method and valuable DSS to help managers and decision-makers in supporting their decisions.Based on these concluded points, HR enterprise DW is suggested as future work to build an enterprise DW based on multiple data mart.Key performance indicators (KPI) can be used to give an indication when some positions reach a critical number of employees.Data mining techniques such as decision tree, association rules, and clustering can be performed to make a prediction or a classification on HR data mart.

Figure 2 :
Figure 2: HR Data Mart Schema.The data mart schema is constructed based on star schema with four dimension tables and fact table.The dimensions are (Job, Info, Date, and Salary) tables with fact table that connect them and hold a measurement (count) which is the base of answering all cube queries.The historical dimension (Date)

Figure 3 :
Figure 3: According to Quarter and Gender.
Figure (4) presents the results after selecting the dimension (Info) with hierarchy (age) in the range (64-70) years old with dimension (Date) and hierarchy (Quarter) equal to (4) for all (Emp_Title) employee title.Senior technical, managers and mechanical staff took the maximum number of retired employees based on the filter fields.

Figure 5 :
Figure 5: Quarterly View of Retired According to Job.
employees while electrical protection and water and services took the lowest number of retired employees.

Figure 6 :
Figure 6: Number of Retired according to Gender.

Figure 7 :
Figure 7: Retired Employees According to Quarter Classified by Job.

Figure 8 :
Figure 8: Retired Employees According to Skills.