BUILDING AN INFORMATION ANALYSIS SYSTEM WITHIN A CORPORATE INFORMATION SYSTEM FOR COMBINING AND STRUCTURING ORGANIZATION DATA (ON THE EXAMPLE OF A UNIVERSITY)

for prompt decision-making. The object of the study is data stored in the corporate information system of the organization, methods of their analysis for making management decisions.The subject of the study is the automation of work with data within the corporate analytical system, the identification of data analysis patterns, as well as the design of an information analysis system of a university. The presented information analysis system will solve the problem of consolidating disparate data of corporate information systems, as well as operational data of the organization. This is ensured by the creation of a metadatabase and the formation of an information analysis system add-on using PowerBI technologies. The generally accepted design scheme of the information system was modernized demonstrating the place of the metadatabase within the corporate information system of the university. A model of data analysis based on the formation of production rules for building a decision tree on the example of human resources analysis is presented. The results of this study can be useful to analysts, executives and senior managers of large organizations in creating an analysis system for the organization's performance


Introduction
Modern organizations are forced to use various information storage and processing systems. For example, 1C Enterprise is used to generate financial statements, systems such as PerCO can be used to record and monitor working hours. Universities often use their information systems to store student data, assessment results, scores and attendance. All these data are difficult to combine into a single structure, and it is even more difficult to obtain operational data on

Literature review and problem statement
Strategic management of the educational organization under constantly changing external and internal indicators makes the decision support process difficult in terms of data analysis. If we also consider that data are often stored in disparate information systems of a corporate information system (CIS), the process of collection and analysis often takes longer than necessary to make decisions. Therefore, senior managers have to rely not on real data, but on the opinions and experience of employees of relevant departments. Thus, we get a discrepancy: on the one hand, we have a lot of data within the corporate information system, on the other hand, we depend on the professionalism of employees in terms of data analysis. In fact, decisions are made not on the basis of data, but on the basis of the qualifications and experience of the manager. This problem arises from the presence of a large amount of poorly structured data within the cooperative information system of the organization.
The problem is also relevant for both the healthcare system [14] and production enterprises [15]. However, the authors of these works focus on changing the organization's data structure to speed up the analysis and processing process. For educational organizations, this process can be very costly in terms of time resources, given the constant flow of reporting and current documentation. Programmers often do not have enough time to create new reporting forms according to the requirements of ministries and departments.
There is a whole area of data analysis using neural networks. A number of authors suggest using video to recognize employees [16]. It is proposed to use ultra-precise neural networks to determine the weights of multimedia data [17], as well as to build context-sensitive diffuse networks for detecting visual dependencies [18]. This area is highly specialized, aimed at solving only problems related to multimedia processing and assumes the presence of a powerful server for building a neural network, training it and processing video data. Often, educational organizations cannot afford a powerful additional server with a neural network deployed on it. Undoubtedly, the proposed analytical system can include such a direction of data analysis, if the organization's hardware allows.
Another area in the development of the organization's analytics is the creation of a specialized system that allows consolidating corporate information system data into a single information analysis system. Prospects for the development of corporate information systems are expressed in the creation of an integrated information analysis system [19]. Such a solution will make it possible to create cloud-based data processing and visualization systems by integrating data within the corporate information system [20]. Moreover, this approach allows expanding the capabilities of data analysis by accessing public data from analytical services [21]. A natural problem an organization may face is securing data in cloud storage [22]. In this regard, research in the field of building effective data queries is relevant [23]. It is necessary to ensure the confidentiality of stored data, and one of the approaches is Random Space Encryption (RASP) [24].
Summarizing most of the publications related to the implementation of BI systems in various types of organizations [25], difficulties faced by any organization are identified. These are: university's activities in order to make decisions. This is possible only through the introduction of modern information technologies in the process of university management and their continuous improvement.
To solve this problem, it is necessary to structure and summarize the data stored in various information systems of the corporate information system into a common metadatabase. With these data, an information analysis system is created, which is an add-on to all corporate information systems and allows using metadatabase data for intelligent analysis.
The methodology for building an information analysis system of a university is described in [1]. The basic principles of organizing a data strategy in the information space of a university presented in the form of dynamic dashboards and analysis of key decision-making issues are discussed in [2]. The data structure model and the place of the information analysis system within the corporate information system of the university are described in detail in [3]. The development of comprehensive decision support tools based on expert evaluation using the analytic hierarchy process is considered in [4]. An example of the practical application of the information environment of an educational institution to form an individual educational path is presented in [5].
The use of business intelligence programs allows you to describe an organization as an ecosystem. The operating BI used consists of complex components that cover the general functions of storage, processing, monitoring, visualization and delivery of information to form an ecosystem [6]. A relevant example is using a data platform for Portuguese universities [7]. The system architecture of business intelligence for AUN-QA Framework for a higher education institution is presented on the example of the ASEAN university network in Turkey [8]. In operations with big data, special data storage and use models are applied [9]. Providing researchers with data handling tools is an important task in the field of research management. It is relevant to study the principles and mechanisms for implementing data management systems, including the principles of FAIR Data [10].
The implementation of data analytics in an enterprise has its advantages, and the effect of implementation is expressed in the growth of enterprise competitiveness [11]. The introduction of data analytics is relevant for educational institutions, for example, colleges [12]. The introduction of data analytics in public institutions has a positive effect [13].
The global trend of commercialization of educational institutions on the one hand allows organizations to make decisions independently, on the other hand, the organization takes full responsibility for management decisions. To make strategically important decisions, the board of directors and the head of the organization must have reliable, up-to-date and complete data. By introducing an information analysis system into corporate information systems, which allows you to consolidate all the organization data, as well as supplementing it with operational information, senior managers get a powerful tool for analyzing the current situation. Selecting up-to-date data and creating a metadatabase allows you to generate interactive reports that are updated at a given frequency. Therefore, a study on the development of an information analysis system within the corporate information system is relevant.
-lack or weak integration of data among the systems of the corporate information system; -lack of additional servers for publishing analytical data and ensuring their security; -dynamically changing data. Thus, many organizations face the problem of disparate data stored in corporate information systems.

The aim and objectives of the study
The aim of the study is to improve the corporate information system for building an information analysis system on the example of a university. This will make it possible to create an information analysis system within the corporate information system, which allows combining and structuring the organization's data.
To achieve the aim, the following objectives were set: -to identify the place of the information analysis system in the corporate information system of the university and determine the design stages of the information analysis system; -to give an example of data analysis using production rules, give an example of building a decision tree for a decision support system; -to justify the design of the information analysis system based on the structure of the corporate information system of the organization; -to give an example of the development and deployment of the information analysis system.

Materials and methods
The object of the study is data stored in the corporate information system of the organization, methods of their analysis for making management decisions.
The hypothesis of the study is that the use of cloudbased business intelligence systems will allow senior managers of an educational institution to make decisions based on real data from integrated corporate information systems.
Realizing that different corporate systems of educational institutions have a different number of corporate information systems, the following assumptions and simplifications were made: -corporate information systems were classified by the type of their use (accounting and production systems, personnel management systems, financial and marketing systems, analytical systems); -abstracted as much as possible from the names and technical characteristics of corporate information systems; -the system of indicators is formulated very conditionally, since it is created to solve specific management problems and is given in the paper as an example to demonstrate the operation of production rules.
As a tool for creating, structuring and presenting analytical data, the PowerBI cloud data storage and analysis system from the Microsoft Corporation was chosen.
The proposed block diagram of the design stages is a step-by-step algorithm for the development and implementation of an information analysis system within the corporate information system. It can be abstracted from the education system as a whole and implemented within an arbitrary enterprise of the production cycle.

1. The place of the information analysis system in the general structure of the corporate information system
Modern software products and tools make it possible to create various information analysis systems for monitoring and regulating key indicators for making management decisions.
If we present existing software products in a hierarchical form ( Fig. 1), it is obvious that accounting and production systems underlie this pyramid. This is because they process, structure and store information directly related to materials and means of production.

Fig. 1. Hierarchy of software products in terms of information analysis and processing (author's generalized vision)
Since the successful functioning of human resources, finance and marketing promotion systems requires information stored in accounting and production systems, it is advisable to place them at the next level.
Information analysis systems are an add-on to the previous two levels of information systems, they combine data from all systems in the corporate system of the university. Its main task is to provide information as accessible as possible to decision-makers, regardless of their location and the amount of information processed.
The creation of an analytical service for companies, enterprises and large organizations in the financial and social sector is becoming an urgent direction. As practice has shown, not all data stored within the corporate information system are well structured [7]. One of the conditions for successful business development is the ability to work with data flows. From the experience of data analysis, two-thirds of employees have access to information prohibited to them. On average, only half of structured and 1 % of unstructured information is used in decision-making, and analysts spend 80 % on data discovery and preparation. The main tasks facing the analytical service of any enterprise or organization (including universities): -analysis and structuring of data flows; -defining the roles of system users; -optimization of the data collection, processing, storage and transmission process; -determination of algorithms for synchronizing disparate data arrays; -selection of consolidation means for data processing to generate analytical reports.
Let's consider the process of designing an information analysis system based on a classical representation. The design of an information analysis system can be divided into 6 stages (Fig. 2). The difference from the classical representation in this case is the addition of the items "Analysis of metadata repository (7)", "Design of metadata store (10)" and "Development of metadata store (14)".
By metadata we mean data stored in various corporate information systems, combined through data integration.
The introduction of these items into the classical design process will allow you to create an additional metadata store, which is a virtual database. Such a database consists of stored procedures, queries, and a number of additional tables. It does not duplicate the data of corporate information systems, but only accesses them, if necessary, in the "read only" mode. Additional tables of such a database consist of data obtained from additional sources, such as Google Forms or open data from third-party organizations.
Additional tables can also be created within the organization to meet their needs for storing data presented in tabular form. An example of such data is information stored in Excel and generated from periodic reports. The creation of an additional information system for storing it is often not promising, however, data stored in this way can be useful for an information analysis system.
At the first stage, a feasibility study is carried out in order to transform or introduce a new information system. Project goals and objectives, project relationship with the strategic goal, business strategies, critical success factors and performance indicators of the company are indicated.
The planning stage is divided into two parts. The main output of this stage is the project plan. Assessment of the enterprise infrastructure (2) implies the selection of a method, general methods for assessing the effectiveness of economic systems, special methods adapted for the enterprise infrastructure. Project planning (3) is a continuous process aimed at determining and agreeing on the best course of action to achieve the project goals, taking into account all factors of its implementation.
The analysis stage is divided into two levels. The main task of this stage is to determine what data should be used for the effective implementation of the project, and in what form the data will be presented after processing by the information analysis system. Requirements for the project (4) are determined by the need to identify the main performance indicators of the project. Data analysis (5) involves the definition of the main information flows and processing methods. Application prototyping (6) involves the development of the interface part of the project, as well as the formation of the main modules demonstrating the project performance. Analysis of the metadata repository (7) allows you to design an algorithm for collecting metadata to generate analytical reports.
Design is an important stage in the development of an information analysis system, which gives a clear idea of its work and appearance. Database design (8) represents the process of developing a database schema, analyzing data relationships, extracting information tables, and defining integrity constraints. At the ETL (9) (Extract, Transform, Load.) design stage, the concept of combining data stored in disparate systems and defining user interactions with them is determined. Designing a metadata store (10) consists of the following steps: -definition of data store objects and their attributes; -description of the semantics of data sources and their attributes; -description of algorithms for combining data into a single system and their transformation; -description of data access and analysis algorithms.
At the development stage, the design solutions described in the previous stages are implemented. The information analysis system is filled with functionality.
Release (16) and deployment management is responsible for providing and testing the service delivery capabilities identified during the Fig. 2. Block diagram of the design stages of an information analysis system design phase. The main objectives of release and deployment management are: -formation and approval of release plans; -ensuring that each release package consists of a set of related and compatible components; -managing the release and its components within the implementation processes; -providing access to information for customers and investors so that they can effectively use the new or modified service; -providing access to information for operational personnel so that they can provide, maintain and manage the service.
Since some of the above steps are typical and do not present any difficulties in design, only those whose formation methodology differs from standard solutions should be presented in detail.

2. Data analysis using production rules
As an example, the problem of analyzing the quality of students' training by levels of education is considered. This problem was divided into tasks: 1) to study human resources; 2) to study the relevance of disciplines taught; 3) to study the quality of educational material; 4) to identify additional indicators for training specialists. Each of the tasks involved the allocation of indicators. To solve the first task, the main indicators affecting the quality of the university infrastructure were identified. The database of production rules for data analysis was formed by experts. The most relevant data analysis indicators were identified using the analytic hierarchy process [4]. Not only the priority indicator was taken, but also a number of indicators that have a fairly large weight. Metadata is stored in additional tables, so they can be adjusted, added or removed without significantly affecting the corporate information system. This is convenient, since experts do not have to single out or formulate them each time, depending on the study.
On the example of human resources, it is considered what indicators may be relevant for analysis (Fig. 3). Let some context K={S, P, R} be given, where S is the statuses, parameters of the research object; P -features that the object of study shows in this context; R -decisions that should be made if an object with certain parameters exhibits certain features, i. e. R⊆S×R.
Statuses are the set S={S1, S2, …, Sm}, where m is the number of parameters characterizing the object.
The features are described by the following sequence P={P11, P12, …. P1n1; P21, P22, …, P2n2; …, Pm1, Pm2, …, Pm nm}, where n={n1, n2, …, nm} is the number of features for each of the parameters. Fig. 3 shows a decision tree for analyzing the human resources of the department of a typical university. A general analysis of the department, faculty and university staffing can consist of the following parameters and features.
Labor rate (S1) with the corresponding features: the rate (P11) when a regular employee works a certain amount of hours. Part-time employee (P12) is an employee who is not regular at a given university, but performs a certain load. Internal part-time employee (P13) -when a regular employee works a certain number of hours in addition to the basic rate.
An additional parameter is the amount of part-time job (S3). This parameter will allow, having the number of hours spent by an employee, to judge the volume of his employment. Features for parameter S3: P31 -1 labor rate; P32 -labor rate less than 0.25; P33 -labor rate from 0.25 to 0.5; P34 -labor rate from 0.5 to 0.75; P35 -labor rate from 0.75 to 1.
An additional parameter S5 is the number (in %) of employees with certain features. Since there are 4 main age divisions, as well as 4 divisions by degree, one feature characterizing the quantitative indicators (in %) for S5 will be enough: -P51 -the number of employees is less than 25 %; -P52 -the number of employees is more than 25 %, but less than 50 %; -P53 -the number of employees is more than 50 %, but less than 75 %; -P54 -the number of employees is more than 75 %.

Fig. 3. Indicators for the analysis of the human resources of the department
This percentage is not accidental. Firstly, the requirements for the number of employees with an academic degree are typical for state and national universities [2]. It is also the well-known Interquartile Range (IQR) ratio, adopted to study the indicators of variability and at the same time eliminate the calculation error.
The presence of public assignments (S6) is a parameter that can show the amount of social work that the department employee performs. The features of such an indicator are P61 -the number of employees performing less than 5 % of the total amount of public work; P62 -the number of employees performing more than 5 %, but less than 10 %; etc.
Availability of publications (S7) is an indicator used when analyzing the activities of employees. The features describing this indicator are P71 -the number of articles in the rating indexed in databases that have a non-zero impact factor; P72 -the number of articles in scientific journals recommended by the Ministry of Science and Higher Education of the Republic of Kazakhstan; P73 -the number of articles published in international conferences.
Participation in grant projects (S8) and quantitative indicators of the training of undergraduates and doctoral students (S9) were taken into account in the overall analysis, but were not considered in this study.
Based on the task, the elements of the decision base of production rules in the analysis of the department staff were presented as follows: -R1 -to recommend an employee to enter a PhD program; -R2 -to train an employee for a management position; -R3 -the head of the department should pay attention to the employee with such features, since he either teaches a small subject or gives only lectures having no feedback from students; -R4 -to monitor the quality of disciplines taught; -R5 -to recommend that an employee join the staff; -R6 -to pay attention to the aging of the academic staff; -R7 -to monitor the quality of students' training in the specialties of the department; -R8 -to apply to the Ministry of Science and Higher Education for additional places in doctoral studies.
-R9 -the social work of the department is unevenly distributed.
The algorithms for each of the decisions are as follows: Similarly, any analysis and decision-making processes both at a university and at an enterprise can be described.
Based on the allocated production rules, an analysis was carried out in two universities of Kazakhstan. The results were provided to the heads of organizations. Although the criteria used in both studies were the same, the problems identified in this study were quite different. At the Al-Farabi Kazakh National University (Almaty), the problem of severe aging of the academic staff was revealed. Many professors teaching master's and PhD programs are retired (P43). At the North Kazakhstan University named after Manash Kozybaev (Petropavlovsk), the problem of scientific activity was identified (S7). An interesting fact was revealed: if we take individual indicators, considering the labor rate, then the indicators P71, P72 and P73 are even higher than the norms required by internal regulatory documentation. Further research has shown that this discrepancy is because not all regular teachers (P11) work full-time. Given the labor rate of employees, the percentage of scientific work performed shows a significant excess of the declared indicators. However, in fact, these indicators do not pass even the minimum requirements of the strategic plan.
The analysis found that the same indicators affecting the quality of the university's infrastructure can reveal diametrically opposed problems. This will result in different management decisions.

3. Design of the information analysis system
When forming the interface of the information analysis system, it is necessary to determine from which data reports will be generated and in what form the data will be stored. At the design, development and deployment stage, they proceed from what information systems are already available in the enterprise, what opportunities for their interaction exist. The environment in which reports will be generated, published and stored is determined. When generating reports, the target audience and user groups are defined.
For example, the Al-Farabi Kazakh National University (KazNU) established the Center for Situational Management (CSM). The tasks of the CSM: 1) information and analytical support for the university's activities; 2) monitoring of the educational process; 3) prompt response to emergency situations. Fig. 4 shows the model of the analytical service of the Center for Situational Management, which closely interacts with the working groups of departments. Directions of the department's work: educational process, research work, upbringing process, financial, economic and production activities, administrative and organizational management, international activities and strategic management, quality management systems (QMS), IT services. The analytical service should be guided by the normative and methodological documents of the university (regulations of departments, QMS procedures, instructions, regulatory documentation of the departmental ministry, strategic documents of the university). Analytics engineers of the CSM analytical service have access to databases of key subsystems of corporate information systems of the university.
In Fig. 4, the following abbreviations are used: -academic department (AD); -science and innovation department (SID); -languages and external development department (LEDD); -financial and economic department (FED); -pre-university training (PT); -administrative department (AD); -information technology and innovative development institute (IT and ID Institute).
To organize the work of the analytical service, a performance reporting methodology and an analytical reporting methodology have been developed. The PowerBI platform was chosen as the implementation environment, which made it possible to bring all the data stored in disparate information systems of the university. That is, if we return to the diagram of IAS design stages (Fig. 1), the design and development of ETL, presented in paragraphs 9 and 11 in this corporate information system, are carried out using PowerBI Desktop.
Dashboards are a single real-time data center available on all devices that gives business users a complete view of the most important metrics: 1. All organization data in one dashboard (important data about the entire organization and from all applications in one system).
2. Creation of interactive reports consisting of tabular presentation and visualization.
3. Ability to share reports within the organization. 4. Consistent analysis of the entire organization (robust, reusable data models for consistency in the organization's reports and analytics).
5. Convenient embedding of analytics directly into the application (the ability to embed on site pages, PowerPoint presentations).
6. Visualization and analysis of data within a single reporting system. 7. Universal data access (connect to hundreds of data sources regardless of location and type).
The deployment in this case is a cloud storage provided by the Microsoft company for the organization. Access to data on this platform is available only to the organization's users.
Classes of users of the information analysis system: 1. IT department whose primary function is infrastructure management (preparing data sources in the form of tables for data viewing and online reporting).
2. Data owner whose role in this system is to determine access rights depending on the level of users (often a QMS employee).
3. Analytics engineer who creates reports, places them on dashboards and provides data access to users. 4. The user analyzes in detail the data obtained from the reports, creates a request and business decisions.

4. Development and deployment of the information analysis system
To generate dashboards containing reports, it is advisable to use business intelligence systems for a number of good reasons. First, they allow you to integrate, transform and store data from various information systems of the organization. Secondly, provide quick access to the necessary information. Thirdly, they allow you to design multidimensional storage, conduct an in-depth analysis of large amounts of information. Fourthly, they allow building information reports of varying complexity (Fig. 4), modeling and forecasting key indicators and visualizing data.
When choosing a prototype for implementation, it is necessary to take into account the fact that many of the business analysis systems on the market offer data storage on their servers. An important advantage of business intelligence systems is that dashboard data can be viewed remotely by means of connection through the personal account of the head of the organization or structural divisions. The contents of dashboards for different user levels can also be different. On the one hand, such an approach to obtaining analytical data can be beneficial for regional universities that do not have their own servers. On the other hand, the policy of some organizations (for example, military institutions) prohibits storing information on other servers.
As an example of deployment of the information analysis system, Fig. 5, 6 show the appearance of the educational process dashboard (Fig. 5) and an example of the contingent growth analytical report (Fig. 6).
The peculiarity of this system is that it is built on the basis of data stored in various databases of the university's corporate information system based on ETL technology. The PowerBI business intelligence toolkit was used as the implementation tool.
The main purpose of the information analysis system is to provide a multidimensional analysis of data, trends and forecasting the results of various management decisions at all levels of the management vertical, including corporate reporting, financial and economic planning and strategic planning.
The data system in the form of analytical indicators can exist in various forms of reporting. The main task of the information analysis system is to combine disparate indicators to build strategic maps. On the one hand, they demonstrate various directions of the organization's activities, on the other hand, they show the real state of affairs within the framework of one dashboard. This greatly helps the decision-maker, since data are presented in a convenient visual form and do not require the generation of reports by the relevant departments of the organization.
The main features of the IAS: -maintenance and display of regulatory reference information on controlled technological processes; -drawing up schedules, determining deadlines for individual operations and critical path of the process; -monitoring progress and adjusting the schedule in case of unforeseen delays; -establishment of cause-and-effect relationships in the implementation of activities based on available data; -provision of data on the deployment of personnel, equipment and rescue equipment; -creation of standardized and customizable reports providing an arbitrary sample of information stored in the IAS. -the possibility of expanding the composition of processed information without involving the system developer; -the possibility of adjusting the rules for analyzing information stored in the IAS by the user; -the possibility of integration with customer information systems.

Discussion of the results of designing the information analysis system
The layout of the information analysis system presented in Fig. 5 is a web application implemented in the PowerBI system account. Reports can be multi-page, consisting of table data, graphical data views and selection lists. They are dynamic data that change depending on the filters applied.
The data are displayed directly by communicating with the database server, or updated as required. Reports can be embedded in corporate web pages. Access to data can be regulated by providing a login and password to the corresponding user roles.
The proposed production rules are often drawn up by experts in the field of management decision-making, or by senior managers. Reporting forms given as an example are created in accordance with the reporting documentation of the organization. Reporting forms are a dynamic structure developed and published by analytics engineers (Fig. 4) in accordance with the requirements of the head of the organization based on the data of the corporate information system of the university.
The advantage of this approach to the design of an information analysis system is that a fundamentally new information system is not created. Data are taken from existing database tables of corporate information systems. Integration takes place within the PowerBI system, the key fields