Construction and Exploration of an Intelligent Evaluation System for Educational APP through Artificial Intelligence Technology

To improve the evaluation accuracy of educational applications (APPs), the evaluation methods of educational APPs under artificial intelligence (AI) technology are explored. First, based on the principles of establishing evaluation indexes, the evaluation indexes for educational APPs are established. Second, an index evaluation system for educational APPs is constructed, and weights are assigned to the established evaluation indexes of educational APPs with the aid of analytic hierarchy process (AHP). Finally, the availability and effectiveness of the established evaluation system are investigated through empirical analysis. The results show: Five first-level indexes and 20 secondlevel indexes have been established through the existing index establishment principles, and a framework for intelligent evaluation of educational APPs has been successfully constructed; AHP is utilized to calculate the weight of each index; among the first-level indexes, the weight ratios of the educational and scientific indexes of the educational APPs are larger, whose proportion exceeds 60%; among the second-level indexes, the educational objective, educational principle, and knowledge systematization account for the highest proportion; therefore, the intelligent evaluation system of education APPs is obtained; finally, the empirical analysis has revealed that the score given by the intelligent evaluation system and the actual score of users have a high consistency, which proves that the proposed intelligent evaluation system is feasible and effective. The proposed intelligent evaluation system can be used as the basis for the design of educational APPs to improve the values of educational APPs. Keywords—Artificial intelligence; educational APP; intelligent evaluation system

through computers. The fields of AI include natural language information processing, expert system construction, image language and other information recognition, and robot research [1]. With the continuous development of information technology, the combination of education and information technology has led to the development and continuous updating of educational applications (APPs). The emergence of educational APPs is also a critical research result under AI. The emergence of educational APPs has brought convenience to learning. At present, there are many classifications of educational APPs, which can be divided into categories of teachingassistance, management-assistance, and learning-assistance [2]. However, currently, the educational APPs have problems such as uneven quality, and it is difficult for learners to find high-quality and suitable learning software from these APPs. Therefore, the intelligent evaluation of educational APPs has very important significance [3]. Educational APPs are countless, and their quality is uneven. At present, there is neither a unified concept for the quality of educational APPs nor a unified method of judgment.
Different scholars have different definitions of the evaluation of educational APPs. The reason is that various experts and scholars often have subjectivity and bias when giving a concept to the evaluation of educational APPs. Through the research by authoritative scholars and referring to the definition of general evaluation, the evaluation of educational APPs can be defined as follows: According to a given goal, the value of educational APPs is judged through feasible operation techniques and means, which is a basis for the improvement of projects of educational APPs [4]. Currently, there are three types of evaluation methods for educational APPs. The first type focuses on the evaluation of software product functions, which evaluates the educational APPs with the evaluation methods of software products and focuses on the perspectives such as software usability, interface design, and human-computer interaction; the second type focuses on the teaching effect of the educational APPs during the evaluation, which requires a long-term follow-up observation on the objects using the APPs and is easily affected by interference factors, making it not conducive to the experiment; the third type simplifies the evaluation method and evaluates the educational APPs from the perspectives of usability, content, entertainment, and social interaction, which is more in line with the results of scientific evaluation and the expected evaluation standard [5].
Based on the previous research results, first, the evaluation indexes of educational APPs are established according to the principle of evaluation index establishment. Second, the weights are assigned to the established evaluation indexes with the aid of analytic hierarchy process (AHP). Finally, the availability and effectiveness of the established evaluation system are investigated through empirical analysis.

Educational APPs
With the continuous development of information technology, more APPs that can be used on mobile devices have emerged and developed in large numbers. The mobile software is referred to as APPs, which help mobile devices achieve information carrying and interactive functions; in addition, the behaviors of humans have also changed correspondingly with the appearance of various mobile APPs [6]. The classification and functions of APPs are shown in Figure 1  Educational APP refers to the application programs of intelligent mobile terminals that provide educational learning resources for various learners, which mainly carries educational information. The emergence of educational APPs makes learning more convenient and flexible, with certain interactivity, personalization, and entertainment, which meets the learning needs of contemporary people to a great extent [7]. Educational APPs have been popularized in all stages of education, including early childhood education, elementary education, higher education, and adult education. According to the functional nature of educational APPs, they can be divided into categories such as early childhood education enlightenment, primary and secondary education assistance, language learning, vocational education and exam counseling, educational administration, and educational games; each category of educational APPs has its specific service objects and contents [8].

2.2
The construction principles of intelligent evaluation indexes for educational APPs While constructing intelligent evaluation indexes for educational APPs, the principles of scientificity, comprehensiveness, independence, essentiality, operability, and quantitativeness-quantitativeness combination should be considered [9]. Principle of scientificity: The determination of the evaluation indexes of educational APPs should show that the contents of the upper and lower indexes of the index system should be consistent. There is some logic between the indexes. The evaluation indexes and weight coefficients of the index system should be quantified according to the scientific methods and steps. Principle of comprehensiveness: It refers to that the evaluation index of educational APPs can fully indicate the situation of all aspects of educational APPs, evaluate the various attributes of these APPs from multiple perspectives, and assess the educational APPs from different points of view. Principle of independence: It refers to that of all the evaluation indexes, the content of each index is independent and does not overlap or repeat, thereby avoiding the phenomenon of redundancy; to some extent, this principle also helps reduce the complexity of computer algorithms. Principle of essentiality: The established indexes should indicate the most basic aspects of educational APPs, express all aspects of APPs concisely and clearly, and show the status of educational APPs from the perspective of most fundamental attributes. Principle of operability: The established indexes should be operable in the evaluation, and the evaluation method is feasible and complies with the practicality of the evaluation indexes. Principle of quantitativeness-quantitativeness combination: The qualitative indexes are more in line with the ideal evaluation, while the quantitative indexes are more scientific, more accurate, and more reasonable. The combination of the two makes the evaluation results more realistic and comprehensive. While establishing the evaluation indexes for educational APPs, this principle can make the evaluation result conform to the actual situation and have scientific rationality at the same time [10].

Determination of intelligent evaluation indexes for educational APPs
The previous research results and Delphi method are utilized to determine the factors influencing the quality of educational APPs. By referring to the literature, it is found that most of the current evaluation index system for educational APPs are composed of first-level indexes and second-level indexes. Therefore, the two-level system is also adopted here to construct the evaluation index system for educational APPs [11]. Through searching, it is found that the current evaluation standards for educational APPs are shown in Figure 2 below.

Fig. 2. Evaluation standard for educational APPs
The first-level evaluation indexes of educational APPs are classified into five types: education, functionality, scientificity, artistry, and practicality [12]. The determination of the second-level indexes is based on the content of the five indexes and the principles of index establishment, with the help of the Delphi method, 20 second-level indexes are selected. The design of the first-level indexes is as follows. Education: The application degree and educational value of the educational APPs in the process of teaching and learning of the students are measured according to the theories and principles of instructional design. Its secondary indexes are designed mainly from the perspectives of educational subject and correlation. The subject refers to the core elements included in the teaching design and teaching process, and the correlation refers to the elements correlated to the teaching design or teaching process.
Functionality: Functionality refers to the stability, ease of use, and usability of the APP system. Whether the functional services of the APPs can meet the needs of users in learning is evaluated. Its secondary indexes are determined from the perspectives of inherent attributes and extended attributes of the APPs. The inherent attributes refer to the technical requirements and indexes of APPs during development and design; the extended attributes refer to the extended functions and services while utilizing the APPs.
Scientificity: The functionality of education APPs is elaborated through the content information contained in the APPs. The functionality of educational APPs is mainly evaluated from the perspective of teaching content organization and utilization. Its second-level indexes are designed mainly from the internal and external perspectives. The internal perspective refers to the inherent characteristics of the content, while the external perspective refers to features and characteristics associated with the content of the subject and the learning environment.
Artistry: The artistry of APPs analyses the feelings that APP brings to users during utilization, which mainly includes the interface design of the APPs, and whether the visual and auditory effects can bring users a comfortable and pleasant experience. Therefore, its second-level indexes are designed from the perspectives of artistry and sense of experience. Artistry refers to the aesthetics and comfort of APP design. Experience mainly refers to the experience of using the APPs.
Practicality: Practicality refers to the benefits brought to users. These benefits refer to the degree of knowledge acquisition, followed by the convenience of the APPs to users. Its secondary indexes include whether the APPs are easy to operate, and whether the relevant advertisements and paid contents meet the needs and consumption of users.

AHP
After obtaining the evaluation index system of educational APPs, the weights of these indexes need to be calculated. The weight is the quantification of the importance of each index. It converts the importance of each evaluation index into a value and elaborates the specific contribution or importance of each index to the entire evaluation index system through a numerical form [13]. Only after the weight of each index is determined can the evaluation object be evaluated correctly and objectively. Therefore, the weight of the evaluation index needs to be calculated and determined by a scientific and reasonable method, and each index is given a weight value of corresponding importance. Here, the AHP method is utilized to determine the weight of the index; the weight values of the first-level indexes and the second-level indexes in the evaluation index system are obtained through the AHP software [14]. The architecture of index AHP method is shown in Figure 3 below.

Application
layer Index layer Criterion layer Target layer The target layer in the index AHP architecture is the target of this AHP; the criterion layer is the process of implementing the intermediate links of evaluation according to various measures; the index layer is utilized to select the research means, techniques, and methods to be applied; the application layer is the bottom layer of the AHP, which mainly contains various methods for evaluation and research using this index system [15].
The weight of each index in the index system is determined by the AHP method. The specific process includes the construction of AHP model, the construction of the judgment matrix, the single hierarchical arrangement and consistency test, and the total hierarchical arrangement and consistency test. However, in the actual application process, the final total hierarchical arrangement and consistency test step can be omitted [16].  [17]. The first-level indexes, education, scientificity, functionality, artistry, and practicality, are represented by A1, A2, A3, A4, and A5, respectively. The constructed judgment matrix model for the first-level indexes is shown in Table 1 below. Since the decision-makers have certain differences in the weighting ratio of each index, here, the 9-scale method is utilized to determine each calculation result [18].
This judgment matrix A =(aij)n×n is expressed as follows: Where: i and j represent evaluation indexes, respectively, n is the number of indexes, that is, the order of the judgment matrix. Besides, aij >1, aji =1/aij, aii =1. The above judgment matrix has symmetry.
Single hierarchical arrangement: The weight of each first-level index and the largest characteristic root of the judgment matrix are calculated [19]. First, each column of the judgment matrix A needs to be normalized. The calculation method is as follows. Next, the vector of the judgment matrix is transposed as follows.
It is also necessary to convert the transposed vector to obtain the feature vector. The conversion method is as follows. The feature vector w is also the weight vector of each indication, and the weight value of each indication can be obtained.
Finally, the maximum characteristic root λmax of the judgment matrix needs to be calculated, as shown below.
Where: (AW)i represents the i-th component of the vector W in the matrix A. If the judgment matrix has complete consistency, the maximum characteristic root will be equal to n.
Consistency test: During single hierarchical arrangement, the principle of data consistency should be followed. Therefore, it is necessary to perform a consistency test on the indexes. The calculation method of the consistency index (CI) of the judgment matrix A is shown below. If the value of CI is 0, the judgment matrix will have complete consistency; the larger the value of CI is, the worse the consistency of the judgment matrix will be. Whether the matrix conforms to the consistency test result can be determined by judging the ratio of the consistency of the matrix and the random consistency index (RI). The calculation method of the consistency ratio (CR) of the judgment matrix is as follows.

RI
CI CR = (8) Here, the order of the judgment matrix is 5; then, the value of RI can be obtained through the average random consistency index table, which is 1.12 [20]. Results and Analysis

Index weight determination results
The AHP method is utilized to obtain the weights of the first-level indexes and the second-level indexes. The calculation results of the weights of evaluation indexes for educational APPs are shown in Table 2, Table 3, and Figure 5 below.   As shown in Table 2, Table 3, and Figure 5, among the first-level indexes, the weight ratios of educational and scientific indexes for educational APPs are larger, whose proportion exceeds 60%. Therefore, while designing educational APPs, it is necessary to value the educational and scientific nature of the educational APPs. Among the second-level indexes, the educational objective, education principle, and knowledge systematization account for the highest proportion; therefore, APP designers can put more emphasis on the educational objective, educational principle, and knowledge systematization of educational APPs to improve the value and selection probability of educational APPs.

Performance analysis of intelligent evaluation system
To verify the application effectiveness and evaluation reliability of the proposed intelligent evaluation system, several common educational APPs are selected to compare the evaluation results of the proposed intelligent evaluation system and the actual evaluation of the users (questionnaire survey). The selected educational APPs are represented by 1, 2, 3, 4, and 5, respectively, and the scores obtained are shown in Table 4 below.
iJET -Vol. 16  The scores and feedback from 20 users are obtained through questionnaire survey. The scoring grades are divided into four categories: good, fair, qualified, and poor, and their corresponding scores are 4 points, 3 points, 2 points, and 1 point, respectively. The scores obtained are shown in Table 5 below. The comparative analysis of the scores of educational APPs by the two methods is shown in Figure 6 below.  Tables 4 and 5 above, the ranking order of the included educational APPs by the two evaluation methods is the same, which is 1, 2, 4, 3, 5; therefore, it proves that the two evaluation methods are consistent, the results obtained by the intelligent evaluation system are true and reliable, and the adjustments of the index system is unnecessary. Figure 6 also shows that the scores of the two evaluation methods are close, indicating that the proposed intelligent evaluation system has high reliability and practicality.

Conclusion
The evaluation methods of educational APPs under AI technology are explored. First, based on the principles of evaluation indexes, the evaluation indexes for educational APPs are established. Five first-level indexes and 20 second-level indexes are established according to the existing index establishment principles, and a framework for intelligent evaluation of educational APPs has been successfully constructed. Second, the weights are assigned to the established evaluation indexes of educational APPs with the aid of AHP method. The AHP software is utilized to calculate the weights of first-level and second-level indexes. The calculation results show that among the first-level indexes, the weight ratios of the educational and scientific indexes of the educational APPs are larger, whose proportion exceeds 60%; among the second-level indexes, the educational objective, educational principle, and knowledge systematization account for the highest proportion. Finally, the availability and effectiveness of the established evaluation system are investigated through empirical analysis. The empirical analysis has found that the ranking and evaluation results by the users and the proposed intelligent evaluation system are consistent; besides, the scores obtained have a high consistency, indicating that the proposed intelligent evaluation system is feasible and effective. During the actual development process of educational APPs, the proposed intelligent evaluation system can be used as the basis for the design of educational APPs to improve the values and selection probability of educational APPs.
Due to some objective limitations, while calculating the weight of each index through the AHP method, limited by the personnel conditions, most of the opinions come from the previous research results, which lacks the actual scoring by relevant professionals. Also, during the empirical analysis and comparison, because of the time limitations, the experience and scoring of only 20 users are obtained, which lacks the support of massive data. It is hoped that in the follow-up works, the above deficiencies can be improved, and the professionalism can be enhanced; at the same time, more data can be obtained to support the results. In addition, with the continuous development of science and technology, it is hoped that more intelligent means can be used for the evaluation of such APPs to improve the scientificity while reducing the manual works.