AI different approaches and ANFIS data mining: A novel approach to predicting early employment readiness in middle eastern nations

The use of data mining to predict early employment readiness of students is gaining importance due to the expansion of data production in various industries. This study aims to address the employability issue in Middle Eastern nations by utilizing an Adaptive Neuro-Fuzzy Inference System (ANFIS) data mining technology. The experimental investigation used data from tracer studies conducted by three Jordanian universities, consisting of 22 parameters. Results showed that despite achieving an accuracy of 94% for the graduate dataset, ANFIS exhibited high complexity due to the large number of attributes used. The study has implications for selecting relevant variables and investigating multiple aspects. Data mining has various applications, including classification, clustering, regression, association rule development, and outlier analysis. As data production continues to expand, this study provides insights into the potential use of ANFIS in predicting early employment readiness of students


Introduction
To make successful judgments in the education system, a variety of methods and approaches are utilized to examine the different datasets.DM is used not just in the sector of education but in every aspect of our lives.The identification of learners' behavior is one of the main goals of data mining in education (Dawson & Dawson, 2019;El Nokiti et al., 2022;Ravikumar et al., 2022;Shwedeh et al., 2023;Md Razak et al., 2013).Also, the use of electronic instruments in the sphere of education has significantly increased in recent years.Electronic technologies are widely utilized to assist and enhance the quality of education, from kindergarten classes at the preschool level through postgraduate courses at institutions.While the use of computer networks is a key component of online training, face-to-face institutions are also making substantial use of networkconnected devices like laptops, ipads, and cellular phones, which has an impact on graduates' employability in all disciplines and majors (Ramanathan et al., 2015).The comprehensive system of techniques and tools used by their peers in the commercial and service industries, which typically denote data analytics or data mining, might, however, benefit practitioners and university administrations.The education industry generates vast amounts of data, which can be effectively managed through the implementation of data mining and data analytics techniques.Educational data mining (EDM) and learning analytics (LA) are two such tools that have garnered the attention of academics and researchers in the field of education (Bhaskaran et al., 2016;Kesavaraj & Sukumaran, 2013;Martínez-Cerdá et al., 2018;Peña-Ayala, 2014).
Based on all the aforementioned, the concept of employability is seen as a two-sided equation, and many people need various forms of aid to get past the psychological and physical barriers that stand in the way of their growth and learning.Moreover, employability is not just about having academic and vocational abilities; people should also have relevant and useful information about the workforce to help them make the best decisions possible regarding their available employment options.Additionally, they require assistance in comprehending when such information can be crucial, reading that information, and turning it into intelligence (Schnell & Rodríguez, 2019;Abdallah et al., 2022;Salameh et al., 2022).

Statement of the problem
-The sector of education produces enormous volumes of data each year, and academics and scientists use this data as a signpost for many academic achievements, including students' failure or distinction in specific courses (Mishra et al., 2019).
-The aim of this research is to utilize a variety of data mining classification techniques, including a fusion of neural network and fuzzy set approaches, to extract latent insights from gathered data.Although using this information to optimize both the learning system and participants' futures presents a major challenge, the study aims to provide valuable insights that can benefit parents of children and decision-makers in the educational field.By integrating both approaches, the study aims to improve the accuracy and efficiency of classification in the field of education, ultimately providing valuable information to inform decision-making and enhance learning outcomes.The objective is to investigate the effectiveness of the combined approach for classification in educational settings and to identify potential opportunities for using this information to improve both individual and systemic outcomes in education.The ultimate goal is to provide insights that can inform policy and practice in the educational field, leading to improved outcomes for students and educators alike (Satyanarayana et al., 2014).
-This research aims to combine the fuzzy approach and NN approach, creating a new method called the Neuro-fuzzy method.Artificial neural networks mimic the human brain by connecting separate nodes to form new data layers and producing outputs known as node values in the hidden layers.By integrating a fuzzy set technique, this study aims to address issues of ambiguity, subjectivity, and profitability.The objective is to investigate the effectiveness of the Neuro-fuzzy method for educational purposes and its potential to improve the accuracy and efficiency of classification in the field of education.

Research questions
These are the research questions for the current study: 1. Can the use of a neuro-fuzzy approach predict the future employability of graduates?2. What are the appropriate methods for assessing and testing the effectiveness of the neuro-fuzzy approach in predicting future employability? 3. Which element has the greatest impact on the future employability of graduates, as determined by the neuro-fuzzy approach?

Methodologies
As can be seen from the literature, numerous research studies in various nations have investigated employability-related concerns.The majority of these research were conducted in nations with significant unemployment rates, like Vietnam (Tran, 2015).

Statistical study
In conducting research, it is important to gather relevant information and data that can help inform the study (Shwedeh et al., 2021;Shwedeh et al., 2022;Aburayya et al., 2023;Salloum et al., 2023).In this case, the researchers reached out to Jordanian MDEE and the ML to gather information on computer science industry in Jordan, as this information would be pertinent to their study on predicting the future employability of CS graduates using a neuro-fuzzy approach.
By examining statistical data provided by these ministries, the researchers were able to gain insights into the characteristics of the CS industry in Jordan that may impact the employment prospects of CS graduates.This data can then be used to inform the research and help identify potential factors that may be important in predicting future employability.
Overall, reaching out to relevant organizations and gathering data is an important aspect of conducting research, as it can help ensure that the study is grounded in relevant information and can lead to more accurate and insightful findings.In all, 4180 graduates with IT specialties were awarded degrees in 2018.Male grads made up 51% of the total, while female graduates made up 49%.According to their academic degrees, the percentage of graduates is as follows: 90% of graduates are bachelor's, 6% are master's, and 4% are diploma holders.Fig. 1 demonstrates the proportion of CS graduate students.In 2018, 1,512 graduates found employment in the IT sector, accounting for 40% of all graduates, with 64% men and 36% women.The employability rate for 2017 was the same (64%), however there were fewer male graduates who found jobs than there were female graduates (39% vs. 61% in 2018).With 92%, the private sector employs most IT graduates.Only 8% of all graduate's work for the government.According to Fig. 2, 28% of all 2018 graduates received employment in the field of computer science, whereas 1% of all graduates received employment in the sector of information communication networks.The data figures demonstrate that there is a greater need for male graduates in Jordan's IT business than there is for women.Furthermore, the analysis suggests the dominance of some specialties in the IT area.The university students graduated from also had an impact on their employability.According to prior data, the following characteristics may have an impact on graduates' employment in Jordan's IT market:

Data collection
The study utilized data obtained from career guidance departments' trail study conducted by three Jordanian universities, specifically the IT College.For this research on the employability of IT majors, information from 1095 IT graduates from three majors (CS, CIS, and software engineering) was collected and organized into Table 2.The data was obtained from Balqa Applied University, Philadelphia, and Alzaytoneh, with 560, 221, and 314 graduates participating in the study, respectively.Notably, Philadelphia University had no graduates in CIS.The data was collected between 2015 and 2019.22 characteristics have been chosen for this research project due to the various studies that have been done to identify and ascertain the key components of employability.The variables of the dataset are divided into: • Demographics characteristics: these include information about age, gender, province, social class, and the number of applications.• Interpersonal skills, teamwork abilities, and talent are all examples of "soft skills" traits.
• Technical/hard skill qualities: include knowledge of programming, mathematics, English, and a variety of technical certifications.• Educational qualities include the name of the university, the major, the degree, the grade of high school, the GPA, the program duration, the method of study, and the expertise.

Data preprocessing
As ANFIS, a method that combines FL and NN architecture, is utilized in the experiment, extra preparation of the dataset is required to adapt it to ANFIS.This involves cleaning and converting the data as a part of the data preprocessing phase to make it appropriate for the data mining algorithm.

Refining Data
Data cleaning is a fundamental process in data preprocessing that is essential to improve the quality of the data.Its primary goal is to detect and correct any errors, inconsistencies, or inaccuracies that may exist in the dataset.In practice, data can have numerous incorrect or incomplete entries due to a variety of sources such as human error, system malfunction, or data transmission issues.

Conversion of Data
Data conversion involves converting the data into suitable formats for data mining.It includes the following methods: 1. Normalization: This method arranges the data values in a way that reduces dependencies and redundancies, typically within a specific range (-1.0 to 1.0 or 0.0 to 1.0).2. Attribute Selection: This method involves extracting a novel attribute from the existing set of characteristics to improve the categorization process.For example, the "social strata" attribute was developed based on the "family income" attribute provided by the tracer study.3. Discretization: This method replaces numerical attributes with intervals, degrees, or conceptualization levels.For instance, the continuous value of the GPA was replaced with a grade range in this study.Additionally, continuous attributes were converted into nominal attributes to prepare the data for categorization.4. Generalization: This method involves translating qualities from lower levels of hierarchy to higher levels.For example, in this study, attributes like "street name" were replaced with more general attributes like "city" to improve the analysis.
Pages in Excel were created using the source of data.To work with WEKA and MATLAB data mining, which are required to develop the model.

ANFIS Implementation
The combination of neural networks and fuzzy logic yields two main types: (1) Neuro-Fuzzy System (NFS) and (2) Fuzzy Neural Network (FNN).Among researchers, NFS is the most used technique.In this research investigation, the Adaptive Neuro-Fuzzy Inference System (ANFIS) will be utilized as a classification technique.The ANFIS algorithm, developed by Takagi and Sugeno in 1997, serves as the foundation for predicting student employment outcomes.ANFIS is a graphical network that represents Sugeno-type fuzzy systems with neural learning capabilities.The network is composed of five levels of nodes, where IF-THEN rules are integrated into a network realization by ANFIS.ANFIS addresses parameter estimation issues by utilizing linguistic data and a hybrid learning rule that combines the least-squares method and back-propagation learning algorithm.ANFIS works with multiple input variables, such as A1 and A2, and one output variable, O1, where the output result is always crisp and input variables are represented by linguistic values such as high, mid, and low.A1→ ANFIS →0 A2→

Fig. 4. ANFIS concept demonstration
Fig. 4 illustrates how appropriate Gaussian, triangular, or trapezoidal membership distributions are used to express input signals.ANFIS utilizes Takagi and Sugeno's method, which differs from the Mamdani approach by not representing the output O in any membership function distribution.Instead, ANFIS expresses O as a function of the input variables A1 and A2, which can be a linear or nonlinear function.The linear function of the input parameter is most frequently used to express the output, and the formula Oi=piA1+qiA2+ri (equation 4.2) can be used to express it linearly.
In this context, A1 and A2 refer to the input parameters, while pi, qi, and ri refer to the coefficients of the equation.These coefficients are adjusted using techniques such as the least square error method with backpropagation or other naturallyinspired approaches like the Genetic Algorithm to obtain the optimal values after each ANFIS iteration.The notation "i" indicates the number of rules produced by merging the linguistic variable and input variables.
The five levels of the ANFIS algorithm are depicted in Figure (7).The training dataset is used to assess the rules.Levels include: • In layer 1, each linguistic value is used as input and the resulting computed membership function is used as output.
• In layer 2, the membership function values of a specific rule are taken as input, and a t-norm (such as prod.or min.) is applied to these values to generate the output W.
• In layer 3, the normalized w value for a particular rule is produced as output, and the inputs are all the w values from the previous layer.
• Layer 4 utilizes the input parameters (A1 and A2) and normalized values derived from the rules as inputs, and generates outputs in the form of W1O1, W2O2, ..., WnOn, with Oi being calculated as piA1 + qiA2 + ri.
• The output of Node 5 is the sum of values obtained from the preceding node.
Although there are more than 20 design variables in this research project, we'll only illustrate the ANFIS architecture for two of them here to keep things simple.
The model creation time for this research project was significantly long when considering the entire dataset and its parameters.To achieve classification, an incremental approach was utilized where ANFIS was given three input variables, trained with a training dataset, and tested for accuracy with a testing dataset.Additional input variables were added to increase the number of input parameters, and a collection of classifiers were developed using various input factors.The calculation time and reliability of each predictor were documented and evaluated against a reference classification algorithm.The ultimate objective was to construct a predictor with the best possible set of input parameters and attributes, while also considering ANFIS's computational time required to build the predictor.The aim was to obtain the best possible classifier with superior qualities.
The connection in Fig. 5 contains two inputs, x and y, each of which has a couple of linguistic values, A1 and A2 and B1 and B2, correspondingly.  = Layer 2: Using whatever T-norm, including the min or prod operators, every node determines the firing strength of each rule.We used product operation in this research investigation.
The w2,w3,….,wn.will be computed with the same way.
Within Layer 3, the nodes ascertain the standardized firing intensity of each rule, producing a value of normalized firing strength.

𝑤 =
Likewise, w2 is computed, and as a result, the output O3,I is the normalized firing strength represented as (w_1 ) ̅ .

O3,1=𝑤
The nodes' output in Layer 4 is calculated by multiplying the normalized firing strength of the rule that was fired with its corresponding node.

𝑂 , = 𝑤 𝑓
As; In Layer 5, a one node gather the results of all the nodes in Layer 4, and the resulting output is represented as O6,1.
The coefficients of the consequence functions p, q, and r, in addition to the membership distribution function that is employed, are the major determinants of how effectively ANFIS performs.
The coefficients of the function can be optimized using any optimization method, such as an evolutionary algorithm or a backpropagation algorithm, in order to tune or training the ANIFS algorithm.For tuning reasons in this research study, we apply the Bb algorithm.

Phase of Testing
In this section of the research project, the evaluation procedure is explained, which includes the training and testing phases using the dataset obtained from tracer studies.The goal of this study project is to develop a graduated employment model that can predict the employment status (employed, unemployed, or other) of graduates based on the collected dataset.The classification task is divided into two phases: Training phase was explained earlier, while the testing phase involves selecting the testing dataset and computing the predicted accuracy.Although there are various common testing techniques, the four most frequently utilized methods in WEKA and MATLAB are described: There are several methods commonly used for testing in WEKA and MATLAB, including: • Using the entire training set as the testing dataset, or randomly nominating a testing dataset from the training dataset.However, this method is not commonly used due to the risk of overfitting and inaccurate results.• Using a separate test set that has already been provided.This method involves identifying the training and testing datasets independently, which leads to more accurate and meaningful results about the classifier's efficiency.• Cross-validation, which involves dividing the dataset into K equal-sized partitions for K-fold cross-validation.One partition is used for testing and the remaining k-1 partitions are used to train the model.This process is repeated k times, with each iteration using a varied partition for testing and the rest for training and validation.The K testing processes' outputs are combined to compute the final estimation, and the widespread 10-fold cross-validation is commonly used to provide the best error measurement results.
• Percentage split, in which the dataset is split into two halves for testing and training.The percentage of data used for each can vary depending on the dataset and application, with some researchers using 70% for training and 30% for testing, and others using 50% and 50%.
In this experiment, a precise model developed with ANFIS was assessed using 10-fold cross validation.The neural network architecture was trained 50 times with the ANFIS algorithm.During the training process, 9/10 of the dataset properties were used along with a varying number of input characteristics, starting with 3 characteristics and .9examples.
The remaining 1/10 of the whole dataset was used for testing, and the accuracy of the used classifier was compared to that of other data mining classifications.The computational efficiency of ANFIS was also compared to that of other methods after each generated classifier.The assessment procedure was repeated 10 times using a validation set of ten-fold cross validation.
The number of attributes used in ANFIS was increased from one to four, and the testing process was carried out for each classifier developed.
Additionally, a classification model was constructed using characteristics chosen by specialists and established methods.The data analysis section of this research study delved into selecting superior attributes, including but not limited to Information Gain (IG) and Gini index.

Results and discussion
The study's classification work is composed of two main phases: training and testing.The previous chapter provides a visual representation of the training phase.The testing phase is carried out by selecting the testing dataset and estimating the predicted accuracy.Although there are various testing techniques available, the most frequently utilized ones in WEKA and MATLAB include: • Training set: The entire training dataset can be utilized as the testing dataset, or a random subset can be selected.However, this method is not commonly used as it often leads to overfitting issues and unreliable accuracy scores.• Provided test set: This technique involves independently identifying the training and testing datasets.Since the testing data is different from the training data, this method usually produces more accurate results in assessing the classifier's efficiency.• Cross-validation: This method involves dividing the dataset into K equal-sized partitions for k-fold cross-validation.
One of the partitions is used for testing.The output from each fold is gathered to produce the final estimation.10fold cross-validation is widely used to achieve the best error measurement results with various classification algorithms.• Percentage split: The dataset is split into two halves, with the first half used for testing and the second half for training.The percentage split varies depending on the dataset and purpose, with some using 70% for training and 30% for testing, while others use 50-50 split.
The precision of the model developed with the neuro-fuzzy inference method (ANFIS) was evaluated using 10-fold crossvalidation in this investigation.The neural network architecture was trained 50 times with the ANFIS algorithm to create an accurate model.To assess the effectiveness of the classifiers, a group of classifiers were trained using the same dataset and compared with our classifier.We used a 10-fold cross-validation approach to train and test all classifiers and evaluate their performance by comparing different class labels.Various metrics, including recall, precision, TP-rate, FP-rate, confusion matrix, RMSE, and kappa measure, were utilized to calculate accuracy.Additionally, we used an incremental attribute strategy to evaluate the computational expense and efficiency of our classification model.We conducted multiple trials with an increasing number of attributes and divided the performance testing into phases.

Applying Three Attributes
As was discussed in chapter four, we did away with the investigation variables because their abundance would have a negative impact on the classifier's effectiveness and computing cost.We chose only 3 elements in this stage after utilizing an attributes selection technique like the decision tree j48 to order the attributes according to their level of importance.The initial characteristics were utilized in our classification as well as in several other classifiers, including decision trees, support vector machines (SVM), naive Bayes, and multilayer perceptron (MLP).We first create a confusion matrix for each of the three feature classifiers before analyzing them.The contrast is displayed in the tables below, where each classifier's confusion matrix is shown.

Table 3
The confusion matrix of the ANFIS classifier consisting of three attributes The accuracy of the ANFIS classifier is calculated to be 69%, which is derived by dividing 485 (the number of correct classifications) by 700 (the total number of classifications).The accuracy of the Decision Tree classifier is computed to be 67%, obtained by dividing 475 (the number of correct classifications) by 700 (the total number of classifications).The SVM classifier accuracy is determined to be 65%, which is calculated by dividing 457 (the number of correct classifications) by 700 (the total number of classifications).

Table 6
The confusion matrix for the Naive Bayes classifier consisting of three attributes is illustrated in tabular form.The Naïve Bayes classifier accuracy is evaluated to be 61%, which is computed by dividing 432 (the number of correct classifications) by 700 (the total number of classifications).

Table 7
The confusion matrix for the MLP classifier consisting of three attributes is presented in tabular format The accuracy of the MLP classifier is determined to be 68%, which is obtained by dividing 480 (the number of correct classifications) by 700 (the total number of classifications).
The given statement discusses the performance comparison of five different classifiers -ANFIS, MLP, SVM, decision tree, and Naive Bayes.The classifiers were evaluated using confusion matrices, and the accuracy rates were recorded.The results indicate that the ANFIS classifier has the highest accuracy rate at 69%, followed by MLP with 68%, SVM with 67%, decision tree with 67%, and Naive Bayes with 61%.The statement suggests that the classifiers that use neural network methods have a general superiority over other classifiers.It is also observed that combining a neural network with a fuzzy approach can enhance the effectiveness of the model.The class F-measure values utilized by all classifiers are shown in Fig. 7, indicating that they are higher for the employed group than for the not-employed group.This suggests that all classifiers have a better ability to predict employment than unemployment.As seen in Fig. 8, the employed class's FP-rate values for each classifier are lower than those for the unemployed class, indicating a less than stellar predictions ratio for the employed class when using the 3 elements.We examined each classifier's efficiency using data on RMSE and the Kappa statistics to wrap up the examination of the classifiers.We used a well-known measurement in our experiment called the RMSE value, which ought to be as low as possible.The reliability of the classified data gathered, and their validity are distinguished using the kappa statistic to determine the correctness.10.Table 10 illustrates that the construction time for the classifier is reasonable since only three attributes were utilized.ANFIS had the shortest execution time of 0.93s, followed by decision tree at 0.95s, then SVM at 0.96s, and MLP at 0.97s, according to the table.Naive Bayes was the last with 0.99s.

Using Four Attributes
In the phase under consideration, the attribute count was increased to four, and this was achieved by employing an information gaining technique to identify the most significant attributes.After implementing the classification models using the four chosen attributes, a confusion matrix was created for each prediction classifier.To provide a visual representation of the results, Table 11 through Table 15 were generated to display the confusion matrices for each classifier.The accuracy of the ANFIS classifier is 71%, with 502 out of 700 instances classified correctly.

Table 12
The confusion matrix for the Decision Tree four attribute classifier The accuracy of the Decision Tree classifier is 69%, as shown in its confusion matrix.The accuracy of the SVM classifier with four attributes was 65%, as shown in its confusion matrix.The accuracy of the Naïve Bayes classifier is 63%, based on its confusion matrix for the four attributes.

Table 15
The confusion matrix for the MLP classifier with four attributes The accuracy of MLP classifier with four attributes is 493 out of 700, which is equal to 71%.
The aforementioned confusion matrix illustrates that the ANFIS classifier continues to exhibit the highest accuracy, reaching 71% when the number of applied attributes was increased.The MLP classifier secured second place with an accuracy rate of 70%, followed by decision tree in third place with 69% accuracy.The SVM classifier's accuracy remained the same as when three attributes were applied, securing fourth place with an accuracy rate of 65%.Once again, Naive Bayes had the lowest accuracy with a score of 63%.These findings suggest that classifiers that utilize neural network techniques are superior, and increasing the number of attributes resulted in improved accuracy for all classifiers.The efficiency comparison of classifiers is presented in Fig. 14.  the ANFIS classifier is highly effective in predicting the "Employed" class and performs reasonably well in predicting the "Not-employed" class.The ANFIS classifier obtained the highest values for Recall and F-Measure while achieving the highest value for FP-Rate, as indicated by the results.These findings further support the effectiveness of the ANFIS classifier in predicting efficiency, especially when using an increased number of attributes.When four attributes are applied, Fig. 11 shows the F-Measure values of the "Employed" and "Not-employed".Upon analyzing Fig. 11, it can be observed that the F-Measure values for the "Employed" class are almost equivalent to those of the "Notemployed" class for all classifiers, albeit with a slight preference towards the "Employed" class.These results suggest that all classifiers have improved their prediction accuracy when utilizing four attributes, resulting in a more balanced and consistent performance in predicting both classes.Fig. 12 provides clear evidence that the FP-Rate values for the "Employed" class are consistently lower than those for the "Not-employed" class, regardless of the classifier used.This indicates that the use of four attributes significantly enhances the ability of the classifiers to predict the "Employed" class with higher accuracy, resulting in a more favorable prediction ratio for this category compared to the "Not-employed" class.Table 18 provides information on the execution time taken by each classifier when building classification models that utilize four attributes.ANFIS emerges as the fastest classifier with an execution time of 0.95 seconds, followed by the decision tree classifier with the same time of 0.95 seconds.The SVM classifier comes next with an execution time of 0.96 seconds, followed by MLP with 0.98 seconds.Naïve Bayes, on the other hand, has the highest execution time of 1.03 seconds.It is noteworthy that the execution time of classifiers has increased when four attributes are applied, which is a logical outcome as the increase in attributes usually leads to an increase in computational time.Upon completing this stage, the final outcomes of constructing prediction classifiers using four attributes reinforce as in general, the neural network approach demonstrated superiority in the classification models, with the ANFIS classifier performing better than the other models.The results also establish that an increase in the number of attributes is directly proportional to an increase in accuracy, indicating the effectiveness of the approach.Additionally, it is worth noting that the execution time increased moderately when applying four attributes in contrast to three attributes, which is a reasonable consequence given the larger number of attributes.The researchers kept adopting an incremental approach to develop classification models, gradually increasing the number of attributes used.The experiment was divided into five phases, starting with three attributes, and adding one attribute in each subsequent phase until reaching seven attributes in the final phase.Table 19 demonstrates that the best classifier was obtained with seven attributes, leading to a significant improvement.However, building the ANFIS classifier required a considerable amount of time.By using the information-gaining technique on the 22 attributes and gradually applying ANFIS to build the classifier, it was found that the maximum number of attributes that can be used to build the ANFIS classifier is seven.Initially, all 22 attributes were used, but this led to a complexity problem where the computer entered an infinite loop due to the large number of attributes.

Conclusion
The purpose of our study was to evaluate the effectiveness of the ANFIS classifier in predicting employment outcomes.To achieve this, we conducted a series of experiments to compare the performance of the ANFIS classifier against other commonly used classifiers such as decision tree, SVM, Naïve Bayes, and MLP.In order to gradually increase the complexity of our models, we adopted an incremental approach by increasing the number of attributes used in each experiment (Shwedeh et al., 2020;Ravikumar et al., 2023;Shwedeh et al., 2022).Our initial experiment involved using only three attributes, and we gradually increased the number of attributes in each subsequent experiment until we reached a maximum of seven attributes in our final experiment.We evaluated the performance of our models using various metrics such as Kappa, RMSE, and accuracy, while also assessing the efficiency of each model by measuring the time it took to build the classifiers.
According to the obtained results, it can be inferred that an incremental increase in the number of attributes utilized for building the classifiers has a positive impact on the accuracy and Kappa statistic values.On the other hand, an inverse relationship was observed between the number of attributes and the RMSE values, indicating that a smaller number of attributes can result in higher accuracy.Therefore, it can be concluded that incorporating more attributes in the classification models can potentially enhance their performance.However, it is noteworthy that the ANFIS classifier encountered a sudden increase in the execution time when more than five attributes were applied.This behavior implies that the ANFIS classifier may encounter a complexity problem when dealing with a larger number of attributes, and this aspect should be carefully considered when selecting the appropriate classifier for the problem at hand.

Fig. 1 .
Fig. 1.The proportion of CS graduates in 2018 by specialization, according to bachelor's degrees With 31% of all alumni in the IT area in 2018, the computer science specialization has the highest number of graduates, while the information network systems specialization has had the lowest amount.With 2869 alumni in the field of information technology in 2017, computer science once again had the largest percentage (29%).

Fig. 2 .
Fig. 2.During the year 2018, employment rates by IT specialization , and university.Those variables, along with other chosen attributes, are part of the training sample data used in this study.

Fig. 3 .
Fig. 3. 2018 employment figures for graduates categorized by their areas of specialization

Fig. 5 .
Fig. 5. Architecture of ANFIS Now, the ANFIS layers are more fully illustrated, with their customization for the recent study is made clear: Layer 1: At this layer, where each node generates the membership grades of a linguistic label, the fuzzification procedure is applied.The membership function (A) is the output Oi,1 of node I and the linguistic input value A1i is the input to node IO1,i= μAi(x)Any parametric membership function may be used as the membership function for the linguistic value A. (such as triangle, sigmoid and trapezoidal).We use the Gaussian distribution in this research study to improve the accuracy of the findings.The (X) membership function:

Fig. 6 .Fig. 7 .Fig. 8 .
Fig. 6.A comparison of the efficiency of classifiers using three attributes is presented Fig. 9 provides a visual comparison of classifier performance based on these metrics.It is evident from the results that ANFIS outperforms the other classifiers in terms of accuracy and efficiency, as demonstrated by its low RMSE and high Kappa statistic values.Three attributes' results are represented in Table

Fig. 9 .
Fig. 9.An analysis of efficiency comparison among classifiers based on RMSE and Kappa statistic values, using three attributes is presented

Fig. 11 .
Fig. 11.The results of F-measure values for the "Employed" and "Not-employed"

Fig. 13 .
Fig. 13.The efficiency of classifiers based on their RMSE and Kappa statistics Fig. 13 compares the efficiency of classifiers based on their RMSE and Kappa statistics values when using four attributes.

Table 1 List of dataset attributes
{weak, good, v. good, excel-The proficiency level of the graduate in the English language.Team work _ skills {bad, low, mid, good, v. good} Team work skills of graduate Status {employed, unemployed, The work status of the graduate Experience Number years Talent {Painting, game…} If the graduate possesses any particular abilities, such as drawing or any other talent No. of applications {0-6,5-11,11-16,>16}The proportion of proposals from graduates submitted to industry employers.

Table 2
Data gathered from three universities in Jordan

Table 4
A tabular representation of the confusion matrix for the decision tree classifier with three attributes is presented

Table 5
A tabulated version of the confusion matrix is provided for the SVM classifier with three attributes

Table 9
The RMSE and Kappa statistics values for every classifier employing three attributes Table 9 displays the performance evaluation of classifiers based on RMSE and Kappa statistical values, with ANFIS ranking first with a low RMSE value of 0.3490 and a high Kappa statistic value of 0.7376 compared to the other classifiers as reported in the above table.

Table 10
The computation time required for building classification models using three attributes for each classifier

Table 11
Confusion matrix for ANFIS classifier with four attributes

Table 14
Confusion matrix for the Naive Bayes classifier with four attributes

Table 16
Accuracy by class for classifiers with four attributesTable16presents information regarding the performance of the ANFIS classifier in predicting both the "Employed" and "Notemployed" classes.The table shows that the ANFIS classifier achieved the highest values for TP-Rate, Precision, Recall, and F-Measure, indicating its superior performance compared to other classifiers.The ANFIS classifier also had the lowest FP-Rate, which further supports its accuracy in making predictions.However, when predicting the "Not-employed" class exclusively, the ANFIS classifier had lower values for TP-Rate, Precision, and Recall, indicating a relatively weaker performance in this specific prediction task.Overall, the results suggest that

Table 17
The RMSE and Kappa statistic values for each classifier when applying four attributes

Table 17
compares the performance efficiency of each classifier based on their RMSE and Kappa statistic values when utilizing four attributes.ANFIS emerges as the top-performing classifier, with the lowest RMSE value of 0.3025and the highest Kappa statistic value of 0.7635 comparing to the other classifiers.Placing it in fourth place, while Naïve Bayes has the highest RMSE value of 0.5439 and the lowest Kappa statistic value of 0.6535, making it the least efficient classifier.Fig. 17 illustrates a comparison of classifier efficiency based on RMSE and Kappa statistic values when applying four attributes.

Table 18
Execution time of four attribute classifiers

Table 19
Accuracy (%), RMSE, Kappa, and Execution Time (Secs) for Various Number of Attributes