Brain Storm Optimization Based Clustering for Learning Behavior Analysis

Recently, online learning platforms have proven to help people gain knowledge more conveniently. Since the outbreak of COVID-19 in 2020, online learning has become a mainstream mode, as many schools have adopted its format. The platforms are able to capture substantial data relating to the students' learning activities, which could be analyzed to determine relationships between learning behaviors and study habits. As such, an intelligent analysis method is needed to process efficiently this high volume of information. Clustering is an effect data mining method which discover data distribution and hidden characteristic from uncharacterized online learning data. This study proposes a clustering algorithm based on brain storm optimization (CBSO) to categorize students according to their learning behaviors and determine their characteristics. This enables teaching to be tailored to taken into account those results, thereby, improving the education quality over time. Specifically, we use the individual of CBSO to represent the distribution of students and find the optimal one by the operations of convergence and divergence. The experiments are performed on the 104 students' online learning data, and the results show that CBSO is feasible and efficient.


Introduction
Online learning is becoming more popular as courses can easily and repeatedly be taken from anywhere and at any time. As such, the number of students is virtually unlimited, as even individuals who are not enrolled in a learning institution are able to sign up. With the outbreak of COVID-19 in 2020, many educational institutions were compelled to embrace online learning. As these learning platforms are able to capture substantial data related to the ongoing learning activities of each student, a considerable amount of information regarding learning behavior could be gathered. For example, analysis of the data can reveal relationships between learning behaviors and performance scores, enabling tailoring of lectures accordingly for better results, improving on the educational quality and experience [1,2].
Students have different learning manners, habits, basic knowledge, and interests. Manual analysis method are difficult to process this complex information. As such, intelligence methods are needed to solve these problems. This paper designs a clustering algorithm based on brain storm optimization (CBSO) to analyze online learning behaviors [3,4].
BSO combines swarm intelligence techniques with data mining and data analysis. The distribution of generated solutions can be changed according to the properties of a problem. However, BSO has the defects of converge slowly and falling into local optimum easily. As such, many BSO variants have been proposed. For example, Zhu proposed a BSO algorithm that replaced k-means with k-medians [18] and used the median to calculate the center point, which is the point that is closest to the other points in the cluster. Yang proposed a BSO algorithm which replaced Gaussian variation with differential variation [19], thereby improving greatly the running speed and search efficiency. Shi introduced the idea of a target space [20], which reduced the complexity of clustering by dividing solutions into elite and ordinary solutions. Zhang replaced k-means with SGM clustering [21], and Gaussian variation with differential variation to improve the operational efficiency and optimization accuracy. Wu used differential mutation and clustering to improve the search speed and accuracy [22]. Yang introduced a new method, which used the discussion mechanism to generate new individuals to avoid falling into a local optimum [23]. Wu proposed an adaptive inertial selection strategy to solve the problem of low accuracy and slow convergence [24]. Xue used BSO to solve multi-objective problems, and employed a non-dominant sort to update the archive set composed of non-inferior solutions [25]. Wu proposed a multi-dimensional BSO algorithm with high objectives to improve convergence and diversity, enhancing the efficiency in solving multi-objective problems [26].
The BSO algorithm has good robustness in solve clustering problems, and current clustering methods have some defects. For example, k-means depends on the selection of initial centers and is sensitive to noise and outliers. Density-based spatial clustering of applications with noise (DBSCAN) does not perform well on the problems that are high-dimensional or have large changes in density.
Clustering is an important technique which uses data mining to discover data distribution and hidden patterns. Through clustering, the relationship between learning behavior and performance can be found from uncharacterized online learning behavior data. As such, the characteristics of students are able to be obtained.
In this paper, CBSO is applied to cluster and analyze students' online learning data. Our objectives are as follows: 1) to construct a clustering optimization model based on BSO; 2) to improve the accuracy of the cluster; 3) to apply the proposed model to data of students' online learning behaviors; 4) to provide suggestions for students for better education quality.
The remainder of this paper is organized as follows. Section 2 provides background information. Section 3 describes the CBSO algorithm. Section 4 describes an experiment design. Section 5 presents results and discussion, and Section 6 relates our conclusions.

Related Work
Human brainstorming is a method proposed by Osborne in 1953 to enable a group of people to come up with as many ideas as possible. Everyone can present ideas freely, find inspiration by listening to others. After all ideas are evaluated, the best is selected.
BSO simulates this process and has three corresponding steps of initializing individuals, generating new ones, and selecting the best one [27]. The steps are as follows: Step 1: N individuals are initialized and grouped into k clusters, and the best individual in each cluster is selected as the cluster center.
Step 2: Updating strategies are selected according to probability functions. These strategies and random disturbances are used to generate new individuals.
Step 3: The fitness function is used to evaluate individuals and retain the better ones.
To strength the search ability of BSO, Gaussian variation is used to update individuals in Step 2. The updating method is shown in Eqs. (1) and (2).
where ζ is the step length, max it is the maximum number of iterations, n it is the current number of iterations, K adjusts the convergence rate of the algorithm, rand is used to generate randomly a number from (0,1), logsig is the sigmoid function, X i new is the i th dimension of the generated individual, X i select is the i th dimension of the old individual, and n l; r 2 ð Þ is a normally distributed random number.

Main Idea
Clustering method is dividing the samples with high similarity into a cluster. Hence samples in same cluster have high similarity, and in different clusters have high heterogeneity. In addition, the Euclidean distance is used as the fitness value to judge the similarity of samples. As the division of samples can be represented well by a solution of the intelligent swarm algorithm, and BSO is a good global optimization method, CBSO is used to find the optimal solution of clustering samples [28].

Initialization and Representation of Solutions
Algorithm 1 describes the initialization and representation of solutions for clustering problems. The value of each dimension of an individual is generated randomly in (0,1). This range is divided averagely into small intervals according to the number of clusters. A discrete value is obtained for each dimension by calculating which interval the corresponding continuous values fall in. The x i in individual x = {x 1 ,x 2 , x 3 ,…,x n } is the cluster label. The steps are shown in Algorithm 1.

Object Function
The Euclidean distance is used as object function to gather similar samples into the same cluster. The closer the distance, the more similar the samples are. The object function is given as Eq. (3).
where k is the number of clusters, c j is the center of the j th cluster, x i is value of each sample in the j th cluster, m j is the number of members in the j th cluster, and F is the fitness value.

Clustering Method for Individuals in the Population of BSO
To reduce the search space and accelerate convergence, individuals in BSO are clustered by k-means [29][30][31]. Then, through the judgment of fitness value, the central individual can be obtained. In addition, A random probability p is used to judge whether to replace the current central individual with the new one to avoid premature convergence and falling into a local optimum. The steps are shown in Algorithm 2.

The Operations for Generating New Solutions
New individuals are generated according to Eqs. (1), (2), and (4). The probability parameters such as p 1 , p 2 , and p 3 are used to avoid falling into a local optimum, as the specific operations are described in this section. The global optimal solution can be found by enhancing the search capability with these parameters in iterative processes [32]. Probability parameter p 1 determines whether one or two clusters are selected. If the probability value generated is less than p 1 , we select one cluster; otherwise, we select two clusters. Probability parameter p 2 is used to select the central or ordinary individual in a cluster to generate new individuals, and probability parameter p 3 is used to select the central or ordinary individual from each of the two clusters and combine them to generate new individuals. The steps are shown in Algorithm 3.
where X i select is the i th dimension of the selected individual, rand is used to generate randomly a number between (0,1), X i class1 is the i th dimension of the selected class 1, and X i class2 is the i th dimension of the selected class 2.

Datasets
We collected the online learning data of students in the spring of 2020, including student number, student name, course number, days and times logged in to the web, days and times logged in to a course, number of posts, number of replies, post points, and total points. To avoid the impact of abnormal and Add random disturbance to the two ordinary individuals to generate a new individual; 18 i=i+1; 19 end useless information, some abnormal samples were removed, 104 sample data without student number and name were reserved for subsequent analysis.
Since data in the collected dataset differed greatly, we standardized the data with a Z-score to reduce their impacts on CBSO.
where y i is the normalized value, x i is the original value, and l and r are the mean and standard deviation, respectively, of all samples.
Since the original data had nine types of features, when drawing the graph of clustering results, we used principal component analysis (PCA) to reduce the dimensions to two.

Parameter Settings
In the CBSO algorithm, the maximum number of iterations was 12000, and runtimes was 30, ensuring the stability and accuracy of the experiment. The probability parameters of four updating strategies and initial number are shown in Tab. 1.

Experiment Design
Students were clustered into four categories, and the average value of all student data for each category was calculated. Then, the characteristics of each category are able to be found by analyzing these values, enabling tailoring of lectures accordingly for better educational quality and high enthusiasm in learning.

Computational Results and Analysis
The cluster results are shown in Tab. 2, in which the value in each row is an average. The first through fourth classes had 23, 25, 37, and 19 students, respectively. In Tab. 2, the values of days logged in to web and courses, times logged in to web and courses, times released and replied posts can reflect the enthusiasm of students in learning. The posting integral and total integral can reflect students' performance. Thus, four characteristics of students were defined according to enthusiasm and performance. For example, the values of the first-category were -0.3178, 0.4574, 0.9209, 0.7142, 0.6771, -0.0177, 1.1724, 1.1665, and 1.4339, respectively. Compared to the other three types of data, this kind of student's learning enthusiasm and performance are the best. Therefore, such students are defined as those with good performance and high enthusiasm.
The same method is used to define the remaining three characteristics of students as follows: poor performance and high enthusiasm, good performance and low enthusiasm, and poor performance and low enthusiasm. Tab. 3 describes the four characteristics of students. Fig. 1 shows the clustering situation of 104 students after dimensional reduction, which can reflect that the CBSO algorithm cluster samples with similar characteristics together.

Teaching Advice
Tailoring of lectures can be provided to help students achieve better learning results. For example, firstcategory students are highly motivated to learn and have strong learning capabilities. So, they should be provided with more learning resources for further improvement. Although the students in the secondcategory are motivated to learn, they do not achieve good results. They need to be given learning guidance to avoid study blindly. Conversely, students in the third-category have low learning enthusiasm, but their grades are acceptable. We should enhance teaching supervision and cultivate their interest in   learning to raise their potential. Finally, students in the fourth-category have neither learning enthusiasm nor good grades. Thus, we should strengthen teaching supervision, cultivate their interest in learning, and change the manner of teaching to improve their performance.

Conclusions
The BSO algorithm takes advantage of swarm intelligence and data analysis, and can be used to solve efficiently clustering problems. Thus, the paper proposed a cluster algorithm based on BSO. Individuals were discretized to present the distribution of samples. Besides, Euclidean distance was used to calculate the similarity of individuals. Experiments showed that CBSO is feasible and efficient to solve clustering problems. As such, CBSO is applied to analyze the relationship between learning data and performance, so as to provide tailored guidance to each characteristic of student. However, BSO still falls easily into local optima. We look forward to optimizing the model to obtain more accurate clustering results. Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.