Analysis of Influencing Factors of Social Mental Health Based on Big Data

Big data is a large-scale rapidly growing database of information. Big data has a huge data size and complexity that cannot be easily stored or processed by conventional data processing tools. Big data researchmethods have been widely used inmany disciplines as research methods based on massively big data analysis have aroused great interest in scientific methodology. In this paper, we proposed a deep computational model to analyze the factors that affect social and mental health. -e proposed model utilizes a large number of microblog manual annotation datasets. -is huge amount of dataset is divided into six main factors that affect social and mental health, that is, economic market correlation, the political democracy, the management law, the cultural trend, the expansion of the information level, and the fast correlation of the rhythm of life.-e proposedmodel compares the review data of different influencing factors to get the correlation degree between social mental health and these factors.


Introduction
Big data mainly refers to the relatively macroscopic network data generated by Internet platforms and has become an important part of big data research. It is different from microscopic big data generated by genes and brain sciences. Social science research based on big data analysis technology is of great significance for mastering political, economic, and social psychology and behavioral laws [1]. Network big data analysis technology has some advantages, that is, sample size and representation, relying on the advantages of the network platform; it can achieve measurement covering large-scale groups. is sample is close to the overall characteristics and advantages and is conducive to solving traditional research. e method represents the problem, timeliness: network big data makes it possible to track large-scale group measurements in a regular, even real-time manner. Its tracking time can be as good as yearly, monthly, daily, and hourly, or even every minute. Objectivity, network big data is based on the objective behavior data of users, such as search engine search and clicks behavior, social network likes, and forwards and publishes content and has good objectivity of evidence. Cost-effectiveness: traditional research methods are often limited by research costs such as human resources and financial resources. It is impossible to achieve regular and real-time measurements of large-scale groups. Network big data analysis is supported by technologies such as web crawling and text analytics. It has been possible to obtain large amounts of data at a relatively low cost [2].
Research methods based on massively big data analysis have led to thinking about scientific methodology. Research does not require direct contact with the research subject, but new research findings can be obtained by direct analysis and mining of massive data, which may have spawned a new research model [3]. To this end, Turing Award winner Gray distinguishes data-intensive science from computational science and depicts the "fourth paradigm" based on dataintensive scientific research [4]. However, some researchers have doubts about big data analysis technology and believe that big data cannot replace the original research methods. It can be seen that there is still controversy about how to treat the big data analysis technology of the network.
In this paper, to overcome the problem we proposed a new framework model based on psychology which is also one of the important areas of the combination of big data and social sciences [5]. Our main contributions to this study are to present big data analysis methods that have been widely used to solve emotional psychology [6]; behavioral economics, personality psychology, and health psychology [7]; political psychology; and many other important psychological issues. So, in combination with other psychological empirical methods and for the effective features set in this study, we treat psychological research based on network big data analysis technology. e proposed model compares the review data of different influencing factors to get the correlation degree between social mental health and these factors. Unfortunately, the current systematic thinking about the methodology of online big data psychology is still very small [8]. e rest of the paper is organized as follows: In Section 2, the methods are discussed in detail. Section 3 includes big data key technologies while in Section 4, the neural network algorithm is discussed. Similarly, in Section 5, experimental results and discussion are discussed. Finally, we demonstrate a conclusion and future work.

Methods
In this section, we present the two different research methods that are used in this literature. ese research techniques include cognitive neuroscience method technology and traditional research method techniques.

Cognitive Neuroscience Technology.
Compared with modern research techniques, cognitive neuroscience techniques were formed in the 1980s and 1990s. Cognitive neuroscience techniques are formed by the combination of cognitive science and neurology. Cognitive neuroscience research attracted widespread attention from psychologists and reflections based on the perspective of psychological development [9]. Cognitive neuroscience research, mainly through modern cognitive neuroscience research techniques, such as brain wave (EEG), functional magnetic resonance imaging (fMRI), brain magnetic (MEG), transracial magnetic stimulation (TMS), and other technologies, to reveal the law of the occurrence and development of psychology and behavior. Moreover, cognitive neuroscience has become one of the hottest cross-disciplinary fields in recent years. ese fields mainly include the research on the neural mechanism of cognitive behavior, the neuroscience research of cognitive behavioral psychology theory, and the research on the theoretical model of cognitive behavioral psychological mechanism based on brain neural stimulation [10]. For example, explore the neural mechanisms of individual psychological perceptions, learning memories, attention, speech, executive control, thinking, emotions, and other psychological activities or behaviors [11].

Traditional Research Techniques.
e traditional research techniques are divided into two main methods, that is, the questionnaire and behavioral method. e questionnaire method is based on the respondent's self-reporting method for a series of self-reported questionnaires and uses this as data evidence to study people's psychological and behavioral patterns [12]. e main advantages of this method are targeted, quick access to data of larger populations. By designing structural, standardization, or open-ended questions directly related to the purpose of the research, the respondent's firsthand data on the problem is collected in a targeted manner. e questionnaire method can obtain data of large-scale populations more quickly through standardized operation procedures. Compared with the high research cost of cognitive neuroscience technology, the cost of the questionnaire survey is relatively low, and the sample coverage is relatively large. However, the questionnaire survey method has certain defects in subjective bias, sample size, and timeliness [13]. Network big data analysis technology has a relatively large advantage in these aspects. Poor objectivity: because the questionnaire survey method adopts a self-reporting method, there is a strong subjective bias in the research results, especially the social approval response is the most typical [14]. Similarly, the behavioral experiment method mainly involves the experimenter manually or designing different experimental conditions and observes the difference of behavioral results of the subjects under different experimental conditions, tests if the experimental conditions affect the outcome significantly [15]. e behavioral experiment method can be 83 degrees of data to explore causality. Since the experimental method controls the influence of other unrelated interference factors, it can be more reliable to judge whether the difference in behavior results is caused by the experimental conditions (i.e., independent variables), thereby causing causal inference. In view of the significance of causality in research, experimental methods also occupy an important position in scientific methodology [16]. In the behavioral experiment, the experimental person's artificial conditional intervention and the laboratory-specific environmental space may cause the experimental situation to differ from the actual environment of the subject; thus, the experiment the result is interference. erefore, behavioral experiment methods generally face the challenge of low ecological validity. Although natural experiments can improve the authenticity of the environment, there are challenges in terms of operational feasibility and disturbance variable control. e network big data is constructed on the user's real network situation, collects objective data by not intervening in user behavior activities, so it has high ecological validity. e sample size is small and the representativeness is limited [17].

Big Data Features.
e term big data was born in 1997 and usually denotes a massive and compound collection of data. Big data has four outstanding features. ese features include massive data, sparse value, multisource heterogeneity, and exponential growth. e massive data contains full sample data and ultrahigh dimensional complex data, rather than a small amount of sampled data. e sparse value of a single piece of data is extremely low and the correlation between the data is poor. e multisource heterogeneity data sources are complex, the channels are wide, and most of them are unstructured data mixed structured data, which is difficult to sort and sort. In exponential growth, the amount of data contained in big data will change with the output index, and the traffic can reach terabytes [18].

Big Data Core Technology.
To overcome the information threshold of big data and discover the information value of big data, its key technology is divided into three partial layers [19].
ese include data platform, analysis platform, and display platform. e data platform is responsible for the collection and classification storage management of big data. e data gathered must be filtered and remarked. Marking data is always cleaned up and updated, and it contains most of the research value of massive data [9]. e analysis platform is responsible for big data calculation and analysis and is an important way to value the visualization. e transformation process requires strong computing platform support. e commonly used distributed big data computing frameworks are MapReduce, Parameter Server, and so forth. e analysis methods are commonly used for manual modeling or neural network analysis [20,21]. Display platforms often use big data products to promote big data products, including big data research rules or value research models. e complete analysis: marking and extraction process of big data is shown in Figure 1

Neural Network Algorithm
e NN algorithm is an automatic calculation technique that simulates the brain study mechanism. e study covers NN organization simulation, learning algorithm, memory model, and network communication model. At present, the neural network algorithm models realized by research mainly include the feed-forward neural network, reply neural network, and time series memory neural network. Neural networks are the most efficient network model. Its outstanding feature learning capability can be broadly used in image and speech recognition. e feed-forward neural network is divided into multilayer structures, each layer consists of multiple sets of neurons, and information is input along the feed-forward layer, one-way flow transmission. Based on this feature, the feed-forward neural network can effectively extract the characteristics of data space structure. e most famous implementation methods such as perceptron and deep automatic encoder have made outstanding contributions in the field of artificial intelligence and computer vision development.

Restricted Boltzmann Machine.
In 1986, Hinton and Sejnowski proposed the RBM is a multiplicative stochastic neural network. e network consists of visible units. Visible variables and hidden variables are binary variables, that is, 0, 1. A two-part graph is a whole network. e edges are exclusive to the exposed unit and unseen unit. As seen in Figure 2, there is no boundary relation between the visible and the hidden units: RBM is an energy-based model which shows visible v and hidden h energy from the joint as shown in the following equation: where θ is the parameter {W, a, b} of RBM, W is the edge between the visible and hidden units, and a and b are the bias between the visible and hidden units. Based on the joint configuration of visible variable v and hidden variable h, we can obtain the joint probability between h and v: where Ζ(θ) is the normalization factor called partition function; according to function (1), function (2) can be written as We hope the likelihood function of the largest observed data P(v) can be obtained by function (3): e RBM parameters we get, by maximizing P(v), is comparable to increasing log(P(v)) � L(θ): Metadata collection Metadata management requirements Package design

Class attribute creation
Inter-class relationship design Mobile Information Systems 3 en, according to random gradient descent, we can get Take a sample of data and set the state of the visible variable to this sample data. Randomly initialize W. e state of the hidden variable is updated according to the first formula of equation (6); that is, h j is set to state 1 with the probability of P(h j � 1|v); otherwise, it is set to 0. en, for each side v i h j , calculate P data (v i h j ) � v i * h j (note that the states of v i and h j are both {0, 1}). v 1 is reconstructed according to the state of h and the second formula of the formula (6), h 1 is obtained from the first formula of v 1 and formula (6), and . Take the next data sample and repeat steps 1-4. e above process is iterated K times.

Deep Belief Network.
e deep belief network (DBN) consists of a backpropagation (BP) and multilayer restricted Boltzmann machine (RBM) network. Figure 3 shows the over model. In the deep belief network, the learned output of the upper layer RBM network is used as the input of the next layer, so that each layer can better abstract the features of the upper layer and extract the data features layer by layer. e top-level BP network uses the features extracted by the RBM network as input for classification or prediction. e RBM consists of a visible layer v and a hidden layer h, as shown in Figure 4. e visible layer is used to input feature data and the hidden layer is used for feature detectors. e nodes in the visible layer and the hidden layer have no connection with each other, that is, each node takes values independently of each other. Each node of the hidden layer can only randomly take the value 0 or 1, and the full probability distribution P(v, h) satisfies the Boltzmann distribution. e full probability distribution can be calculated by the conditional distribution P(v|h)andP(h|v). When v is input, the hidden layer h can be obtained byP(h|v), and after the hidden layer h, the visible layer can be obtained by P(v|h). By adjusting the parameters, the visible layer v′ obtained from the hidden layer is the same as the original visible layer v; that is, the hidden layer is visible, another expression of the layer. erefore, the hidden layer can be used as a feature of the visual layer input data. e joint distribution of RBM under given model parameters (θ) is where Z � v h exp(−E, v, h; θ) is the normalization factor, and the energy function is as follows: where i, j are nodes; W ij is the connection weight between the visible layer unit and the hidden layer unit; and b i and a j are offsets. e BP neural network consists of three layers of neurons: input layer, hidden layer, and output layer. e structure of the BP network is shown in Figure 5. BP network in a DBN can be understood as a classifier with supervised learning.
In the BP network, the output of the hidden layer node, O j � f( W ij x i − a j ), where a j is the neuron threshold, and f is the excitation function, generally taking the Sigmoid function. e output of the output node y k � f(T jk O j − b k ), where b k is the neuron threshold and T jk is the strength of the connection between the hidden layer node and the output layer node.

Experimental Results and Discussion
In this section, we analyze and discuss the results achieved by our proposed model for the microblog data of 2017 and 2018 from January to September are used as experimental data.

Evaluation Method.
In this section, we demonstrate the measurement of the performance proposed model. e confusion metrics is one of the most widely used techniques used by several researchers for the identification of the performance result. In this paper, the correlation coefficient (Corr) and mean absolute error (MAE) are used as evaluation indicators. e correlation coefficient is calculated as follows: where n is the predicted sample point; R i and P i are the actual mean and predicted mean of the mental health level test sample, respectively; and R is the mean and standard deviation of R i and P is the mean and standard deviation of P i , respectively. To calculate the average absolute error, we used the following equation:

Model Parameters.
e DBN structure is determined by the network depth. ese include the number of inputs, the number of outputs, and the number of hidden layer nodes in each layer. e number of nodes in the 1 st layer of RBM viewport nodes is determined by the number of input sample features. In this study, social psychology considers many influencing factors, many of the above factors; we take each of the microblog comments, that is, (1) economic market correlation degree; (2) political democracy related degree; (3) management legal system related degree; (4) cultural thoughts diverse degree; (5) degree of information expansion; and (6) rapid correlation of life rhythm. ese six main influencing factors are as the characteristics of network learning; all data are manually marked and each influencing factor is also graded in a hierarchical manner so that it can be digitized as the input tensor. Each influencing factor has its own rating according to the criteria.
A major impact on model performance is the depth of the DBN network. Research shows that if the number of RBM layers increases, DBN modeling is enhanced and the hidden layer in the higher layer provides more abstract functional representation and improved network prediction performance. Overfitting may however contribute to several layers. e number of hidden nodes also influences the performance of the model in DBN. e number of nodes is too small, the performance of the data from model mining is not good, and it can be fitted easily when too many nodes are present.

Model
Training. e training of the DBN model is divided into two steps: pretraining and fine-tuning.
Step 1: separately train each layer of RBM networks unsupervised separately, obtain the weights of the generated models through pretraining by unsupervised greedy layer-by-layer method, and ensure that the feature information is retained as much as possible when the feature vectors are mapped to different feature spaces. e RBM training process actually determines the probability distribution that best produces the training samples by determining the weights. at is to say, a distribution is obtained such that the probability of training the sample under this distribution is the greatest.
Step 2: the BP network of the last layer of the DBN receives the output vector of the RBM as the input feature vector and supervises the classifier. Each layer of the RBM network adjusts the weights in its own layer to ensure that the feature vector mapping of the layer is optimal, and the feature vector mapping of the entire DBN is not optimal, so the BP network propagates the error information from top to bottom. Layer RBM finetunes a DBN network.

Training Results.
In this paper, the microblog data of 2017 and 2018 from January to September are used as experimental data. All the data is crawled from the Internet, manually labeled, the emotional tendency is graded, and the related aspects are also graded. e grade is divided into 5 levels: 1 is the least relevant and 5 is the maximum correlation. Among them, the data from January to July of 2017 and 2018 is used as training data for the training DBN model, the data for August is used as feasibility verification data, and data of September is used as forecast test data.
In order to reasonably set the DBN's network depth, we study the impact of the DBN layer number {2, 3, 4} on the model prediction performance and set the number of each hidden layer node to 100. e average absolute error MAE is used as the evaluation index, and the research results are shown in Figure 6. It can be seen from Figure 6 that the DBN network depth has little effect on the accuracy of psychological prediction, and the overall three-layer structure model has the best forecast performance. In this study, DBN network depth has little effect on forecast performance, mainly due to a large amount of training data, providing sufficient data information, so that fewer RBM layers can also deeply mine data features.
Based on the above research results, we use the threelayer DBN model to further study the influence of the number of hidden layer nodes on the prediction performance of the model. e number of hidden layer nodes is set to 50, 100, and 200, respectively. MAE is also used as the evaluation index. As shown in Figure 7. It can be seen from Figure 7 that when the number of hidden layer nodes is 100,

Input layer
Hidden layer Output layer x n x 2 x 1 Weight Figure 5: BP network structure.
Mobile Information Systems the overall prediction performance of the model is optimal. erefore, this study will eventually adopt a three-tier DBN model with each hidden layer node set to 100.
In order to verify the accuracy of the proposed method, DBN prediction model and classical machine learning prediction model linear regression (LR), neural network (NN), support vector machine (SVM), random forest (random forest, RF), and autoregressive integral moving average model (autoregressive), the integrated moving average model (ARIMA) is compared and the results are shown in Tables 1 and 2. It can be seen from Tables 1 and 2 that the prediction performance of the DBN-based forecasting model is significantly better than other classical prediction models under the two evaluation indexes of correlation coefficient and average absolute error.
is indicates that compared with other classical forecasting methods, the deep learning-based forecasting model can deeply mine the input sample features, extract the main factors affecting the mental health level, and reduce the influence of noise in the sample, thus having higher forecast accuracy.
Considering that Weibo has different differences in the hot topics in different time periods, in order to further verify the performance of the deep learning prediction model based on different environments, this paper uses the 2017 data as training data for 2018. e February and July comments were tested for predictive results as shown in Tables 3-6.

Conclusion
In this paper, we proposed a new framework that discusses the application of big data processing technology in social psychology. e proposed method is based on the deep belief network which establishes six major influencing factors that include the degree of economic market relevance of microblog commentary data, political democracy correlation, management legal system, cultural ideological diversification, information expansion, the rapid correlation of life rhythm, and the statistical model. Using the big data of the comments to train the model, fully explore the semantic features in the big data, and realize the mining of emotions and social psychology based on the commentary big data. By comparing with the classical machine learning judgment method based on the correlation and the average absolute error evaluation index, the validity of the DBN model in the mining of social psychological impact is verified. Research shows that a deep learning prediction approach can better overcome the weaknesses of conventional approaches, particularly in the case of big data, which can further explore the importance of big data in the heart and increase the implementation impact of big data comments. Future big data research needs to deal with the relationship between data and theory. On the one hand, datadriven evidence can validate or correct existing theories, and as data-driven evidence continues to accumulate, it is expected to further refine innovative theories. On the other hand, the new theory can further provide guidance for subsequent empirical research. e combination of data driving and theoretical driving is conducive to the benign development of the mutual promotion of theory and data.

Data Availability
Data sharing is not applicable to this article as no datasets are generated or analyzed during the current study.