Fusion Analysis of Economic Data of the Medical and Health Industry Based on Blockchain Technology and Two-Way Spectral Cluster Analysis

Due to the huge potential in gene expression analysis, which is helpful for disease diagnosis, new drug development, and life science research, the two-way clustering algorithm was proposed and it was widely used in gene expression data research. In order to understand the economic data of medical and health industry, this paper analyzes the economic data of the medical and health industry in different regions of China based on blockchain technology and two-way spectral cluster analysis and makes statistics on the economic data of the medical and health industry in eastern, central, and western regions of China. .is paper studies the development status of China’s medical and health industry and the factors affecting the agglomeration of medical and health service industry and analyzes them under the blockchain technology and two-way spectral cluster analysis method. .e results show that the overall development trend of China’s medicine and health is from government-led to government, society, and individual sharing. After the transformation of blockchain technology and two-way spectral cluster analysis, the output value of the pharmaceutical industry increased by about 10%.


Introduction
Traditional clustering analysis algorithms mainly deal with static data information. Due to the high-speed, real-time, and continuous characteristics of real-time data streams, traditional clustering analysis algorithms cannot be used. Blockchain technology is essentially a distributed witness technology. e so-called distributed means that the data are not concentrated in a certain data server center, but stored in various nodes in the network. e network members themselves are the storage carriers of the data and directly share, copy, store and synchronize the data. e so-called "witness" is to confirm and notarize the information uploaded in this distributed network. Once the information is uploaded and verified successfully, it cannot be tampered with, achieving the purpose of "witness." e data is stored in the blockchain instead of a centralized server, which can protect the data from being tampered with, making the data more credible and reliable. In addition, the permanent preservation of data also prevents the occurrence of denial. erefore, blockchain technology fundamentally solves the many problems of the current traditional centralized system due to the existence of third parties. Different from the general clustering method, in twoway clustering, not only the genes must be clustered, but also the changes in experimental conditions must be considered at the same time. e clustering composed of object subsets and attribute subsets identifies gene combinations with consistent expression patterns in the subsets of specific conditions, that is, two-way clustering. e clustering changes found in the dynamic analysis of data flow play an important role in the economic data fusion analysis of medical and health industry.
For the fusion analysis of economic data in the medical and health industry based on blockchain technology and two-way spectral clustering analysis, experts at home and abroad have conducted many studies. Yokoya showed the scientific results of the 2017 Data Fusion Competition organized by the IEEE Geosciences and Remote Sensing Society Image Analysis and Data Fusion Technical Committee. It aims to establish an accurate model (evaluated by the accuracy index of the undisclosed test city reference), and it is computationally feasible (evaluated by a limited-time test phase) [1]. Paola believes that the use of multisensor data fusion technology is essential to effectively merge and analyze heterogeneous data collected by multiple sensors, which are generally deployed in smart environments. A context-aware, self-optimized, and self-adaptive sensor data fusion system based on a three-tier architecture is proposed.
e results show that the proposed solution is superior to the static method of context-aware multisensor fusion and achieves a lot of energy saving while maintaining a high degree of inference accuracy [2]. Liu believes that there is an urgent need to develop a data fusion method that can integrate data from multiple sensors to better characterize the randomness of the degradation process. e article developed a method to construct a health index by fusing multiple degradation-based sensor data [3]. In the article, Ghamisi proposed a new framework for using extinction profile (EP) and deep learning to fuse hyperspectral and optical detection and rasterization data derived from ranging. e results are compared with other methods, and the proposed method can achieve accurate classification results. It should be noted that the article uses the concept of deep learning to integrate LiDAR and hyperspectral features for the first time, providing a new opportunity for further research [4]. Chen FC proposed a deep learning framework based on convolutional neural network (CNN) and naive Bayesian data fusion scheme, called NB-CNN, which is used to analyze a single video frame for crack detection. At the same time, a new data fusion scheme is used to aggregate information extracted from each video frame to improve the overall performance and robustness of the system. For this reason, a CNN is proposed to detect crack patches in each video frame, and the proposed data fusion scheme maintains the temporal and spatial consistency of the cracks in the video, and the naive Bayesian decision effectively discards false positives [5]. In order to create a positioning system that provides high-availability attitude estimation, Tao Z also integrates dead reckoning sensors. en the data fusion problem is expressed as sequential filtering. A reduced-order statespace modeling of the observation problem is proposed to provide an easy-to-implement real-time system. Experimental results show that, in terms of accuracy and consistency, this tightly coupled method performs better than the loosely coupled method that uses GNSS positioning points as input [6]. Beyca integrates multiple in situ sensor signals to detect initial abnormalities in the ultraprecision machining (UPM) process. rough the development of a new supervised learning method, the DP model state estimation is combined with the evidence theory sensor data fusion method to make a cohesive decision about the UPM process conditions. It is detected and classified as 90% accuracy [7]. ese studies provide a reference for the creation of this paper, but due to certain problems in related research algorithms and insufficient data samples, the results of related research are not consistent.
In this paper, we have two innovations. One is that the paper proposes a method to search for various biclusters using different bicluster quality evaluation indicators. Second, it compares and analyzes the experimental results of the algorithm in this paper and other commonly used biclustering integration methods in expressing data. e comparison results show that the blockchain-based technology and the two-way spectral clustering analysis method proposed in this paper are implemented in indicators, and time performance is better than other methods.

Blockchain Technology.
Blockchain is derived from the underlying support technology of the Bitcoin network. It is a decentralized public ledger facing the world [8]. e block header contains the version number, timestamp, random number, difficultly value, hash value of the previous block, and the root hash value of the Merkle tree, as shown in Figure 1 [9]. e blockchain is built on the entire network, and the extension of the blockchain network is convenient. Any place that has access to the Internet can be connected to the blockchain so that it can realize transactions across borders and supervisions, reduce supervision costs, and improve convenience.
Blockchain is a storage form of blockchain technology. e blockchain is composed of "blocks" connected in chronological order, and corresponding information is recorded in each "block." Blockchains can be divided into three types, namely, public chains (represented by Bitcoin and Ethereum), alliance chains (represented by R3 alliance, BCOS platform), and private chains. Among them, the public chain is an open platform facing the world. Any individual or organization can freely access and use the services of the public chain and can also withdraw freely [10]. As the underlying chain, the public chain can develop decentralized applications for specific businesses based on the public chain. e contribution of nodes in the public chain will be rewarded with digital tokens, and nodes participating in the world will jointly maintain the public chain. e public chain has achieved complete decentralization, but it lacks effective supervision, and transaction throughput is relatively low. At present, it cannot fully adapt to commercial applications with large business volumes. e alliance chain is jointly maintained by several organizations to maintain a blockchain, which is mainly used for the blockchain platform as a new way of cooperation between organizations to reduce the cost of business collaboration between alliance members and improve business operation efficiency [11,12]. e alliance chain can have no token mechanism, nodes are provided by alliance members, the generation of each block is jointly determined by the preselected nodes, and other nodes can participate in verification and transactions. e alliance chain will provide a supervision interface and even allow the setting of supervision nodes to achieve a kind of semidecentralization. Private chain is a blockchain system managed by an individual or a single organization. It has all the read and write permissions of the blockchain. e transaction throughput is much higher than that of the public chain. It is generally used for the internal business of exchanges and financial institutions. With the help of blockchain, the platform improves the business efficiency within the organization in a low-cost way [13]. e essence of a smart contract is a collection of data (state) and code (business function), which are stored in a specific address on the Ethereum blockchain. ey can be triggered by transactions on the blockchain, and this code can be read from the blockchain to get data and write data [14]. Ethereum uses smart contracts to extend the functionality of the blockchain to support developers in building decentralized applications. At present, there are thousands of decentralized applications that are being developed and deployed based on Ethereum, and hundreds of decentralized applications have been running stably on the Ethereum blockchain network [15].
Traditional centralized systems face problems such as high cost, low business operation efficiency, and insecure data storage. From the nature of blockchain, it can be seen that blockchain can provide good solutions [16]. In the blockchain system, there is no need for a trusted third party to do credit endorsement, and the nodes in the network can still carry out normal transactions and business operations in an environment that does not need to trust each other. Data does not need a centralized server for storage and management but is secured by cryptography technology, distributed consensus algorithm, and so on, so that the data cannot be tampered with and can be traced [17].

Two-Way Spectral Cluster Analysis Method.
With the rapid development of science and technology in the world today, the development of mankind has produced a large amount of data. How to quickly and fully utilize these data and find useful information is a major challenge [18,19]. Data mining is to mine and analyze the original data from a large number of data sources to obtain effective knowledge information so as to make guidance and decision-making. Under normal circumstances, the process of data mining mainly has the following steps, as shown in Figure 2.
e two-way clustering algorithm is fundamentally different from the traditional clustering algorithm. e traditional clustering algorithm is only one-way clustering of rows or columns, while the two-way clustering algorithm considers the whole matrix at the same time; that is, at the same time, it performs cluster analysis on rows and columns to detect the local information of the matrix [20]. However, based on the two-way clustering algorithm of gene expression data, a gene or a sample can belong to different "clusters" at the same time; of course, it can also not exist in any "clusters"; that is, it can be between "clusters" and "clusters." e overlapping part is shown in Figure 3. Rows represent genes, and columns represent the edges of two adjacent conditions, that is, the direction of the gene expression level of a gene under two adjacent conditions. e algorithm can exclude extra rows and columns from the two-way clustering results so as to achieve the purpose of shielding the rows and columns contained in the previous two-way clustering results so that the algorithm can produce different results through continuous iteration. Two-way cluster analysis plays an important role in gene expression profile data, which is mainly manifested in the following two aspects. In drug research, the results of two-way cluster analysis based on gene expression data are useful for the research of drug mechanism, drug development, the judgment of drug efficacy, and the detection of drug targets has played a great role. In terms of disease diagnosis, cancer heterogeneity is the biggest difficulty facing current cancer diagnosis and treatment. However, we can use the two-way data of gene chip cluster analysis that is used to identify cancer subtypes, thereby developing personalized treatment approaches. It can also be used to detect new tumor markers for early diagnosis and corresponding treatment.
Most of the two-group analysis algorithms are currently based on either Jewish or metaheuristic optimization methods, so these algorithms require some quality evaluation indicators to calculate the quality of the search and the direction of the search [21]. In fact, the research process of biclustering is the process of proposing a large number of biclustering indicators. e quality of biclustering evaluation indicators directly determines the efficiency and benefit of biclustering analysis algorithm.
In the two-way clustering, the two-way clustering set with the smallest average mean square residual is determined, and it is saved as the contemporary optimal two-way clustering set. Otherwise, the iteration is terminated, and the contemporary optimal two-way clustering set is output as a result. e mean square residual of a bicluster B (I, J) is defined as Its related index is expressed as Among them, R Ij is the correlation index of the j-th column in the bicluster and σ 2 Ij is the local variance of all elements in the j-th column in the bicluster B, but σ 2 j is the global variance of all the elements in the j-th column in the entire gene expression data A.
For a bicluster k * l of size B (I, J), k * (l − 1) do the following transformations to obtain a matrix M of size, and each element of b ′ ij is defined as follows:  en, the corresponding similarity times N of any two gene sums in double cluster B (I, J) are defined as follows:

Mobile Information Systems
Among them, when the value δ(x) � 1 of x is true, δ(x) � 0. Based on the formula biclustering B (I, J), the maximum number max N of similarities of gene i is defined as follows: Any three genes of bicluster B (I, J) are defined as follows: Each data point has neighbor points.
is transformation should be reversible, where Z j is the X ij mapping result and Z j can also be obtained byX ij doing the inverse transformation.
e formula is inverse transformation. In actual operation, due to the influence of noise data or different transformation methods, there are errors between X ij ′ and X ij , as shown in the following: Perform objective optimization operations on B and d. e specific formula is as follows: where ω j is the weight of the error ε j . According to the feature decomposition, the minimum weighted mean square value of B is obtained. S is the weighted covariance matrix of neighboring points. e customization of data is based on the premise of which distribution it conforms to, and then training and analysis are carried out according to the hypothetical distribution model. erefore, learning the distribution of feature data according to the energy model can solve all the above problems. en, Among them, θ is the parameter model, a i is the bias of the visible layer unit, b j is the bias of the hidden layer unit, and W ij is the connection weight between the visible layer and the hidden layer. e joint probability distribution that can be obtained according to the energy function is as follows: where Z(θ) represents the normalization factor which was in the calculation of joint probability. e likelihood function is solved through specific calculations, and the formula can be expressed as According to the state of the hidden layer unit, the formula for obtaining the visible layer unit in reverse is e specific solution algorithm P(v|θ) of the function is to use the contrast divergence algorithm and then calculate the minimum mean square value of the translation vector d: When the above formula is transformed, size ω is related. e weight of the sample point reflects the possibility that the point is noisy data. If the error is large, it means that the point is likely to be noise; otherwise, the point is less likely to be noise. e following functional relationship is satisfied between the weight and the error: Calculate the cost function. If the cost function ε is less than a certain threshold or the change of the cost function during two iterations ε is less than a certain threshold, the algorithm stops, and the cost function is Update the membership matrix U, and then return to the step: For the membership matrix output by the algorithm, no human intervention is required in the algorithm implementation process. In order to avoid the possible misjudgment of this method, based on the cosine similarity, the Mobile Information Systems cosine value of the angle between the point and the cluster center is used to weigh the Euclidean distance. en, Among them, t � |v j |, |v j | represents the number of samples v j in a cluster that is the cluster center and represents all sample points in the cluster where the cluster center is located.

Data Fusion.
e data integration process includes information retrieval, data processing, data integration, and result analysis [22]. Due to the variability of data, in the process of multisensor data integration, data must be integrated systematically, and data integration is divided into two levels according to function. All-round data connection with data preview, location recognition, and tracking was functions. High-resolution data integration is important for the analysis of trends and errors as a process to obtain the overall integration results [23]. Data fusion plays an important processing and coordination role in multi-information sources, multiplatforms, and multiuser systems, ensuring the connectivity and timely communication between the units of the data processing system and the collection center.
We use Figure 4 to illustrate the data-level fusion method. Data-level fusion is based on the raw data collected by each sensor to directly perform sublevel fusion; that is, data compilation and analysis are performed before the raw data collected by each sensor is processed [24]. Data-level fusion can retain the effective information in the original data as much as possible, but its disadvantage is that when the sensor data value is too large, the statistical accuracy will be improved, and the original data will be incompletely verified.
e biggest advantage of data-level fusion is that the original information is rich because the processed object is the most original data set, without any preprocessing, the loss of information is negligible, it can provide a large amount of detailed original information, and the accuracy of the fusion result is high. e disadvantage is that the amount of data that needs to be processed is extremely large, and the computer capacity and performance requirements are high. At the same time, the entire fusion process takes a long time, which will directly affect the real-time performance of the system; the original data is easily interfered with by external data, and the system should have good fault tolerance. Commonly used methods include weighted average algorithm, wavelet transform, and other algorithms [25].
In order to solve the shortcomings of data fusion, this paper is aimed at detecting the characteristics of a certain ambiguity in the data set and using fuzzy logic methods to identify and classify the detected data sets. Fuzzy set theory is essentially a kind of multivalued logic. In the process of fusion, a number between 0 and 1 is set for each data to express the credibility in the fusion process, and then the multivalue is used. Logical reasoning method merges data to realize data fusion [26,27].

Economic Status of the Medical and Health Industry.
Data analysis is carried out on sites where Chinese medical institutions and health centers are focused on consulting data. In the eastern part of China, the medical centers of Beijing, Tianjin, Hainan, and Shanghai are higher than 1; Hebei and Shandong are nearly three. e strength of agglomeration in the year is higher than 1. e agglomeration index of Jiangsu, Zhejiang, Fujian, and Guangdong is below 1, as shown in Figure 5.
It can be seen that the four eastern cities of Beijing, Tianjin, Shanghai, and Hainan are densely populated and well-developed, stimulating an infinite demand for medical services, and they have relative agglomeration advantages from the perspective of demand. From the perspective of supply, the average number of health personnel in each medical institution in the above three cities is higher than the average in the eastern region, while the average in the four provinces, including Jiangsu, is lower than the overall level in the eastern region. e four provinces of Jiangsu, Zhejiang, Fujian, and Guangdong are the eastern coastal areas, and the large population base is the key factor for the formation of agglomeration levels lower than the levels of other eastern regions.
e results of statistics on the average personnel of medical institutions in the eastern region are shown in Table 1.
rough statistics on the medical industry in the western region, the results of the study found that the concentration of medical service industries in Ningxia, Inner Mongolia, and Xinjiang in the western region are all above 1. Qinghai and Shaanxi are basically above 1. e agglomeration level is below 1, and only one year exceeds 1 a year; the agglomeration levels in Guangxi, Chongqing, Sichuan, Guizhou, Yunnan, and Gansu are all below 1. e result is shown in Figure 6.
It can be seen that the average level of health personnel in each medical institution in Ningxia, Inner Mongolia, and Xinjiang is above the western average. e level of health personnel in each medical institution in Guangxi, Chongqing, Sichuan, Guizhou, Yunnan, and Gansu is almost less than or close to the western average, as shown in Table 2.
e study found that the medical service industry concentration levels in the four provinces of Anhui, Jiangxi, Henan, and Hunan were below 1; the concentration levels of the five provinces of Hubei, Shanxi, Liaoning, Jilin, and Heilongjiang were all above 1. e average number of health personnel in each medical institution cannot reflect the concentration of the medical service industry in the region from the perspective of supply. e results of the research are shown in Figure 7.
For the central region, from the effective analysis of eil index and local health expenditures, the expenditures in the region are relatively consistent, which meets the needs of the local population and fully integrates into the government's health expenditures. ere are significant differences among different provinces in China. e average ranking of the central region is 9. Compared with the eastern and western regions, the overall assessment of the local health expenditure in the central region is carried out. e average personnel of each institution in the central region are shown in Table 3.
e average health staff of each medical institution in the central region of our country is slightly lower than that of the eastern region but higher than that of the western region. e values of Anhui, Hubei, Jilin, and Heilongjiang are all above the average. Henan has tended to be below the average in recent years; Jiangxi, Hunan, Shanxi, and Liaoning are all below the average. With the level of medical service, industry was agglomeration. e reason is that the number of medical institutions in Shanxi and Liaoning is higher than that in Jilin and Heilongjiang.  We have made statistics on the number of for-profit medical institutions and nonprofit medical institutions in various regions, and the results are shown in Table 4.
From the above table, the number of nonprofit organizations in the eastern region has increased year by year since 2015, and the number of for-profit organizations has been basically stable except for a few years; the number of nonprofit organizations in the central region has increased year by year, and the for-profit organizations have been basically stable, while in the western region, nonprofit organizations have increased year by year, and profit-making organizations have remained stable.

Data Fusion Changes.
We have made statistics on the expenditures of the medical and health industry as a percentage of GDP and structure, and the results are shown in Figure 8.
It can be seen that the overall development trend of our country's medicine and health is a government-led transformation into a form of sharing by the government, society, and individuals. Government expenditures are declining gradually, and personal and social expenditures are increasing year by year, finally reaching a balance.
We made statistics on the changes in the medical and health industry under the blockchain technology and the two-way spectral clustering analysis method, and the results are shown in Figure 9.
We can see from Figure 9 that, after the change of blockchain technology and the two-way spectrum analysis system, the medical and healthcare industry has greatly improved. Among them, the result of the pharmaceutical industry has increased by about 10%, the cost has fallen, and an increase in the population and a reduction in bed rest time have led to a significant improvement of the medical and healthcare industry. We use examples to count different variables and analyze the fusion results. e results are shown in Figure 10.

Discussion
4.1. Algorithm Discussion. As a new type of technical means in the field of data mining, the bidirectional clustering algorithm successfully overcomes the shortcomings of traditional clustering algorithms. It can cluster both the gene direction and the conditional direction at the same time.
at is, while retaining the global information, the local information of the gene expression matrix can still be mined.  Mobile Information Systems e traditional clustering analysis algorithm and data stream clustering analysis algorithm are researched and analyzed, and the data stream clustering analysis algorithm based on density grid is mainly discussed, and it is analyzed and summarized, and the improvement ideas are put forward. Combining the data stream clustering analysis algorithm based on density grid with fast processing speed and strong real-time characteristics, the DSG-stream algorithm is proposed on the problem of its insufficiency of cluster boundary processing and the uniform division of singlemode grids. e grid is divided into different thicknesses and granularity. e concepts of boundary grid and internal grid are introduced, and the grid influence factor is combined for clustering processing. e algorithm is based on a two-stage processing framework: online stage maintenance of grid feature vectors and dynamic processing of the internal grids form microclusters, and the boundary grids are fine-grained clustering in the offline stage to obtain the clustering results.
In the algorithm, the grid cluster and density threshold are adjusted dynamically, which reflects the real-time changes of data and detects and processes isolated grids, which improves the efficiency of the algorithm and proposes a localized algorithm based on the distributed environment. e node-global node processing model further improves   the processing speed of the algorithm. Experiments and comparisons verify the clustering accuracy and the operating efficiency of the algorithm, as well as the processing efficiency of the algorithm in a distributed environment.

Pharmaceutical Industry.
With the continuous reform of the medical system and the continuous expansion of medical service marketing, the impact of private medical companies on our country's medical service market forprofit first changed the behavior of our country's general medical companies. In areas where profitable hospitals are relatively dense, nonprofit hospitals will actually be affected by the hospital's profit-making medical behaviors, and it is changing the market value and service quality of our country's medical service products. Our country's medicine, especially in the field of traditional medicine, has major problems such as multiple facilities, small scale, hardly control of equipment, information asymmetry, low efficiency, high cost, and confusion. For a long time, pharmaceutical companies have not had their own foundation, and modern medical statistics are no exception to the cost reduction and efficiency improvement of the pharmaceutical industry. e integration of social  medicine logistics and the integration of the modern medicine and medicine movement system is the primary task of reducing the complexity of the medicine industry.
In the process of implementation, the situation in the past was that primary hospitals were unwilling to be trusted by banks due to their weak financial strength, while large hospitals had strong financial strength and did not require banks to be trusted. It is now required that all grassroots hospitals and community hospitals implement "two lines of revenue and expenditure," that is, all hospital expenditures are included in financial budget management and all revenues are turned over to special financial accounts.

Conclusions
is paper considers the degree of difference between twoway clustering and the degree of fusion between clusters and believes that the optimal number of clusters determined by two-way clustering is better than other algorithms. We compared the accuracy of the two-dimensional clustering algorithm. For high-density genetic data, the two-way clustering algorithm can better extract local information while retaining all this information. It can be seen that the two-way clustering algorithm is better than other clustering algorithms. Of course, there are some problems in the research of this paper. Compared with the clustering ensemble algorithm, the biclustering ensemble algorithm has an extra step of reconstructing the bicluster, and usually, heuristic algorithms that are easy to fall into local optimality are used to solve the problem. However, at present, there is no relevant literature on how to reconstruct the biclustering to obtain the global optimum, and it is still in a blank state of research, which points out a clear direction for the next step of research. How to obtain useful information from these data and solve some new problems is the focus of current research in the big data industry. erefore, the related research ideas of gene expression data biclustering analysis are extended to other data, which opens up a new direction for the next step of research for discovering or solving new problems.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors state that this paper has no conflicts of interest. Mobile Information Systems 11