Measuring Patent Similarity Based on Text Mining and Image Recognition

: Patent application is one of the important ways to protect innovation achievements that have great commercial value for enterprises; it is the initial step for enterprises to set the business development track, as well as a powerful means to protect their core competitiveness. The emergence of a large amount of patent data makes the effective detection of patent data difﬁcult, and patent infringement cases occur frequently. Manual measurement in patent detection is slow, costly, and subjective, and can only play an auxiliary role in measuring the validity of patents. Protecting the inventive achievements of patent holders and realizing more accurate and effective patent detection were the issues explored by academics. There are ﬁve main methods to measure patent similarity: clustering-based method, vector space model (VSM)-based method, subject–action–object (SAO) structure-based method, deep learning-based method, and patent structure-based method. To solve this problem, this paper proposes a calculation method to fuse the similarity of patent text and image. Firstly, the SAO structure extraction technique is used for the patent text to obtain the effective content of the text, and the SAO structure is compared for similarity; secondly, the patent image information is extracted and compared; ﬁnally, the patent similarity is obtained by fusing the two aspects of information. The feasibility and effectiveness of the scheme are proven by studying a large number of patent similarity cases in the ﬁeld of mechanical structures.


Introduction
At present, in the era of economic globalization, every country is constantly innovating and developing. The effective carrier of advanced technology in every country is the patent, which contains rich technical information. It is estimated that 70-90% of patent information is not disclosed elsewhere, so patents have a higher technological content than other technology carriers [1]. At the early stage of national development, domestic attention to patents is still insufficient, and some private enterprises are in a state of insufficient knowledge of patents, some enterprises are in a state of understanding patents but do not pay attention to them, and only a few enterprises recognize the importance of patents. However, with the development of economic globalization and increasing competition in various industries, countries pay more and more attention to patents and establish relevant institutions, the state pays more and more attention to combating piracy and protecting patents, and domestic enterprises also pay attention to patents [2,3]. In recent years, the number of patent applications is increasing, and China in particular became the leading country in patent applications. Since the 21st century, the number of patent applications continued to increase, and according to the number of patent applications in 2022 released by the World Intellectual Property Organization (WIPO), the number of patent applications in 2022 is as high as 278,100, with Asia accounting for 54.7% of the total. China continues to be the largest source of patent cooperation treaty applications with 70,015 applications, and

Patent Similarity
The clustering-based PIDM puts all detected patents together to generate one or more clusters, and those clustered with the target patents are more likely to be infringing patents. In particular, Jeong [8] extracted problem solved concept (PSC) terms and constructed a PSC-based map, clustering and evaluating them to explore opportunities for new patent creation. Zhu [9] combined a self-organizing neural network (SOM) with fuzzy C-means (FCM) clustering to obtain a SOM-based FCM algorithm, which improved the quality of clustering, automatically identified patents similar to the patents under investigation and designed a patent infringement detection system. Lee et al. [10] utilized a principal component analysis (PCA) algorithm to cluster and visualize keyword space vectors. Yoon et al. [11] converted each patent document into a vector by extracting keywords, used PCA to reduce dimensionality, and finally performed SOFM with the vector as input to create patent maps for clustering purposes. Lai et al. [12] proposed a method, called the bibliometrics-based patent co-citation approach, by analyzing the co-citations of target patents using clustering methods for cited patents and creating a patent classification system. However, the clustering-based approach can only cluster several different classes, and there will be a large number of patents in the same class as the target patent, which cannot effectively reduce the examination work.
VSM-based PIDM converts text into spatial vectors and feeds into patent similarity by comparing spatial vector similarity. Magerman et al. [13] demonstrated patent similarity using VSM and latent semantic analysis. Yoon et al. [14] used the Doc2Vec model [15] to demonstrate patent similarity and predict the future direction of technology development from the constructed patent network. The Doc2Vec model was improved from the Word2Vec model [16] by replacing the original spatial vector for word detection with the spatial vector for paragraph detection. SAO2Vec [17,18] is an improved spatial vector model based on Doc2Vec. It is easier to construct the vector space model, but the dimensionality of the vectors is positively related to the size of the prediction, and the vectors constructed by large-scale prediction are high-dimensional and sparse, which makes the computation more complicated.
SAO-based PIDM analyzes information such as sentence lexicality and obtains the desired structure using natural language processing (NLP) techniques. Park et al. [19] used the WordNet-based SAO structure to measure patent similarity and used multidimensional scaling to map patent relationships to a two-dimensional space and group patents that could infringe. Li et al. [20] used the SAO structure to prove patent similarity and extended it by using the Sorensen-Dice index [21,22], which has good flexibility and robustness. Yoon [23,24] also used the SAO structure to prove patent similarity and then used similarity to analyze potential competitors and partners. Park et al. [25] proposed a patent infringement map based on SAO semantic similarity to identify patent infringement. The calculation of patent similarity based on SAO structure has a serious dependence on the extracted SAO structure, which requires manual annotation if a higher quality SAO structure is to be obtained.
Deep learning developed rapidly in recent years, with significant achievements in text, image, and radio, and many researchers applied deep learning techniques to the field of patents. Lu et al. [26] proposed a patent citation classification model based on deep learning by selecting convolutional neural networks (CNN) at the document encoding level and introducing multilayer perceptron to gradually compress and extract the most relevant features and adjust the nonlinear relationships. Ma et al. [27] constructed a patent model tree and compared the advantages and disadvantages of CNN, RNN, LSTM, and Siamese LSTM, and established that Siamese LSTM [28,29] has obvious advantages among them. Deep learning PIDM uses neural network models for vectorized representation, although the accuracy rate is high, the model is poorly interpreted, the data for constructing specialized fields are difficult to obtain, and a large amount of manual involvement is required at the initial stage. The composition of a patent includes several structural components, such as inventor, application number, filing date, IPC classification number, abstract, claims, etc. The last patent-based PIDM considers these structures. Zhang et al. [30] used the IPC classification model and semantic model to evaluate patent similarity by constructing patent terms into different layers of trees, each layer having its own weight value, and equating patent similarity by calculating tree similarity. Fujii et al. [31] used punctuation to segment the claims and Okapi BM25 [32] to obtain paragraph similarity, and then cumulatively obtained overall patent text similarity. Among citation methods [33], Lee et al. [34] proposed a stochastic patent citation analysis method, and Rodriguez et al. [35] proposed a similarity measure in citation networks that exploited direct and indirect co-citation links between patents. Klaans and Boyack [36] compared the accuracy of direct citation-based, bibliographic coupling, and co-citation in representing knowledge classification. In general, the classification in the direct citation classification was better than that in the other classifications. Wu et al. [37] also proposed a method for evaluating patent similarity by considering direct and indirect citations. Cheng et al. [38] used USPC and IPC construction techniques and functional class matrices to demonstrate patent similarity. Similarity based on the patent structure is more relevant, but for patent infringement, each part has a different weight, and manual weighting is resource-intensive and less feasible.

SAO Semantic Analysis
SAO structure is a construction in which the subject (S) and object (O) of a sentence are related under action (A), and an SAO structure simply reflects the content of a sentence, giving a complete picture of how two things are related or affect each other. For example, in the sentence "The shower spray water", "shower" is the subject, "spray" is the action, and "water" is the object. Similar to the SAO structure, the subject-predicate-object (SPO) structure, which consists of subject elements, object elements, and the relationships between them, can be considered a semantic network and is widely used for knowledge discovery in biomedical literature [39], while SAO is commonly used for text mining in patent documents [40].
SAO structure is a technical tool for NLP, which is favored by scholars and received wide attention, and the ability of SAO structure extraction became more powerful in the process of the continuous improvement of machine learning algorithms. For example, Kim et al. [41] analyzed the "for" and "to" phrases and verbal forms of object elements to effectively explore the purpose and effect of the technique in depth. Miao et al. [42] used the purpose relationship between the SAO structure and the technology-relationshiptechnology structure to mine technology solutions and functional information. He et al. [43] proposed a potential technology requirement identification model based on semantic analysis of the SAO structure. They realized the layout and visualization of requirements based on the technology life cycle to guide the direction of technology development and optimize resource allocation. Li et al. [44] used the Unified Medical Language System to evaluate the similarity between SAO structures, which was introduced in the field of medical patents. Yoon [23,24] also used SAO structures to demonstrate patent similarity and then used similarity to analyze potential competitors and partners. Using NLP techniques, rapid mining of SAO structures from text can be achieved.
The structure of the SAO patent triad is extracted from the text, and usually the subject and object are in the form of nouns, representing the performer and the event performed, respectively. The predicates are all used as actions to link the subject and object [45]. A set of SAO structures may be included in a single sentence, or multiple sets of SAO structures may be included. In the patent text, the SAO structure of the patent is summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1. summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1.

1
Wherein the conductive mechanism further comprises a control circuit module; 1. conductive mechanism-comprises-control circuit module 2 The first conductive portion and the second conductive portion are electrically coupled with the control circuit module through conductive line; 1. the first conductive portion-coupled-module 2. the second conductive portion-coupled-module 3 A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the above disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on the mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools. Wherein the conductive mechanism further comprises a control circuit module; 1.
conductive mechanism-comprises-control circuit module 2 Systems 2023, 11, x FOR PEER REVIEW 5 of 21 summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1.

1
Wherein the conductive mechanism further comprises a control circuit module; 1. conductive mechanism-comprises-control circuit module 2 The first conductive portion and the second conductive portion are electrically coupled with the control circuit module through conductive line; 1. the first conductive portion-coupled-module 2. the second conductive portion-coupled-module 3 A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the above disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on the mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools. the first conductive portion-coupled-module 2.
the second conductive portion-coupled-module 3 Systems 2023, 11, x FOR PEER REVIEW 5 of 21 summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1.

1
Wherein the conductive mechanism further comprises a control circuit module; 1. conductive mechanism-comprises-control circuit module 2 The first conductive portion and the second conductive portion are electrically coupled with the control circuit module through conductive line; 1. the first conductive portion-coupled-module 2. the second conductive portion-coupled-module 3 A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the above disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on the mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools. A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1.
showerhead-includes-connecting seat 4 Systems 2023, 11, x FOR PEER REVIEW 5 of 21 summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1.

1
Wherein the conductive mechanism further comprises a control circuit module; 1. conductive mechanism-comprises-control circuit module 2 The first conductive portion and the second conductive portion are electrically coupled with the control circuit module through conductive line; 1. the first conductive portion-coupled-module 2. the second conductive portion-coupled-module 3 A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the above disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on the mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools. The invention reduces and eliminates the above disadvantages; 1.
invention-eliminates-disadvantages 5 Systems 2023, 11, x FOR PEER REVIEW 5 of 21 summarized as shown in Table 1. The similarity between patents can be translated into the similarity of the SAO set, as shown in Figure 1.

1
Wherein the conductive mechanism further comprises a control circuit module; 1. conductive mechanism-comprises-control circuit module 2 The first conductive portion and the second conductive portion are electrically coupled with the control circuit module through conductive line; 1. the first conductive portion-coupled-module 2. the second conductive portion-coupled-module 3 A direction-adjustable showerhead fixing structure includes a showerhead main body and a connecting seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the above disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on the mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools. The control circuit module is disposed on the mount and is located in the first chamber; 1.
module-located the-first chamber 1 Wherein the conductive mechanism further co prises a control circuit module; 1. conductive mechanism-comprises-control cuit module 2 The first conductive portion and the second c ductive portion are electrically coupled with control circuit module through conductive line; 1. the first conductive portion-coupled-modu 2. the second conductive portion-coupled-m ule 3 A direction-adjustable showerhead fixing struct includes a showerhead main body and a conn ing seat; 1. showerhead-includes-body 2. showerhead-includes-connecting seat 4 The invention reduces and eliminates the ab disadvantages; 1. invention-reduces-disadvantages 2. invention-eliminates-disadvantages 5 The control circuit module is disposed on mount and is located in the first chamber; 1. module-disposed-mount 2. module-located the-first chamber The key point of using the SAO structure applied in the patent field is the quali the SAO structure, so manual extraction is the most accurate method, but this meth not possible in the presence of a large number of patents, which requires a lot of effort is very inefficient. However, with the development of NLP, it became possible to ex SAO structures using NLP tools.  The key point of using the SAO structure applied in the patent field is the quality of the SAO structure, so manual extraction is the most accurate method, but this method is not possible in the presence of a large number of patents, which requires a lot of effort and is very inefficient. However, with the development of NLP, it became possible to extract SAO structures using NLP tools.

Contour Detection
Contour detection refers to the process of extracting the target contour by ignoring texture and noise interference within the image [46]. Traditional contour detection is broadly classified into three types: pixel-based, edge-based, and region-based contour detection methods. The pixel-based approach is concerned with discontinuity of the image boundary, and the occurrence of sharp changes in pixels around the contour indicates that a regional change is generated. This process introduced linear filtering [47][48][49], such as the Prewitt operator, Sobel operator, and Canny operator. Later, many scholars proposed the use of higher-level features such as luminance, color, and texture gradients [50], and the combination of these features improved robustness. The edge-based approach considers the overall image information and divides the contour extraction process into edge detection and edge grouping [51]. Individual edge points are collected and then formed into a continuum, irrelevant data are eliminated, and the remaining data are rearranged, with each grouping corresponding to a specific object [52]. The early determination of edge elements in the likelihood of being in the same contour was based on empirical statistics, after which Elder [53] added Bayesian inference methods, while Mahamud [54] introduced the concept of contour saliency to identify smooth closed contours. Finally, with regard to region-based approaches, Arbelaez et al. [55,56] proposed the concept of ultrametric contour maps, in which local contrast and regional contribution are involved in the dissimilarity of adjacent regions, and the key to their method lies in the definition of hyperparametric distance. The region-based method is more stable to noise and can adapt to relatively uneven contours.

Data Collection
Showerheads are widely used in daily life. With the continuous development of society, people's demands for showerhead products also increased, and they no longer have only the single function of spraying water, but have added functions, such as disinfection, spraying bath products, and even massage. As a product in the traditional mechanical field, the shower is characterized by a variety of functions, a mature market, and sufficient patent applications. For this reason, the product was chosen as the research object for the experiment. In this paper, the patent database of the United States Patent and Trademark Office was searched with "showerhead" as the keyword, and the handheld shower patents from the past ten years were downloaded for testing; the total number of patents was 131. This paper lists some of the patents, as shown in Table 2.

TF-IDF
The term frequency-inverse document frequency (TF-IDF) model is a statistical method that can evaluate the importance of words in a text in the corpus and is a common model for calculating text similarity. The calculation process is shown in Figure 2.
2. Calculate the inverse document frequency: A corpus is a collection of all articles tha simulate the language environment. The more frequent a single word is, the large the denominator becomes, and the closer the inverse document frequency is to zero The denominator is added by 1 to prevent the denominator value from being 0 (i.e all documents do not contain the word); lg means to take the logarithm of the ob tained value. 2.
Calculate the inverse document frequency: A corpus is a collection of all articles that simulate the language environment. The more frequent a single word is, the larger the denominator becomes, and the closer the inverse document frequency is to zero. The denominator is added by 1 to prevent the denominator value from being 0 (i.e., all documents do not contain the word); lg means to take the logarithm of the obtained value.

Inverse document f requency = log
Total number o f documents in the corpus Document containing the word + 1 3.
Calculation of the TF-IDF: As you can see, TF-IDF is proportional to the number of occurrences of a word in the document and is inversely proportional to the number of occurrences of that word in the entire corpus. So, the algorithm for automatic keyword extraction is clear: the TF-IDF value is calculated for each word in the document, and then the top 100 words are taken in descending order. For visualization, words are sorted by TF-IDF value and the top 50 words are captured. Figure 3 shows a heat map of TF-IDF values for these words in some patents.

4.
Build a word frequency list: Build a word frequency matrix; the length of the matrix is the number of texts, the width of the matrix is the number of words, and each group of vectors represents the frequency of words contained in each text.

5.
Calculating the Cosine Similarity: Given two attribute vectors, A and B, the cosine similarity is given by the dot product and the vector length, as shown in Equation (4).  4. Build a word frequency list: Build a word frequency matrix; the length of the matrix is the number of texts, the width of the matrix is the number of words, and each group of vectors represents the frequency of words contained in each text. 5. Calculating the Cosine Similarity: Given two attribute vectors, A and B, the cosine similarity is given by the dot product and the vector length, as shown in Equation (4).

SAO Structure Extraction and Cleaning
In order to better extract the SAO structure, this paper uses a method based on dependent syntactic analysis to extract triples from patents, and the main steps are shown below. The current level of NLP for SAO structure extraction is improving, but it is still impossible to accurately extract all effective SAO structures and there is bound to be some noise, so cleaning the extracted SAO structure is a necessary process. Figure 4 illustrates the SAO extraction process.

SAO Structure Extraction and Cleaning
In order to better extract the SAO structure, this paper uses a method based on dependent syntactic analysis to extract triples from patents, and the main steps are shown below. The current level of NLP for SAO structure extraction is improving, but it is still impossible to accurately extract all effective SAO structures and there is bound to be some noise, so cleaning the extracted SAO structure is a necessary process. Figure 4 illustrates the SAO extraction process.

1.
Segmenting the text into independent sentences.

2.
Dependent syntactic analysis of the sentences.
Clean up the SAO structure and remove the meaningless SAO structure.
The whole text of the patent is divided into sentences and the SAO structure is extracted for each sentence. The Spacy library has certain advantages in execution speed and accuracy, so the text is lexically annotated and dependent syntactic analysis is performed using Spacy to extract the subject, predicate, and object of the text, some of which may contain multiple sets of keywords. The text content of patent US20180318860A1 was subjected to SAO structure extraction, and some of the SAO structures are shown in Table 3. The number of SAO structures extracted by each patent is shown in Table 4.   The whole text of the patent is divided into sentences and the SAO structure is extracted for each sentence. The Spacy library has certain advantages in execution speed and accuracy, so the text is lexically annotated and dependent syntactic analysis is performed using Spacy to extract the subject, predicate, and object of the text, some of which may contain multiple sets of keywords. The text content of patent US20180318860A1 was subjected to SAO structure extraction, and some of the SAO structures are shown in Table 3. The number of SAO structures extracted by each patent is shown in Table 4.

SAO Structure Semantic Similarity Calculation
Each patent text is represented as a collection of SAO structures, and each SAO structure consists of a subject, a predicate, and an object. The similarity of SAO structures is obtained from the similarity of internal elements, so the similarity between internal elements, i.e., the similarity between words, is measured first.
In this paper, the Word2Vec model is chosen to compute the semantic similarity between words. The Word2Vec model is a language model proposed by Mikolov et al. [16] based on the NNLM model of Bengio et al. [57] and the log-bilinear model of Hinton et al. [58]. A word can be quickly and efficiently trained into a vector form after optimization based on a given valid corpus, providing an effective tool for subsequent word similarity. Word2Vec contains two core architectures, the CBOW model and the Skip-gram model, as shown in Figure 5. The CBOW model predicts the probability of occurrence of the current word w(t) by context, while the Skip-gram model is the exact opposite of the CBOW model, predicting the probability of occurrence of several words before and after the current word w(t). Skip-gram is less efficient but has relatively high accuracy, so this paper chooses to use the Skip-gram model as the training model to ensure the high priority of accuracy.

SAO Structure Semantic Similarity Calculation
Each patent text is represented as a collection of SAO structures, and each SAO structure consists of a subject, a predicate, and an object. The similarity of SAO structures is obtained from the similarity of internal elements, so the similarity between internal elements, i.e., the similarity between words, is measured first.
In this paper, the Word2Vec model is chosen to compute the semantic similarity between words. The Word2Vec model is a language model proposed by Mikolov et al. [16] based on the NNLM model of Bengio et al. [57] and the log-bilinear model of Hinton et al [58]. A word can be quickly and efficiently trained into a vector form after optimization based on a given valid corpus, providing an effective tool for subsequent word similarity Word2Vec contains two core architectures, the CBOW model and the Skip-gram model as shown in Figure 5. The CBOW model predicts the probability of occurrence of the current word w(t) by context, while the Skip-gram model is the exact opposite of the CBOW model, predicting the probability of occurrence of several words before and after the current word w(t). Skip-gram is less efficient but has relatively high accuracy, so this paper chooses to use the Skip-gram model as the training model to ensure the high priority of accuracy. The Skip-gram model uses a three-layer network structure to train word vectors, including an input layer, a hidden layer, and an output layer. The input layer is the one-hot encoding corresponding to the input words, while the hidden and output layers are the two vector matrices and . The central word matrix has the dimension * and the surrounding word matrix has the dimension * , where V is the size of the lexicon and N is the dimensionality of the constructed word vector. Using the one-hot encoding of the input layer multiplied by the matrix to obtain the vector of which we want to reduce the dimensionality, this vector can be considered as the central word vector representation of the word. This vector is then multiplied by the surrounding word vector matrix , which is the influence of the surrounding word on the word; and finally, a The Skip-gram model uses a three-layer network structure to train word vectors, including an input layer, a hidden layer, and an output layer. The input layer is the one-hot encoding corresponding to the input words, while the hidden and output layers are the two vector matrices W 1 and W 2 . The central word matrix W 1 has the dimension V * N and the surrounding word matrix W 2 has the dimension N * V, where V is the size of the lexicon and N is the dimensionality of the constructed word vector. Using the one-hot encoding of the input layer multiplied by the matrix W 1 to obtain the vector of which we want to reduce the dimensionality, this vector can be considered as the central word vector representation of the word. This vector is then multiplied by the surrounding word vector matrix W 2 , which is the influence of the surrounding word on the word; and finally, a word vector of size 1 * V is obtained, which is finally normalized by Sotfmax to obtain the predicted probability value. The difference between the probability value and the true value is actually the loss, and according to these losses, the vector matrices W 1 and W 2 are adjusted using the backpropagation algorithm to make the prediction more accurate. The training objective function for this sequence of words is formulated as: In this formula, k is the window size, and the larger the window, the more information is captured and the more accurate the result, but the efficiency decreases. After training, each word has its own vector representation, which is finally represented by cosine similarity.
In practice, it is experimentally found that word vectors generated by Word2Vec training are not as accurate as the NNLM model; but given a sufficient corpus, word vectors generated by Word2Vec become more and more accurate. Therefore, it can be trained on English Wikipedia to obtain a highly accurate word vector model. The S and O are in the extracted SAO structure because there is a singular-plural distinction in the extraction process, and in practice, the singular and plural refer to the same object, so it is important to unify the word forms and eliminate this distinction. Lemminflect is a Python module for reducing the morphology of English words. It uses a dictionary to reduce the morphology of English words, and its accuracy rate is higher than NLTK, spaCy, and Stanford Core NLP. For example, the dictionary has maps from "pipelines" to "pipeline", "showers" to "shower", and "plays, played, playing" to "play", so when you make the change, you can simply consult the dictionary to restore the words.
The SAO structure, where S and O are nouns, can be cross-calculated, and A is a verb and is calculated separately. The specific computation is shown in Figure 6. mation is captured and the more accurate the result, but the efficiency decreases. After training, each word has its own vector representation, which is finally represented by cosine similarity.
In practice, it is experimentally found that word vectors generated by Word2Vec training are not as accurate as the NNLM model; but given a sufficient corpus, word vectors generated by Word2Vec become more and more accurate. Therefore, it can be trained on English Wikipedia to obtain a highly accurate word vector model.
The S and O are in the extracted SAO structure because there is a singular-plural distinction in the extraction process, and in practice, the singular and plural refer to the same object, so it is important to unify the word forms and eliminate this distinction. Lemminflect is a Python module for reducing the morphology of English words. It uses a dictionary to reduce the morphology of English words, and its accuracy rate is higher than NLTK, spaCy, and Stanford Core NLP. For example, the dictionary has maps from "pipelines" to "pipeline", "showers" to "shower", and "plays, played, playing" to "play", so when you make the change, you can simply consult the dictionary to restore the words.
The SAO structure, where S and O are nouns, can be cross-calculated, and A is a verb and is calculated separately. The specific computation is shown in Figure 6. The formula for calculating the similarity between two SAO structures is as follows:

Patent Similarity Calculations
After obtaining the similarity of SAO structures, the Hungarian algorithm is used to find the maximum number of matches for two SAO sets, as shown in Figure 7. The red line represents the matching result. The Hungarian algorithm is a combinatorial optimization algorithm used for solving task assignment problems in polynomial time, and is later used to solve matching problems in graph theory. The formula for calculating the similarity between two SAO structures is as follows:

Patent Similarity Calculations
After obtaining the similarity of SAO structures, the Hungarian algorithm is used to find the maximum number of matches for two SAO sets, as shown in Figure 7. The red line represents the matching result. The Hungarian algorithm is a combinatorial optimization algorithm used for solving task assignment problems in polynomial time, and is later used to solve matching problems in graph theory. In this paper, we set the threshold P. If the similarity of two SAO structures reaches the threshold, it is defined as a match that can be made. 1 1 represents all the SAO structures in patent 1, and 2 1 represents all the SAO structures in patent 2. However, it is possible for the SAO structures in one patent to match multiple SAO structures in another patent, and the two foci of matching are: In this paper, we set the threshold P. If the similarity of two SAO structures reaches the threshold, it is defined as a match that can be made. SAO1(i)(1 ≤ i ≤ n) represents all the SAO structures in patent 1, and SAO2(j)(1 ≤ j ≤ m) represents all the SAO structures in patent 2. However, it is possible for the SAO structures in one patent to match multiple SAO structures in another patent, and the two foci of matching are: (1) The match is the set of edges.
(2) In this set, any two edges cannot have a common vertex.
Therefore, this paper uses the Hungarian algorithm to achieve maximum matching.

Determining the Optimal Threshold
In order to distinguish the similarity between relevant patents and targets, this experiment wants the proportion of patents with high patent similarity and patents with zero patent similarity to be as small as possible. High similarity means that patent similarity values are more similar and difficult to distinguish. Smaller repeated similarity values imply subtle differences in similarity between patents. The smaller the proportion of patents with zero similarity, the more detailed the textual content analysis. Before calculating the initial level of patent similarity, a threshold (P) for the SAO structure must be set. A range of thresholds from 0.3 to 1 was set for the search with a step size of 0.01 to determine the optimal threshold setting. Figure 8 shows the proportion of patents with zero similarity and patents with too much similarity at different thresholds.

Patent Similarity between Target Patents and Related Patents
Using the SAO proof-of-structure method, the target patent US20180318860A1 was compared with other related patents to rank the similarity, and Table 5 shows the top ten patent serial numbers and the degree of patent similarity.  To reduce the proportion of patents with high similarity and those with 0 similarity, the experiment initially chose a threshold range between 0.6 and 0.8. To cross-check the results, the experiment invited experts to perform manual reading to ensure that the difference between the measured results and the manual understanding was minimized. After reviewing all combinations, it was confirmed that a threshold of 0.8 was chosen.

Patent Similarity between Target Patents and Related Patents
Using the SAO proof-of-structure method, the target patent US20180318860A1 was compared with other related patents to rank the similarity, and Table 5 shows the top ten patent serial numbers and the degree of patent similarity.

Weighted SAO structure
Wang et al. [40] introduced the calculation index of different weighted SAO (DWSAO), extracted the patent SAO structures, and calculated their weight information to measure the patent similarity in robotics. The number of patents contained in the patent set is N. The target patents have m SAO structures, and SAO p i denotes the i-th SAO structure of patent P. Formula 8 calculates its feature weight DWSAO value: where F represents the document frequency of SAO p i , set the initial value to 1, traverse N patents, and add 1 to F if the patent contains SAO similar to SAO p i . It is derived from the formula that the greater the commonality of the SAO structure with other patents, the weaker the ability to represent technical features, and the smaller the DWSAO value. The calculation process is shown in Figure 9.

Multimodal Patent Similarity Analysis
The research method in this paper is to compare the target patents with related patents using the analysis method of fused images and SAO structures. In the previous SAO

Multimodal Patent Similarity Analysis
The research method in this paper is to compare the target patents with related patents using the analysis method of fused images and SAO structures. In the previous SAO structure, to obtain patent similarity, the degree of similarity between patent texts was obtained only by the similarity of the SAO structure. The abstract of the patent text contains a comprehensive overview of the features of the invention, and the claims contain a detailed overview of the content of the invention; rich in content, the amount of content of the abstract and the claims are large, and the corpus available is numerous, which is suitable for studying patent infringement and patent similarity. In this paper, we choose to combine image information with the SAO structure to accurately promote patent similarity. The specific implementation process is shown in Figure 10.

Multimodal Patent Similarity Analysis
The research method in this paper is to compare the target patents with related tents using the analysis method of fused images and SAO structures. In the previous structure, to obtain patent similarity, the degree of similarity between patent texts obtained only by the similarity of the SAO structure. The abstract of the patent text tains a comprehensive overview of the features of the invention, and the claims conta detailed overview of the content of the invention; rich in content, the amount of con of the abstract and the claims are large, and the corpus available is numerous, whi suitable for studying patent infringement and patent similarity. In this paper, we ch to combine image information with the SAO structure to accurately promote patent s larity. The specific implementation process is shown in Figure 10.

1.
First, the SAO structure in the patent is extracted, and the resulting SAO structure is preprocessed using standard preprocessing. Second, the patent contour is extracted from the drawings attached to the abstract in the patent to preserve internal information. At the end of the process, each patent corresponds to an SAO set and a processed patent image.

2.
Based on semantic information, the SAO structure similarity and the similarity between the related patents containing the SAO structure set and the target patent are calculated. Each patent contains an SAO structure set, and the similarity of the SAO set is obtained to indicate the similarity of the patent, and the Hungarian algorithm is applied to obtain the corresponding similarity of the SAO structure set.

3.
Calculate the similarity of image features between related patents and target patents, detect the contour of the patent image, reconstruct the contour map using Fourier descriptors, retain the image within the contour, and calculate image similarity using the mutual information method. Finally, combine the weighting of patent text similarity to obtain the overall patent similarity.

4.
The TF-IDF method, the SAO structure method, the DWSAO method, and the Sentence Bidirectional Encoder Representations from Transformers (SBERT) method are used to calculate patent similarity between the target patent and related patents and to compare the accuracy of different methods.

Contour Extraction
After extracting the patented contours in the image, the image is first blurred using median filtering to reduce noise. Median filtering is a nonlinear smoothing technique that replaces the median of the gray values of pixel points in the eight neighborhoods around a point with the gray values of pixels at that point, and the process is shown in Figure 11a, and the grayscale before and after the change is shown in Figure 11b.
detect the contour of the patent image, reconstruct the contour map using Fourier descriptors, retain the image within the contour, and calculate image similarity using the mutual information method. Finally, combine the weighting of patent text similarity to obtain the overall patent similarity. 4. The TF-IDF method, the SAO structure method, the DWSAO method, and the Sentence Bidirectional Encoder Representations from Transformers (SBERT) method are used to calculate patent similarity between the target patent and related patents and to compare the accuracy of different methods.

Contour Extraction
After extracting the patented contours in the image, the image is first blurred using median filtering to reduce noise. Median filtering is a nonlinear smoothing technique that replaces the median of the gray values of pixel points in the eight neighborhoods around a point with the gray values of pixels at that point, and the process is shown in Figure  11(a), and the grayscale before and after the change is shown in Figure 11b.
Before filtering: After filtering:  Secondly, binarization is performed to facilitate contour extraction, and adaptive thresholding tends to localize the threshold by averaging the pixel value of a pixel point with the pixel value of the region in which the point is located to determine whether the point belongs to 0 or 1.
Finally, for the pre-processed image, the Fourier operator eight-neighborhood detection is performed to extract the contour point coordinates and retain the contour point coordinates with the largest area of the region, after which the sweep profile is reconstructed by the Fourier descriptor to retain valid image information within the contour map before and after treatment, as shown in Figure 12.
Image alignment methods based on mutual information were widely used in the field of image alignment. In this paper, we characterize the similarity between two images by calculating their mutual information. Table 6 shows the top ten patent serial numbers and patent image similarity.
with the pixel value of the region in which the point is located to determine whether the point belongs to 0 or 1.
Finally, for the pre-processed image, the Fourier operator eight-neighborhood detection is performed to extract the contour point coordinates and retain the contour point coordinates with the largest area of the region, after which the sweep profile is reconstructed by the Fourier descriptor to retain valid image information within the contour map before and after treatment, as shown in Figure 12. Image alignment methods based on mutual information were widely used in the field of image alignment. In this paper, we characterize the similarity between two images by calculating their mutual information. Table 6 shows the top ten patent serial numbers and patent image similarity. The patent similarity calculation is based on the fusion of the patent image similarity method and the patent text similarity method, as shown in Equation (9).

Threshold Selected
We compared several sets of thresholds and determined the appropriate threshold from them, as shown in Table 7. Patent images are auxiliary to the patent text, giving visual effects to the patent text and thus making it easier for the reader to understand the patent, so the weights of the patent images are lower than the weights of the patent text. The combinations in Table 6 are all confirmed, with combination 4 being the most effective, so α is selected as 0.8 and β as 0.2.

Patent Similarity between Target Patents and Related Patents
In this paper, the target patent US20180318860A1 is selected as the target patent and compared with other related patents, similarity is ranked using the method proposed in this paper, and Table 8 shows the top ten patent serial numbers and patent similarity.

Analysis and Validation of Results
To further verify the effectiveness of the method, the proposed method in this paper was compared with the TF-IDF method, SAO method, DWSAO method, and SBERT method. The purpose of the experiment was to find patents with a high degree of similarity to the target patent. In cooperation with the patent office, the university invited three experts, one of whom was a university professor, another an industry expert with more than five years of experience, and the last was a patent examiner who was practicing for five years, to assess similarity based on the functional features involved in the patent and the technical means used to solve the technical problem. The top ten most similar patents were obtained by ranking them from top to bottom, and Table 9 shows the results of this analysis. To show the average variation value of each method more graphically, the results are presented in Figure 13. As shown in the table, the absolute difference between manual reading and TF-IDF is 62, and the mean difference is 6.2. The absolute difference between manual reading and the traditional SAO method is 43, and the mean difference is 4.3. The absolute difference between manual reading and DWSAO is 43, and the mean difference is 4.3. The absolute difference between manual reading and SBERT is 61, and the mean difference is 6.1. The TF-IDF method is less accurate than the SAO method because the SAO structure reflects the structural relationship between engineering components. Park et al. [18] concluded such as from design for mass customization to design for mass personalization [61]. To be sure, the accuracy of patent similarity measurement will be improved by the aboveimproved methods.