Design and Testing of Automatic Machine Translation System Based on Chinese-English Phrase Translation

With the development of linguistics and the improvement of computer performance, the effect of machine translation is getting better and better, and it is widely used.+e automatic expression translation method based on the Chinese-English machine takes short sentences as the basic translation unit and makes full use of the order of short sentences. Compared with word-based statistical machine translation methods, the effect is greatly improved. +e performance of machine translation is constantly improving. +is article aims to study the design of phrase-based automatic machine translation systems by introducing machine translation methods and Chinese-English phrase translation, explore the design and testing of machine automatic translation systems based on the combination of Chinese-English phrase translation, and explain the role of machine automatic translation in promoting the development of translation. In this article, through the combination of machine translation experiments and machine automatic translation system design methods, the design and testing of machine automatic translation systems based on Chinese-English phrase translation combinations are studied to cultivate people’s understanding of language, knowledge, and intelligence and then help solve other problems. Language processing issues promote the development of corpus linguistics. +e experimental results in this article show that when the Chinese-English phrase translation probability table is changed from 82% to 51%, the BLEU translation evaluation system for the combination of Chinese-English phrases is improved. Automatic machine translation saves time and energy of translation work, which shows that machine translation shows its advantages due to its short development cycle and easy processing of large-scale corpora.


Introduction
People express their emotions through language, which is an important tool for communication between people. erefore, it is more and more important to overcome communication barriers between languages in the 21st century. Machine automatic translation is a meaningful and complicated research and full of challenges and difficulties. Continuous and high-quality automatic translation machine research is one of the ultimate goals of computing and language research, which is the main trend of future development.
Machine automatic translation is becoming more and more important in today's society, and its potential is huge with the rapid economic development. Everyday people from all walks of life deal with a large number of documents, and people of different languages communicate with each other. erefore, machine automatic translation has a great market demand, and only a very large amount of information can meet the needs of translation. With the combination of Chinese and English, automatic machine translation has become the most common method at present, which has the benefit of greatly facilitating people's lives, and it is also a simple data warehouse.
Sangeetha and Jothilakshmi proposed a speech-tospeech translation system, which mainly focuses on translation from English to Dravidian. e three main technologies involved in the SST system are automatic continuous speech recognition, machine translation, and text-to-speech synthesis systems. Based on automatic associative neural network, vector support mechanism, and hidden Markov model, automatic continuous speech recognition has been developed. Compared with SVM and AANN, HMM produces better results, but it currently lacks specific data to prove [1]. Shereen and Mohamed believes that deaf-mute people are an important part of the growing community, and they use sign language. However, communication between normal people and hearing-impaired people becomes difficult because most normal people cannot understand the meaning of sign language gestures, while deaf-mute people cannot understand natural spoken language. ere are approximately 70 million deaf and hearing-impaired people in the world, as well as people who use sign language as their mother tongue or mother tongue. e analysis of the existing system provides us with the necessary information about its work process, success rate, shortcomings, and limitations, and its development is relatively vague [2]. In order to improve the accuracy of automatic machine translation, Sangeetha and Jothilakshmi proposed a study to improve the efficiency of machine translation when necessary. For this reason, based on the adjustment of English context and the mutual information between words in English words, they proposed an automatic translation system based on semantic relations [1]. e innovation of this article lies in the investigation and study of the method probability of Chinese-English phrase translation and the combined machine automatic translation system of Chinese-English phrase translation. e systematic research and experimentation of automatic translators are of great significance. To a certain extent, it can promote the rapid and in-depth dissemination of international information.

Process of Machine Translation of Phrases.
Machine translation experiments found that the translation model in the basic IBM machine translation equation was replaced by the reverse translation model, but the accuracy of translation was not reduced by the automatic translation machine, which could not be passed through the channel theory [3,4]. erefore, the maximum entropy based on machine translation is proposed. is more general method is a statistical method of machine translation based on the source channel [5,6]. Characterizing the maximum entropy, language format and translation mode, and adding them to the model framework, the main advantage is that it can easily integrate knowledge sources and automatically weight between knowledge sources. Most current statistical machine translation methods use the highest entropy modeling framework [7,8]. e automatic machine translation modeling of phrases is shown in Figure 1.

Method and Process of Machine Translation Based on
Phrase Structure

Corpus Preprocessing.
e processing level of the corpus directly affects the translation results. Statistical machine translation usually uses a bilingual corpus and prepares Chinese and English corpora separately [9,10]. e results of corpus preprocessing are shown in Table 1.

Implementing the Title Translation System in the Aviation Field.
On the basis of researching related machine translation theory, using some existing resources and tools, we complete the phrase translation model module, realize the phrase-based statistical machine translation system, and introduce the basic working principle of the system, system implementation, and system operating environment settings and parameters [11,12].

Automatic Evaluation Technology of Machine
Translation. Based on the research of machine automatic translation technology, the results of Chinese-English machine automatic translation are automatically evaluated. In the field of statistical machine learning, there are already some methods to solve domain adaptation problems [13,14]. But most of them are only used to solve simple learning problems (such as classification or regression). In the face of structured learning problems such as machine translation, different domain adaptive methods are used to solve them separately under the machine learning framework [15]. e application of machine translation automatic evaluation technology is shown in Figure 2.

Stack Search Translation Method.
Stack search utilizes a research and exploratory method. Before strengthening the search of n heaps, the number n is the number of words in the source language sentence, and each state data hypothesis is stored in the stack extension [16]. "I" is translated as "she," and "flower" is derived from the word "flowers." Both hypotheses are in the first stack of the stack search translation method [7]. Also, in the second stack, the molecule adds  Step Complaint text 1 Introduce custom dictionary 2 Segment words and mark part of speech 3 Remove stop words 4 Keep important part of speech words 5 Keep processing results information about the source language translation of the two terms. For the source language words that have been translated, the stack cost is low, so they are determined as the best translation [8]. e stack search conversion is shown in Figure 3.

Phrase-Based Statistical Machine Translation.
e basic idea is to use phrases as the basic unit of translation. In the process of transfer, everyone's translation of phrases is not the same everywhere, and there are various opinions and interpretations at the same time. In grammatical sense, if only the phrase lines are not continuous, we still need to solve the problem of the overall coherence of the full text. In order to expand the transmission of these contents, we can easily solve the local problem in the same way. Context-dependent issues and explanations of phrases in all languages using this method can maintain the original state of the language to the greatest extent. Generally speaking, the so-called free grammar method can be a continuous line subnavigation. erefore, Chinese-English translation of words must be carried out to extract the viewpoint of double-body protection, and the process of rulebased machine translation is shown in Table 2. Table. In the output file of the phrase output module, each line contains some Chinese phrases, English phrases, and translation probability values:

Defining the Format of Phrase Translation Probability
. (1) Lexicalized translation probability: e BLEU evaluation tool is currently the most widely used indicator in international machine translation evaluation. It compares the system translation with the reference translation, calculates the accuracy of each system translation, and finally records the entire translation. It is calculated as follows:

Vector Machine Algorithm.
Where P is the penalty length factor, B is the shortest length of the reference translation of the tested sentence, and R is the translation length of the tested sentence, that is, the number of words contained in the entire output translation.
In the current statistical method, the shared modernity of indecent words indicates the fidelity of translation. It means that a word has been translated in the original text, and a dictionary with more than two yuan appears at the same time to indicate the fluency of the target language: e way and form of this formula equal to half is calculated as follows: is is the minimum error rate during editing. e score ranges from 0 to 1. e scores are different for editing. e so-called edit distance is the minimum cost of insertion, deletion, and replacement operations performed by converting the system output into a reference translation:

Automatic Evaluation Model of Machine
Translation. e logarithmic linear model is introduced into statistical translation, which can add any number of features to the translation process and determine the contribution of each feature to the translation result by weighting these features. erefore, the effect of phrase-based translation system developed by them is far better than that of wordbased translation system. For formal syntax model rules, the formula is as follows: According to CKY algorithm, we can construct hypergraph from sentences of source language. When we calculate the k-best derivation of a node, the ranking of the dimension of rules no longer only depends on the score of syntactic rule features. We use heuristic function H (R) to sort the rules:

Phrase-Based Statistical Machine Translation.
e basic idea is to use machine translation as the basic unit of phrase translation. In the translation process, each translated word must be combined with context and constrained translation during the translation process. But generally speaking, no grammar is performed in the same way. In this way, the twobody alignment should be removed from the bilingual excerpt. Given a source language sentence, the sentences used for the translation process model are as follows: the source is divided into phrase sentences and language word viewpoints, and the order is adjusted according to the interpretation target model of each sentence. Phrases are used as the basic unit of translation. e Chinese sentence interpretation system is used to divide many sentences into socalled "phrases" and then translate them into English. e generated phrases and output are shown in Table 3.

Translation Process.
It mainly includes the following parts: model phrase translation, translation model training, language training, and trial transmission of decoding results. ese parts are scattered in the form of a flowchart. From the perspective of the translation science model, each table is best to learn Chinese phrases from the English interpretation of English sentences and arrange them in a row as shown in the flowchart in Figure 4.
Traditional word alignment-based heuristic phrase extraction methods will have word alignment errors and wordto-space problems, which leads to the loss of many bisyntactic phrases. On the other hand, the bilingual phrases extracted from bilingual phrases in this paper are bilingual phrases with better quality. erefore, we consider adding the extracted bisyntactic phrases to the phrase table to make up for the bisyntactic phrases lost by the heuristic phrase extraction method. e experiment uses the provided training set, development set, and test set. e source language of the corpus is Chinese, and the target language is English. e scale of the experimental data is shown in Table 4.
It can be seen from the table that English sentences are on average longer than Chinese sentences, and both Chinese and English sentences are longer, especially when the average length of English sentences reaches one word, which brings difficulty to syntactic analysis. We analyze the syntax of the source language and the target language and extract bilingual phrases using an iterative phrase extraction algorithm. According to the Chinese-English phrase translation training set, there are 120,000 Chinese-English bilingually aligned sentences, and the test corpus contains 141 sentences. In the experiment, this paper uses the C value and the degree of adhesion to reduce the source language. It is added to the translation model as a function, and the translation results are compared with the reference frame. First, no matter how long the sentence is, the possibility of translation is lower than the C value of the source code, and it can be seen that the BLEU evaluation can be improved by 0.02 at most compared with the benchmark system, while the phrase translation probability table is only 78% of the original. When the phrase translation probability table is reduced to 51% of the original, the BLEU evaluation is still slightly higher than the benchmark system. e experimental results are shown in Figure 5. e input of the bilingual phrase extraction algorithm is an aligned Chinese-English bilingual tree, so it is necessary to perform syntactic analysis on the source language end and the target language end of the training corpus separately. e bilingual phrase extraction algorithm extracts bilingual phrases based on word alignment. e training corpus is the training corpus that has been word aligned, so it is no longer necessary to apply word alignment to the training corpus. We run the bisyntactic phrase extraction algorithm and temporarily store the extracted bisyntactic phrases. is experiment needs to run four different machine translation systems.
ese systems are statistical machines based on bisyntactic phrases that are generated after the extracted bisyntactic phrases are applied to the system.

Chinese-English Translation Corpus.
e Chinese-English translation corpus is used. is corpus contains more than 10,000 words and 10,000 pairs of sentences. is article finds that the best translation effect can be achieved by using it as a means of word alignment extraction in Chinese-English translation. Used as the evaluation standard, the calculation script adopts the standard script.
is article uses ten thousand sentence pairs in ten thousand pairs of sentences as the training corpus, and the number of short sentence   Mobile Information Systems 5 pairs extracted by the method is regarded as the parameter in the linear rearrangement model, as shown in Figure 6. As shown in Figure 6, the fragment probability phrases are used in the Chinese-English translation, thereby improving performance. rough the data test of an example, after completing the translation process, machine translation is introduced into the system, and a partition system is established.
e model and module are given, and the existing local resources and document resources are used

Model Training and Parameter
Setting. e evaluation of machine translation mainly includes manual evaluation and automatic evaluation. e advantage of manual evaluation is high accuracy, but the disadvantage is that the labor cost and time cost are too high. e advantages of automatic evaluation are low cost, fast speed, and the ability to be used repeatedly. e disadvantage is low accuracy. At present, the focus of machine translation evaluation research is how to improve the rate of automatic evaluation. e test set of CSTAR 2003 is the development set of the experiment. Some features of the corpus are shown in Table 5. e phrase is extracted from the training set, and the English part of the training set is trained by language model tool. e feature model of phrase model and formal syntax model is reduced by a 3-element language model; in order to speed up the training of minimum error rate and save memory space, the development set and test set are used to filter these models.
e characteristics of the model are shown in Table 6. e evaluation of machine translation plays an important role in the research of machine translation technology and the promotion of market. Manual evaluation refers to the evaluation of candidate translations given by machine translation system according to certain standards and norms. Automatic evaluation is the use of machines to complete the scoring process, but it requires that the results of scoring are consistent as much as possible with the person's score; the training of machine translation is shown in Figure 7.
Machine translation evaluation, in short, is the evaluation of all aspects of machine translation in order to correctly and objectively reflect the achievements and functions of machine translation. e significance of machine translation evaluation is to find out the problems existing in the research and development of machine translation system by evaluating the performance and development level of machine translation, define the goal, find solutions, provide direction for the improvement of the existing machine translation system, and constantly improve the translation quality of machine translation system; the paradigm of machine translation is shown in Table 7.
Machine translation is a reliable way to evaluate the performance of a translation system. However, it usually takes time and effort to organize a manual evaluation. e use of automatic evaluation tools can greatly reduce the cost of evaluation, analyze the system performance in time,    improve the system targeted, and shorten the product development cycle; the neural machine translation system is shown in Figure 8.

Conclusions
is article extends the discussion from the perspective of automatic translation systems for mechanical design. e machine translation system is a large-scale system composed of several modules, which can complete the translation work. is article makes full use of the existing resources and tools in the literature, briefly describes the phrase and probability of phrase translation, and integrates these tools and modules, and we believe that building a machine translation system based on statistical results means an attempt that cannot be done by learning translators. Automatic machine translation is a complete process that integrates the development of concepts, opens up the use of existing resources, and adds modules such as repositories, dictionaries, and so on. e decision is based on the results of statistical machine translation methods that can achieve better translation results.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that they have no conflicts of interest.