A study of Grammar Analysis in English Teaching with Deep Learning Algorithm

— In English teaching, grammar is a very important part. Based on the seq2seq model, a grammar analysis method combining the attention mechanism, word embedding and CNN seq2seq was designed using the deep learning algorithm, then the algorithm training was completed on NUCLE, and it was tested on CoNIL-2014. The experimental results showed that 5


Introduction
In English teaching, grammar analysis is a very important part.Students can master the structure, rules and other knowledge of English through the study of grammar, so as to better master the use of English language and lay a solid foundation for the improvement of reading, speaking, writing and other abilities, which is also conducive to the cultivation of students' English application ability and the future work and learning of students.Therefore, the study of grammar is of great significance in English teaching.The correction of grammar works of students usually takes a lot of time and effort of teachers.With the development of computer technology, the application of artificial intelligence in education has become more and more popular [1], and methods such as machine learning [2] and deep learning have been widely used in language processing [3], which also provides a new possibility for the automatic analysis of English grammar.Tang et al. [4] designed an online patient question answering system based on deep learning.With that system, patients can find semantically similar questions in Q & A to get answers without waiting for doctors' answers.This method was based on deep neural network and could predict the similarity between questions.They verified the effectiveness of the system through experiments.Chen et al. [5] analyzed the performance of deep learning in radiology text report classification, compared the performance of convolutional neural network (CNN) model and PeFinder, and found that the accuracy of CNN was 99%, F1 value was 0.938, and PeFinder was 0.867 in the internal validation report data set, which verified the reliability of CNN model.Zhang et al. [6] analyzed event recognition.After sentence segmentation, the words were divided into different categories and vectorized.Then, the deep belief network (DBN) was used for training.The test found that the maximum F value of the method was 85.17%, and the F value of the dynamic supervision DBN was 88.11%, showing better recognition performance.Yousfi et al. [7] recognized the text in Arabic video with Recurrent Neural Network (RNN), combined RNN model in long short-term memory, and introduced super parameters to improve the recognition effect.They found that the recognition rate increased by 16% and the response time was relatively reasonable.This study improved the seq2seq model, trained and tested it, and analyzed its performance in the grammar analysis, which makes some contributions to improving the efficiency of English teaching, reducing the burden of teachers and improving the learning effect of students.

2
Deep Learning Related Concepts

Deep learning
Deep learning can be said to be a kind of machine learning and a part of artificial intelligence.It is developed from artificial neural network (ANN).It can make machines have certain functions through some learning methods, so as to complete some tasks instead of human beings.It has a wide range of applications in many fields, such as image processing [8], automatic classification [9] and disease prediction [10].

RNN
When processing temporal order information, RNN [11] can take current input and previous input information into account simultaneously.It has a good applicability in language processing.Its structure mainly includes: (1) input layer (2) output layer , and the weight of layers is xh w , hh w and hy w respectively.The calculation formulas are: ( ) ( ) 15, No. 18, 2020 where f refers to the activation function.It is seen from the formulas that the output of RNN at time n is related not only to current input n x , but also to all the previous inputs.

Seq2seq model
Seq2seq model is a kind of deep learning model, which refers to the mapping from sequence to sequence [12].English teaching grammar analysis can also be seen as the mapping from the text sequence with grammatical errors to the correct text sequence; therefore, seq2seq model can be used.Seq2seq has Encoder-Decoder structure, generally RNN structure, and a terminator <EOS> at the end.Taking the translation from sentence "ABC" to "WXYZ" as an example.Seq2seq structure is shown in Fig. 1.
, and n and ' n may not be equal.RNN is used to evaluate condi- tional probability: ( ) In grammar analysis, the input of seq2seq is the sentence with grammatical errors, and the output is the correct sentence.

Attention mechanism
In seq2seq, if the input sentence is long, the detail information may be lost.In order to solve this problem, the attention mechanism [13] can be introduced to enable the decoder to automatically focus on different parts of the sentence and improve the decoding ability.It is assumed that Decoder has a state of n s at the n -th time, the output is n y , the semantic information is n c , then: , where  ( ) In grammatical analysis, the detection and correction of grammatical errors are generally determined by local words.For example, in the selection of articles, "a" and "an" depend on the following nouns.Therefore, after the introduction of the attention mechanism, the seq2seq model can perform better in grammatical analysis.

Word embedding
In order to make the computer understand the natural language, it is necessary to perform vectorization operation on the text to turn it into numbers.GloVe method [14] was used in this study: for a corpus, co-occurrence matrix is represented by M , and the times of word j appearing in the context of word i is represented by ij M , then its co-occurrence probability is:

CNN-seq2seq
Compared with RNN, CNN has higher efficiency in capturing local information and shorter training time.In grammatical analysis, grammatical errors are generally local.Therefore, CNN can be used instead of RNN, and CNN-seq2seq is obtained.
In CNN-seq2seq model, both Decoder and Encoder use CNN and combine the gating linear unit (GLU) in the nonlinear part, and the attention mechanism is added at the same time.It is assumed that the word vector of the input sentence is ( ) , position vector is ( ) , then the word vector of the output sentence is ( ) . The output of the l -th layer of Encoder is ( ) . The output of the l -th layer of Decoder is ( ) . The size of convolution kernel is . The formula of GLU can be expressed as: , where  stands for dot product and ( ) where i g stands for the word vector information before l i q , j e stands for the word vector information of the input word, and l ij a stands for the weight.CNN-seq2seq can fully find the hidden information in sentences and has a high computing speed.In addition, it combines GLU and attention mechanism, showing a better performance in grammatical analysis.

4
Experimental Results

Experimental data set
Training set: The NUCLE data set [15] of National University of Singapore was used, which is written non-English native students according to certain topics, such as environmental pollution, medical health, etc.There are 1397 articles in total in the data set, which are marked and corrected by professional English teachers.Some errors are shown in Table 1.Test set: CoNIL-2014 was used as the test set [16], in which there are 1312 sentences.CoNIL-2014 covered 28 kinds of grammatical errors, and the errors have been marked and corrected by two native English speakers.Some errors are shown in Table 2.

Evaluation criteria
The evaluation of CoNIL-2014 takes 5 .0 F as the standard, and its calculation formula is as follows: ( ) , A stands for the correctly marked sentence, B stands for the wrong sentence that is not marked out, and C stands for the correct sentence which is marked as wrong.

Experimental results
This study analyzed the performance of the grammar analysis method which combined the Attention mechanism, word embedding and CNN-seq2seq and compared the grammar analysis effect of the basic seq2seq, seq2seq + attention, CAMB [17] and the method designed in this study on CoNIL-2014.The results are shown in Fig. 2.

Fig. 2. Comparison of analysis results between different methods
It was seen from Fig. 2 that the effect of the basic seq2seq on grammar analysis was relatively poor, with low P and R values, and the value of

F
was only 21.27%, which showed that the efficiency of this method in the process of grammar analysis was relatively low if the correct sentences are marked as errors or the wrong sentences are not marked out.After the attention mechanism was added, the effect of grammar analysis significantly improved, the P value increased by 36.91%, the R value increased by 29.28%, and 5 .0 F was 28.38%, 33.43% higher than the basic seq2seq, which verified that the attention mechanism improved the model.Compared with seq2seq + attention, the R value of CAMB significantly improved, which was about twice of the former, indicating that the CAMB method had better performance in identifying sentence errors, but slightly lower accuracy.Values of P, R and 5 .0 F of the method proposed in this study were large; compared to CAMB, the P value improved 59.33%, the R value improved 8.9%, and 5 .0 F improved 42.91%, which verified the performance of the proposed method in the grammar analysis.
The actual grammar homework of students were corrected by the method proposed in this study, and 100 sentences were collected.Some results of the grammar analysis are shown in Table 3.In (1), the method corrected the error of modal verbs in the sentence; in (2), it corrected the error of subject predicate agreement; in (3), it corrected the error of verb tense; in (4), it added the true article "a"; in (5), it corrected the error of singular and plural nouns; but in (6), the article "a" in the sentence was redundant and should be deleted, but the method determined it as the correct sentence output.Among these 100 sentences, 71 sentences were found and corrected correctly.Although there were also cases of misjudgment and missed judgment, the overall performance was high.

Discussion
With the progress of computer and the development of artificial intelligence technology, computer has been more and more widely used in the processing of natural language.Writing a good computer program can realize the analysis and processing of human language, such as the classification of different texts, the translation of different languages [18], the analysis of emotions contained in languages [19], and opinion mining [20], and reading text automatically.In English teaching, grammar is a difficult but very important part, which will directly affect listening, speaking, writing, etc. [21].In the process of homework, there are often errors such as tense errors and misuse of words.In the process of correction, teachers inevitably make mistakes or omissions.If teachers can use computer technology to detect and correct the grammatical errors in students' work, it will be of great positive significance for teachers and students.
Based on the deep learning method, this study designed a grammar analysis method which combined the attention mechanism, word embedding and CNN-seq2seq model.It was trained on NUCLE and tested on CoNIL-2014.Compared with other models, it was found that the proposed method had a good performance in the grammar analysis.Firstly, after combining with the attention mechanism, the 5 .0 F of the basic seq2seq improved 33.43%, which verified the effectiveness of the attention mechanism.Then it was found from the comparison with CAMB that the P value and R value of the proposed method increased by 59.33% and 8.9% respectively, and 5 .0 F improved 42.91%, which showed that the method had a good performance in marking and correcting grammatical errors in the process of grammatical analysis and iJET -Vol.15, No. 18, 2020 could achieve accurate grammatical analysis.Then, in the actual students' grammatical analysis, it was also found that the proposed method could accurately find and modify the grammatical errors in sentences and showed a good performance although there were some wrong and missed corrections.
Although some achievements have been made in the research of deep learningbased grammar analysis method in English teaching in this study, there are still some shortcomings that need to be solved in the future work, such as: 1. Further improving the recall rate of grammatical analysis methods.2. Expanding the scale of data set to train the model better.3. Considering the performance of other deep learning methods in grammar analysis.

Conclusion
Based on the seq2seq model in deep learning, this study designed a grammar analysis method which combined the attention mechanism, word embedding and CNN-seq2seq and trained and tested it.The results showed that: 1.The addition of the attention mechanism significantly improved the accuracy of the model.2. Compared with CAMB, the P value and R value of the proposed method were 59.33% and 8.9% larger respectively, and 5 .0

F
was 42.91% larger.3. The proposed method also had a good performance in the analysis of the actual grammar homework of students.

Z
stands for the training set, I is the input sequence, and O is the output sequence, i.e. the translated sentence.The translation result O ˆ can be obtained through the decoder RNN:

1 .
nj a is the semantic vector weight at time n and j , and is the relationship between the state of Decoder at time 1 − n and the j -th state of Encoder, and where ij M stands for the times of any word appearing in the context of word i , k stands for the size of the convolution- iJET -Vol.15, No. 18, 2020 Paper-A study of Grammar Analysis in English Teaching with Deep Learning Algorithm al window and d stands for the dimension.The vector generated by every time of convolution can be expressed as: stands for the extent to which parts of A need to be partially retained.The attention weight of the model is determined by the current output l i q of Decoder and all the outputs of Encoder.This weight can weight the output of Encoder to get input sentence vector

Table 3 .
The analysis results of actual grammar homework of students