Multi-label learning for improving discretely-modulated continuous-variable quantum key distribution

We propose a novel scheme for discretely-modulated continuous-variable quantum key distribution (CVQKD) using machine learning technologies, which called multi-label learning-based CVQKD (ML-CVQKD). In particular, the proposed scheme divides the whole quantum system into state learning process and state prediction process. The former is used for training and estimating classifier, and the latter is used for generating final secret key. Meanwhile, a multi-label classification algorithm (MLCA) is also designed as an embedded classifier for distinguishing coherent state. Feature extraction for coherent state and related machine learning-based metrics for the quantum classifier are successively suggested. Security analysis based on the linear bosonic channel assumption shows that MLCA-embedded ML-CVQKD outperforms other existing discretely-modulated CVQKD protocols, such as four-state protocol and eight-state protocol, as well as the original Gaussian-modulated CVQKD protocol, and it will be further enhanced with the increase of modulation variance.


Introduction
For decades, continuous-variable quantum key distribution (CVQKD) [1,2] has been a hotspot in quantum communication and quantum cryptography. It provides an approach to allows two distant legitimate partners, Alice and Bob, to share a random secure key over insecure quantum and classical channels. One of the advantage of CVQKD protocol is that the most state-of-art telecommunication technologies can be compatible to CVQKD protocols, so that one may apply CVQKD system to the practical communication network in use. CVQKD protocols have been shown to be secure against arbitrary collective attacks, which are optimal in both the asymptotic limit [3][4][5][6] and the finite-size regime [7,8]. Recently, CVQKD is further proved to be secure against collective attacks in composable security [9].
In general, there are two modulation approaches in CVQKD protocol, i.e., Gaussian-modulated CVQKD protocol [10,11] and discretely-modulated CVQKD protocol [12,13]. For the first approach, Alice usually encodes key bits in the quadratures (p andq) of optical field [14], while Bob can restore the secret key bits through high-speed and high-efficiency coherent detection techniques. This strategy usually has a repetition rate higher than that of single-photon detections so that Gaussian-modulated CVQKD could potentially achieve higher secret key rate. However, it seems unfortunately limited to much shorter distance due to the problem of quite low reconciliation efficiency in long-distance transmission. For the second approach, it generates several nonorthogonal coherent states and exploits the sign of the measured quadrature of each state to encode information. This modulation strategy has a discrete set of choices instead of a continuous set for the choice ofp and for the choice ofq quadratures, thereby validating most excellent error-correcting codes even at low signal-to-noise ratio (SNR). However, very small modulation variance is needed State preparation. Alice first prepares a modulated coherent state, and sends it to Bob through an untrusted quantum channel.
Measurement. Bob then measures the incoming state with coherent detection so that Alice and Bob share two correlated sets variables, i.e. raw key.
Reconciliation. One first needs to discretize the yielded raw key if Gaussian modulation is used, and then a linear error correcting code, usually a low density parity check (LDPC) code [38], is exploited to automatically reconciliate the data between Alice and Bob.
Parameter estimation. Bob sends a part of bits of information to Alice that allow her to infer the characteristic of the quantum channel and compute the covariance matrix of quantum system. Privacy amplification. Alice and Bob apply a random hash function to their respective strings so that they can obtain two identical strings, i.e., secret key.
The above process is based on information theory, so that one needs to correct the erroneous code and evaluate the quality of quantum channel, which makes reconciliation and parameter estimation become the most crucial steps. Therefore, a part of resources such as the generated raw key, storage and computing of devices have to be inevitably sacrificed for these two steps. Moreover, small modulation variance is needed to keep discretely-modulated CVQKD protocol safe, otherwise eavesdropper may perfectly launch the intercept-resend attack without detection [6].

Process of ML-CVQKD
Multi-label learning involves three modules, i.e., training, testing and prediction. Each module is responsible for respective task, i.e. modeling, evaluation and classification. Specifically, to construct the classifier, a training set is first used for learning the classification rules. Subsequently, another set of data, called testing set, is exploited to evaluate the classifier's performance. Finally, the trained classifier can be used for predicting unknown data if the evaluation is passed. Inspired by the process of multi-label learning, ML-CVQKD is proposed, which includes two parts: state learning and state prediction. As shown in figure 2, we explain each step as follows.

State learning
Step 1. Alice first prepares modulated coherent states, and sends them to Bob through noisy and lossy quantum channel. The label information of each state is also sent to Bob via auxiliary classical channel.
Step 2. Bob measures the incoming states with coherent detector thereby obtains the measurement results. Note that these results are similar to, but not identical to the modulated information sent by Alice, this is because the transmitted signals are inevitably distorted by several negative effects such as channel noise and loss.
Step 3. Bob then extracts features from the obtained labeled coherent states, these features are prepared to training and testing classifier. Step 4. After collecting sufficient featured data, Bob divides them into two datasets, i.e. training set and testing set. The former is used for training classifier, and the latter is used to evaluate the classifier's performance. Finally, a well-behaved classifier for ML-CVQKD is prepared if testing is passed.
Once the classifier has been trained successfully, one does not need to perform state learning repeatedly. The system is ready for generating secret key.

State prediction
Step 1. Alice prepares modulated coherent states and sends them to Bob through untrusted quantum channel.
Step 2. Bob obtains the measurement results (unlabeled) by measuring the received states with coherent detector.
Step 3. Bob first extracts features from the obtained unknown data, these features are subsequently used as input data for the prepared classifier.
Step 4. Bob classifies the input data using the well-behaved classifier, so that he can predict the state that Alice sent to him. After many rounds of prediction, Alice and Bob share a string of key.
Step 5. A linear error correcting code, usually a LDPC code, is applied to automatically correct the data between Alice and Bob.
Step 6. Bob sends a part of bits of information to Alice that allow her to infer the characteristic of the quantum channel and compute the covariance matrix of quantum system.
Step 7. To further enhance the security, Alice and Bob apply a random hash function to their respective strings. Finally, they respectively obtain two identical strings, i.e. secret key.
Similar to the basic assumption of QKD in which both Alice's side and Bob's side cannot be compromised (device-independent-QKD excluded), the steps of state learning has to be done without eavesdropping. That is to say, assuming state learning is trusted, otherwise Eve may have the opportunity to learn the knowledges about the classifier, rendering information leakage. This assumption can be implemented through security monitoring of the communication system when initially deploying the proposed scheme. We do not intend to detail it, as the monitoring approach is not the key point of this paper. Even so, we point out that an efficient way for monitoring state learning is necessary, we leave this issue to our future study. On the other hand, the steps in state prediction look similar to conventional CVQKD process, however, it is quite different. First of all, the steps of feature extraction and classifier are added to the process, these two steps are the point of ML-CVQKD scheme. Secondly, the data format is different, coherent state is represented by several robust features rather than quadratures itself, these proper features are conducive to improve the performance of quantum classifier.

Multi-label classification algorithm
MLCA is derived from a traditional lazy learning approach called k-nearest neighbor (kNN) [39]. It selects k NNs for each unknown data point. Based on the number of neighboring data belonging to each possible class, maximum a posteriori principle can be exploited to allocate the label to the unknown data point. Figure 3 depicts an example of kNN approach in feature space. The green circle denotes an unknown data point, while triangles and rectangles represent the labeled data points which respectively belong to red class and yellow class. The green circle will be assigned to the red class for k = 3, since two of the three-nearest labeled data points belong to the red class while only one point belongs to the yellow class. Similarly, the green circle will be labeled as yellow class for k = 7.
kNN classifies unknown data in feature space which enlightens us that coherent states can probably be classified in its phase space. As shown in figure 4(a), the phase space is divided into several regions, which are labeled as L i (i = 1, 2, 3, 4) according to their located quadrant. We find that each QPSK-modulated coherent state belongs to a single label, which can be deemed quantum single-label learning problem depicted in figure 4(c). However, with the development of modulation technique, single-label learning is not suitable to address high-dimensional modulation problem. As an example, figure 4(b) shows the phase space representation for coherent states with 8PSK modulation. Thereinto, some of 8PSK-modulated coherent states, such as |α 2 , |α 4 , |α 6 and |α 8 , simultaneously belong to multiple labels, which can be generalized into multi-label learning problem depicted in figure 4(d).
In fact, single label is a special case of multiple labels, so that both can be described by the model of multi-label learning. In what follows, we detail the proposed MLCA for addressing the generalized multi-label CVQKD model. Without loss of generality, we consider the algorithm for the eight-state CVQKD since it is the simplest multi-label modulation scheme. We note that the proposed MLCA can also be extended for other complicated modulation schemes.

Feature extraction for coherent state
As known, feature extraction is an important data-preprocessing step in machine learning field, since a set of suitable features would significantly enhance classification performance. The more features are extracted, the more details about the object can be obtained. However, there is few apparent features to describe a modulated coherent state, except for a few attributes such as p-quadrature, q-quadrature and modulated variance V M .
To solve the above-mentioned problem, we construct a set of distance features for each coherent state. As shown in figure 5, Alice sends a modulated coherent state through an untrusted quantum channel (usually a single mode fiber, SMF), and then the transmitted coherent state is received by Bob. Note that the transmitted state is no longer identical with its initial modulated state due to the phase drift (θ = θ) and energy attenuation ( p 2 + q 2 < p 2 + q 2 ) caused by the imperfect channel noise and loss. Subsequently, a number of virtual states (we named them reference states) are set for calculating the similarities of the transmitted state and reference states. In particular, the similarity can be measured by Euclidean metric, which is the straight-line distance between two points in Euclidean space [40]. In the Cartesian coordinates, we assume y = (y 1 , y 2 , . . . , y n ) and z = (z 1 , z 2 , . . . , z n ) are two points in Euclidean n-dimensional space, and the distance d between y and z is given by (1) Specifically, in the two-dimensional phase space we have where w is the number of reference state, t = (p , q ) and r = (p r , q r ) are the respective Cartesian points of transmitted state and rth reference state. After that, we can extract a set of feature vectors d = (d 1 , d 2 , . . . , d w ) for better description of the transmitted states.
As mentioned above, reference states are a set of virtual states that do not really exist, and hence one does not need to prepare them at Bob's side. In general, reference states are set to be identical with initial modulated states, which can help us to investigate the influence of imperfect channel on transmitted state.

Multi-label classifier for ML-CVQKD
After extracting robust features, these features are subsequently used as input data of classifier for state learning. Assuming X = R d is d-dimensional data space, and Y = {y 1 , y 2 , . . . , y l } is label space containing l categories. A training set is given by . , x id ) T and Y i ⊆ Y is a set of labels to which x i belongs. The task of learning system is to find a multi-label classifier h(·) : X → 2 Y , where 2 Y is the power set of set Y. Namely, for a given threshold function t : Let |x be an unlabeled coherent state, and N (|x ) denotes the subset of k nearest coherent states of |x in training set. The following statistic will be calculated as where (|x * , Y * ) denotes the labeled coherent states in training set that belong to N (|x ), [[•]] denotes the times of • and thus C j counts the number of neighbors of |x belonging to the jth category y j (1 j l).
Assuming H j represents the event that coherent state |x has label y j , then P(H j |C j ) denotes the posteriori probability, where H j is true under the condition that C j is the labeled data in N (|x ) have label y j . Accordingly, P(H j |C j ) denotes the posteriori probability which H j is false under the condition that C j labeled data in N (|x ) with label y j . Let f (|x , y j ) = P(H j |C j )/P(H j |C j ), the quantum multi-label classifier can be expressed by In other words, unlabeled coherent state |x can be assigned to category y j when posteriori probability P(H j |C j ) is greater than t(|x ) · P(H j |C j ). Specifically, based on Bayesian theorem [34], function f(|x , y j ) can be rewritten as where P(H j ) and P(H j ) respectively represent the prior probability that event H j is true or false, P(C j |H j ) and P(C j |H j ) respectively represent the conditional probability of C j labeled coherent states in N (|x ) with label y j under the condition that event H j is true or false.
The probabilities in equation (5) can be estimated by frequency counting in training set. In particular, prior probabilities can be calculated by and where s is a smoothing parameter controlling the weight of uniform prior distribution during probability estimates, and it usually set to 1 for Laplace smoothing. Different from prior probability, the estimation of conditional probabilities in equation (5) is complicated. For the jth category y j (1 j l), we calculate two arrays ς j andς j , each of which contains k + 1 elements given by andς where ψ j (|x i ) counts the number of neighbors that belong to category y j in k NNs of the ith coherent state. Correspondingly, ς j [r] counts the number of coherent states that belong to category y j themselves and exactly have r neighbors which belong to category y j in k neighbors, whileς j [r] counts the number of coherent states that does not belong to category y j and exactly have r neighbors which belong to category y j in k neighbors. Consequently, the conditional probabilities in equation (5) can be calculated by and where 1 j l and 0 C j k. Finally, a well-behaved multi-label classifier h(|x ) for ML-CVQKD is obtained by the successful state learning. Comparing with state-discrimination detector reported by our previous work [34], the proposed MLCA has several advantages. The most obvious merit is that MLCA has the ability to address the model of multi-label learning, thereby it is suitable to the high dimensional modulation strategy. In essence, MLCA belongs a part of data-processing in ML-CVQKD, so that it can be ran without any extra device or component. Moreover, the MLCA-embedded ML-CVQKD can further improve the performance of quantum communication system, we give the detailed analysis in next section.

Analysis and discussion
In this section, we elaborate the performance and security of the proposed MLCA-embedded ML-CVQKD system. We first interpret the prepared data after feature extraction, and then show the performance analysis of MLCA with several machine learning-based metrics. Security and comparison are subsequently presented.

Data preprocessing
For simplicity, we limit the analysis to a linear quantum channel, so that the quantum channel of the fiber-based one-way quantum key distribution can be deemed a mapping function which can be described as [41] q = √ T(q cos ϕ 0 + p sin ϕ 0 ) + ε,  where ϕ 0 = |θ − θ | is the phase drift during transmission and ε is Gaussian (0, N 0 + Tξ) distribution. Figure 6 shows 10 4 data points of 8PSK-modulated coherent state in phase space after passing 20 km fiber-based quantum channel. Due to the impact of channel loss and noise, the transmitted states are distributed in phase space with a certain probability distribution. As can be seen, however, these chaotic points with initial format are hardly distinguished thereby cannot be directly used as input data for MLCA. After feature extraction, figure 7 shows that these coherent states are mapped into an eight-dimensional vector by calculating the Euclidean distance between each data point and each reference state. We observe that most distance values of feature vectors are located range from 0 to 9.5, while a few feature vectors contains high distance values. These high-value feature vectors are corresponding to the edge outliers in figure 6, which leads to performance reduction. Therefore, a threshold function can be used to filter the high-value feature vectors, feature vectors whose feature value beyond black line in figure 7 should be discarded for performance improvement.

Performance on machine learning-based metrics
ML-CVQKD takes advantage of multi-label learning-based technology to predict unknown signal state, so that the traditional information theory-based metrics used in GG02 are not enough to comprehensively estimate the performance of our scheme. Hence, several machine learning-based metrics need to be introduced. Assuming there are three datasets, i.e., dataset A denotes samples which predicted as positive, dataset B denotes all positive samples, and dataset C denotes all samples. Figure 8 shows the relationship between these datasets and their corresponding metrics, which are listed below.
Precision Actually, besides the above-listed metrics, there may be other metrics used in machine learning to estimate specific system. The reason why we select these three is that the primary concern of our scheme is the correctness of the coherent state classification. We need to know how accurate MLCA can be and how many misclassifications it occurs. In addition, since MLCA is designed for solving multi-label classification problem, we deploy another metric called average precision (AP), which evaluates the average fraction of labels ranked above a particular label y ∈ Y which are in Y. It can be expressed as where g is the number of data in testing set and rank f (·, ·) is ranking function related to labels [39]. The performance improves with the increased AP, and the maximum perfect value is AP = 1. Figure 9 shows the performance of the MLCA-embedded ML-CVQKD system in terms of precision (a), recall (b), false positive rate (c) and average precision (d). According to the plots (a), (b) and (d), the performance of Prec/Rec/AP show the similar trend. Namely, it increases with the enlarged modulation variance and decreases with the risen channel loss. More specifically, the optimal performance can be achieved in both Prec and Rec when V m 40 regulated at a certain channel loss range. It illustrates that the MLCA-embedded ML-CVQKD has the ability to accurately predict unlabeled positive signal states. In the meanwhile, plot (d) shows that the AP has larger range of perfect performance area. It illustrates that MLCA is well qualified for handling the multi-label classification problem of coherent state. On the other hand, plot (c) shows the reduced FPR, and it illustrates that the level of misclassification is well acceptable.
Although Prec/Rec/FPR/AP have shown the respective performance from different aspects, we still hope that using only one metric to check the overall quality of the embedded MLCA. Therefore, receiver operating characteristic curve (ROC) [42], which describes the true positive rate of a certain classifier as a function of its FPR, is introduced. With ROC curve, one can explicitly tell the quality of the classifier: the curve more close to point (0, 1), the performance better. Figure 10 shows the ROC curves of MLCA. The gray line is the result of random guess, which illustrates that there is no performance improvement without using any classifier. For each label, however, the ROC curve is close to point (0, 1) with embedded MLCA, which  denotes the proposed classifier can dramatically improve the prediction performance of ML-CVQKD. We further calculate the area under curve (AUC) for each label and thus obtain AUC value, which is a probability value range from 0 to 1. As a numerical value, AUC can be directly used for evaluating classifier's quality. Therefore, the efficiency of quantum classifier Λ can be described by its AUC value, namely Λ = average AUC in our case. Moreover, a threshold range from 0.5 to 1 can be set to monitor the effectiveness of classifier. The state learning must be interrupted and restarted if AUC value of current trained classifier less than a certain threshold.

Security and comparison
Till now, we have demonstrated the performance of MLCA-embedded ML-CVQKD system in terms of machine learning-based metrics. However, we still want to present a performance comparison for ML-CVQKD in traditional way so that researchers who do not familiar with machine learning can immediately evaluate how much improvement of the proposed scheme can achieve. To this end, we first present the theoretical security proof for ML-CVQKD as follows.
To avoid redundant description, we here only focus on the analysis with reverse reconciliation, one can also obtain the direct reconciliation version by the following similar approach. As known in asymptotic limit, the secret key rate of the conventional discretely-modulated CVQKD with reverse reconciliation is given by where β is the reconciliation efficiency and I (A : B) is the Shannon mutual information between Alice and Bob. For heterodyne detection, we have where V = V m + 1 and the total noise referred to the channel input is χ tot = ξ − 1 + 2(1 + v el )/(ηT), η and v el are the practical detector's efficiency and noise due to detector electronics, respectively. Term χ BE represents the Holevo bound [43] of the mutual information between Eve and Bob and needs to be calculated in parameter estimation. In ML-CVQKD, however, due to the data processing is quite different, equation (16) can be rewritten as the following form The difference between equations (16) and (18) lies in two parts. First, the efficiency of embedded classifier Λ has to be considered since the classifier is necessary to the proposed ML-CVQKD. Note that the MLCA is one of embedded classifiers suggested in this paper, other excellent classifiers may also fit for ML-CVQKD. Second, term χ BE in equation (16) is substituted by term χ ML E , which denotes the Holevo quantity of the useful information Eve acquired by interacting with the quantum states under collective attacks, reads where S(ρ) = −Tr(ρ log ρ) is the von Neumann entropy, the logarithms are taken in base 2, y i is the raw key obtained by Bob's measurement with probability p(y i ), ρ E|y i is the corresponding state of Eve's ancilla, and In conventional discretely-modulated CVQKD protocol, p(y i ) = 1/m is the probability of discrete uniform distribution when variable Y = y i (i = 1, 2, . . . , m), since Y contains m finite and complete encoding events randomly chosen by Alice. Note that the relationship between each discretely-modulated coherent state and its binary presentation is fixed and public. For example, the key bits of state |α 0 in four-state protocol is always (0, 0) (see figure 1 of reference [44]), so that Eve can precisely recover the correct key bits (0, 0) when she successfully intercepts the coherent state |α 0 . Due to Eve knows all public encoding events of four-state protocol, the probability is p(y i ) = 1/4. Similarly, as Eve knows all encoding events of eight-state protocol, so the probability is p(y i ) = 1/8. However, due to the special-designed process of ML-CVQKD, the above relationship is no longer fixed and public. Although the label(s) of each discretely-modulated coherent state is fixed, such as the label of |α 1 is L 1 and the labels of |α 2 are L 1 , L 2 shown in figure 4(b), the relationship between label(s) and binary key bits can be randomly assigned by Alice. Hence, only Alice knows this relationship at the beginning, and Bob will learn it by the labeled coherent state at the end of state learning. While eavesdropper who do not participate the state learning process actually does not know the correct encoding events shared by Alice and Bob. That is to say, the possible encoding events Y for Eve is infinite (m → ∞), so that an intercepted state could denote any bit(s) for Eve, which leading p(y i ) → 0. Therefore, Eve can hardly obtain useful information from the intercepted state. The rest calculations are the same as eight-state protocol and can be found in appendix A. Figure 11 shows the performance comparison between MLCA-embedded ML-CVQKD and several existing CVQKD protocols in asymptotic limit. The results show that MLCA-embedded ML-CVQKD outperforms other conventional CVQKD protocols in terms of the maximum transmission distance. We also find that the performance of our scheme can be further enhanced with the risen modulation variance. To investigate the cause, we further plot the asymptotic secret key rates of above-mentioned protocols with fixed distances of 50 km and 100 km. As shown in figure 12, curves of both four-state protocol and eight-state protocol are arched and are located in certain ranges of small modulation variance, which shows that small variance is required for ensuring the security of conventional discretely-modulated CVQKD.  Meanwhile, curves of Gaussian-modulated CVQKD and our scheme are keep rising with the increase of modulation variance, it illustrates that the small variance is no longer required for the security of ML-CVQKD, so that the proposed scheme can outperform conventional discretely-modulated CVQKD protocols in terms of both transmission rate and distance by setting proper larger modulation variance. However, we point out that although the secret key rate of ML-CVQKD is an increasing function of modulation variance in the view of mathematical calculation, it cannot surpass the Piradola-Laurenza-Ottaviani-Banchi (PLOB) bound [46]. Besides, it is worthy noticing that the security proof of our scheme is based on the security analysis of discretely-modulated CVQKD developed in reference [5], which considers the asymptotic security under a linear channel assumption. Very recently, references [44,47] have proven the asymptotic security of discretely-modulated CVQKD without this assumption, so that one may obtain tighter security bound of ML-CVQKD by taking advantage of their approaches.
Finally, let us consider the practicality of ML-CVQKD. At the initial state learning, Alice and Bob are starting to establish a classification model, this process may be a little bit costly since numbers of coherent states and classical data should be communicated and computed. Despite that, the cost is acceptable because only four labels need to be considered, this magnitude is very small for addressing multi-label learning problem [48] (see the metric of coverage for multi-label learning). Once the process successfully completed, the system begin to enter the real quantum key distribution process, i.e. state prediction process. This process is more economic than other existing CVQKD protocols. In addition, ML-CVQKD can be applied to the existing optical communication system without any extra equipment, leading to fast deployment and operation.

Conclusion
In this work, we have proposed a multi-label learning-based scheme for discretely-modulated CVQKD protocol, called ML-CVQKD. In particular, the proposed scheme divides the whole quantum system into two parts, which are trusted state learning process and untrusted state prediction process, respectively. State learning is used for training and estimating multi-label classifier, while state prediction is used for generating final secret key. To this end, feature extraction was suggested to better represent the characteristics of modulated coherent state. Subsequently, a specialized MLCA was elegantly designed as an embedded classifier for distinguishing the incoming signal state. We then introduced a series of related machine learning-based metrics to estimate the performance of MLCA, and presented the asymptotic security proof of ML-CVQKD under the linear bosonic channel assumption. The practicality of ML-CVQKD was also discussed.
Performance analysis shows that MLCA-embedded ML-CVQKD is well feasible and effective for predicting the unknown signal state. Numerical simulation shows that the proposed MLCA-embedded ML-CVQKD outperforms other existing CVQKD protocols specially in maximum transmission distance, and the performance of both transmission distance and secret key rate will be further increased with the increase of modulation variance. Besides, we notice that the finite-size composable proof of the security of discretely-modulated CVQKD under collective Gaussian attacks is finished by reference [49], thereby investigating the finite-size composable security of ML-CVQKD is the subject of our future work.
In summary, ML-CVQKD is not only a kind of variant of CVQKD protocol, but also provides a novel thought for introducing various machine learning-based methodologies to CVQKD field. Figure 13 shows the comparison of different covariance coefficients as a function of modulation variance V m . We find that Z 4 and Z 8 are equal to Z G when the modulation variance is small enough. Hence, for a sufficiently low modulation variance the bound χ BE for discrete modulation is almost identical to the one obtained for a Gaussian modulation. Figure 14 depicts the optimal modulation variance for discretely-modulated CVQKD protocol as a function of transmission distance. As can be seen from the figure, the optical modulation variance is decreased with the increase of transmission distance, and the minimum optimal modulation variances are 0.3 for four-state CVQKD and 0.35 for eight-state CVQKD. This numerical simulation shows that small modulation variance is required to guarantee safety for discretely-modulated CVQKD protocol as it can prevent eavesdropper from intercepting the useful information.