True Random Number Generation from Bioelectrical and Physical Signals

It is possible to generate personally identifiable random numbers to be used in some particular applications, such as authentication and key generation. This study presents the true random number generation from bioelectrical signals like EEG, EMG, and EOG and physical signals, such as blood volume pulse, GSR (Galvanic Skin Response), and respiration. The signals used in the random number generation were taken from BNCIHORIZON2020 databases. Random number generation was performed from fifteen different signals (four from EEG, EMG, and EOG and one from respiration, GSR, and blood volume pulse datasets). For this purpose, each signal was first normalized and then sampled. The sampling was achieved by using a nonperiodic and chaotic logistic map. Then, XOR postprocessing was applied to improve the statistical properties of the sampled numbers. NIST SP 800-22 was used to observe the statistical properties of the numbers obtained, the scale index was used to determine the degree of nonperiodicity, and the autocorrelation tests were used to monitor the 0-1 variation of numbers. The numbers produced from bioelectrical and physical signals were successful in all tests. As a result, it has been shown that it is possible to generate personally identifiable real random numbers from both bioelectrical and physical signals.


Introduction
Random numbers are needed in some areas in computer science, such as authentication, secret key generation, game theory, and simulations. In these applications, particularly numbers should have good statistical properties and be unpredictable and nonreproducible. The number generation in the literature is performed in two different ways as deterministic and nondeterministic [1,2]. PRNGs (Pseudo Random Number Generators), which are deterministic random number generators, generate numbers with fast, easy, inexpensive, and hardware independent solutions. The statistical qualities of these numbers produced are close to the ideal. PRNGs must meet the requirements specified in Table 1 to be used especially for authentication and key generation [3][4][5]. Therefore, nondeterministic functions are added to the output functions of PRNGs to guarantee these requirements.
TRNGs (True Random Number Generators), which are nondeterministic random number generators, present slower, more expensive, and hardware-dependent solutions compared to PRNGs. Contrary to PRNGs, there is no need to include extra components in the TRNG system designs for R2, R3, and R4 requirements. Because of the unpredictability of random numbers generated by the use of high noise sources with high entropy in TRNGs, it is assumed that the R2 requirement is met. If the R2 requirement is satisfied, then it is assumed that the R3 and R4 requirements are also satisfied. To meet the R1 requirement in TRNGs, postprocessing techniques are applied on the random numbers obtained by sampling from noise sources. This eliminates the statistical weaknesses of random numbers at the output of the TRNG. In addition, postprocessing techniques eliminate potential weaknesses and make TRNG designs strong and flexible [6,7].
Recently, there have been studies performed on random number generation from human-based noise sources [8][9][10][11][12]. Elham et al. showed that two different people would produce different random numbers and that these numbers 2 Computational and Mathematical Methods in Medicine Table 1: Requirements for random numbers.

R1
RNGs must generate random numbers having good statistical properties at the output to be used in cryptographic applications.

R2
In case of the attacker knows the sub-generators of random numbers, it must not be allowed to calculate or predict premise and consecutive random numbers with high accuracy.

R3
It must not be possible to predict or calculate previously generated random numbers with high accuracy by considering the known current internal state value of a RNG or without requiring its internal state information.

R4
It must not be possible to predict or calculate subsequent random numbers with high accuracy by considering the known current internal state value of a RNG or without requiring its internal state information.
could be used as biometric signatures [8]. Xingyuan et al. proposed a TRNG structure using a one-dimensional chaotic map based on mouse movements. The proposed structure showed that NIST tests were successful and could be used on personal PCs [9]. Hu et al. performed real random number generation by observing mouse movements of computer users. The statistical properties of the binary number generators generated from mouse movements of three different users were examined by the NIST test suite. Three chaos-based approaches were proposed to eliminate similar motions generated by the same user. Successful results were also achieved with these approaches [10]. Rahimi et al. used two different ECG signals for the cryptographic key generation and suggested two different approaches. The security analyses of keys obtained by both approaches were tested with distinctiveness, randomness, temporal variance, and NIST and successful results were obtained [11]. In the study performed by Chen et al. [12], random number generation was done from ECG signals and the analysis was tested by NIST test suite. It was revealed by the authors that the PRNG-based generated numbers had more successful results in classical PRNG structures. Dang et al. showed the possibility of random number generation from EEG signals. Four different EEG datasets were used to illustrate the use of obtained numbers in cryptography applications and their statistical properties were analyzed with the NIST test suite. In this PRNG-based approach, the samples consisting of EEG signals were transformed into 0 and 1 number generators. Mathematical definitions of the structure using modular arithmetic for the transformation of number generators were given. In the study, it was shown that EEG signals could be used for random number generation. The NIST test suite was used for this purpose and a success of higher than 99% was achieved [13]. In a study carried out by Chen et al. [14], the authors showed that EEG signals agreed with Gaussian distribution and also revealed whether random number could be generated from signals. They used the EEG signals obtained from both healthy and sick people for the PRNG number generation. They used NIST test suite for statistical analysis and they failed some tests. It was shown as a result of the study that the generated numbers could be used as a PRN. In a study done by Chen et al. [15], random number generation was performed by using white noise signals taken from MPEG-1, WEBCAM, and IPCAM video files and it was emphasized that successful results were obtained from statistical tests. Buhanuponp et al. proposed a new encoding method for random number generation using EEG signals. This number generator, which can be used in low cost and real applications, is based on TRNG. A success of 99.47% was obtained from statistical tests. It was revealed that it was possible to do simple and fast bit generation by encoding method [16]. The summary of the literature methods used for random number generation is shown in Table 2.
In this article, it was shown that it was possible to generate real random numbers from personally identifiable bioelectrical signals (EEG, EMG, and EOG) and physical information (blood volume pulse, GSR (Galvanic Skin Response), and respiration). The accuracy of random numbers obtained was analyzed by NIST SP 800-22, scale index, and autocorrelation tests that are commonly used in the literature and the results are given in tables. The contributions made to the literature in this article can be summarized as follows: (1) It was shown that it was possible to generate personally identifiable random numbers.
(2) Random numbers were generated with the TRNG structure.
(3) It was revealed that random numbers can be generated by not only bioelectrical signals but also physical signals.
(4) The analyses of statistical properties were performed and successful results were obtained. Analyses were also performed by scale index and autocorrelation tests in addition to the NIST test.
The article is organized as follows to achieve the aim. In the second section, the structures and properties of PRNG and TRNG are briefly explained. Moreover, the comparisons of these two structures are presented in tabular form. In the third section, bioelectrical and physical signals are briefly described and the properties of signals and the dataset used in the study are given. In the fourth section, the proposed TRNG structure, the normalization for number generation, and sampling and postprocessing operations are presented. The tests used for the statistical analysis of the numbers and the results obtained are tabulated in Section 5. In the last section of the article, the results are discussed and the suggestions are made about future works.

Random Number Generation Methods
Random numbers are widely used in areas such as cryptography and data transmission, luck games, secure communication, simulation, and game programming, where key generation is important. Random number generators can be divided into two classes: TRNG (True Random Number Generator) and PRNG (Pseudo Random Number Generator). Random numbers can be generated as hardware and software. The random numbers generated by the software can be defined by a specific mathematical model. On the other hand, it is possible to generate numbers by hardware with the help of noise source whose behavior cannot be predicted. Figure 1 shows the classification of random number generation.

PRNG.
The general design architecture of PRNG is shown in Figure 2. 1 , 2 , . . . . . . . . ∈ represents random number generator while ∈ indicates the internal states of pure PRNG and is defined as the probability distribution of random seed. PRNG generates random number from the current state provided that Ψ : → output function will be = Ψ( ). After that, using Φ transition function, state is updated as +1 = Φ ( ). 0 represents the first internal state and 1 value corresponds to the seed value of 0 state and the equation 1 = Φ ( 0 ) is generated [18]. In short, these generators need the starting parameters also known as seed.
Random number generators with good quality statistics are generated by expanding these parameters with deterministic ways [19].

TRNG.
The general design architecture of TRNG is shown in Figure 3. The values obtained by sampling noise  sources are called digitalized analog signals (DAS). DAS random numbers correspond to a particular case of pure random numbers and they are subjected to algorithmic postprocessing to reduce their potential weaknesses. During this application, however, the output bit rate is reduced and the operating speed decreases. The structural comparison of PRNG and TRNG number generators is shown in Table 3. According to Table 3, PRNGs generate fast, easily designable, and periodic numbers. On the other hand, TRNGs generate unpredictable, entropy dependent, and nonperiodic numbers. Beside these advantages, they are disadvantageous compared to PRNGs because they are hardware dependent and operate slowly.

Bioelectrical and Physical Signals
Bioelectrical signals are low amplitude noises between 100 V and 1 mV and are taken from the body through electrodes. The frequency spectra of such signals are in the low frequency range of 0.1 Hz ∼ 2000 Hz. The amplitude and frequency characteristics of different bioelectrical signals taken from the body are shown in Table 4. During brain activity, continuous rhythmic electrical potentials are produced and also electrical signals are generated due to receptor activity. The recording of these electrical signals with the electrodes embedded in the skull is called electroencephalography (EEG). The amplitudes of EEG waves range from 5 to 400 V and their frequencies change between 0.5 and 100 Hz. EEG signals are taken according to Extended International 10-20 system. Electromyography (EMG) is a neurological examination method based on examining the electrical potentials of nerves and muscles. EMG is made in two ways by using surface electrodes and needle electrodes. In the tests using surface electrodes, electrodes are bonded to the skin surface.
Computational and Mathematical Methods in Medicine 5  Among physical signals, the blood volume pulse (BVP) is used to measure heart rate. BVP measurement is obtained using a photoplethysmography (PPG) sensor. This sensor measures changes in blood volume corresponding to changes in heart rate in arteries and capillaries and blood flow. The GSR signal is one of the most sensitive indicators of emotional stimulation to show whether individuals are under stress. It gives information about the conductivity of the skin. Another physical signal, respiration, is caused by the difference between breathing air and exhaling air. With the temperature converter, the heat exchange during respiration is converted into electrical activity. Figure 4 shows the samples bioelectrical and physical signals in the BNCIHORIZON2020 database. In this figure, the six rows of signals from top to bottom are samples EEG, EOG, EMG, GSR, and blood pressure respiration signals from the datasets A, B, C, D, E, and F, respectively. Table 5 shows mean and standard deviation of bioelectrical and physical signals.

Personally Identifiable Number Generation from Bioelectrical and Physical Signals
In this study, to generate TRNG-based random numbers, bioelectrical and physical signals obtained from BNCIHORI-ZON2020 database were used. The overall number of the databases used is fifteen. These data are four EEG, four EMG, four EOG, one blood volume pulse, one GSR (Galvanic Skin Response), and one respiration. For EEG data, the signals in the database were recorded using thirty-two Ag/AgCl active electrodes with a sampling frequency of 512 Hz according to Extended International 10-20 system. EOG and EMG signals were recorded over the left and right flexor digitorum profundus. GSR, blood volume pulse, and respiration data were recorded simultaneously with bioelectrical signals. Real random number generation from these data includes three steps, as shown in Figure 5. These steps are normalization, digitization (sampler), and postprocessing.
In normalization, all data obtained from noise sources (raw signals) first should be transformed into binary number system. To achieve this, the operations given with their mathematical explanations below are applied. Let each signal obtained from individuals be x=(x 1 ,x 2 , . . .x n ) T . Each sample is expressed as 5 bits by using modular arithmetic as shown in To produce 0 and 1 from 5-bit y i generators, z i generator is obtained according to (2). Each element of z i generator is in a range of [0−5].
Lastly, b i random binary generator is obtained by using z i generator with the help of (3). The algorithm of normalization is given below.
After this step, sampling is applied to b i generator obtained from bioelectrical and physical signals that are used as the noise source. Periodic and nonperiodic signals are used in the literature for sampling [17,20]. In this article, logistic map presenting nonperiodic behavior was used for sampling. Logistic map presenting chaotic behavior is defined as +1 = * * (1 − ) = 0, 1, 2, . . .
where r is the system parameter, a 1 is the seed or initial value, and i is the number of iterations. The numbers produced by logistic map and in a range of [0, 1] are real. Let these numbers
Postprocessing is applied to improve statistical properties of sampled random number generators. XOR, Von Neumann, LFSR, and Hash function structures are commonly used in the literature for postprocessing [21,28]. In this study, XOR postprocessing was used. When sequential numbers are 0,1 and 1,0 with XOR in the random number generator, real random number is assumed to be 1 and otherwise 0. When XOR postprocessing is applied to the sampled 1110110010 number generator obtained in Table 5, the number generator obtained will be 01001.The algorithm for postprocessing is given in Algorithm 3.

Statistical Analysis of Random Numbers
NIST SP 800-22, scale index, and autocorrelation testscommonly used in the literature-were used for the statistical analysis of generated random numbers.

NIST SP 800-22 Statistical Test Suite.
NIST SP 800-22 is a test commonly used for statistical analysis of the numbers obtained from TRNG, PRNG, Physical Unclonable Function (PUF), and their hybrid generators [21][22][23]. NIST test suite includes fifteen tests and the parameters of each test are given in the related study [29]. The value known as the level of significance is one of the most important parameters in the test. The selection of as 0 indicates that the randomness of numbers to be tested has a confidence value of 99%. Another parameter is p value and it is known as the measure of randomness. If this value is equal to 1, numbers are said to have perfect randomness. If p value becomes 0, numbers are not random. The value of personally identifiable random numbers to be used for key and verification applications should be appropriately selected. For each test, if p value is greater than or equal to value, then the test is successful. Otherwise the test becomes unsuccessful; i.e., the numbers generated are not random. Generally, is selected from [0.001, 0.01] range.
NIST test suite analysis results of personally identifiable random numbers obtained from bioelectrical and physical signals in the dataset are shown in Tables 7 and 8, respectively. As can be seen from tables, all data used was successful in the NIST test because their p value was higher than 0.01.

Scale Index Test.
The scale index test was applied for statistical analysis of numbers. The scale index technique was proposed by Benitez [30]. This technique was used to determine the information about the degree of nonperiodicity of a signal or generated number series. In literature, for determining the periodicity of TRNG and PRNG, the scale index test was used [24,25]. The scale index was based on the continuous wavelet transform and wavelet multiresolution analyses. The scales s and f at time u in the continuous wavelet transform (CWT) and scalogram were shown as given in equations (5) and (6) [22]. One has The continuous wavelet transform's energy of f at a scale s was illustrated as S(s). Equation (7) shows the inner scalogram of f at a scale s.
where ( ) = [ ( ), ( )] ⊆ is the maximal subinterval in I for which the support of , is included in I for all u j(s).
Considering that the length of J (s) depends on the scale s, the values of the inner scalogram at different scales cannot be compared. The normalized s in is defined as shown in The degree of nonperiodicity of bioelectrical and physical magnitudes was determined by the scale index test whose details were explained above. Table 9 shows obtained scale index values.
The scale index value i scale should be in the range of 0≤ i scale ≤1. If the scale value obtained from the generated system is 0 or near 0, then the system is defined as periodic and if 1 or near to 1, then it is defined as nonperiodic. According to Table 8, it was observed that the results obtained from both bioelectrical and physical signals were successful in scale index test and they were close to 1.

Autocorrelation Test.
Finally, to observe the variations of 0 and 1 s in the generated random numbers the autocorrelation test was used [26,27]. Equation (10) shows the mathematical definitions of the test [31].
where ⊕ is the XOR operator, n is the length of the generated number sequences, and b i represents the number sequence. The d value is the constant integer and between [1,(n/2)]. Equation (11) shows the relationship between 0 and 1s.
For = 0.05, if |X5| < 1.6449, then the test is successful. Autocorrelation test was used to determine 0-1 change in the random numbers obtained from both bioelectrical and physical magnitudes. Table 10 shows autocorrelation test results. As seen in Table 9, for each dataset, |X5| value is in the specified interval. Thus, random numbers obtained from both bioelectrical and physical signals were successful in autocorrelation test.  (2) //normalized 0-1 data end Algorithm 1: Normalization procedure.

Discussion and Conclusion
Random numbers have been generated in the literature from bioelectrical signals such as EEG. These numbers have the PRNG structure. However, the statistical properties of the numbers generated from EEG signals are not good. In this article, the TRNG structure using bioelectrical and physical signals as a source of randomness was proposed. Although the taken signals are periodic, the level of noise that would emerge from any source on the signal causes the signal to be nonperiodic. In this case, the random numbers to be generated are unpredictable. Personally identifiable random numbers were generated from the obtained raw signals using normalization (see Algorithm 1), sampling, and postprocessing operations. The process is faster than the TRNG structures in the literature because of the simple structure of normalization, sampling, and postprocessing. NIST SP 800-22 test was used to show that the statistical properties of the generated numbers were improved; scale index test was used to reveal the level of nonperiodicity and autocorrelation tests were used to observe 0 and 1 change of numbers. All test results were presented in tabular form and all results were found to be successful. The results indicate that TRN generation from bioelectrical and physical signals obtained from the human body is possible.
Thus, the obtained random numbers are suitable for use in different areas, such as key generation, authentication, games of chance, simulation, and game programming. It is possible to carry out various studies on random number generation using these bioelectrical and physical signals as well as different types of signals like EGG and ECG. The present study will be the basis for random number generation using such signals.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.