Non-Uniform Random Number Generation from Arbitrary Bivariate Distribution in Polygonal Area

Bivariate non-uniform random numbers are usually generated in a rectangular area. However, this is generally not useful in practice because the arbitrary area in real-life is not always a rectangular area. Therefore, the arbitrary area in real-life can be defined as a polygonal approach. Non-uniform random numbers are generated from an arbitrary bivariate distribution within a polygonal area by using the rejection and the inversion methods. Three examples are given for non-uniform random number generation from an arbitrary bivariate distribution function in polygonal areas. In these examples, the non-uniform random number generation is discussed in the triangular area, the Korea mainland and the Australia mainland. The non-uniform random numbers are generated in these areas from the arbitrary probability density function. The observed frequency values are calculated with using both methods in the simulation study and the generated random numbers are tested with the chi-square goodness of fit test to determine whether or not they come from the given distribution. Also, both methods are compared each other with a simulation study.


Introduction
Random number simulations have long been a popular topic in the world of science.Random number simulations have been studied particularly closely within the areas of mathematics and statistics.Random numbers are used in fields such as communication [1], information and control systems [2], signal processing [3], machine learning [4], biostatistics [5], econometrics [6], mathematical finance [7], biology [8], insurance [9], cryptography [10], simulation [11], computer games [12] and statistical sampling and applications [13].Therefore, random number generation is quite important.
In the literature, the rejection method or the inversion method from univariate and bivariate probability density functions are the most commonly used non-uniform random number generation algorithms [11,14,15,16].Particularly, while generating non-uniform random numbers from bivariate probability density function, the limit of the probability density functions is defined as rectangular [17,18,19,20].If the non-uniform random numbers are generated from a bivariate probability density function in an arbitrary area (polygon), then the distribution bounded within the area will be uniform.In this case, the non-uniform random numbers are generated by looking into whether it falls into the area [14,15,21,22,23].Furthermore, random numbers are generated over a unit hyperball, hypersphere and hyperellipsoid.But similar to the above, they have a uniform distribution [15].Contrary to the literature, the non-uniform random numbers are generated from an arbitrary bivariate distribution function in an arbitrary polygonal area.
The process of generating random numbers from the given limits of the bivariate probability density function, which is defined as a rectangular range, can be easily adopted to work with both of the nonuniform random number generation methods.However, in many simulations, it is necessary to use a bivariate probability density function to model the problem within two-dimensional spaces.The definition limits of this probability density function can be an arbitrary non-uniform area rather than a rectangular area [24,25].In this case, it is not possible, through conventional methods, to perform conversions required for the non-uniform random number generation.The motivation of this study comes from the fact that the generation of the nonuniform random numbers from an arbitrary bivariate distribution within a polygonal area by using the rejection and the inversion methods is not improved in the literature.Because real-life problems are not always limited to uniform rectangular areas.
The bivariate probability density function, which is bounded by arbitrary non-uniform limits, has many fields of application in real-life problems.These problems include the amount pollution or the crime rate in a city, the distribution of earthquake frequency within a country, the dispersion density of insects in a field, the traffic congestion in a specific region, the documented locations and the frequency of an epidemic disease in a country.Also, a region of interest may have many partial density functions in the rectangular area.Many physically or politically divided cities may be given as examples for this situation (e.g.Belfast, Beirut, Jerusalem, Mostar, and Nicosia).However, the data in the politically divided cities start to change over time.In this case, it may not be possible to evaluate the entire city as a region.In each of these examples, an arbitrary polygonal area may be needed in order to generate non-uniform random numbers from the probability density function.
In this study, the generation of non-uniform random numbers is performed from the bivariate probability density function with the rejection and the inversion methods.The inversion and the rejection methods for univariate distribution functions are reminded in Section 2.1.The non-uniform bivariate random number generation in a rectangular area and its algorithms are given in Section 2.2.1.In Section 2.2.2, analytical and numerical approaches are presented for the non-uniform random number generation in a triangular area which describes 3-dimensional structures well.The algorithm using the triangular approach mentioned in Section 2.2.2 for the nonuniform random number generation in a polygonal area is given in Section 2.2.3.The performance of these algorithms is tested by simulating for real and artificial data sets in Section 3.

Generating univariate non-uniform random numbers
In all programming languages, random number generator functions have a deterministic algorithm which generates random numbers from uniform distribution in the range [0, 1).Random numbers are generated with a deterministic algorithm since such random numbers are called pseudo random numbers [11,26].The rejection and the inversion methods are the most common two approaches which are used to generate non-uniform random numbers.

Rejection method
The rejection method is useful in generating a nonuniform random number in the range [, ] from an arbitrary distribution.In order to perform this method, the   value must first be determined as the largest value of the distribution of probability density function () in the range [, ].The algorithm of the rejection method is given by Rubinstein [15].Algorithm 1.The univariate non-uniform random number generation algorithm according to the rejection method.
Step 1. Determine the probability density function f(x), to produce random number.Step 2. Generate two random numbers from the uniform distribution  1 ~ 0,1 and  2 ~ 0,1 .

Inversion method
The inversion method is the most frequently used method to generate a non-uniform random number as it is an analytically approach [26].The application of this method is only valid if the inverse of distribution function is available.In cases when the inverse of the distribution function cannot be found, numerical methods [27,28] are used.() is the desired distribution function used in generating random numbers and Equation ( 1) is used to generate random number by employing the inversion method.
~(0,1) indicates a random number is generated from the uniform distribution on the interval [0,1).If the inverse of the distribution function is solved in Equation (1).
Equation ( 2) is obtained [15].The variable  shows a random number from the distribution ().In this way, all random numbers which are generated from the uniform distribution can be converted to the desired distribution.

Generating bivariate non-uniform random numbers
Bivariate random numbers are shown in the form of (, ).Generally, these are referred to as twodimensional random points rather than bivariate random numbers.When generating univariate random numbers within a limited range, only two limit values are used.However, generating random numbers becomes more complicated in a limited area due to an infinite number of limitations of different areas.A general definition is required to overcome this complexity.This definition is the main subject of this study.First, generating a bivariate random number in simple geometric areas will be discussed.
In the subsequent sections, the method of generating bivariate random numbers in complex geometric areas will be described.Thus, the bivariate random number is generated in the form of ,  .

Generating non-uniform random numbers in a triangular area
Generating a random number in the triangular area by using the rejection method will be explained comprehensively in Section 2.2.3.This section will be focused on generating a random number with the inversion method in the triangular area.
Although the calculation of the integration calculus from the bivariate probability density function according a fixed limit on the rectangular area is a relatively simple process, the integration calculus in the triangular area is more complex.To eliminate this complexity, the median of the components   ,   ,   of x is found from the corner coordinates of the triangle area of Ω in Figure 1 The marginal probability density function of the triangle Ω  is defined in Equation ( 4).
shows that the equation of the line segment between A and B points. shows the equation of the line segment between A and C points.The distribution function of the obtained marginal probability density function is calculated in Equation (5).
The volume of left area (probability value) is obtained in Equation (6).
Likewise, the marginal probability density function of the triangle Ω  is defined in Equation (7).
The conditional probability density function of Y (given  =  in the Ω L area) is defined in Equation (9).
If  <   , Equation ( 9) is valid, otherwise the conditional probability density function of Y (given  =  in the Ω  area) is defined in Equation (10).
The conditional distribution of  (given  =  in the Ω  area) is defined Equation (11).
The conditional distribution function of  (given  =  in the Ω  area) is defined in Equation (12).
The analytical equations found here are calculated according the joint distribution function in Algorithm (4a).
Step  The inversion method is performed by calculating the distribution function from a given probability density function and the inverse distribution function from the distribution function.Therefore, the cumulative distribution function of the marginal probability function and the conditional probability density function and their inverse distribution function must be calculated in Section 2.2.2.The distribution function of the selected any probability density function cannot always be calculated.Even when the distribution function is calculated, it is often impossible to have an analytic function of the inverse distribution function.In this case, the rejection method or a numerical approach is required for solving the problem [28].A numerical approach is chosen in this section.The bivariate probability density function in the selected triangle must be transformed into a linear function for the numerical approach.First of all, the probability density value at the corner coordinates of the triangle is determined by the following equation.
=     ,   ,  = , , Thus, the plane equation passing through the points  1   ,   ,   ,  2 (  ,   ,   ) and  3 (  ,   ,   ) can be found by the following determinant.When the determinant is calculated, the following plane equation is obtained.
The marginal probability density function, the conditional probability function, their distribution and the inverse distribution function always can be found from the obtained plane joint probability density function.Thus, the analytical approach given in Section 2.2.2.can be used easily.A certain amount of error is made with this approach by converting an arbitrary joint probability density function into a planar function.This error can be reduced by dividing the selected triangle into the smaller subtriangles.

Generating non-uniform random numbers in a polygonal area
Generally, bivariate statistical calculations are performed in a rectangular area.In reality, however, the probability of statistical data being found within a rectangular area is quite low.Thus, statistical calculations must have the potential to be conducted in areas not of a rectangular shape.In this section, the boundaries of Korea are given Figure 3(a).In order to calculate the area of Korea, said area must be defined as a polygonal area.Dominant points of the boundaries are determined in order to change arbitrary area into a polygonal area.These dominant points can be determined manually or automatically by employing using polygonal approximation algorithms [29] (Figure 3(b)).
Whether a point is within the polygon or not must be investigated while generating random numbers with both the rejection and inversion methods.This problem is known as the point problem in the polygons [30,31].A perpendicular straight line is drawn by connecting from a randomly selected (, ) point to line  =   according the proposed odd-even method for solving this problem.Also, how many points of constituent all lines of the polygon are intercepted by the perpendicular straight line must be determined.If the number of the crossing points is odd, the selected point is determined to be inside of polygon.If the number of crossing points is even, the selected point is determined to be outside of the polygon.
The rejection method is the first method which explains the generation of a non-uniform random number from the arbitrary joint density function in a polygonal area.Algorithm 5 is given to generate a random number from a bivariate distribution function by employing the rejection method within a polygonal area.Firstly, the polygonal area needs to be divided into triangles to generate non-uniform random numbers using the inversion method.The Delaunay algorithm is used in this task [32,33].After polygonal area is divided into several triangles, the probability value (the probability value in the triangle,   ) of each triangle (Ω  ) is defined.This is demonstrated in Equation ( 19) [34].
In order to calculate the integral in a triangular region, as shown in Figure 1(b), the triangle is divided into two parts.The sum of the integral value is calculated separately in Equation (20).
These integrals' boundaries may be replaced.Therefore, the absolute value of these integrals must be calculated.The sum of the probability values of all the triangles must be equal to 1 because of the required probability axioms, such as the one seen in Equation (21).
A random triangle is selected according these probability values.The Algorithm 4a-b is used to generate random numbers in the selected triangle.Algorithm 6 is given in order to generate a random number using the inversion method in a polygonal area.
Algorithm 6. Non-uniform random number generation algorithm in the polygonal area by using the inversion method.
Step 1. Divide the polygonal area into triangles with dominant points and selected grid points (Figure 4

Results
Three separate samples are determined for generating non-uniform random numbers in the various geometric areas with the help of algorithms developed in Section 2.2.2. and Section 2.2.3.In one of these examples, the non-uniform random number generation in a triangular area is discussed.The another example, the non-uniform random number generation in the shape of the mainland of Korea (surrounding islands being excluded) is discussed.
The last example, the non-uniform random number generation is performed with the help of the probability density values which are obtained by using the frequencies of the Australian rainforest data.Also, latitude-longitude is used in all samples.The latitudes and longitudes used in calculations are assumed to be in cartesian coordinates.
In three example, the non-uniform random number generation is performed by a simulation.The aims of these simulations are to find which random numbers generation methods run faster than the other methods and whether or not the generated random numbers, those generated by using the rejection and the inversion methods, come from the given joint probability density function.The simulation is made on a computer that has Intel® Core (TM) i5-4440 CPU, 8 GB of RAM.Also, Matlab® R15b software is used to make the calculations.

Non-uniform random number generation in a triangular area
In the first example, a probability density functions which is bounded by a triangular area, is given in Equation ( 22).The values of the probability density function are arbitrarily determined to provide that the total value of the probability density function calculated in the triangular area equals to 1.
,  = The inverse marginal distribution function and the inverse conditional distribution function of the joint probability density function given in Equation ( 22) is required for generating the non-uniform random numbers.However, since this process cannot be found analytically, the numerical method given in Section 2.2.2 is preferred.According to this, firstly the triangular area is taken as a single triangle (Figure 6 Random numbers are generated from the given probability density function according the two different generated non-uniform random number methods.These random numbers are generated in two different sizes:  = 200 and  = 1000.This process is repeated 10,000 times and then the simulation is performed.The generated random numbers by the inversion method are illustrated in Figure 7.

𝝌 𝟐 goodness of fit test for the triangle example
The  2 goodness of fit test is used to determine whether or not the generated random numbers come from the given joint probability density function in a triangular area [25].Firstly, the triangular area is divided into classes by employing the  2 goodness of fit test in a triangular area.Generally, a rectangular area is used for the classification for bivariate examples.However, due to the difficulty of rectangular classification in a triangular region, the area is divided into 25 triangular classes (Figure 8(a)).Also, the expected frequency in the same small triangles can be ascertained by calculating the integral over the piece triangle mentioned in the previous section (Figure 8 The observed frequencies are found by dividing the triangular area into small classes in order to test the random numbers which are generated from each method (Figure 9).
The Chi-square test is performed 10,000 times for each number of triangles by using observed and expected frequencies.A summary of the obtained results from the experiments are shown in Table 1.The observed frequencies of the random number generated from the inversion method.
The random numbers which are generated by utilizing the inversion method is tested at the 95% confidence level for  = 200.The  0 hypothesis is accepted in approximately 94.26% of the 10,000 trials and the average computational time (ACT) is 0.0039 second when one triangle is used.Furthermore, the  0 hypothesis is accepted in approximately 97.52% of the 10,000 trials and the average computational time (ACT) is 0.0137 second when four triangles are used.On the other hand, the random numbers which are generated by utilizing the inversion method is tested at the 95% confidence level for  = 1000.The  0 hypothesis is accepted in approximately 86.56% of the 10,000 trials and the average computational time (ACT) is 0.0063 second when one triangle is used.Furthermore, the  0 hypothesis is accepted in approximately 95.83% of the 10,000 trials and the average computational time (ACT) is 0.0167 second when four triangles are used.

Using pdf from theoretical function
The Korea mainland is selected as the polygonal area in the second example.All areas of the Korea mainland are assumed to be surface areas for generating random numbers.This region is shown in Figure 10(a) and the probability density function in the same region can be seen in the Figure 10(b).Also, the piecewise probability density function in this region is given in Equation (23).Equation ( 23 It is aimed to generate the non-uniform random numbers according to the rejection and the inversion methods using JPDF given in Equation ( 23).However, the numerical approach is preferred to generate random numbers by the inversion method since the inverse marginal distribution function of the given piecewise JPDF functions and the inverse conditional distribution function cannot be found analytically.
Before generating a random number, the polygonal area must be divided into triangles (Figure 10).

𝝌 𝟐 goodness of fit test for the Korea example
Firstly, the polygonal area is divided into classes by employing the  2 goodness of fit test in polygonal area (Figure 12(a)).Also, the expected frequency in the small triangles can be ascertained by calculating the integral over the piece triangle mentioned in the previous sections (Figure 12 The chi-square test is performed 10,000 times for each method by using observed and expected frequencies.A summary of the obtained results from the experiments are shown in Table 2.  0 : The generated non-uniform random numbers come from the given joint probability density function in the Equation ( 23). 1 : The generated non-uniform random numbers do not come from the given joint probability density function in the Equation ( 23).
The random numbers which are generated by utilizing the rejection and the inversion methods are tested at the 95% confidence level.In these two methods, the  0 hypothesis is accepted in approximately 95% of the 10,000 trials for both sample sizes.The average computational time (ACT) is 0.3475 second for  = 1000 and the ACT is 0.7348 for  = 2000 when the rejection method is utilized.Also, the ACT is 0.2841 second for  = 1000 and the ACT is 0.3047 for  = 2000 when the inversion method is utilized.

Using pdf from rainfall data
It may not always be possible to obtain a probability density function from real-life data.In this case, a probability density function approach can be obtained by taking the frequencies of the data.However, since the JPDF is not determined analytically, the non-uniform random number generation can be performed by using the numeric approach in Section 2.2.2.A model is obtained by using the average annual rainfall data of Australia for this example.The modeling is performed by collecting the real data in ranging (0.05) from latitude-longitude coordinates [36].The modeled probability density function is shown in Figure 14.

𝝌 𝟐 goodness of fit test for the Australia example
The polygonal area is divided into classes by employing the  2 goodness of fit test in polygonal area.Also, the probability density function bounded in the Australia mainland is determined from rainfall data (Figure 15(a)).The observed frequencies are found by dividing the polygonal area into small classes in order to test the random numbers which are generated from each method (Figure 15 (b)).
The chi-square test is performed 10,000 times for each method by using observed and expected frequencies.A summary of the obtained results from the experiments are shown in Table 3.The expected frequency in the small triangles can be ascertained by calculating the integral over the piece triangle mentioned in the previous sections.The expected frequencies are calculated by using the inversion and the rejection methods (Figure 16).The random numbers which are generated by utilizing the rejection and the inversion methods are tested at the 95% confidence level.In these two methods, the  0 hypothesis is accepted in approximately 95% of the 10,000 trials for both sample sizes.The average computational time (ACT) is 2.9106 second for  = 1000 and the ACT is 5.6783 for  = 2000 when the rejection method is utilized.Also, the ACT is 1.2611 second for  = 1000 and the ACT is 1.3725 for  = 2000 when the inversion method is utilized.

Discussion and Conclusion
The non-uniform random numbers are generated from an arbitrary bivariate distribution within a polygonal area by using the rejection and the inversion methods.Three separate samples are determined to generate non-uniform random numbers in the various polygonal areas with the help of developed algorithms from bivariate distribution functions.
In the first example, the generated random numbers come from the given joint probability density function in approximately 95% of the 10,000 trials when one triangle and four triangles are used for both sample sizes,  = 200 and  = 1000.The high percentage of acceptance demonstrates that the random number generation algorithms run correctly for each approximation in the triangular area.That said, the first approximation is faster than the other approximation (for four triangle) in terms of the average computational time.In the second example, the random numbers which are generated with the rejection and the inversion methods are tested at 95% confidence level.In these two methods, the generated random numbers come from the given joint probability density function in approximately 95% of the 10,000 trials for both sample sizes,  = 1,000 and  = 2,000.The high percentage of acceptance shows that the non-uniform random number generation algorithms run correctly for each method in the polygonal area.Also, the inversion method is faster than the rejection method in terms of the average computational time.In last example, the random numbers which are generated by utilizing the rejection and the inversion methods are tested at the 95% confidence level.In these two methods, the generated random numbers come from the given joint probability density function in approximately 95% of the 10,000 trials for both sample sizes,  = 1,000 and  = 2,000, Since the level of significance is taken as α = 0.05, this high percentage of acceptance demonstrates that the nonuniform random number generation algorithms run correctly for each method in the polygonal area.That said, the inversion method is faster than the rejection method in terms of the average computational time.
In conclusion, the non-uniform pseudo random numbers are generated simply, consistently and accurately from a bivariate probability density function whose definition range is bounded by an arbitrary polygonal area by employing the rejection and the inversion methods.
(a).Drawing a parallel line from the y-axis to the median value divides the triangle into two parts.This division makes the calculation of the integral in the triangle area of Ω in (Figure1(b)) less complicated.The Ω region is divided into two triangles (  ,   ) and the sum of the total integral value can be found separately in Equation (3).

Figure 1 .
Figure 1.The representation of the triangular region; (a) Ω region; (b) The determination of the triangular region of Ω  and Ω  . shows the equation of the line segment between B and C points. shows the equation of the line segment between A and C points.The distribution function from marginal distribution function of the second triangle is obtained in Equation (8).

Algorithm 4a .
The non-uniform random number generation algorithm in the triangular area according to the inversion method (Analytical Part).Step 1. Determine the bivariate probability density function  ,  to generate a random number and the boundaries of the triangular region Ω .Step 2. Divide the region Ω into two triangles according the median apse (  ).Step 3. Compute the marginal probability density functions of each triangle like in Equation (4) and Equation (7).Step 4. Calculate the (  ) value which is the integral of the marginal probability density function from the left triangle in the range   ,   like in Equation (6).Step 5. Calculate the distribution functions of the marginal probability density functions in Step 3 like in Equation (5) and Equation (8).Step 6. Calculate the distribution functions of the conditional probability function of region Ω  and Ω  (when  =  given), by utilizing Equation (11) and Equation (12).Determine the inverse functions of these distribution functions.In Algorithm 4b, the random number generation in the triangular area is performed by utilizing the calculations made analytically in Algorithm 4a.Algorithm 4b.The non-uniform random number generation algorithm in the triangular area according to the inversion method (Simulation Part).Step 1. Generate two random numbers from the uniform distribution   ~ 0,1 and   ~ 0,1 .Step 2. Substitute   value in the following inverse of the marginal distribution function and generate the component of  of the bivariate random number (Figure 2(a)).

Figure 2 .
Figure 2. The non-uniform random number generation in the triangular area; (a) the generation of the component  of bivariate random number; (b) the generation of the component  of the bivariate random number.

a b Figure 3 .
Definition of the polygonal area; (a) An arbitrary piecewise area; (b) The polygonal area obtained by determining the dominant points.

Figure 4 .
Figure 4.The implementation of the Delaunay triangulation method in the polygonal area; (a) The selection of grid points (b) The division of the polygonal area into triangles.

Figure 5 .Figure 6 .Figure 7 .
Figure 5. Definition of the first example, (a) The triangular area; (b) The probability density function in the triangular area.

Figure 8 .
Figure 8. Calculation of the expected frequency in the triangular region for 1000 samples; (a) Dividing the triangular region into small triangles; (b) The expected frequencies of each small triangle.

𝐻 0 :Figure 9 .
Figure 9.The observed frequencies according each method for 1000 samples; (a) The observed frequencies of the random number generated from the rejection method; (b) The observed frequencies of the random number generated from the inversion method.

Figure 10 .
Figure 10.Definition of Example 2; (a) The probability density function in the polygonal area with colored surface on the Korea mainland; (b) The probability density function in the polygonal area with mesh surface on the Korea mainland.

Figure 11 .
Figure 11.The non-uniform random number generation; (a) Random numbers generated by using the inversion method; (b) Random numbers generated by using the rejection method.

Figure 12 .
Figure 12.Calculation of the expected frequency in the polygonal region; (a) Dividing the polygonal area into small triangles; (b) The expected frequencies of each small triangle.

Figure 13 .
Figure 13.The observed frequencies according each method; (a) The observed frequencies of the non-uniform random numbers generated by utilizing the rejection method; (b) The observed frequencies of the non-uniform random numbers generated by utilizing the inversion method.

Figure 14 .
Figure 14.Definition of Example 3 (Australia mainland); (a) The colored contour representation of the observed frequencies; (b) The colored contour representation of the probability density function.

𝐻 0 :Figure 15 .
Figure 15.The polygonal area (Australia mainland); (a) The probability density function bounded in the Australia mainland; (b) The observed frequencies on the Australia mainland.

Figure 16 .
Figure 16.The representation of the expected frequencies; (a) The expected frequencies calculated by the inversion method; (b) The expected frequencies calculated by the rejection method.

Algorithm 5 .
The non-uniform random number generation algorithm in the polygonal area according to the rejection method.Step 1. Determine bivariate probability density function   ,  to generate a nonuniform random number and the coordinates of the nodal points of the polygonal area   ,   .Step 2. Generate two random numbers from the uniform distribution ~(  ,   ) and ~   ,   .These upper and lower boundaries are determined by choosing points which create the polygon nodal points.  = min     = max     = min     = max Step 3. Examine the generated random number (, ) to determine whether is in polygon or not.If the generated number is outside of the polygon, generate a new random number (, ) by returning to Step 2. Step 4. Generate a random number from the uniform distribution in the shape of ~(0,   ).Also,   value is the largest value of   ,  function in the region   ,  ∈ Ω. Step 5.If z>   ,  , reject the generated random number ,  and continue processing and go to Step 2, otherwise accept the generated random number ,  .

Table 1 .
The results obtained from 10,000 trials for the inversion method

Table 2 .
The results obtained from N=10,000 trials *(POA:The percentage of accepted hypothesis; ACT: Average computational time)

Table 3 .
The results obtained from N=10,000 trials