Next Article in Journal
Defect Detection Model Using CNN and Image Augmentation for Seat Foaming Process
Previous Article in Journal
A Time-Fractional Parabolic Inequality on a Bounded Interval
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Critical Analysis of Beta Random Variable Generation Methods

by
Elena Almaraz Luengo
1,* and
Carlos Gragera
2
1
Department of Statistics and Operational Research, Faculty of Mathematical Science, Complutense University of Madrid, 28040 Madrid, Spain
2
Faculty of Mathematical Science, Complutense University of Madrid, 28040 Madrid, Spain
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(24), 4893; https://doi.org/10.3390/math11244893
Submission received: 28 October 2023 / Revised: 18 November 2023 / Accepted: 20 November 2023 / Published: 6 December 2023
(This article belongs to the Section Probability and Statistics)

Abstract

:
The fast generation of values of the beta random variable is a subject of great interest and multiple applications, ranging from purely mathematical and statistical ones to applications in management and production, among others. There are several methods for generating these values, with one of the essential points for their design being the selection of random seeds. Two interesting aspects converge here: the use of sequences as inputs (and the need for them to verify properties such as randomness and uniformity, which are verified through statistical test suites) and the design of the algorithm for the generation of the variable. In this paper, we analyse, in detail, the algorithms that have been developed in the literature, both from a mathematical/statistical and computational point of view. We also provide empirical development using R software, which is currently in high demand and is one of the main novelties with respect to previous comparisons carried out in FORTRAN. We establish which algorithms are more efficient and in which contexts, depending on the different values of the parameters, allowing the user to determine the best method given the experimental conditions.

1. Introduction

The study of different algorithms for generating random variables is currently a topic of great interest (see, for example,  [1,2,3,4,5], among others).
The rapid generation of beta random variables is essential for simulating real models in a diverse group of disciplines. It is possible to define random variable (r.v.) generation as the process of obtaining a realisation of a r.v. from a target distribution.
The generation of random numbers or numbers that behave as such is crucial for obtaining sequences of values that come from other distributions, such as normal, exponential, or beta distributions. To this end, the simulation process begins with the generation of values that can be considered sequences of independent numbers which come from a uniform variable in the interval (0,1) (or, if working at bit level, 0–1 independent uniformly distributed values) that will constitute the starting point for the algorithmic generation of values of the desired probability distribution. It is therefore essential to start from quality values (i.e., values that really verify the desired properties of randomness and uniformity). It is at this point that we will take a closer look at some of the most important concepts related to random and pseudo-random sequences and the tests that must be carried out to check the quality of the sequences. In general, it is possible to speak of true random numbers and pseudo-random numbers (see, for example, [6,7,8,9,10,11] among others). The former are based on physical sources and do not need an initial input to be generated. They are not expected to show any cyclical patterns or correlations between the generated data. The latter are sets of values that, although generated through algorithms and an initial seed, resemble values from independent realisations of a uniform r.v. in the interval ( 0 , 1 ) ( U ( 0 , 1 ) ). As they are generated via deterministic algorithms, they are reproducible. Generators that produce the first type of numbers are called true random number generators (TRNGs), while the second type is generated by pseudo-random number generators (PRNGs). Given the high cost, both material and temporal, of using TRNGs, since it is necessary to work with an entropy source and a processing function, it is more common to work with PRNGs, since they are much faster and computationally more efficient. One line of research related to this generation is based on the creation of more powerful and secure algorithms that allow these sequences to be obtained while also providing a certain degree of security that prevents the data obtained from allowing knowledge of the underlying algorithm (in this sense, the study of the so-called cryptographically secure pseudo-random number generators is of interest). For more details on the generation of pseudo-random numbers, see [12].
Once the sequence of pseudo-random numbers has been obtained, and before being used for other purposes such as the simulation of a system or the generation of values of other random variables, it is necessary to check whether the properties of uniformity and randomness (and in the case of working in cryptographic applications, unpredictability) are verified. For this purpose, different hypothesis tests are used to test them from different perspectives. The aim is to measure whether the results obtained with the samples under study are compatible with those that should be obtained in the case of working with sequences that are random. By carrying out numerous tests on different sequences of a generator, it will be possible to decide on the goodness of the generator, and, therefore, it will be possible to decide whether it is appropriate to use the sequences in question for subsequent analyses. It is important to emphasise this aspect because various algorithms for generating beta random variables will be discussed below, starting from certain inputs that assume a certain prior distribution for which, in the event that this is not being verified, the results of such algorithms would not be as expected.
In order to check sequences effectively, different sets of tests known as test suites or test batteries are used. There are many suites in the literature, such as NIST STS 800-22 [13], Diehard [14], Dieharder [15], ENT [16], FIPS [17,18,19], and TestU01 [20]. The most currently used suites are NIST STS 800-22, Dieharder, and TestsU01. Once the sequences have been checked and found to be of quality, they can then be used in the downstream processes where they need to be used.
In this paper, we will focus on analysing the existing algorithms in the literature for the generation of beta random variables. The study will be carried out from a critical point of view and will analyse the mathematical/statistical and computational properties of such algorithms under the prior assumption that the inputs of such algorithms have been previously checked by means of a suite of statistical tests. This is an exhaustive and more complete review that includes all the methods developed, classified by typology, and provides an overview of the different methods used in the development of the methods. Additionally, we will carry out an empirical study to analyse the performance of the different algorithms using R software, which is one of the most demanded programs at present, both in the academic and research fields, and which also provides free access.
The outline of this work is as follows: in Section 2, the basic concepts about the distribution under study are explained; in Section 3, the main existing methods in scientific research for the generation of the beta r.v. are discussed in detail; in Section 4, a computational study on the efficiency of the previously explained methods is carried out; and finally in Section 5, the main conclusions of this work are given.

2. Preliminaries

A r.v. is a measurable function f : ( Ω , A , P ) ( R , B ) , where ( Ω , A , P ) is a probability space ( Ω is the sample space, A P ( Ω ) is a σ -algebra, and P is the probability measure defined on ( Ω , A ) ), and B is Borel’s σ -algebra over R .
A beta r.v. with the parameters α > 0 and β > 0 (denoted by X b e t a ( α , β ) ) is a continuous r.v. with a density
f ( x ) = x α 1 ( 1 x ) β 1 B ( α , β ) , 0 x 1
where B ( α , β ) = 0 1 u α 1 ( 1 u ) β 1 d u = Γ ( α + β ) Γ ( α ) Γ ( β ) .
The density of this distribution can take a number of interesting different forms depending on its parameters. This versatility in the forms it can take is what makes the beta r.v. so important in modelling real-world phenomena (often after re-scaling the defining interval to the interval [ a , b ] via the transformation a + ( b a ) X ). In addition, special cases include the uniform r.v. when α = β = 1 ; the gamma r.v. on the boundary when α , β , and α / β remains constant; and the normal r.v. when α , β , and α = β among the most relevant ones. Also, small transformations of the beta distribution provide many of the families of the Pearson distribution system [21]. By providing a relationship between these important distributions, the beta r.v. can be a suitable model where the more commonly used distributions fail. This versatility is important in simulation modelling, as it is often used when closed-form results are more difficult to obtain, such as when component lifetimes do not follow a normal, exponential or Erlang distribution [21]. This is why beta r.v. is widely used in Bayesian statistics [22], simulation [23,24], management and production models [25,26,27], control systems [28], and in the study of genome structures [29], machine learning [30], among many others.

3. Methods for Generating Beta Random Variables

In this section, we will describe the main methods for generating beta random variables, from the most general methods (i.e., they are not specifically designed to generate values of beta random variables, and they can be difficult to adapt and are, in general, more computationally inefficient as a result) to more specific and concrete methods.

3.1. Inverse Transform Method

Let X be a continuous r.v. with a cumulative distribution function F which is continuous and strictly increasing in ( 0 , 1 ) . Let F 1 be the inverse of F. The method starts by generating U U ( 0 , 1 ) and afterward takes X = F 1 ( U ) . To check the consistency of the algorithm, let us note the following for x R : P ( X x ) = P ( F 1 ( U ) x ) = P ( U F ( x ) ) = F ( x ) . For more details about this method and improvements, see, for example, [31].
One of the main characteristics of this method is its simplicity, as it allows the generation of random variables from only one uniform r.v. and the calculation of the inverse distribution function F 1 . Moreover, it preserves the structural properties of the underlying uniform pseudo-random number generator. This makes it possible to generate correlated random variables, generate order statistics efficiently, and obtain samples from truncated distributions or marginal distributions in a simple way [32]. But this method also has disadvantages, since it is only computationally efficient and accurate if F 1 is known. For a beta r.v., this only happens when either of its two parameters is equal to one (see Algorithm 1). Regrettably, it is not always possible to obtain F 1 (see [33,34]), and thus other approaches are employed, such as numerical methods (even if they are approximate). Among the most well-known methods of numerical inversion are the “root-finding algorithms”, among which are Newton’s method, the false position method [35], and the bisection method [32], but these algorithms are generally characterised by their time-consuming nature. Other methods use the interpolation of the tabulated values of F, as in [33]. Finally, in [31], another interesting algorithm was designed (NINIGL) that allowed continuous random variables such as beta ones to be generated by numerical inversion.
Algorithm 1 Inverse transform method for generating b e t a ( α , 1 )
Require:  α
Output:  X b e t a ( α , 1 )
1:
Generate U U ( 0 , 1 )
2:
X U 1 α
3:
return X
In [31], a new method of inversion is proposed. It is the first algorithm of its kind based on error control, which can be applied to all smooth and bounded densities. The authors use polynomial interpolation techniques of inverse CDF and Gauss–Lobatto integration in their development. They apply their algorithm not only to beta variable generation, but also to normal, gamma, and t-distributions. This is a very fast algorithm with accuracy close to machine precision. This research line is currently being pursued, and we can highlight very recently published works that are based on the previous one, for example [36], where the author presents an inversion method designed in the varying parameter case, if a suitable density transformation can be found to avoid running the configuration for each parameter.

3.2. Composition Method

This method is applied when the distribution function F ( x ) of the variable to be generated can be expressed as a combination of other distributions F 1 , F 2 , that are easier to sample than the original one. That is, F ( x ) = j = 1 p j F j ( x ) with p j 0 , j = 1 p j = 1 and F j ( · ) a cumulative distribution function of a r.v. ∀j. To verify the consistency of this method, note that: P ( X x ) = j = 1 P ( J = j ) P ( X x J = j ) = j = 1 p j F j ( x ) = F ( x ) .
Algorithm 2 shows this method, it is suitable when the distribution we want to generate is itself a mixture. For other types of distributions, the application of this method can be complex. Therefore, this method is often used in conjunction with other principles.
Algorithm 2 Composition method for generating b e t a ( α , β ) r.v. [37]
Require: Decomposition of the distribution function F ( x ) = j = 1 p j F j ( x )
Output: Random sample x from the b e t a ( α , β ) r.v.
1:
Generate a random index J such that P ( J = j ) = p j for j = 1 , 2 ,
2:
Generate x with distribution F J ( · )
3:
return x

3.3. Acceptance-Rejection (A–R) Method

This method was proposed by John von Neumann in 1951 [38]. It is a flexible and efficient method for generating continuous random variables. The concept of acceptance–rejection consists of generating random samples from a given distribution and discarding some of them so that the remaining samples follow the desired distribution. This idea, together with the results described hereafter (whose proofs can be seen in [39]), will enable the method to be established and its consistency to be justified.
Theorem 1. 
Let f ( x ) be a density function and c be a nonnegative constant. If the random pair ( X , Y ) is uniformly distributed in G c f = ( x , y ) : 0 < y c f ( x ) , then X is a r.v. with a density f ( x ) .
Theorem 2. 
Let ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , ( X 3 , Y 3 ) , be a sequence of independent and identically distributed (i.i.d.) random pairs distributed uniformly in a set A. Then, for a (measurable) subset B A with P ( ( X , Y ) B ) > 0 , the subsequence of all pairs ( X i , Y i ) B is a sequence of i.i.d. pairs uniformly distributed in B.
Theorem 2 establishes that all accepted pairs are uniformly distributed over the region G f between the density function f and the x axis. Under Theorem 1, one can then ensure that the x coordinate of these accepted pairs follows a distribution with a density f. Furthermore, the following theorem shows the reciprocal idea to that of Theorem 1:
Theorem 3. 
Let X be a r.v. with a density g ( x ) and Y be a uniform r.v. in the interval ( 0 , c g ( X ) ) . Then, for some constant c > 0 , ( X , Y ) is uniformly distributed in G c f = ( x , y ) : 0 < y c f ( x ) .
For the proof, see [32]. This method uses a density function f ( x ) and a dominant function t ( x ) f ( x ) . The original description of the method used the maximum of the density as the dominant function t ( x ) = max x f ( x ) . For  α , β > 1 , it is possible to take t ( x ) = f p 1 p + q 2 . This function t ( x ) is not really a density function since c = t ( x ) d x f ( x ) d x = 1 , but the function g ( x ) = t ( x ) c is. Therefore, a common method for generating samples of a t ( · ) function for a density f ( · ) is to choose another density function g ( · ) from which samples can be easily generated and find a constant c > 0 such that c g ( x ) f ( x ) for all x in the domain of f ( · ) . A constant c can be found by taking c = max x > 0 f ( x ) g ( x ) . By setting t ( x ) = c g ( x ) , it is possible to obtain Algorithm 3.
Algorithm 3 Acceptance-rejection method for generating a b e t a ( α , β ) r.v.
Require: Density of a beta r.v. f ( x ) , constant c and density g ( x ) such that c g ( x ) f ( x )
Output: Random sample x with distribution b e t a ( α , β )
1:
repeat
2:
      Generate x with density g ( · )
3:
      Generate u U ( 0 , 1 )
4:
until  u f ( x ) c g ( x )
5:
return x
The execution time of Algorithm 3 depends on (1) the time taken to generate x, (2) the time taken for the comparison u f ( x ) c g ( x ) and (3) the expected number of iterations c until the required number of values of X is achieved. The loop condition requires constantly evaluating the function f ( x ) and is not always easy, especially in the case of a beta r.v. By adding an intermediate step to Algorithm 3, it is possible to accelerate the process. If there is a simpler way to evaluate and minorise the function s ( x ) for the density f ( x ) (i.e.,  s ( x ) f ( x ) x [ 0 , 1 ] ), then it is not necessary to calculate the density function f to accept the sample. If the pair ( x , u c g ( x ) ) lies below the curve s ( x ) , then x is accepted directly. This modification is called the squeeze method [40], which is illustrated in Algorithm 4.
Algorithm 4 The squeeze method
Require: Density f ( x ) of a beta r.v., constant c and density g ( x ) such that c g ( x ) f ( x ) , minorising function s ( x )
Output: Random sample x coming from b e t a ( α , β ) r.v.
1:
repeat
2:
      Generate x with density g
3:
      Generate u U ( 0 , 1 )
4:
      if  u s ( x ) c g ( x )  then go to 6
5:
until  u f ( x ) c g ( x )
6:
return x

3.3.1. Jöhnk’s Algorithm

This method shows that, starting from two independent U ( 0 , 1 ) random variables Y and Z, if  Y 1 α + Z 1 β 1 , then X = Y 1 α ( Y 1 α + Z 1 β ) b e t a ( α , β ) (see [41]). The method is valid for all beta distributions with parameters α > 0 and β > 0 , but it requires on average 1 P ( Y + Z 1 ) = Γ ( α + β + 1 ) Γ ( α + 1 ) Γ ( β + 1 ) iterations (see [32]), and these grow rapidly with α and β ; that is, when α and β are large parameters, the computational cost is high, and thus it is recommended to restrict their use to b e t a ( α , β ) such that α , β < 1 . This algorithm is not used anymore because it is not as efficient as other methods. The pseudo-code is in Algorithm 5.
Algorithm 5 Jöhnk’s algorithm [41]
Require: Parameters α and β
Output: Random sample x coming from b e t a ( α , β ) r.v.
1:
repeat
2:
      Generate u , v U ( 0 , 1 )
3:
       y u 1 α
4:
       z v 1 β
5:
until  y + z 1
6:
return  x = y y + z

3.3.2. Forsythe’s Method

The basic method is described in [42], and some extensions and applications can be seen in [43,44]. It consists of two parts: a first step where an interval is selected to generate the X r.v. and a second step where the exact value within the interval is determined by the A–R technique. This method allows the generation of random variables with a density f ( x ) = C e B ( x ) , where B ( x ) is an increasing function of x in ( 0 , ) and C is a real constant. The successive intervals of x have limits q 0 , q 1 , , verifying B ( q k ) B ( q k 1 ) 1 . For each interval of x, two constants are calculated: r k = 0 q k f ( x ) d x and d k = q k q k 1 ( k = 1 , , K ) . The number of intervals K is chosen so that r K exceeds the largest representative number less than one. In addition, to sample within the interval, it is appropriate to define the following function: G k ( x ) = B ( q k 1 + x ) B ( q k 1 ) .
Algorithm 6 is divided into two parts: a first loop that allows us to choose the interval [ q k 1 , q k ) to which x will belong and a second one that will determine the value of x in that interval.
Algorithm 6 Forsythe’s algorithm [42]
Require: Functions B ( X ) , G ( X ) , constants q k , r k , d k
Output: Random sample x with distribution b e t a ( α , β )
  Selection of the interval
1:
k 1
2:
Generate u U ( 0 , 1 )
3:
if  u < = r k  then go to 5
4:
else  k k + 1  go to 3
Generation of x
5:
Generate u U ( 0 , 1 )
6:
w u d k
7:
t G k ( w )
8:
Generate v U ( 0 , 1 )
9:
if  v t  then
10:
     return  x q k 1 + w
11:
else
12:
     Generate u U ( 0 , 1 )
13:
     if  u < v  then  t u  go to 8
14:
     else  go to 5
For a beta r.v. with parameters α and β it follows that
B ( x ) = ( α 1 ) log ( x ) ( β 1 ) log ( 1 x )
G k ( x ) = ( α 1 ) log 1 + x q k 1 ( β 1 ) log 1 x 1 q k 1
As for the number of intervals, in [45], it is proposed to control their width by successively solving G k ( x ) = g m a x , finding it satisfactory to take a value of 0.2 for g m a x . The constants r k are obtained directly by integrating the density.
The main disadvantage of this method is that it requires continuous calculation of the constants q k , d k and r k , which reduces its efficiency.
In  [45], a comparative analysis (in μ s) of the performance of Jöhnk’s and Fosythe’s algorithms for the case of α = β < 1 was performed by programming them in FORTRAN on two computers: a Cyber 73-14 and an IBM 360/65. In this case, Forsythe’s algorithm turned out to be more efficient than Jöhnk’s algorithm for α = β < 1 analysed on the Cyber 73-14 and IBM 360/65, except for the case where α = 0.1 . As the value of α increased, the execution time in Forsythe’s algorithm remained approximately the same, whereas in the case of Jöhnk’s algorithm, it tended to be little more than a half as fast as Forsythe’s. In this paper, the case α = β > 1 is also analysed, and Forsythe’s method is compared to other methods which will be described later.

3.3.3. Ahrens and Dieter’s Methods

These methods are particular cases of the A–R method based on the density of a normal r.v. The beta distribution has as its domain the interval [ 0 , 1 ] , and when its parameters take values greater than one, its density function is bounded. This is why the density function of a beta distribution can be increased using a normal r.v. and not have any problem with tails. In [46], this idea was used to modify the A–R method so that samples from a beta distribution of the parameters α , β > 1 could be generated by using a majorising function proportional to the density of a normal r.v. To perform this method, the beta density function is taken, without loss of generality, as  f ( x ) = x A A 1 x B B C C , where A = α 1 , B = β 1 and C = A + B = α + β 2 . As proven in [46], f ( x ) exp 2 C x A C 2 , x [ 0 , 1 ] . The left-hand side of the inequality corresponds to the density of a beta r.v., while the right-hand side is proportional to the density of a normal r.v. with a mean μ = A C and standard deviation σ = 1 ( 2 C 1 2 ) .
In Algorithm 7, the pseudo-code of this variant is presented.
Algorithm 7 BN algorithm ([46], α , β > 1 )
Require: Parameters α > 1 and β > 1
Output: Random sample x with distribution b e t a ( α , β )
1:
A α 1 , B β 1 , C A + B , L C log C , μ A C , σ 0.5 C 1 2
2:
Generate z N ( 0 , 1 )
3:
x z σ + μ
4:
if  x < 0  or  x > 1  then go to 2
5:
Generate u U ( 0 , 1 )
6:
if  log u > A log x A + B log 1 x B + L + 0.5 z 2  then go to 2
7:
else
8:
     return x
Generating samples of a uniform r.v. and a normal r.v. plus three logarithmic evaluations are the least efficient computations of this algorithm. The average number of iterations is ( 1 2 π C ) 1 2 A A B B Γ ( α + β ) C C Γ ( α ) Γ ( β ) 1 2 ( A + B ) ( A B ) 1 2 . This approximation is based on Stirling’s formula [46] and is valid except for small values of α and β . This method reaches its maximum efficiency in the symmetric case α = β . For this case, the version of Algorithm 7 developed thus far can be improved. If  α = β , then A = B , C = 2 A and step 13 tests whether u ( 4 x ( 1 x ) ) A e z 2 2 is true. For  x = z σ + μ , this condition can be written as u q ( z ) with q ( z ) = 1 1 2 z 2 A A e z 2 2 . In addition, this function q ( · ) can be bounded to speed up the algorithm such that 1 z 4 ( 8 α 12 ) q ( z ) 1 z 4 ( 8 α 8 ) + 1 2 z 4 ( 8 α 8 ) 2 . The pseudo-code can be seen in Algorithm 8.
Algorithm 8 BS algorithm ([46], α = β > 1.5 )
Require: Parameter α > 1.5
Output: Random sample x with b e t a ( α , α ) distribution
  • A α 1
  • t ( A + A ) 1 2
  • Generate z N ( 0 , 1 )
  • x 1 2 ( 1 + z t )
  • if  x < 0  or  x > 1  then go to 3
  • Generate u U ( 0 , 1 )
  • if  u 1 z 4 ( 8 α 12 )  then
  •      return x
  • else if  u 1 z 4 ( 8 α 8 ) + 1 2 z 4 ( 8 α 8 ) 2  then go to 3
  • else if  log u > A log 4 x ( 1 x ) + z 2 2  then go to 3
  • else
  •      return x
In [46], the authors made a comparison in μ s of the BN and BS algorithms programmed in FORTRAN and found that the computational time stabilised quickly for 150 μ s in the symmetric case. In [45], a comparison of the performance (in μ s) of the BN algorithm, general rejection method, Forsythe’s method and a method based on order statistics was carried out in FORTRAN on two computers: a Cyber 73-14 and an IBM 360/65. For the case where α = 2 , the general rejection method was faster than the BN algorithm but slower than Forsythe’s algorithm. For the rest of the α values analysed, it was slowest and inefficient in the Cyber 73-14 and one of the slowest in the IBM 360-65. For values of α > 3 , the BN algorithm provided the second fastest algorithm, with Forsythe’s method being the fastest for all integers α except two.

3.3.4. Switching Algorithms

In [47], a new procedure was developed to generate beta random variables with both parameters or at least one of them being less than one for densities of the form
f ( x ) = c f 1 ( x ) f 2 ( x )
However, it also combines the composition and inversion methods.
Let g ( x ) = k 1 f 1 ( x ) be a density function with an associated distribution function G. If X is a r.v. with density g ( x ) and U U ( 0 , 1 ) , then we can deduce using the A–R method that X is accepted if U sup f 2 f 2 ( X ) . The average number of iterations required is c sup f 2 k 1 . This method is particularly useful for unbounded densities. In [47], two cases were developed: Case 1, with both parameters being less than one, and Case 2, with one parameter being less than one and the other being higher than one.
In Case 1, the density is not bounded when x = 0 or x = 1 . To solve the difficulties this implies in the rejection algorithm, a composition is employed so that the range [ 0 , 1 ] to which x belongs is divided into two: [ 0 , t ] and ( t , 1 ] . When x takes values greater than t, the roles of f 1 and f 2 of the density function in Equation (4) are switched. For this reason, this algorithm is named the “switching algorithm”. Starting from the density of Equation (4), then following is taken:
f 1 ( x ) = α x α 1 f 2 ( x ) = β ( 1 x ) β 1
To generate a sample x from a beta r.v., we have that x [ 0 , t ] with a probability p, while x ( t , 1 ] with a probability 1 p . If  0 x t , then x is generated from the density α x α 1 t α , 0 x t through the inversion method, while if t < x 1 , then x is generated from β ( 1 z ) β 1 ( 1 t ) β , t < z 1 . From the calculations described in detail in [47], the probability value p and the optimal value of t are obtained as a function of the parameters of the distribution b e t a ( α , β ) to be generated:
t = α ( 1 α ) β ( 1 β ) + α ( 1 α )
p = β t α ( 1 t ) + β t
At first, this new algorithm requires generating an exponential r.v. with a parameter 1, E e x p ( 1 ) and two uniform random variables in [ 0 , 1 ] . The first uniform r.v. U is used in the U < p test that dictates whether to generate x [ 0 , t ] or x ( t , 1 ] , while the second uniform V is needed to generate x itself through the inversion method. However, with the properties of the uniform variable, if  U < p , then U p also follows a uniform distribution. Therefore, it is not necessary to generate V, and it is sufficient to generate x from U p or ( 1 U ) ( 1 p ) , depending on whether the test U < p is verified or not. In Algorithm 9, a pseudo-code for this method is described.
Algorithm 9 has an acceptance rate 1 e , with
e = Γ ( α + 1 ) Γ ( β + 1 ) Γ ( α + β ) t ( 1 α ) ( 1 t ) ( 1 β ) β t + α ( 1 t )
being the efficiency [47].
Case 2 does not differ much from the previous one as far as the procedure is concerned. Here, f 1 and f 2 are taken as in Equation (5), and x is generated from the same densities. However, the calculation of p and the optimal value of t varies. Now, p = β t β t + α ( 1 t ) β and t must satisfy h ( t ) = β t + ( α 1 ) ( 1 t ) β β t ( 1 t ) ( β 1 ) = 0 . Except for particular cases, this equation cannot be solved analytically, but it is easily solved numerically. By solving it in this way, Atkinson and Whittaker [47] concluded that the value closest to the optimum one is obtained when t = ( 1 α ) ( β + 1 α ) .
Algorithm 9 Switching algorithm [47,48] (SW2 ( α , β < 1 ) )
Require: Parameters α < 1 and β < 1
Output: Random sample x with distribution b e t a ( α , β )
1:
t α ( 1 α ) β ( 1 β ) + α ( 1 α )
2:
p β t α ( 1 t ) + β t
3:
Generate U U ( 0 , 1 )
4:
Generate E e x p ( 1 )
5:
if  U > p  then
6:
       X = 1 ( 1 t ) 1 U 1 p 1 β
7:
      if  ( 1 α ) log X t E  then
8:
           return X
9:
      else go to 3
10:
else
11:
       X = t U p 1 α
12:
      if  ( 1 β ) log 1 X 1 t E  then
13:
           return X
14:
      else go to 3
By applying this modification in Algorithm 9, Algorithm 10 is obtained. The efficiency of Algorithm 10 is
e = Γ ( α + 1 ) Γ ( β + 1 ) Γ ( α + β ) 1 β t α + α ( 1 t ) β t α 1
See [47] for more information.
Algorithm 10 Switching algorithm [47,48] (SW1 ( α < 1 , β > 1 ) )
Require: Parameters α < 1 and β > 1
Output: Random sample x with distribution b e t a ( α , β )
1:
t ( 1 α ) ( β + 1 α )
2:
p β t α ( 1 t ) + β t
3:
Generate U U ( 0 , 1 )
4:
Generate E e x p ( 1 )
5:
if  U > p  then
6:
       X = 1 ( 1 t ) 1 U 1 p 1 β
7:
      if  ( 1 α ) log X t E  then
8:
           return X
9:
      else  go to 3
10:
else
11:
       X = t U p 1 α
12:
      if  ( 1 β ) log ( 1 X ) E  then
13:
           return X
14:
      else  go to 3
A comparative analysis of the performance (in μ s) of the SW1 and SW2 algorithms by programming them in FORTRAN on a Cyber 73-14 is presented in [47]. In particular, in Case 1, the Jöhnk’s and switch algorithms were compared. If  α + β < 1 , then Jöhnk’s algorithm is preferable, while if α + β > 1 , then the switching algorithm is faster. In Case 2, Whittaker suggested an A–R algorithm without decomposition, and its pseudo-code can be seen in Algorithm 11. This method was compared to the switch algorithm, resulting in the last one being faster in all studied parameter combinations.
Algorithm 11 Whittaker’s algorithm [47] ( α < 1 , β > 1 )
Require: Parameters α < 1 and β > 1
Output: Random sample x with distribution b e t a ( α , β )
1:
Generate U U ( 0 , 1 ) and E e x p ( 1 )
2:
Set X = U 1 / α
3:
if  ( 1 β ) log ( 1 X ) E  then accept X. Otherwise go to 1

3.3.5. Cheng’s Methods

The method presented in [49] is based on the A–R method by applying it to second-type beta variables b e t a 2 ( α , β ) . (The density of a b e t a 2 ( α , β ) r.v. is f ( X ) = x α 1 B ( α , β ) ( 1 + x ) α + β , x > 0 , where B ( α , β ) is the beta function). If  Y b e t a 2 ( α , β ) , then Z = Y 1 + Y b e t a ( α , β ) [49].
Based on Algorithm 3, for generating values of a B 2 ( α , β ) r.v., we must select f ( x ) as a density of a b e t a 2 ( α , β ) r.v. and g ( x ) = λ μ x λ 1 ( μ + x λ ) 2 , where μ and λ are parameters set to obtain a small value for c. Cheng suggested [49 μ = ( α β ) λ and
λ = min ( α , β ) , if min ( α , β ) 1 2 α β ( α + β ) α + β 2 , if min ( α , β ) > 1
Additionally, c = max ( f / g ) is taken as a function of the given parameters c = 4 α α β β λ B ( α , β ) ( α + β ) α + β .
In Algorithm 12 the pseudo-code of this method is presented.
Algorithm 12 BA algorithm [49]
Require: Parameters α and β
Output: Random sample x with distribution b e t a ( α , β )
  Initialisation
1:
a α + β
2:
if  min ( α , β ) 1  then  b max ( α 1 , β 1 )
3:
else  b a 2 2 α β a
4:
g α + b 1
Generation
5:
Generate u 1 , u 2 U ( 0 , 1 )
6:
v b log u 1 ( 1 u 1 )
7:
w α e v
8:
if  a log a ( β + w ) + g v 1.3862944 < log u 1 2 u 2  then go to 5                 ▹ Constant is log(4)
9:
return  x = w ( β + w )
The time required for this algorithm depends mainly on the number of iterations c needed to obtain a sample. It is interesting to distinguish the behaviour of this quantity in two different cases: (1) when min ( α , β ) > 1 , c does not rise approximately above 1.47, and (2) when min ( α , β ) 1 , c reaches 4. This duality behaviour suggests modifying Algorithm 12 with the goal of increasing its speed for each of the cases. In this way, Cheng [49] developed (1) Algorithm BB for min ( α , β ) > 1 and (2) Algorithm BC for min ( α , β ) 1 .
Algorithm 12 requires four logarithmic evaluations to be performed between steps 6 and 8. This can be computationally demanding. When the rejection rate is low ( min ( α , β ) > 1 ), the logarithmic evaluation of step 6 cannot be avoided. However, a preliminary test can be performed to avoid the evaluations of step 8. The idea behind this test resides in the property that a state as long as log z is a concave function, and its tangent always lies above the curve [49]. Therefore, θ z log ( θ ) 1 z , z , θ > 0 . Let z = β + W . If the test
m log ( m ) log 4 + a V m θ ( β + W ) log ( θ ) 1 log ( U 1 2 U 2 )
is satisfied, then the value X = W β + W will be accepted.
Regarding the choice of θ , in [49], it was concluded that taking θ = 1 / ( α + β ) is the best option to maximise the conditional probability of satisfying the test in Equation (9) given that the algorithm is accepted. This value for θ causes a drawback: it must be ensured that α < β . However, since if X b e t a ( α , β ) , then 1 X b e t a ( β , α ) [37], it is possible to take without loss of generality α and β as the minimum and maximum, respectively, of the original parameters α 0 and β 0 , and if an exchange occurs between the original parameters and those taken for the algorithm, then X = b ( b + W ) is returned instead of X = W ( b + W ) [49].
It is also possible to avoid evaluating log ( U 1 2 U 2 ) in the preliminary test (Equation (9)) as discussed in [49]. The exact value of θ is not critical for this test, and  θ = 5 can be taken independently of the values of α and β . These modifications are given in Algorithm 13.
For the case where min ( α , β ) 1 , rejects must be detected beforehand by performing preliminary tests between steps 6 and 8 of Algorithm 12. For this purpose, the following results are used [49]:
Lemma 1. 
If U 1 < 1 2 and α β , then R ( x ) = f ( x ) C g ( x ) verifies R ( U 1 ) k 1 ( 1 2 U 1 ) 2 , where k 1 = δ ( 1 + 3 β ) 4 ( 18 ( α β ) 14 ) .
Lemma 2. 
If U 1 1 2 and m i n ( α , β ) = β 1 , then U 1 2 4 R ( U 1 ) k 2 U 1 2 , where k 2 = 1 + ( 2 + δ 1 ) β 4 with δ = 1 + α β .
Algorithm 13 BB algorithm [49] ( min ( α 0 , β 0 ) > 1 )
Require: Parameters α 0 and β 0
Output: Random sample x with distribution b e t a ( α , β )
  Initialisation
1:
α min ( α 0 , β 0 )
2:
β max ( α 0 , β 0 )
3:
a α + β
4:
b a 2 2 α β a
5:
g α + b 1
Generation
6:
Generate u 1 , u 2 U ( 0 , 1 )
7:
v b log u 1 1 u 1
8:
w α e v
9:
z u 1 2 u 2
10:
r g v 1.3862944                            ▹ Constant is log 4
11:
s α + r w
12:
if  s + 2.609438 5 z  then go to 16                   ▹ Constant is 1 + log 5
13:
t log z
14:
if  s t  then go to 16
15:
if  r + a log a β + w < t  then go to 6
16:
if  α = α 0  then
17:
      return  x = w b + w
18:
else
19:
     return  x = b b + w
The choice of k 1 and k 2 is covered in detail in [49].
Lemma 1 allows one to create a preliminary test for rejection, while Lemma 2 can be applied to create preliminary tests for both acceptance and rejection. Algorithm 14 includes these tests.
In [49], a comparison of the methods proposed in different situations was made through programming in FORTRAN and running on a CDC 7600 computer. The case where min ( α , β ) > 1 was first analysed, and the BA, BB and BN algorithms were compared, leaving aside Jöhnk’s algorithm since it was substantially slower than the other methods in this case. In this situation, the BB algorithm was uniformly faster than the BA algorithm, which was uniformly faster than the BN algorithm. While Cheng argued that the speed of the BN algorithm could be improved by applying a better normal r.v. generator, he in fact said that the BN algorithm’s times would be reduced to reach the performance of the BA or BB algorithms when α and β were approximately equal and large. However, Cheng pointed out two aspects showing why the BA and BB algorithms were better than the BN algorithm. First, the performance of the BA and BB remained approximately constant for any values of α and β , while the performance of the BN algorithm worsened when α and β were close to one or when their values were quite different. Second, in the case where α and β varied and had to be recalculated at each call, the BA and BB algorithms took only 5.6 μ s more time per variable, while the BN algorithm required an additional 9.9 μ s. The case where min ( α , β ) 1 was also analysed. Here, the BA, BC, switch (1 and 2) and Jöhnk’s algorithms were compared. None of them substantially dominated over the others. If  α and β were fixed and less than or equal to one, and if  α + β 1 , then Jöhnk’s method would be the best performer, while if α + β > 1 , then the BC or first switch method were dominant. If  α and β were fixed, and either of them were greater than one, then the second switch method was slightly faster than the BC algorithm if the other parameter was small, while the BC algorithm was slightly faster in the other cases. Here, Jöhnk’s method slowed down rapidly as α or β increased. In the case where α and β were not fixed and were updated on each call, the Switch methods were not as efficient, and it was Jöhnk’s method that provided better performance when α or β was small or if α + β 1.5 . Outside of this region, the BA algorithm was better if α , β 1 ; otherwise, the BC algorithm was better.
Algorithm 14 BC algorithm [49] ( min ( α 0 , β 0 ) 1 )
Require: Parameters α 0 and β 0
Output: Random sample x with distribution b e t a ( α , β )
  Initialisation
1:
α max ( α 0 , β 0 )
2:
β min ( α 0 , β 0 )
3:
a α + β
4:
b β 1
5:
δ 1 + α β
6:
k 1 δ ( 0.25 + 0.75 β ) 18 α b 14
7:
k 2 = 0.25 + ( 0.5 + 0.25 δ 1 ) β
Generation
8:
Generate u 1 , u 2 U ( 0 , 1 )
9:
if  u 1 1 2  then go to 14
10:
y u 1 u 2
11:
z u 1 y
12:
if  0.25 u 2 + z y k 1  then go to 8
13:
else  go to 20
14:
z u 1 2 u 2
15:
if  z 0.25  then
16:
       v b log u 1 1 u 1
17:
       w α e v
18:
      go to 23
19:
if  z k 2  then go to 8
20:
v b log u 1 1 u 1
21:
w α e v
22:
if  a log a β + w + v 1.3862944 < z  then go to 8
23:
if  α = α 0  then
24:
      return  x = w β + w
25:
else
26:
      return  x = β β + w

3.3.6. BNM Algorithm

In [50], another technique for generating beta distributions was developed based on the one already developed in [46]. This new technique assumes that the inflection points of f ( x ) fall at points of the form
x = A ± A B C 1 1 2 C
if these values are real numbers between 0 and 1. If there are no inflection points, then the density is concave. If there are two inflection points, then the density is concave between these points and convex in the rest of the space. If there is only one point, then the density is concave in the direction of the mode and convex in the opposite direction. The mode is found in A C when A , B > 0 . For this method, it is essential to define the following points (see [50]):
x 1 = x 2 x 2 ( 1 x 2 ) A C x 2 x 2 = A A B C 1 1 2 C , if it is a real number in   [ 0 , 1 ] 0 , in other case . x 3 = A C x 4 = A + A B C 1 1 2 C , if it is a real number in   [ 0 , 1 ] 0 , in other case . x 5 = x 4 ( x 4 ( 1 x 4 ) ) A C x 4
where x 2 and x 4 are the inflection points in the case where they exist, x 3 is the mode and x 1 and x 5 are the points at which tangents passing through points x 2 and x 4 intersect the X axis, respectively. In addition, this method performs a preliminary test with a minorant function b 1 ( x ) f ( X ) x [ 0 , 1 ] defined by
b 1 ( x ) = 0 , if 0 x x 1 ( x x 1 ) ( x 3 x 1 ) , if x 1 < x x 3 ( x 5 x ) ( x 5 x 3 ) , if x 3 < x x 5 0 , if x 5 < x 1
For more details, see [50]. To write Algorithm 15, it is enough to define the points (Equation (11)) in Algorithm 7 and add a step between steps 5 and 6 that allows the following test to be performed:
if u exp z 2 2 b 1 ( x ) , Return x .
Although Algorithm 15 is more extensive and includes numerous conditionals, the existence of the preliminary test reduces the number of logarithmic evaluations required, which should make it more computationally efficient.
Algorithm 15 BNM [50]
Require: Parameters α > 1 and β > 1
Output: Random sample x with distribution b e t a ( α , β )
1:
A α 1
2:
B β 1
3:
C A + B
4:
L C log C
5:
μ A C
6:
σ 0.5 C 1 2
7:
x 2 A A B C 1 1 2 R
8:
if  x 2  not in  [ 0 , 1 ]  or not real then  x 2 = 0
9:
x 4 A + A B C 1 1 2 R
10:
if  x 4  not in  [ 0 , 1 ]  or not real then  x 4 = 1
11:
x 1 = x 2 x 2 ( 1 x 2 ) A C x 2
12:
x 5 = x 4 x 4 ( 1 x 4 ) A C x 4
13:
x 3 = μ
14:
Generate z N ( 0 , 1 )
15:
x z σ + μ
16:
if  x < 0  or  x > 1  then go to 14
17:
Generate u U ( 0 , 1 )
18:
if  0 x x 1  or  x 5 < x 1  then  s 0
19:
else if  x 1 < x x 3  then  s x x 1 x 3 x 1
20:
else if  x 3 < x x 5  then  s x 5 x x 5 x 3
21:
if  u e z 2 2 s  then
22:
      return x
23:
if  log u > A log x A + B log 1 x B + L + 0.5 z 2  then go to 14
24:
else
25:
      return x

3.3.7. B2P and B4P Algorithms

Similar to the A–R methods discussed in Section 3.3.3, by taking the density function of a beta r.v. with the parameters α , β > 1 such that
f ( x ) = x A A 1 x B B C C
where A = α 1 , B = β 1 and C = A + B , Schmeiser and Shalaby [50] developed three methods based on the A–R method for generating samples of a b e t a ( α , β ) r.v. The first one is the BNM algorithm, and the other two methods will be described below. Starting from the points defined in Equation (11), this method uses a majorising piecewise function t 1 ( x ) :
t 1 ( x ) = x f ( x 2 ) x 2 , if 0 x x 2 1 , if x 2 < x x 4 ( 1 x ) f ( x 4 ) 1 x 4 , if x 4 < x 1
This method is called the 2-points technique, since t 1 requires evaluating f ( x ) at two points: x 2 and x 4 . It can easily be proven that t 1 ( x ) f ( x ) x [ 0 , 1 ] [50]. For  x [ 0 , x 2 ] or x [ x 4 , 1 ] , t 1 is a straight line joining two points of the function f ( x ) , and this itself is convex, while for x [ x 2 , x 4 ] , t 1 ( x ) = 1 = f ( x 3 ) = max x f ( x ) , and hence t 1 ( x ) f ( x ) . This method also employs a minorising function:
b 2 ( x ) = 0 , if 0 x x 1 ( x x 1 ) f ( x 2 ) x 2 x 1 , if x 1 < x x 2 f ( x 2 ) + ( x x 2 ) ( 1 f ( x 2 ) ) x 3 x 2 , if x 2 < x x 3 f ( x 4 ) + ( x 4 x ) ( 1 f ( x 4 ) ) x 4 x 3 , if x 3 < x x 4 ( x 5 x ) f ( x 4 ) x 5 x 4 , if x 4 < x x 5 0 , if x 5 < x 1
It can also be proven that b 2 ( x ) f ( x ) x [ 0 , 1 ] (see [50]). For  x [ x 2 , x 4 ] , f ( x ) is concave, and b 2 ( x ) represents two straight lines connecting x 2 , x 3 and x 4 , and thus b 2 ( x ) f ( x ) . For  x [ x 1 , x 2 ] or x [ x 4 , x 5 ] , f ( x ) is concave, and b 2 ( x ) f ( x ) because b 2 ( x ) is tangent to f ( x ) in x 2 and x 4 . By using t 1 , s it is possible to develop Algorithm 16.
For the 4-points method, it is necessary to evaluate f ( x ) in x 1 , x 2 , x 4 and x 5 and consider the majorising function:
t 2 ( x ) = x f ( x 1 ) x 1 , if 0 x x 1 f ( x 1 ) + ( x x 1 ) ( f ( x 2 ) f ( x 1 ) ) x 2 x 1 , if x 1 < x x 2 1 , if x 2 < x x 4 f ( x 5 ) + ( x 5 x ) ( f ( x 4 ) f ( x 5 ) ) x 5 x 4 , if x 4 < x x 5 ( 1 x ) f ( x 5 ) 1 x 5 , if x 5 < x 1
The implementation of the 4-points method differs from the 2-points method in that four rectangular regions with a probability of rejection of zero have been created, while the rest is the same. These methods are efficient when both α and β are relatively small. However, when either parameter is large, the majorising functions t 1 ( x ) and t 2 ( x ) do not fit particularly well, causing the B2P and B4P algorithms to be less efficient. In [21], two algorithms were developed by replacing the majorising function in the tails with a better-fitting exponential dominant function.
In [50] a performance comparison of the BNM, B2P, B4P, BN, BB, Jöhnk’s and ratio of gammas (RG) algorithms was performed on a CYBER 72 computer using a FORTRAN compiler. The study concluded that the BNM algorithm is more efficient than the BN algorithm (in a marginal runtime), but for most parameter values, the B2P and B4P algorithms are faster, with the B4P algorithm being less sensitive to large parameter values. For its part, the BB method was faster when the distribution was quite skewed. Jöhnk’s method was more inefficient than the others, and the RG algorithm was slow because it requires the generation of two gamma random variables.
Algorithm 16 B2P algorithm [50]
Require: Parameters α , β > 1
Output: Random sample x with distribution b e t a ( α , β )
  Initialisation
1:
A α 1 , B β 1 , C A + B , L C log C , x 1 x 2 0 , x 3 A C , x 4 x 5 1 , f 2 f 4 0
2:
if  C 1  then go to 10
3:
D A B ( C 1 ) 1 2 C
4:
if  D x 3  then go to 7
5:
else
6:
       x 2 x 3 D , x 1 x 2 x 2 ( 1 x 2 ) ( A C x 2 ) , f 2 exp A log x 2 A + B log ( 1 x 2 ) B + L
7:
if  x 3 + D 1  then go to 10
8:
else
9:
       x 4 x 3 + D , x 5 x 4 x 4 ( 1 x 4 ) ( A C x 4 ) , f 4 exp A log x 4 A + B log ( 1 x 4 ) B + L
10:
p 1 x 3 x 2 , p 2 ( x 4 x 3 ) + p 1 , p 3 f 2 x 2 2 + p 2 , p 4 f 4 ( 1 x 4 ) 2 + p 3
Generation
11:
Generate u U ( 0 , 1 )
12:
u u p 4
13:
Generate w U ( 0 , 1 )
14:
if  u > p 1  then go to 19
15:
else
16:
       x x 2 + w ( x 3 x 2 ) , v u p 1
17:
      if  v f 2 + w ( 1 f 2 )  then go to 41
18:
      else  go to 37
22:
      end if
19:
if  u > p 2  then go to 24
20:
else
21:
       x x 3 + w ( x 4 x 3 ) , v u p 1 ( p 2 p 1 )
22:
      if  v 1 ( 1 f 4 ) w  then go to 41
23:
      else  go to 37
24:
Generate w 2 U ( 0 , 1 )
25:
if  w 2 > w  then  w w 2
26:
if  u > p 3  then go to 33
27:
else
28:
       x w x 2
29:
       v u p 2 ( p 3 p 2 ) w f 2
30:
      if  x x 1  then go to 37
31:
      else if  v f 2 ( x x 1 ) ( x 2 x 1 )  then go to 41
32:
      else  go to 37
33:
x 1 w ( 1 x 4 )
34:
v u p 3 ( p 4 p 3 ) ( 1 x ) f 4 ( 1 x 4 )
35:
if  x x 5  then go to 37
36:
else if  v f 4 ( x 5 x ) ( x 5 x 4 )  then go to 41
37:
P log ( v )
38:
if  P ( x x 3 ) 2 ( C + C )  then go to 11
39:
if  P > A log ( x A ) + B log ( ( 1 x ) B ) + L  then go to 11
40:
else
41:
      return x

3.3.8. B2PE and B4PE Algorithms

In [21,51], two extensions of the B2P and B4P algorithms were proposed—the B2PE and B4PE algorithms—that are valid for α , β > 1 . Both algorithms are similar, with B4PE being more cumbersome but faster.
First, the new functions t 1 ( x ) and t 2 ( x ) are defined by taking the curve of an exponential for the tails [52], and b 2 ( x ) is taken as shown in Equation (15). The generation of each variable X needed for the B2PE algorithm consists of taking a probability p j , j = 1 , 2 , 3 with a region from the three existing ones defined, generating a point ( x , v ) uniformly distributed over the selected region and accepting or rejecting it, depending on whether it is above or below the t 1 ( x ) and b 2 ( x )  [21] functions. To generate the points ( x , v ) in the region i, we take
x = x 2 + ( x 4 x 2 ) v , if i = 1 x 2 + log ( u ) λ 2 , if i = 2 x 4 log ( u ) λ 4 , if i = 3
where λ 2 = A x 2 B ( 1 x 2 ) , λ 4 = B ( 1 x 4 ) A x 4 and u U ( 0 , 1 ) . By computing log ( u ) , exponential values are generated for regions 2 and 3 [21]. For the v coordinate, this is generated as U ( 0 , t 1 ( x ) ) , but it is not necessary to evaluate t 1 ( x ) explicitly. For  i = 1 , t 1 ( x ) = 1 , while for regions 2 and 3, t 1 ( x ) involves an exponential function, but by using Schmeiser’s method [52], the exponential operation can be eliminated. For the B4PE algorithm, t 2 ( x ) is taken. The number of regions increases to 10, but the underlying idea is the same. The pseudo-codes of these methods are shown in Algorithms 17 and 18.
Algorithm 17 B2PE algorithm (Schmeiser and Babu [21])
Require: Parameters α , β > 1
Random sample x with distribution b e t a ( α , β )
  Initialisation
1:
A α 1 , B β 1 , C A + B , L C log C , x 2 f 2 f 4 0 , x 3 A C , x 4 1
2:
if  C 1  then go to 10
3:
D A B ( C 1 ) 1 2 C
4:
if  D x 3   then go to 7
5:
else
6:
       x 2 x 3 D , λ 2 A x 2 B ( 1 x 2 ) , f 2 exp A log x 2 A + B log ( 1 x 2 ) B + L
7:
if  x 3 + D 1  then go to 10
8:
else
9:
       x 4 x 3 + D , λ 4 B ( 1 x 4 ) A x 4 , f 4 exp A log x 4 A + B log ( 1 x 4 ) B + L
10:
p 1 x 4 x 2
11:
p 2 f 2 λ 2 + p 1
12:
p 3 f 4 λ 4 + p 2
Generation
13:
Generate u U ( 0 , 1 )
14:
u u p 3
15:
Generate v U ( 0 , 1 )
16:
if  u > p 1  then go to 22
17:
else                                        ▹ Region 1
18:
       x x 2 + u
19:
      if  x < x 3  and  v < f 2 + ( x x 2 ) ( 1 f 2 ) ( x 3 x 2 )  then go to 40
20:
      else if  x x 3  and  v < f 4 + ( x 4 x ) ( 1 f 4 ) ( x 4 x 3 )  then go to 40
21:
      else  go to 36
22:
if  u > p 2  then go to 30
23:
else                                        ▹ Region 2
24:
       u ( u p 1 ) ( p 2 p 1 )
25:
       x x 2 + log ( u ) λ 2
26:
      if  v < λ 2 ( x x 2 ) + 1 u  then go to 40
27:
      if  x 0  then go to 13
28:
      else
29:
            v v f 2 u  then go to 36
30:
u ( u p 2 ) ( p 3 p 2 )                                      ▹ Region 3
31:
x x 4 log ( u ) λ 4
32:
if  v < λ 4 ( x 4 x ) + 1 u  then go to 40
33:
if  x 1  then go to 13
34:
else
35:
       v v f 4 u
36:
P log ( v )
37:
if  P > ( x x 3 ) 2 ( C + C )  then go to 13
38:
if  P > A log ( x A ) + B log ( ( 1 x ) B ) + L  then go to 13
39:
else
40:
      return x
Algorithm 18 B4PE algorithm (Schmeiser and Babu [21])
Require: Parameters α , β > 1
Output: Random sample x with distribution b e t a ( α , β )
  Initialization
1:
A α 1 , B β 1 , C A + B , L C log C , x 1 x 2 0 , x 3 A C , x 4 x 5 1 , f 1 f 2 f 4 f 5 0
2:
if  C 1  then go to 18
3:
D A B ( C 1 ) 1 2 C
4:
if  D x 3   then go to 11
5:
else
6:
         x 2 x 3 D
7:
         x 1 x 2 x 2 ( 1 x 2 ) ( A C x 2 )
8:
         λ 1 A x 1 B ( 1 x 1 )
9:
         f 1 exp A log x 1 A + B log ( 1 x 1 ) B + L
10:
       f 2 exp A log x 2 A + B log ( 1 x 2 ) B + L
11:
if   x 3 + D 1  then go to 18
12:
else
13:
       x 4 x 3 + D
14:
       x 5 x 4 x 4 ( 1 x 4 ) ( A C x 4 )
15:
       λ 5 B ( 1 x 5 ) A x 5
16:
       f 4 exp A log x 4 A + B log ( 1 x 4 ) B + L
17:
       f 5 exp A log x 5 A + B log ( 1 x 5 ) B + L
18:
p 1 f 2 ( x 3 x 2 )
19:
p 2 f 4 ( x 4 x 3 ) + p 1
20:
p 3 f 1 ( x 2 x 1 ) + p 2
21:
p 4 f 5 ( x 5 x 4 ) + p 3
22:
p 5 ( 1 f 2 ) ( x 3 x 2 ) + p 4
23:
p 6 ( 1 f 4 ) ( x 4 x 3 ) + p 5
24:
p 7 ( f 2 f 1 ) ( x 2 x 1 ) 2 + p 6
25:
p 8 ( f 4 f 5 ) ( x 5 x 4 ) 2 + p 7
26:
p 9 f 1 λ 1 + p 8
27:
p 10 f 5 λ 5 + p 9
Generation
28:
Generate u U ( 0 , 1 )
29:
u u p 10
30:
if  u > p 4  then go to 42
31:
else
32:
      if  u > p 1  then go to 35
33:
       else                                             ▹ Region 1
34:
           return  x x 2 + u f 2
35:
      if  u > p 2  then go to 38
36:
      else                                              ▹ Region 2
37:
           return  x x 3 + ( u p 1 ) f 4
38:
      if  u > p 3  then go to 41
39:
      else                                              ▹ Region 3
40:
           return  x x 1 + ( u p 2 ) f 1
41:
      return  x x 4 + ( u p 3 ) f 5                                    ▹ Region 4
42:
Generate w U ( 0 , 1 )
43:
if   u > p 5  then go to 49
44:
else                                           ▹ Region 5
45:
       x x 2 + w ( x 3 x 2 )
46:
      if  ( u p 4 ) ( p 5 p 4 ) w  then go to 86
47:
      else
48:
            v f 2 + u p 4 ( x 3 x 2 )  then go to 82
49:
if  u > p 6  then go to 55
50:
else                                           ▹ Region 6
51:
       x x 3 + w ( x 4 x 3 )
52:
      if  ( p 6 u ) ( p 6 p 5 ) w  then go to 86
53:
      else
54:
            v f 4 + u p 5 ( x 4 x 3 )  then go to 82
55:
if  u > p 8  then go to 69
56:
else                                           ▹ Region 7
57:
    Generate w 2 U ( 0 , 1 )
58:
      if  w 2 > w  then w w 2
59:
      if  u > p 7  then go to 65
60:
      else
61:
            x x 1 + w ( x 2 x 1 )
62:
            v f 1 + 2 w ( u p 6 ) ( x 2 x 1 )
63:
           if  v < f 2 w  then go to 86
64:
           else  go to 82
65:
x x 5 w ( x 5 x 4 )                                    ▹ Region 8
66:
v f 5 + 2 w ( u p 7 ) ( x 5 x 4 )
67:
if  v f 4 w  then go to 86
68:
else  go to 82
69:
if  u > p 9  then go to 76                                    ▹ Region 9
70:
       u p 9 u ( p 9 p 8 )
71:
       x x 1 + log ( u ) λ 1
72:
      if  x 0  then go to 28
73:
      if  w λ 1 ( x x 1 ) + 1 u  then go to 86
74:
      else
75:
            v w f 1 u  then go to 82
76:
u p 10 u ( p 10 p 9 )                                          ▹ Region 10
77:
x x 5 log ( u ) λ 5
78:
if  x 1  then go to 28
79:
if  w λ 5 ( x 5 x ) + 1 u  then go to 86
80:
else
81:
       v w f 5 u
82:
P log ( v )
83:
if  P > ( x x 3 ) 2 ( C + C )  then go to 28
84:
if  P > A log ( x A ) + B log ( ( 1 x ) B ) + L   then go to 28
85:
else
86:
      return x
In [21], a comparative study of the B2P, B4P, B2PE, B4PE and BB algorithms on a CDC CYBER 72 computer was carried out by programming them in FORTRAN. As a result, the B2PE and B4PE algorithms dominated the B2P and B4P algorithms, respectively. Furthermore, both the B2PE and B4PE algorithms dominated the BB algorithm, with the B4PE algorithm being approximately twice as fast as the BB algorithm.

3.4. Other Methods of Generation

There are other methods for generating beta random variables. Some of them are based on certain statistical properties of random variables, order statistics, stratified rejection methods or stochastic search procedures, among others.
The method based on gamma random variables is one of the most used methods. It is based on the following result (see [32]). If Y and Z are two independent standard gamma random variables with parameters α and β , respectively, then X = Y ( Y + Z ) b e t a ( α , β ) . Given this property, it is possible to state that any existing method for simulating gamma random variables can be adopted and used to generate a beta distribution. Algorithm 19 shows this method.
Algorithm 19 Method for generating a beta r.v. based on a gamma r.v. [37]
Require: Parameters α y β
Output: R.v. X b e t a ( α , β )
  Generate Y G a m m a ( α )
  Generate Z G a m m a ( β )
   X Y Y + Z
  return X
Another relational property of the beta r.v. allows generating samples of this distribution from the order statistics of a sample of a U ( 0 , 1 ) r.v. as explained in [53]. If 0 < U ( 1 ) < < U ( α + β 1 ) < 1 is an ordered sample of a size ( α + β 1 ) from a U ( 0 , 1 ) distribution, then U ( α ) b e t a ( α , β ) . Thus, a sample from a b e t a ( α , β ) r.v. with α , β N can be generated by taking the α th smallest value from a uniform sample of a size ( α + β 1 ) .
Algorithm 20 requires generating a whole sample from a uniform distribution to obtain a single value of the desired beta distribution, which is computationally cumbersome, especially if large parameters are considered. In addition, standard ordering of the sample to find the the smallest α th value may not be too difficult for small parameters, but when taking large α or β values, the cost rises. As a solution to this problem, other faster sorting algorithms such as SELECT [54] or SORT [55] can be used as a solution to this problem.
Algorithm 20 Method based on order statistics [32]
Require: Parameters α and β
Output: Random sample x of a b e t a ( α , β ) r.v.
  Generate a sample of size ( α + β 1 ) from a U ( 0 , 1 ) r.v.
   x α th order statistics
  return x
In relation to stratified rejection methods, the B00, B01 and B11 algorithms developed in [56], in which envelopes and exponentials are applied to slices in the centres and tails of the desired beta distribution, respectively, stand out. Some improvements on the traditional A–R algorithms are the BPRS and BPRB algorithms developed in [57], which improve the acceptance and rejection at the centre of the beta r.v. for simulation using the idea of patch rejection. When α , β < 1 , the B00 algorithm has the fastest computational generation time among all compared algorithms. When α < 1 < β or β < 1 < α , the B01 algorithm is the fastest. When α , β > 1 , if one parameter is close to one and the other is large, then the B4PE algorithm has the shortest computer generation time; otherwise, the BPRS algorithm has the shortest computer generation time.
Another procedure is the stochastic search method developed in [58] that asymptotically generates beta random variables. This method has some drawbacks and was studied and improved upon in [59] (MK algorithm). In this paper, the authors indicate the parameter values of the beta r.v. that allow it to be generated with Kennedy’s algorithm and perform a study on the optimal parameter selection. The MK algorithm proposed is faster than all the compared algorithms for generating the beta r.v. when α , β < 1 , and 1.2 < α + β < 2 . In [60], a universal generator is proposed for absolute-continuous and integer-valued random variables. This algorithm is based on the previous work [61] and is a generalization of the A-R method. There are other recent techniques, which also include analysis through neutrosophic statistics (see [62]), analysing the generation of beta distributions using the neutrosophic acceptance–rejection method (see [63]).

4. Computational Development

This section compares the previously described methods to find which one is best for simulating beta random variables. The computational results were derived from the implementation of the different algorithms in R, the generation of 5 beta random samples of a size of 10,000 for each of them and the calculation of the average execution times of these for a wide range of parameters on a laptop with an Intel(R) Core(TM) i5-7200U CPU @ 2.50 GHz (2.71 GHz) and 8 GB of RAM. The choices for the sample sizes and parameters were in relation to those made by the authors of the algorithms themselves in their comparisons [46,47,49,50]. The uniform pseudo-random numbers were generated by means of the function runif [64]. In turn, the random samples of the normal, gamma and exponential distributions needed in some of the algorithms were also generated by their respective functions already existing in R: rnorm, rgamma and rexp [64]. The tables that stored the execution times always took α β . For the opposite case, it was enough to generate 1 X instead of X [37], and therefore the times hardly varied.
To begin with, for the algorithm based on the order statistics [53], we performed two different sorts of uniform random sampling: the first using the SELECT algorithm [54] and the second using the SORT algorithm [55], which is included in R (the command sort). By looking at Table 1, it can be noted that when using SORT, the algorithm was not all that sensitive to the value taken by the parameters, the execution time remained between 400 and 600 ms at all times, unlike the algorithm with SELECT sorting, which was highly volatile. Using the SELECT [54] algorithm to sort the uniform random sample was quite interesting when the parameters α and β were small; its execution times were much better than with SORT when α or β did not take values greater than 10. Otherwise, the time needed to simulate a beta r.v. skyrocketed, taking up to 4 s for a b e t a ( 100 , 100 ) . This is why it is not recommended to use SELECT sorting for large parameters.
In Table 1, we can also observe the execution times of Jöhnk’s algorithm [41] and the algorithm based on the gamma r.v. [32] for the parameters α , β 1 , while in Table 2, the times for these same algorithms but with α < 1 and any β value are collected. In Table 3 and Table 4 we show the average number of iterations required for the simulations in BN, Cheng’s and Jöhnk’s algorithms and Cheng’s, Jöhnk’s and Switch algorithms respectively. When analysing the times used by Jöhnk’s algorithm, it is necessary to restrict the use of this algorithm to small parameters. For α , β = 5 , the algorithm already needed more than 3 s to generate the beta r.v., needing an average of 252 iterations as shown in Table 3. Similarly, if we tried to simulate a b e t a ( 50 , 100 ) , an average number of iterations to the order of 40 was reached. However, neither this nor the algorithms based on order statistics were comparable to the method based on the gamma r.v. This method was faster for any type of parameter; it took no more than 5 ms to simulate any of the beta random variables studied. Moreover, it was 10 times more efficient than the best case of either of the other two methods. Therefore, if the objective is to simulate a beta r.v. through one of the methods based on relational properties, then the algorithm based on the gamma r.v. is recommended, as it was the fastest and took advantage of the function rgamma in R [64], which allowed us to generate random samples. Gamma is easily programmable, as it is only necessary to generate two samples and operate between them.
Next, we will compare the inversion algorithms: the bisection algorithm [32] based on the root-finding algorithms and the NINIGL algorithm [31]. The execution times for these methods are collected in Table 5 and Table 6. Looking at the times for the bisection method, it is clear that this algorithm was slow, as we predicted, but it is interesting to note the stability of its times. Regardless of the parameters taken, it took around 0.7 or 0.8 s to simulate a beta r.v. On the other hand, we have the NINIGL algorithm. When studying the times of this algorithm, it is highlighted that, in spite of the speed with which it generated a beta r.v. of any parameters, when α or β were less than 0.5, the algorithm slowed down quite a lot and became invalid when both parameters were less than 0.5. This is because when a beta distribution takes on small parameters, its density reaches values close to zero, and this causes numerical problems in the algorithm. However, in spite of this, this method was clearly faster than the bisection method. In addition, this algorithm was programmed in the Runuran library, which is an adaptation of the C library of Universal Non-Uniform Random Number Generators UNU.RAN) under the name Polynomial Interpolation of Inverse CDF (PINV), which makes it more attractive for use in simulating beta random variables.
Thirdly, collect the time (in milliseconds) that was necessary to generate beta random variables from what we called specific methods. On the one hand, Table 7 stores the times of those methods applicable when both parameters were greater than one (i.e., the A–R algorithms from the density of a normal distribution [46], the BN, BNM and symmetric case BS algorithms [50], the 2-points B2P algorithm [50] together with its version using the tail of the exponential B2PE algorithm and the extension to the 4-points B4PE algorithm [21] and the BA and BB algorithms [49]). On the other hand, the execution times of the algorithms applicable when at least one of the parameters was less than one (i.e., the “switching algorithm” [47] (SW1 and SW2) and Cheng’s BA and BC algorithms [49]) are stored in Table 8. In addition, the runtimes of the function of R that generated random samples from a beta distribution rbeta are also collected in both tables.
If we begin by comparing Cheng’s algorithms, it should be noted that although the BB and BC algorithms are supposed to be improvements upon the BA algorithm by reducing the number of logarithmic evaluations required, in our codes, the time spent on these evaluations was less than the time required to run the additional tests of the BB and BC algorithms, and therefore, the BA algorithm was uniformly faster than the other two. In addition, since it does not have the additional tests, it is simpler to code. When contrasting this algorithm with the “switching algorithm” of Atkinson and Whittaker [47] either when both parameters were less than one (SW2) or when only one was (SW1), we see that Atkinson and Whittaker’s algorithm was faster than the BA algorithm for all parameters where both algorithms were valid. Furthermore, although the average number of iterations required when simulating symmetric beta random variables was the same, in general, Atkinson and Whittaker’s algorithm [47] required fewer iterations, as shown in Table 4. It is also interesting to note the behaviour of the SW2 algorithm which, as the beta distribution became more asymmetric, became more efficient. The SW1 version, on the other hand, did not follow this pattern. Another algorithm that followed a pattern, even if it was the opposite of the SW2 algorithm’s, was the 2-points B2P [50] algorithm. This algorithm became worse as some of the parameters took on larger values because the major function did not fit well in the tails, and it took almost 7 s to simulate a b e t a ( 1 , 100 ) r.v. Looking at Table 7 proves that, indeed, the supposed improvement of the 2-points algorithm proposed by Schmeiser and Babu, the B2PE algorithm [21], which uses exponentials to fit the majorising function better in the tails of the beta distribution, behaved as such. The B2PE method is noteworthy for its constancy; the times required to simulate a beta r.v. were generally between 60 and 70 ms, which is very different from the behaviour of the B2P algorithm. However, this constancy of the B2PE method was also present in the 4-points version (B4PE) [21]. Between these two algorithms, it is complicated to establish which is the best. The B4PE algorithm was faster than the B2PE one, as shown in Table 7. However, the difference was minimal, not exceeding 30 ms, and the B4PE algorithm was more extensive at the time of programming. Therefore, this decision is left to the user’s choice. Regardless of this choice, the adapted BS algorithm of Ahrens and Dieter [46] was faster and simpler than the previous two, but it is only valid for symmetric cases where α = β and the original BN algorithm [46], applicable for any beta r.v. with parameters greater than one, was not as efficient. Moreover, the supposed improvement of the BN algorithm proposed by Schmeiser and Shalaby (BNM) [50] was not computationally so. The computation of the x 1 , , x 5 points and the additional step to evaluate the minorant function were more time-consuming than the evaluations of the original method they were trying to avoid. One remarkable thing about these three algorithms is how little efficiency they had for α = β = 1.001 ; the BNM algorithm could not be applied directly, and the times needed for the BN and BS algorithms were far from those needed for the rest of the parameters. The reason for this behaviour is that for these parameters, it took around 28 iterations to generate a sample of the beta r.v., as shown in Table 3. However, the best option for simulating a beta r.v. among the algorithms present in Table 7 and Table 8 was to use rbeta. When both parameters were greater than one, this function was about 25 times faster than those considered thus far to be more efficient. For the B2PE and B4PE or BS algorithms in the symmetric case and, similarly, when one of the parameters was less than one, the execution speed was at least 20 times faster than that of the “switching algorithm” (SW1 and SW2). The most interesting thing about rbeta is that, according to the documentation of R [64], this function follows the methods of Cheng [49], and yet it is significantly more efficient than the algorithms of Cheng which we programmed (BA, BB and BC). When taking this into account and that, for the tables with the rest of the methods, the algorithms that stood out for their speed could also be found implemented in R (NINIGL) or only needed to perform small transformations from R functions (algorithms based on gamma random variables), it can be stated that the best options for simulating a beta r.v. are already implemented in R.
Finally, we can still compare the three most efficient algorithms (gamma, NINIGL and rbeta) to obtain the best one. To accomplish this, Table 9 and Table 10 collect the average execution times of these algorithms. On the one hand, when one of the parameters was less than one, it can clearly be observed that rbeta is the best choice; the execution times required for the gamma r.v.-based algorithm did not differ much from those required by rbeta, but they were slower, and the algorithm requires some programming. The NINIGL algorithm suffered a lot with small parameters and did not become a good choice in this situation. On the other hand, when both parameters were higher than one, the three algorithms were more in tune. While all three were efficient, the NINIGL algorithm stood out, in general, as the fastest one. Therefore, for each of the two situations, depending on the values taken by the parameters, the best options are those already mentioned. However, if we want an efficient algorithm for any parameter, then the best option for simulating beta random variables is to use rbeta. However, if we want to generate beta random samples through the implementation of any of the algorithms other than those already present in R, when reviewing the tables that collected the average execution times, the best option is the B4PE or B2PE algorithm if both parameters are greater than one, SW1 and SW2 algorithms if at least one parameter is less than one and the BA algorithm in case we are looking for one that is valid for any parameter.

5. Conclusions

There are currently many high-speed methods capable of simulating beta random variables, but more are still being developed in order to find the fastest, simplest, and most accurate algorithm.
The main objective of this work was to analyse in detail the statistical/mathematical and computational aspects of the generation of the beta random variable and to perform an empirical analysis covering the different generation scenarios. After studying the generation of beta random variables and their implementation in R, it has been concluded that the most efficient way to obtain samples of this type is to use algorithms that are already present in R or that only require a simple transformation. Thus, it is established that the function rbeta of R is the most appropriate option for any type of parameter, especially if any of them are less than one, ahead of the NINIGL algorithm [31], which stands out especially when both parameters are greater than one. In case the reader’s objective is to use an algorithm without resorting to R’s own algorithms, we recommend implementing Schmeiser and Babu’s B4PE or B2PE algorithms [21]. If we only want to simulate beta random variables of parameters greater than one, then the so-called “switching algorithm” (SW1 and SW2) by Atkinson and Whittaker [47] is recommended, as well as if any of the parameters are less than one. Otherwise, the BA algorithm by Cheng [49] is recommended if we want one that is valid for any type of parameter.
However, this work only includes the analysis in R, and through the results presented by the authors of certain algorithms, it was observed that they do not behave in the same way in other languages.

Author Contributions

Conceptualization, E.A.L. and C.G.; methodology, E.A.L. and C.G.; software, E.A.L. and C.G.; validation, E.A.L. and C.G.; formal analysis, E.A.L. and C.G.; investigation, E.A.L. and C.G.; resources, E.A.L. and C.G.; data curation, E.A.L. and C.G.; writing—original draft preparation, E.A.L. and C.G.; writing—review and editing, E.A.L. and C.G.; visualization, E.A.L. and C.G.; supervision, E.A.L. and C.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We thank the Editorial Office for their support in the dissemination of this work. We also thank the reviewers for their recommendations to improve the final version of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Thomas, D.B.; Luk, W.; Leong, P.H.W.; Villasenor, J.D. Gaussian Random Number Generators. ACM Computing Surv. 2007, 39, 11-es. [Google Scholar] [CrossRef]
  2. Malik, J.S.; Hemani, A. Gaussian Random Number Generation: A Survey on Hardware Architectures. ACM Comput. Surv. 2016, 49, 1–37. [Google Scholar] [CrossRef]
  3. Maatouk, H.; Bay, X. A new rejection sampling method for truncated multivariate Gaussian random ariables restricted to convex sets. In Mathematics in Monte Carlo and Quasi-Monte Carlo Methods; Springer: Berlin/Heidelberg, Germany, 2016; pp. 521–530. [Google Scholar]
  4. Morán-Vásquez, R.; Zarrazola, E.; Nagar, D.K. Some Theoretical and Computational Aspects of the Truncated Multivariate Skew-Normal/Independent Distributions. Mathematics 2023, 11, 3579. [Google Scholar] [CrossRef]
  5. Almaraz Luengo, E. Gamma Pseudo Random Number Generators. ACM Comput. Surv. 2023, 55, 1–33. [Google Scholar] [CrossRef]
  6. Rubinstein, R.Y.; Kroese, D.P. Simulation and the Monte Carlo Method, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2016. [Google Scholar]
  7. Altiok, T.; Melamed, B. Simulation Modeling and Analysis with ARENA; Elsevier Science & Technology: New York, NY, USA, 2007. [Google Scholar]
  8. Zhu, S.; Deng, X.; Zhang, W.; Zhu, C. Construction of a New 2D Hyperchaotic Map with Application in Efficient Pseudo-Random Number Generator Design and Color Image Encryption. Mathematics 2023, 11, 3171. [Google Scholar] [CrossRef]
  9. Bagdasar, O.; Chen, M.; Drăgan, V.; Ivanov, I.G.; Popa, I.L. On Horadam Sequences with Dense Orbits and Pseudo-Random Number Generators. Mathematics 2023, 11, 1244. [Google Scholar] [CrossRef]
  10. Mahalingam, H.; Rethinam, S.; Janakiraman, S.; Rengarajan, A. Non-Identical Inverter Rings as an Entropy Source: NIST-90B-Verified TRNG Architecture on FPGAs for IoT Device Integrity. Mathematics 2023, 11, 1049. [Google Scholar] [CrossRef]
  11. Ridley, D.; Ngnepieba, P. Antithetic Power Transformation in Monte Carlo Simulation: Correcting Hidden Errors in the Response Variable. Mathematics 2023, 11, 2097. [Google Scholar] [CrossRef]
  12. Almaraz Luengo, E. A brief and understandable guide to pseudo-random number generators and specific models for security. Stat. Surv. 2022, 16, 137–181. [Google Scholar] [CrossRef]
  13. Rukhin, A.L.; Soto, J.; Nechvatal, J.R.; Smid, M.E.; Barker, E.; Leigh, S.; Levenson, M.; Vangel, M.; Banks, D.; Heckert, A.; et al. A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications; Technical Report; NIST: Gaithersburg, MD, USA, 2010. [Google Scholar]
  14. Marsaglia, G. The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness. 1995. Available online: https://web.archive.org/web/20160220101002/http://stat.fsu.edu/pub/diehard/ (accessed on 1 January 2022).
  15. Brown, R.G.; Eddelbuettel, D.; Bauer, D. Dieharder: A Random Number Test Suite (Version 3.31.1). 2014. Available online: https://webhome.phy.duke.edu/~rgb/General/dieharder.php (accessed on 1 January 2022).
  16. Walker, J. ENT: A Pseudorandom Number Sequence Test Program. 2008. Available online: https://www.fourmilab.ch/random/ (accessed on 1 January 2022).
  17. Evans, D.L.; Bond, P.; Bement, A. FIPS Pub 140-2: Security Requirements for Cryptographic Modules; Federal Information Processing Standards Publication; NIST: Gaithersburg, MD, USA, 2002; Volume 12. [Google Scholar]
  18. NIST. FIPS 140-2. Security Requirements for Cryptographic Modules; Technical Report; NIST: Gaithersburg, MD, USA, 2001. [Google Scholar]
  19. NIST. FIPS 140-3. Security Requirements for Cryptographic Modules; Technical Report; NIST: Gaithersburg, MD, USA, 2019. [Google Scholar]
  20. L’ecuyer, P.; Simard, R. TestU01: A C library for empirical testing of random number generators. ACM Trans. Math. Softw. (TOMS) 2007, 33, 1–40. [Google Scholar] [CrossRef]
  21. Schmeiser, B.W.; Babu, A.J.G. Beta Variate Generation via Exponential Majorizing Functions. Oper. Res. 1980, 28, 917–926. [Google Scholar] [CrossRef]
  22. Stern, J.M.; de Bragança Pereira, C.A. Special characterizations of standard discrete models. Revstat-Stat. J. 2008, 6, 199–230. [Google Scholar]
  23. Kuhl, M.E.; Ivy, J.S.; Lada, E.K.; Steiger, N.M.; Wagner, M.A.; Wilson, J.R. Univariate input models for stochastic simulation. J. Simul. 2010, 4, 81–97. [Google Scholar] [CrossRef]
  24. Thangjai, W.; Niwitpong, S.A.; Niwitpong, S.; Smithpreecha, N. Confidence Interval Estimation for the Ratio of the Percentiles of Two Delta-Lognormal Distributions with Application to Rainfall Data. Symmetry 2023, 15, 794. [Google Scholar] [CrossRef]
  25. Pfeifer, D.; Ragulina, O. Adaptive Bernstein Copulas and Risk Management. Mathematics 2020, 8, 2221. [Google Scholar] [CrossRef]
  26. Malcolm, D.G.; Roseboom, J.H.; Clark, C.E.; Fazar, W. Application of a Technique for Research and Development Program Evaluation. Oper. Res. 1959, 7, 646–669. [Google Scholar] [CrossRef]
  27. Kelley, J.E. Critical-Path Planning and Scheduling: Mathematical Basis. Oper. Res. 1961, 9, 296–320. [Google Scholar] [CrossRef]
  28. Bȩtkowski, M.; Pownuk, A. Calculating Risk of Cost Using Monte Carlo Simulations with Fuzzy Parameters in Civil Engineering. In Proceedings of the NSF Workshop on Reliable Engineering Computing, Savannah, Georgia, 15–17 September 2004; pp. 179–192. [Google Scholar]
  29. Price, A.L.; Patterson, N.J.; Plenge, R.M.; Weinblatt, M.E.; Shadick, N.A.; Reich, D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006, 38, 904–909. [Google Scholar] [CrossRef] [PubMed]
  30. Kirichenko, L.; Radivilova, T.; Bulakh, V. Machine Learning in Classification Time Series with Fractal Properties. Data 2019, 4, 5. [Google Scholar] [CrossRef]
  31. Derflinger, G.; Hörmann, W.; Leydold, J. Random Variate Generation by Numerical Inversion When Only the Density is Known. ACM Trans. Model. Comput. Simul. 2010, 20, 1–25. [Google Scholar] [CrossRef]
  32. Devroye, L. Non-Uniform Random Variate Generation; Springer: New York, NY, USA, 1986. [Google Scholar]
  33. Hörmann, W.; Leydold, J. Continuous Random Variate Generation by Fast Numerical Inversion. ACM Trans. Model. Comput. Simul. 2003, 13, 347–362. [Google Scholar] [CrossRef]
  34. Marsaglia, G.; Tsang, W.W. The ziggurat method for generating random variables. J. Stat. Softw. 2000, 5, 1–7. [Google Scholar] [CrossRef]
  35. Givens, G.H.; Hoeting, J.A. Computational Statistics, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  36. Baumgarten, C. Random variate generation by fast numerical inversion in the varying parameter case. Res. Stat. 2023, 1, 2279060. [Google Scholar] [CrossRef]
  37. Law, A.M. Simulation Modeling and Analysis, 5th ed.; McGraw-Hill Education: New York, NY, USA, 2015. [Google Scholar]
  38. Von Neumann, J. Various Techniques Used in Connection with Random Digits. In Monte Carlo Method; Householder, A.S., Forsythe, G.E., Germond, H.H., Eds.; US Government Printing Office: Washington, DC, USA, 1951; Volume 12, National Bureau of Standards Applied Mathematics Series, Chapter 13; pp. 36–38. [Google Scholar]
  39. Hörmann, W.; Leydold, J.; Derflinger, G. Automatic Nonuniform Random Variate Generation, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  40. Marsaglia, G. The squeeze method for generating gamma variates. Comput. Math. Appl. 1977, 3, 321–325. [Google Scholar] [CrossRef]
  41. Jöhnk, M. Erzeugung von betaverteilten und gammaverteilten Zufallszahlen. Metrika 1964, 8, 5–15. [Google Scholar] [CrossRef]
  42. Forsythe, G.E. Von Neumann’s Comparison Method for Random Sampling from the Normal and Other Distributions. Math. Comput. 1972, 26, 817–826. [Google Scholar]
  43. Ahrens, J.H.; Dieter, U. Extensions of Forsythe’s method for random sampling from the normal distribution. Math. Comput. 1973, 27, 927–937. [Google Scholar]
  44. Brent, R.P. Algorithm 488: A Gaussian pseudo random number generator. Commun. ACM 1974, 17, 704–706. [Google Scholar] [CrossRef]
  45. Atkinson, A.C.; Pearce, M.C. The computer generation of Beta, Gamma and Normal Random Variables. J. R. Stat. Soc. Ser. A 1976, 139, 431–461. [Google Scholar] [CrossRef]
  46. Ahrens, J.H.; Dieter, U. Computer methods for sampling from gamma, beta, poisson and binomial distributions. Computing 1974, 12, 223–246. [Google Scholar] [CrossRef]
  47. Atkinson, A.C.; Whittaker, J. A Switching Algorithm for the Generation of Beta Random Variables with at Least One Parameter Less than 1. J. R. Stat. Soc. Ser. A 1976, 139, 462–467. [Google Scholar] [CrossRef]
  48. Atkinson, A.C.; Whittaker, J. Algorithm AS 134: The Generation of Beta Random Variables with one Parameter Greater than and One Parameter Less than 1. J. R. Stat. Soc. Ser. A 1979, 20, 90–93. [Google Scholar] [CrossRef]
  49. Cheng, R.C.H. Generating Beta Variates with Nonintegral Shape Parameters. Commun. ACM 1978, 21, 317–322. [Google Scholar] [CrossRef]
  50. Schmeiser, B.W.; Shalaby, M.A. Acceptance Rejection Methods for Beta Variate Generation. J. Am. Stat. Assoc. 1980, 75, 673–678. [Google Scholar] [CrossRef]
  51. Schmeiser, B.W.; Babu, A.J.G. Errata. Oper. Res. 1983, 31, 802. [Google Scholar]
  52. Schmeiser, B.W. Generation of Variates from Distribution Tails. Oper. Res. 1980, 28, 1012–1017. [Google Scholar] [CrossRef]
  53. Fox, B.L. Generation of Random Samples from the Beta F Distributions. Technometrics 1963, 5, 269–270. [Google Scholar] [CrossRef]
  54. Floyd, R.W.; Rivest, R.L. Algorithm 489: The algorithm SELECT-for finding the ith smallest of n elements [M1]. Commun. ACM 1975, 18, 173. [Google Scholar] [CrossRef]
  55. Singleton, R.C. Algorithm 347: An Efficient Algorithm for Sorting with Minimal Storage [M1]. Commun. ACM 1969, 12, 185–186. [Google Scholar] [CrossRef]
  56. Sakasegawa, H. Stratified rejection and squeeze method for generating beta random numbers. Ann. Inst. Stat. Math. Part B 1983, 35, 291–302. [Google Scholar] [CrossRef]
  57. Zechner, H.; Stadlober, E. Generating beta variates via patchwork rejection. Computing 1993, 50, 1–18. [Google Scholar] [CrossRef]
  58. Kennedy, D.P. A note on stochastic search methods for global optimization. Adv. Appl. Probab. 1988, 20, 476–478. [Google Scholar] [CrossRef]
  59. Hung, Y.C.; Balakrishnan, N.; Lin, Y.T. Evaluation of Beta Generation Algorithms. Commun. Stat.-Simul. Comput. 2009, 38, 750–770. [Google Scholar] [CrossRef]
  60. Barabesi, L.; Pratelli, L. Universal methods for generating random variables with a given characteristic function. J. Stat. Comput. Simul. 2014, 85, 1679–1691. [Google Scholar] [CrossRef]
  61. Devroye, L. On the computer generation of random variables with a given characteristic function. Comput. Math. Appl. 1981, 7, 547–552. [Google Scholar] [CrossRef]
  62. Smarandache, F. Introduction to Neutrosophic Statistics; Sitech & Education Publishing: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
  63. Jdid, M.; Nabeeh, N.A. Generating Random Variables that follow the Beta Distribution Using the Neutrosophic Acceptance-Rejection Method. Neutrosophic Sets Syst. 2023, 58, 139–147. [Google Scholar]
  64. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Table 1. Execution time (in milliseconds) for relational property-based algorithms ( α , β 1 ).
Table 1. Execution time (in milliseconds) for relational property-based algorithms ( α , β 1 ).
ParametersMethods
AlphaBetaGammaSELECTSORTJöhnk
113.0039.29439.8491.16
23.18103.73433.88135.44
52.80166.32445.73282.73
102.99244.14441.03488.30
502.59646.80476.252243.62
1002.591257.66512.18*
223.20158.97431.96272.88
52.59222.41449.40938.62
102.79296.78435.652761.89
502.59710.20475.56*
1002.191334.46517.91*
553.19306.19435.85*
102.20367.11445.19*
502.19816.16478.66*
1002.391513.51525.20*
10102.19475.53466.57*
502.00958.83478.22*
1002.401736.97521.37*
50501.781955.14511.28*
1001.802842.27553.92*
1001002.194028.12587.14*
* Excessive time.
Table 2. Execution time (in milliseconds) for relational property-based algorithms ( α < 1 ).
Table 2. Execution time (in milliseconds) for relational property-based algorithms ( α < 1 ).
ParametersMethodsParametersMethods
AlphaBetaGammaJöhnkAlphaBetaGammaJöhnk
0.10.14.3847.880.50.54.1957.45
0.34.8050.870.94.3971.59
0.53.9952.5714.7976.99
0.94.1856.9623.7997.68
12.9955.8553.18119.79
23.2055.65103.19165.12
52.9961.04503.20365.27
102.7963.721003.19510.84
502.6072.91
1002.7984.88
0.30.33.8069.780.90.94.18108.61
0.54.0956.8513.59104.32
0.94.3861.8823.60126.26
13.6059.9453.39245.40
23.3971.03103.38421.22
52.9986.98502.991578.84
102.99108.321003.192835.19
503.19218.72
1002.99220.33
Table 3. Average number of iterations required ( α , β 1 ).
Table 3. Average number of iterations required ( α , β 1 ).
ParametersMethods
AlphaBetaBN *ChengJöhnk
1128.0412.00
22.4891.1853.00
53.1111.346.00
104.1481.40211.00
508.8871.45751.00
100-1.464101.00
221.3291.0616.00
51.3771.1321.00
101.6891.1866.00
503.3591.2341326.00
1005.0251.2425151.00
551.091.101252.00
101.1391.1173003.00
501.8891.1553.5 × 10 6
1002.5881.1629.7 × 10 7
10101.0411.1141.8 × 10 5
501.3911.1347.5 × 10 10
1001.8091.1414.7 × 10 13
5050 1.1261.0 × 10 29
100 2.0 × 10 40
* Take 1.001 instead 1 for α and β .
Table 4. Average number of iterations required ( α < 1 ).
Table 4. Average number of iterations required ( α < 1 ).
ParametersMethodsParametersMethods
AlphaBetaChengJöhnkSwitch *AlphaBetaChengJöhnkSwitch *
0.10.11.771.011.770.50.51.271.271.27
0.32.491.041.530.91.501.461.09
0.52.701.061.3511.541.501.15
0.92.841.091.0921.721.881.21
12.861.101.0351.842.711.25
22.941.161.05101.893.701.26
52.991.251.07501.938.041.28
103.011.331.081001.9311.331.28
503.021.561.09
1003.021.671.09
0.30.31.461.111.460.90.91.041.811.04
0.51.721.171.3411.071.901.16
0.91.951.281.1021.262.761.17
11.981.301.0951.415.181.19
22.131.501.14101.478.961.19
52.231.871.18501.5235.761.20
102.272.271.201001.5366.161.20
502.293.621.21
1002.304.441.21
* Take β = 1.001 instead of β = 1 .
Table 5. Execution times (in milliseconds) for inversion algorithms ( α , β 1 ).
Table 5. Execution times (in milliseconds) for inversion algorithms ( α , β 1 ).
ParametersMethods
AlphaBetaBisectionNINIGL
11687.400.60
2760.751.40
5761.261.00
10771.291.20
50765.051.00
100738.421.40
22732.412.59
5803.142.39
10821.892.62
50838.482.39
100771.992.19
55835.621.60
10819.311.20
50802.671.60
100795.961.60
1010792.231.00
50794.361.59
100810.241.60
5050756.471.20
100876.261.40
100100876.651.39
Table 6. Execution times (in milliseconds) for inversion algorithms ( α < 1 ).
Table 6. Execution times (in milliseconds) for inversion algorithms ( α < 1 ).
ParametersMethodsParametersMethods
AlphaBetaBisectionNINIGLAlphaBetaBisectionNINIGL
0.10.1743.93-0.50.5739.833.39
0.3755.05-0.9795.653.99
0.5753.3344.891744.473.06
0.9820.2343.782758.804.39
1727.2141.895756.774.59
2842.6138.7810782.664.79
5834.2039.7050747.514.79
10850.5737.70100745.924.29
50805.9137.10
100813.5237.10
0.30.3759.32-0.90.9738.043.39
0.5713.1511.571704.672.79
0.9714.3414.762790.743.59
1848.1913.365817.373.39
2783.4812.7710827.933.10
5858.2511.3750807.463.59
10848.7010.97100813.412.99
50832.7212.17
100798.9311.17
Table 7. Execution times (in milliseconds) for specific algorithms ( α , β 1 ).
Table 7. Execution times (in milliseconds) for specific algorithms ( α , β 1 ).
ParametersMethods
AlphaBetaB2P *B2PE *B4PE *BABBrbetaBN *BNM *BS *
11138.6363.6358.2495.84123.272.00671.40-735.03
2264.49113.70107.12111.30140.422.19107.11126.67
5378.9977.5959.83123.48147.612.59144.81169.95
10721.0769.2259.64140.02168.142.60185.90227.98
503392.5464.8356.45132.25157.982.59458.77499.47
1006830.8664.0259.45133.24156.582.59555.32700.73
22208.5482.7879.38100.54122.871.9979.9891.1558.24
5187.9975.4060.84105.71125.462.0079.00106.72
10287.7775.4060.44110.31132.452.1992.35169.94
501024.2775.4157.84115.29134.642.39163.76263.10
1002094.0577.3966.42115.70134.452.19223.61334.71
55175.3378.9948.08103.92128.052.1973.0099.1349.67
10203.2667.6347.87106.31125.472.0075.8098.34
50619.7570.2149.87107.31126.262.19113.50169.15
1001065.7469.2251.46111.90132.451.99148.80230.98
1010210.4468.2148.27105.52130.652.2070.3076.2045.08
50441.2170.8149.07109.30123.871.9992.55108.71
100745.0273.0049.06106.12126.272.19125.07147.80
5050391.3566.2346.88107.11128.252.5965.8372.2145.07
100492.4967.8145.67106.32128.062.3972.3081.38
100100515.0266.6249.07105.32136.042.3970.6075.2045.48
* Take α = 1.001 instead of α = 1 and β = 1.001 instead of β = 1 .
Table 8. Execution times (in milliseconds) for specific algorithms ( α < 1 ).
Table 8. Execution times (in milliseconds) for specific algorithms ( α < 1 ).
ParametersMethods
AlphaBetaBABCrbetaSW1 *SW2
0.10.1166.39196.483.59 112.49
0.3219.61229.193.19 97.94
0.5242.16237.172.79 76.80
0.9239.15255.322.99 68.41
1257.91246.142.9963.63
2270.68244.353.1968.42
5255.17248.342.3970.82
10254.53246.142.5966.62
50261.70246.142.5969.41
100253.52247.142.5975.60
0.30.3142.62164.562.79 89.17
0.5153.19179.122.39 85.76
0.9169.95203.652.79 68.02
1173.73191.092.5963.23
2185.50202.462.7969.01
5192.09207.842.3971.81
10192.08206.052.5973.44
50196.27208.442.5974.99
100199.07210.042.3975.60
0.50.5118.89153.992.39 76.20
0.9135.03162.762.59 64.23
1139.83167.552.5970.41
2163.56179.122.6072.40
5161.56198.472.7975.80
10166.16188.892.5977.20
50168.56189.492.5877.39
100177.92191.692.5978.39
0.90.999.54129.652.19 67.42
1101.92135.642.2073.80
2118.48146.212.3972.21
5128.86160.762.4074.20
10132.45165.362.7973.61
50137.43169.752.4075.79
100135.84174.132.5975.00
* Take α = 1.001 instead of α = 1 and β = 1.001 instead of β = 1 .
Table 9. Execution times (in milliseconds) for algorithms implemented in R ( α , β 1 ).
Table 9. Execution times (in milliseconds) for algorithms implemented in R ( α , β 1 ).
ParametersMethods
AlphaBetaGammaNINIGLrbeta
113.000.602.00
23.181.402.19
52.801.002.59
102.991.202.60
502.591.002.59
1002.591.402.59
223.202.591.99
52.592.392.00
102.792.622.19
502.592.392.39
1002.192.192.19
553.191.602.19
102.201.202.00
502.191.602.19
1002.391.601.99
10102.191.002.20
502.001.591.99
1002.401.602.19
50501.781.202.59
1001.801.402.39
1001002.191.392.39
Table 10. Execution times (in milliseconds) for algorithms implemented in R ( α < 1 ).
Table 10. Execution times (in milliseconds) for algorithms implemented in R ( α < 1 ).
ParametersMethods
AlphaBetaGammaNINIGLrbeta
0.10.14.38-3.59
0.34.80-3.19
0.53.9944.892.79
0.94.1843.782.99
12.9941.892.99
23.2038.783.19
52.9939.702.39
102.7937.702.59
502.6037.102.59
1002.7937.102.59
0.30.33.80-2.79
0.54.0911.572.39
0.94.3814.762.79
13.6013.362.59
23.3912.772.79
52.9911.372.39
102.9910.972.59
503.1912.172.59
1002.9911.172.39
0.50.54.193.392.39
0.94.393.992.59
14.793.062.59
23.794.392.60
53.184.592.79
103.194.792.59
503.204.792.58
1003.194.292.59
0.90.94.183.392.19
13.592.792.20
23.603.592.39
53.393.392.40
103.383.102.79
502.993.592.40
1003.192.992.59
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Almaraz Luengo, E.; Gragera, C. Critical Analysis of Beta Random Variable Generation Methods. Mathematics 2023, 11, 4893. https://doi.org/10.3390/math11244893

AMA Style

Almaraz Luengo E, Gragera C. Critical Analysis of Beta Random Variable Generation Methods. Mathematics. 2023; 11(24):4893. https://doi.org/10.3390/math11244893

Chicago/Turabian Style

Almaraz Luengo, Elena, and Carlos Gragera. 2023. "Critical Analysis of Beta Random Variable Generation Methods" Mathematics 11, no. 24: 4893. https://doi.org/10.3390/math11244893

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop