PaCAL : A Python Package for Arithmetic Computations with Random Variables

In this paper we present PaCAL , a Python package for arithmetical computations on random variables. The package is capable of performing the four arithmetic operations: addition, subtraction, multiplication and division, as well as computing many standard functions of random variables. Summary statistics, random number generation, plots, and histograms of the resulting distributions can easily be obtained and distribution parameter ﬁtting is also available. The operations are performed numerically and their results interpolated allowing for arbitrary arithmetic operations on random variables following practically any probability distribution encountered in practice. The package is easy to use, as operations on random variables are performed just as they are on standard Python variables. Independence of random variables is, by default, assumed on each step but some computations on dependent random variables are also possible. We demonstrate on several examples that the results are very accurate, often close to machine precision. Practical applications include statistics, physical measurements or estimation of error distributions in scientiﬁc computations.


Introduction
Arithmetic operations on random variables have many applications in statistics, error propagation or probabilistic inference.In this paper we present PaCAL, a Python (van Rossum et al. 2011) package which allows for computation with random variables as one does with ordinary variables in a computer program.The resulting densities are computed numerically and approximated using Chebyshev interpolation, allowing for computations involving a very wide class of distributions which are allowed to have infinite supports and singularities in their density functions.Thanks to the very high numerical stability of Chebyshev methods, excellent accuracy is typically achieved, comparable with that of double precision arithmetic.The package focuses mainly on independent random variables, however some functionality for computations on dependent variables is also available.
We have tested the package on hundreds of examples, including most examples and exercises from Springer (1979), as well as numerous relationships between distributions listed in Wikipedia and statistical textbooks, achieving, in almost all cases, excellent accuracy.
In this section we discuss the related work and give a high level overview of the package's structure and capabilities, as well as installation and basic usage instructions.Section 2 gives a detailed overview of the package with numerous examples, Section 3 presents the facilities available for computing with dependent random variables, Section 4 presents selected applications, and finally, Section 5 concludes and discusses future directions.

Basic usage
Let us start with a simple example corresponding to measuring an angle A by measuring the opposite Y and adjacent X whose values are not known exactly but follow uniform distributions on [1,2] and [3,4] respectively.The following code will produce the probability density function of A: The resulting plot is shown in Figure 1.The density is clearly not normal and asymmetric.Of course A can be used in further computations.
In the rest of the paper we omit the import statements and commands related to subplots, legends, etc.

Related approaches
Related approaches fall into three general groups: symbolic, based on closed families of distributions, and sampling based.
The symbolic approach relies on the capabilities of computer algebra systems (CAS) to explicitly solve the integrals involved in performing operations on random variables.Such systems have the advantage of yielding exact results whenever the solver is capable of computing all the necessary integrals.In practice however, the integrals do not usually have closed form representations, and numerical fallbacks are necessary.Such fallbacks, if provided by the CAS, are typically very inefficient and not suitable for practical use.A prominent example of such a system is the APPL package (Glen, Evans, and Leemis 2001) based on Maple (Maplesoft 2008).Basic symbolic functionality for probabilistic arithmetic is also available in Mathematica (Wolfram 2003).
Another class of approaches is based on closed families of distributions.The most important of such approaches is described in Springer (1979) and is based on so called H-functions, which are a generalization of hypergeometric functions.The family of independent random variables whose densities are H-functions is closed under multiplication and division but not under addition and subtraction.The approach presented in Springer (1979) is primarily analytical, but numerical procedures based on moment estimation are also discussed.Lack of closure under addition and reliance on moments limit the usefulness of the approach.Another approach, closer in spirit to PaCAL, is presented in Williamson (1989) and Williamson and Downs (1990).The method is based on Laguerre polynomials and relies on the closure of such polynomials under Fourier and Mellin convolutions.The approach does not allow for division of random variables and does not allow for densities with discontinuities or singularities.
An approach based on polynomial interpolation (used also by PaCAL) has been demonstrated for a very general case of dependent random variables in Shenoy and West (2011).However, that approach uses symbolic integration by Mathematica and relies on closeness of polynomials under integration.As a result, multiplication and division are not possible except for special cases.Moreover, no convenient software package is provided.
The last group of methods are purely statistical approaches based on Monte Carlo (MC) sampling.The MC approach currently offers the greatest flexibility allowing for an arbitrary number of dependent random variables, Bayesian inference, etc. Examples of such systems are BUGS (Lunn, Thomas, Best, and Spiegelhalter 2000;Spiegelhalter, Thomas, Best, and Lunn 2003) and PyMC (Patil, Huard, and Fonnesbeck 2010).The MC approach is also suggested as the method of choice in the official guide on assessing uncertainty in measurements (JCGM 2009).The main weakness of the MC approach is slow convergence; the accuracy grows with square root of the sample size, thus to obtain one more correct decimal place of the result a 100 times larger sample is needed.Typically, sampling based methods only provide 3-4 decimal places of accuracy.The situation gets much worse in the tails of the distributions where few samples are available.
PaCAL can be viewed as a variant of the method based on closed representations, however, the class we use (that of approximable densities) is extremely broad and allows for heavy tails, singularities in densities, and the use of arbitrary arithmetic operations.Our earlier paper (Jaroszewicz and Korzeń 2012) explores the theoretical aspects of the approach, describes the closed family of distributions, and discusses details of interpolation and integration procedures, while this paper concentrates on a detailed description of the package itself.Several new examples are presented, as well as functionality not covered in the earlier paper, such as order statistics or optimized operations on independent identically distributed (i.i.d.) sequences.All parts concerning discrete distributions and dependent random variables are also new.
Finally, we need to mention the Chebfun package (Battels and Trefethen 2004;Trefethen and others 2011) which uses the same methodology to implement computations on arbitrary functions, and was the main inspiration for us.It is however not tailored towards probabilistic computation (in fact the only function useful in the probabilistic context is convolution).We focus solely on probabilistic arithmetic, which allows us to obtain better accuracy for a wider range of probabilistic operations.

Overview of PaCAL
We now give a high level overview of the structure and capabilities of the package.
The base class for all probability distributions is the class 'Distr'.Objects of this class are random variables following some distribution.The result of any operation on objects of this class is again an instance of 'Distr', which can be used in subsequent computations.The main part of the package implements efficient arithmetic on independent random variables with the four arithmetic operations: addition, subtraction, multiplication and division.This operations are implemented for the base class 'Distr' and, thanks to operation overloading, can be performed using standard arithmetic symbols +, -, *, /.Besides basic arithmetic, three additional binary operations are available: min, max, and the power operation which overloads the ** symbol.Moreover, a number of standard functions such as square root, exponential function or logarithm are available.Summary statistics of random variables such as mean, variance, skewness, etc., can be computed through the methods of the 'Distr' class.More advanced functionality like plots, such as histograms, quantiles and random number generation is also implemented.
Several standard continuous distributions are predefined in the package.Others, such as noncentral distributions, are defined, internally, using arithmetic operations on predefined distributions.Besides continuous random variables, discrete distributions are also available, but unlike the continuous case, they are restricted to finite domains.Continuous and discrete random variables can be freely mixed in calculations.Certain operations (e.g., min and max) can lead to mixed continuous-discrete distributions which are also supported.For example the random variable Y = min(X, 0), where X ∼ N (0, 1) is of such a mixed type with P(Y = 0) = 0.5 and the continuous part being the right arm of the normal density.
The stats sub-package contains some useful statistical procedures for i.i.d.variables.It contains two modules: stats.distr_estimplementing distribution parameter fitting and stats.iid_opscontaining optimized routines for probabilistic arithmetic on i.i.d.random variables.These include products, averages (arithmetic and geometric), minimum, maximum and order statistics.
The remaining part of the package implements an early approach to computations with dependent random variables.Currently the functionality is limited to the two-variable case and densities without singularities.The base class for multidimensional distributions is 'NDDistr', representing the joint density of two variables, from which all multidimensional distributions are derived.Currently, bivariate product distribution, bivariate normal distribution and a joint distribution of two ordered statistics are available.More complicated relationships can also be modeled using copulas.After defining the joint distribution of two variables, arithmetic operations can be performed on them.

Installation
This paper is based on PaCAL version 1.1.All examples will also work in the newer versions.
The following package is also recommended: Cython (Behnel, Bradshaw, Citro, Dalcin, Seljebotn, and Smith 2011) needed only for compilation of optimized code sections from source.PaCAL works correctly without the compiled parts, albeit less efficiently.
The latest version of PaCAL can be installed directly from the PyPI repository of Python packages using

easy_install pacal
For Windows users, we provide a Windows installer.To install from source, the tar.gz file needs to be downloaded and unpacked.Afterward, the standard command for installation of Python packages needs to be executed in the package directory: python setup.pyinstall

A remark on semantics
Two cases need to be distinguished: that of identical random variables and that of identically distributed random variables.If X and Y are i.i.d.random variables, then X + Y and X + X = 2X have, in general, different distributions.PaCAL, however, treats both cases in the same way -arguments of all operations are always treated as independent.Consider the following two lines: >>> S1 = UniformDistr() + UniformDistr() >>> U = UniformDistr(); S2 = U + U Warning: arguments treated as independent The distributions of S1 and S2 are the same, but the second expression gives a warning.We inform the user, that the result may be inconsistent with her intentions.If this is the case, one should rather write S2 = 2 * U.
The semantics is different in the dependent variable part of the package, where using the same instance several times always refers to the same random variable.Consequently U + U is equivalent to 2 * U.

Detailed description of PaCAL's capabilities
In this section we present the capabilities of PaCAL in detail together with several examples.We begin by presenting the available distributions, later we discuss the methods of the core 'Distr' class and other facilities, such as functions of random variables, conditioning, plotting and random number generation.

Distributions
Several standard continuous distributions are implemented directly (default parameters are given in the parentheses): Arbitrary finite discrete distributions can be created by instantiating the 'DiscreteDistr' class directly.Remember that PaCAL allows only for finite supports in the discrete case.Truncation can be used to approximate infinite distributions.The strategy is very effective for distributions with exponentially decaying probabilities, such as the Poisson.The approach will, however, be inefficient for slowly decaying distributions, because in this case the number of points with high probability masses may be extremely large.
Direct implementation of discrete random variables with infinite supports using PaCAL's methodology seems to be a challenging task, as it would require efficient methods of approximating infinite, irregularly spaced discrete functions.

Summary statistics
The following descriptive statistics are implemented as methods of the 'Distr' class and are thus available for all random variables: The summary() method prints the most important descriptive statistics.

Numerical accuracy
High numerical accuracy was a key issue for us while designing the package, it was thus necessary to be able to assess the accuracy of the result.Clearly, when the theoretical distribution or its parameters, such as the mean, are known, we may use them to evaluate the accuracy directly.The interpolation and integration procedures also allow for error estimation, and such estimates are available in PaCAL.However, a very useful estimate of the accuracy is the integral of the density over the whole real line (the 'zeroth' moment).Since we take no specific steps to guarantee that the densities integrate to 1, the quantity |1 − ∞ −∞ f | is a useful indicator of accuracy which takes into account the accumulation of error over subsequent operations.We call this quantity the • 1 error; it is available via the int_error() method of class 'Distr'.We have noticed that the • 1 error is a good indicator of the order of magnitude of errors of other statistical indicators such as the mean, variance or quantiles.
We now give two examples demonstrating the accuracy of PaCAL on two numerically difficult cases involving operations on densities with singularities.A related example can be found in Section 2.10.An example involving distributions with heavy tails, another kind of numerical difficulty, can be found in Section 2.8 and in Section 4 (Hill's estimator); Section 2.4 contains examples of extremely heavy tails pushing the package to its limits.
The first example is the sum of four χ 2 distributions with various numbers of degrees of freedom: The χ 2 distribution is a good benchmark because of its known theoretical properties: the sum of independent χ 2 variables also has a χ 2 distribution whose number of degrees of freedom is the sum of degrees of freedom of the summation terms; all moments of the distribution are known and have integer values.Moreover, the distribution is numerically difficult: for one degree of freedom, the density has a singularity at zero which poses a challenge to interpolation and integration.The result S above follows a χ 2 200 distribution, and we can compare its moments with known theoretical values.The • 1 error is 4.66 • 10 −15 .One can also immediately see, that the mean and variance are correct to at least 14 decimal digits.The median (not an integer) is also accurate to 14 decimal places.
A similar example demonstrates the noncentral χ 2 distribution.The density of this distribution does not have a closed form representation and, instead, is computed numerically as X 2 + Y , with X ∼ N (λ, 1) and Y ∼ χ 2 df −1 , where df is the number of degrees of freedom and λ the noncentrality parameter.The density of the first term of the sum has a singularity at zero.Below we compute ten moments (m k ) and central moments (µ k ) of the distribution: >>> X = NoncentralChiSquareDistr(df = 10, lmbda = 3) >>> for k in range(11): ... print k, "m_k=", repr(X.moment(k,0)), ",\tmu_k=", repr(X.moment(k))It is known that, for an integer λ, the moments of the noncentral χ 2 distribution are integers, so one can immediately see that the accuracy of 14-15 digits is achieved even for higher moments.
The noncentral χ 2 distribution can be computed using many equivalent expressions.The one described above was picked because it requires only a single convolution of two densities, only one of which involves a singularity, while the other is given by an explicit formula.As a rule of thumb, when several equivalent expressions exist, it is usually best to pick the one which involves the fewest operations on distributions with singularities and/or heavy tails, since they are more difficult to interpolate and integrate numerically.

Functions of random variables
Functions of random variables are often used in statistics.We implement a number of standard functions: exp, log, sqrt, atan, abs, sign, sin, cos and tan1 .Powers of random variables can be computed using the standard Python ** operator.Moreover, the power operation is implemented also for the case when both arguments are random variables.
In the examples presented so far the accuracy has always been very high.Here we present two examples showing the behavior of the package in very difficult cases which push its capabilities to the limits: >>> N = abs(NormalDistr()) ** NormalDistr() >>> E = exp(CauchyDistr()) + exp(CauchyDistr()) >>> print N.int_error(), N.median() >>> print E.int_error() 2.67842808e-06 1.00000013926 0.013874124313 The first line raises an absolute value of a normal variable to a power whose exponent is also a normal variable, the second computes the sum of exponents of two independent Cauchy random variables.Both densities are shown in Figure 2. In the first case, the • 1 error of the result is 2.67 • 10 −6 .Note the singularities at zero and at one; it is the second singularity which causes the loss of precision 2 .The distribution does not have any moments, but it can be seen that the median is accurate up to 6 digits.The second result has the • 1 error of 1.38 • 10 −2 .This might seem to be quite a weak result, note however that the exponent of a Cauchy random variable decreases only as 1 x log 2 x for x → ∞.The tail is so heavy that 4.48 • 10 −4 of the probability mass lies beyond the range of double precision arithmetic!PaCAL is nevertheless able to add the two random variables and, as the histogram shows, the result is fairly accurate.See Section 2.8 and the Hill's estimator example in Section 4 for more typical heavy tailed distributions which PaCAL integrates with very high precision.
The next example demonstrates the abs and sign functions.It illustrates a property of symmetric distributions: the product of two independent random variables representing their absolute value and sign (a discrete random variable taking values −1 and +1) follows the original distribution.

Random number generation
Random number generation is supported through the rand method of the 'Distr' class: rand(n) returns a random sample of size n taken from the distribution of the random variable on which the method is called.
For standard distributions, the method uses native random generators from the numpy.randompackage; arithmetic operations are implemented by performing those operations on samples returned by the rand method of the arguments.A more difficult case arises for conditional (Section 2.9) or user defined distributions, where such methods are not available.In those cases we resort to sampling based on the inverse of the cumulative distribution function implemented by the method rand_invcdf(n, use_interpolated = True).If use_interpolated is set to True, the inverse CDF is precomputed (using numerical root finding) and stored in an interpolated form before generating random numbers, greatly speeding up the approach at the cost of a slight decrease in accuracy.
For comparison, taking a sample of size 1000000 from the normal distribution using rand takes 0.06 s and is about 40 times faster then using rand_invcdf with the flag use_interpolated = True which takes 2.2 s.Using rand_invcdf, without interpolation is much slower: generating a sample of size 10000 takes about 26 s.

Plots and histograms
Graphical methods available in the 'Distr' class are:

Operating on probability densities and cumulative distributions
The 'Distr' class serves as an external representation of the random variable, but most of the computations are performed internally on probability density functions (PDFs) which are implemented as piecewise interpolated functions.It is possible to directly access the PDF as well as the cumulative distribution function (CDF) of a distribution using the following methods: pdf(x) returns the value of the PDF at x, cdf(x) returns the value of the CDF at x, quantile(y) returns the quantile function of the random variable at y, get_piecewise_pdf() returns an object representing the PDF, get_piecewise_cdf() returns an object representing the CDF, get_piecewise_invcdf() returns an object representing the inverse of the CDF (the quantile function).
The last three methods return objects of the 'PiecewiseFunction' class which represents piecewise continuous functions.Several operations can be performed on objects of this class like plotting or integrating.It is also possible to perform pointwise arithmetic operations such as addition or subtraction (appropriate operators have been overloaded).The 'PiecewiseFunction' class is thus a simple Python version of the 'chebfun' class form the Chebfun library (Battels and Trefethen 2004;Berrut and Trefethen 2004), containing, however, only the few operations needed to implement probabilistic arithmetic.
The following example shows an application of this functionality to compute the difference between the true and empirical CDFs, and the Kolmogorov-Smirnov statistic: K_n is the value of the Kolmogorov-Smirnov statistic and D_n the maximum absolute difference between the cumulative distribution functions.The plot is shown in Figure 3.This is in fact a finite approximation of a Brownian bridge process.

Discrete and mixed distributions
Discrete distributions can be used in the same way as continuous ones.Mixed continuousdiscrete distributions are also possible.Below are some examples.
In Feller (1971, Chapter V, Chapter 4) there is an interesting example showing that the uniform distribution is a convolution of two singular Cantor-type distributions.In PaCAL, due to finite representation, we cannot represent such distributions directly.They can however be approximated: Figure 4 shows the CDFs of V and X (the CDF of U is similar to V).It can be seen that X approximates the uniform distribution quite well; in fact, its CDF is composed of 256 steps.

Conditional distributions
In PaCAL, there are two classes representing conditional (truncated) distributions: where U and L are constants.We have overridden the symbol | (Python's or), to make our syntax closer to standard probabilistic notation.The syntax is: X | Condition where X is a random variable (instance of class 'Distr') and Condition is one of: Gt(L), Lt(U), Between(L, U).For example, let us verify that the exponential distribution is a memoryless distribution: -(CE -7).get_piecewise_pdf() >>> print "absolute_error =", r.max_abs() [1] absolute_error = 4.3520742565306136e-13

Operations on i.i.d. random variables
The module pacal.stats.iid_opscontains optimized operations for sequences of i.i.d.variables.The following functions are provided: iid_sum(X, n), iid_average(X, n): sum and average of n i.i.d.r.v.s with distribution identical to that of X, iid_prod(X, n), iid_average_geom(X, n): product and geometric mean of n i.i.d.r.v.s, iid_order_stat(X, n, k): the distribution of kth order statistic of an i.i.d.sample size of n.
All the above methods take an additional argument all, by default set to False.When set to True, the function returns a vector of results for all i = 1, . . ., n. E.g., iid_sum(X, n, all = True) returns the vector of partial sums: If all were set to False, only a single value, S n , would be returned.
The example below demonstrates the extreme value theorem for the exponential distribution.

Distribution fitting
The stats.distr_est module contains a single class 'LoglikelihoodEstimator' which can be used for simple fitting of the parameters of continuous distributions to data using the maximum likelihood method.Below we use a sample taken form a beta distribution to recover its parameters:

Towards dependent random variables
All variables in the previous sections have been assumed to be independent.Sometimes, however, this assumption does not hold, and we need to compute with dependent random variables.This section describes the facilities for such computations available currently in PaCAL.They are essentially limited to the two-variable case, extension to larger number of variables is a work in progress.Operations on dependent variables are more restricted then in the case of independence.Only continuous variables are implemented, densities are not allowed to have singularities, maximum and minimum operations are currently not available.
The process of computing with dependent random variables requires two steps: specifying the joint distribution of the variables and computing the distribution of the result of arithmetic operations on them.We describe those steps in turn.
The abstract class 'NDDistr' is the base class for multidimensional (currently two-dimensional) distributions.The main method of this class is pdf(*X) which defines the joint probability density function of the two random variables.Its arguments are the values of each dimension of the variable.Several subclasses are available including multidimensional normal distributions, copulas etc.They are described in detail below.
The class 'TwoVarsModel' performs the actual arithmetic on dependent random variables.One needs to supply the joint distribution of the two variables and the function of the random variables to be computed.The class has a method eval which performs the actual computations and returns the distribution of the desired function of the two variables.While functions of only two dependent variables are currently supported, the function itself may be defined using an arbitrary number of intermediate variables.
It is currently assumed that the function can be inverted symbolically.Mathematical details of the computation are described in the Appendix.The class depends on the sympy package for symbolic computation (SymPy Development Team 2008).Symbolic computations are used to find the inverse of the function or the random variables and its Jacobian.Note that this is different from purely symbolic approaches, as the distributions themselves are represented numerically and need not be expressed explicitly.

Multidimensional distributions
We now describe the multidimensional distributions available in PaCAL: 'NDNormalDistr' class implements the multidimensional normal distribution with given mean vector mu and covariance matrix Sigma.
'IJthOrderStatsNDDistr' class represents the joint distribution of the ith and jth order statistics of an i.i.d.random sample.The constructor takes four arguments: X the distribution from which the sample is taken, n the size of the sample, i, j the order statistics to use.
Copulas.The most flexible option available in PaCAL are copula distributions which are described in detail below.The constructors take the argument marginals containing a list of marginal distributions, and possibly other arguments: parameters of the copula.

Copulas
Copulas allow for construction of multidimensional distributions with given marginal distributions.They allow one, for example, to obtain a joint distribution with given marginals and given rank correlation coefficient.Detailed discussion of copulas is beyond the scope of this paper, for more information see e.g., Nelsen (2006); some basic information is also included in the Appendix.
There are four copula types available in PaCAL (we follow the naming used by Nelsen 2006).The simplest is the PiCopula which implements the product distribution corresponding to independent random variables.The three remaining types of copulas are: FrankCopula, GumbelCopula, ClaytonCopula.See the Appendix, the examples below, and Nelsen (2006) for details.We do not aim to provide a full featured copula implementation at this moment, just a usable solution to do useful work with dependent random variables.

Parallel circuit of two resistors
The equivalent resistance of a parallel combination of two resistors is equal to where R 1 and R 2 are the resistances of the two resistors.We would like to apply PaCAL to this formula when the resistances are not known exactly, and instead, their probability distributions are given.Even though R 1 and R 2 are independent, we cannot use the arithmetic of independent random variables because the numerator and the denominator of the fraction are dependent.Note that in this specific case we can rewrite the expression as ( 1 R 1 + 1 R 2 ) −1 for which arithmetic of independent variables can be used, however, the modules of PaCAL which operate on dependent variables can handle this expression directly.Consider for example the case where R 1 = 1 ± 0.5 Ω, R 2 = 2 ± 0.5 Ω: >>> R1 = UniformDistr(0.5,1.5) >>> R2 = UniformDistr(1.5,2.5) >>> pr = PiCopula(marginals = [R1, R2]) >>> M = TwoVarsModel(pr, R1 * R2 / (R1 + R2)) >>> R_equiv = M.eval() >>> R_equiv.plot(linewidth= 2.0, color = 'k') >>> R_equiv.hist()>>> R_equiv.summary() The result is shown in Figure 9.Note that the result is an object of class 'Distr' and supports all its methods, for example using R_equiv.cdf(0.75)we obtain P(R < 0.75 Ω) = 0.7266.

General regression
This example does not directly involve arithmetic on random variables, but it demonstrates other capabilities of PaCAL for computing with dependent variables.Given two dependent random variables X and Y , one can define several types of regressions of Y on X.Typically, one models the expectation of Y conditioned on X, but there are other possibilities, including conditional median, mode or quantiles.Given a joint distribution of X and Y one can use PaCAL to easily obtain such regression curves.For a given value x we build a conditional distribution of Y conditioned on X = x and calculate any parameter of the distribution such as the mean, median, mode, etc. Repeating the procedure for various values of x a regression curve is obtained.Below is a sample piece of code which draws a regression curve for the mean of Y when the joint distribution of X and Y is defined using the Frank copula.

Applications
In this section we present examples of statistical applications of PaCAL.

Sample distribution of the difference of proportions
Some tests for equality of two proportions (Newcombe 1998) rely on the exact distribution of their difference.Suppose we have two Bernoulli distributions with success probabilities p 1 and p 2 , draw independent random samples of size n 1 = 8 and n 2 = 7 according to those distributions, and want to compute the distribution of the difference of proportions of successes in the two samples.Suppose that n 1 = 8, n 2 = 7, p 1 = 1/2, p 2 = 2/7; the desired distribution can easily be computed in PaCAL:

Hill's tail exponent estimator
Suppose that our data comes from a heavy tailed distribution with density decreasing as x −β when x → ∞.We want to estimate β based on an i.i.d.sample of size N .A popular choice is the Hill's estimator given by the following equation (Clauset, Shalizi, and Newman 2009) , where X i are the samples and x min > 0 a threshold; only samples X i greater than or equal to x min are included in the sample.The • 1 error is about 5.32 • 10 −15 .It can also be seen that the bias of the estimator for N = 21 is already quite small.

Weighted sum of t distributions
Some statistical procedures use weighted sums of Student's t distributions.For example one of the solutions to the Behrens-Fisher uses the statistic t 1 sin θ − t 2 cos θ for some constant θ (Springer 1979).Using PaCAL one can easily work with such distributions.For example if t 1 and t 2 are Student's t distributed with 10 and 20 degrees of freedom respectively, and θ = π/4, the distribution can be obtained using the following code >>> theta = pi/4 >>> weightedT = sin(theta) * StudentTDistr(10) -\ ... cos(theta) * StudentTDistr(20) The interval supporting the middle 95% of the distribution is [−2.146, 2.146].The • 1 error of the computed distribution is 6.66 • 10 −16 , so one can expect the results to be very accurate.

Distributions of sample range and interquantile range
Order statistics have been implemented in PaCAL as discussed in Section 2.10.However, different order statistics of the sample are not independent, so arithmetic of independent random variables cannot be used.We have thus also implemented the joint probability distribution of the ith and jth order statistic (see Appendix A.1 for the formulas used).The class constructed by IJthOrderStatsNDDistr(X, n, i, j) implements the two dimensional joint distribution of ith and jth order statistic for a sample of size n taken from a population whose distribution is given by X.
Consider a sample of size 8 taken from a normal population.Suppose we want to obtain the distribution of the difference between the 7th and the 2nd value in the ordered sample.
implements functions D2 and D3 which return the mean and standard deviation of sample range of n independent normal random variables.Setting n = 5, i = 1, and j = 5 in the last example, we obtain R.mean() = 2.3259289472810414 and R.std() = 0.86408194109950442, the same values the SAS/QC package gives (respectively 2.3259289473 and 0.8640819411).

Other examples
We have tested PaCAL on hundreds of examples, including most examples and exercises from Springer (1979), as well as several properties of various distributions given in Wikipedia and statistical textbooks.Many examples are available on the package's website at http: //pacal.sourceforge.net/gallery/.

Conclusions and future research
The PaCAL package provides accurate and efficient arithmetic on independent random variables.As our examples have shown, even complicated computations can be performed with accuracy close to machine precision.Besides arithmetic, the package provides many functions which are useful in statistical calculations, such as means, moments, medians, plots, histograms, etc.
A somewhat limited, but nevertheless useful, module for computation with dependent variables is also available.Extending this module to allow for arithmetic with more than two dependent variables and more complex dependency models will be the main goal of future development.

A.2. Minimum, maximum and order statistics
Let us keep the notation from the previous section.The cumulative distribution functions of the minimum and maximum of X and Y are given by: CDF max(X,Y ) (x) = F (x)G(x), and (after differentiation) their probability density functions are: x −∞ g(x) dx + g(x) Those formulas can be generalized to arbitrary order statistics.Let X i , i = 1, . . ., n be a set of i.i.d.random variables with PDF f and CDF F .Let X (i) be the ith order statistic of X i 's.
Then we have (Hoffmann-Jørgensen 1994b, Chapter 2.24) where the second equation is obtained by differentiating the first, and n i−1,1,n−i is the multinomial coefficient.

A.3. Copulas
Copulas are a way of defining joint distributions of several variables based on provided marginals.The joint CDF of variables X and Y is given by CDF X,Y (x, y) = C(F X (x), F Y (y)), where F X and F Y are CDFs of X and Y respectively, and C is the copula function.A good introduction to copulas can be found in Nelsen (2006).In PaCAL we implement the product (independence) copula and three one-parameter families of Archimedean copulas defined as where ϕ determines the family of the copula and is one of Clayton ϕ(t) = 1 θ (t −θ − 1), θ ∈ (0, ∞),
Further distributions are implemented (internally) by applying arithmetic operations and functions to the above distributions.For convenience we have provided the following important noncentral distributions:NoncentralTDistr(df = 2, mu = 0) NoncentralChiSquareDistr(df, lmbda = 0) NoncentralFDistr(df1 = 1, df2 = 1, lmbda = 0) NoncentralBetaDistr(alpha = 1, beta = 1, lmbda = 0)One can also define new distributions by explicitly providing their density function and using the class generated by FunDistr(f, breakPoints = None); the breakPoints argument is the list of discontinuities, which can include -Inf and +Inf to indicate infinite domain.Another possibility is inheriting from the 'Distr' class and overriding the init_piecewise_pdf and rand_raw methods.Discrete distributions inPaCAL are subclasses of the 'DiscreteDistr' class, whose constructor takes lists of points and their respective probability masses.Currently the following subclasses are available: ConstDistr(c = 1.0) with subclasses ZeroDistr() and OneDistr(): degenerate distributions corresponding to constants, BernoulliDistr(p), BinomialDistr(n, p), PoissonDistr(lmbda, trunk_eps = 1e-15) defines the truncated Poisson distribution.The tail of the distribution is truncated such that the total mass of probability is trunk_eps close to 1.

Figure 2 :
Figure 2: The power of two normal variables and the sum of two exponents of Cauchy random variables.
plot(xmin = None, xmax = None, **kwargs) plots the probability density function or the random variables distribution.For discrete and mixed distributions points of nonzero probability are marked using upward pointing arrows of appropriate height.xmin and xmax allow for specifying the range of the plot (by default chosen automatically) and kwargs contains extra formatting arguments to be passed to matplotlib's plot function, hist(n = 1000000, xmin = None, xmax = None, bins = 50, **kwargs) draws a histogram based on a random sample of size n.When xmin and xmax are given, samples are generated only from the specified interval using rejection sampling 3 .kwargs are extra arguments passed directly to matplotlib's plotting routines.

Figure 3 :
Figure 3: Difference between empirical and true CDFs for a uniform sample of size 50.

Figure 4 :
Figure 4: Approximation of Cantor-type distributions whose sum is the uniform distribution.

Figure 5 :
Figure 5: A quotient of two mixtures of uniform random variables.

Figure 6 :
Figure6: Minimum of discrete and continuous uniform random variables and its sum with a uniform random variable.Note the arrows marking points with nonzero probability mass.

Figure 7 :
Figure 7: Sum of two dependent uniform random variables for various values of the θ parameter of the Frank copula defining their joint distribution.

Figure 9 :
Figure 9: Distribution of the equivalent resistance of a parallel circuit of two resistors.

Figure 11
Figure11shows the density and cumulative distribution function of the result.

Figure 11 :Figure 12 :
Figure 11: Difference between sample proportions in two binomial populations.

Figure 14 :
Figure14: Integration paths used for computation of the four basic arithmetic operations on independent random variables: addition, subtraction, multiplication, and division.The figure illustrates, how the contours are truncated to the Cartesian product of two segments.Thin lines denote full integration paths, thick segments show parts of the path within a specific segment pair.
Joint density functions for Frank copula for different values of the θ parameter.isuniformon[0, 2].If, on the other hand, correlation is maximal negative then the result is a degenerate one point distribution with all mass concentrated at 1.Here we use the Frank copula with parameters θ = {−15, −5, 0.05, 10}, such that the correlation varies from strong negative to strong positive.The correlation coefficients in those four cases are about −0.93, −0.64, 0.008, and 0.86 respectively.
The function below returns the sample distribution of Hill's estimator for a Pareto sample with parameter alpha (tail exponent equal to alpha + 1) and size n.Figure12shows the mean and 95% confidence interval of the estimator for various sample sizes taken from the Pareto distribution with parameter alpha = 0.8 and x min = 1; the true tail exponent equals 1.8.It can be seen that the estimator is biased and that the bias decreases with sample size; the median and mode of the estimator are also shown.Using PaCAL, we can easily obtain any parameters of the estimate, e.g., for N = 21 and significance level of 0.05 we get: