The Normal Distribution Formalization for Investment Economic Project Evaluation Using the Monte Carlo Method

Received May 29, 2019 Revised from June 25, 2019 Accepted September 25, 2019 Available online December 15, 2019 Investment plays a very important role in the economy, ensures its sustainable growth, contributes to the improvement of the living standards of the population. The most common mistake of planning investment projects is the insufficient development of risks that may affect the profitability of projects. The purpose of the paper is the formalizing the normal distribution for investment project evaluation using the Monte Carlo method. Such formalizing should allow to present normal distribution in a form that is understandable for nonspecialists in mathematical statistics. A user can easily calculate the standard deviation value and determine the limits of the confidence interval and the range of deviation from the mean value. Such mistakes can lead to incorrect investment decisions and significant losses. The desire to minimize risk requires developing a risk model. One of the risk assessment tools is the Monte Carlo method, which combines and develops both methods of sensitivity analysis and scenario analysis. In the Monte Carlo method, risk analysis is performed using models of possible outcomes where any factor that is characterized by uncertainty is replaced by a probability distribution. Some types of distributions such as normal distribution is used less frequently, because their use requires special knowledge in the field of mathematics. In this paper, the aim is to formalize the normal distribution for use of non-specialists in mathematical statistics. Object of study is the risk assessment of investment projects. Subject of study is the normal distribution formalization for investment project evaluation. As the result the formulas for investment project variables and the form for normal distribution formalization in MS Excel are proposed. The empirical result is an experiment, which identify a pseudo-random numbers sequence as normally distributed. It facilitates the work of an expert and allows him to use the normal distribution variables correctly. JEL classification:

Investment plays a very important role in the economy, ensures its sustainable growth, contributes to the improvement of the living standards of the population. The most common mistake of planning investment projects is the insufficient development of risks that may affect the profitability of projects. The purpose of the paper is the formalizing the normal distribution for investment project evaluation using the Monte Carlo method. Such formalizing should allow to present normal distribution in a form that is understandable for nonspecialists in mathematical statistics. A user can easily calculate the standard deviation value and determine the limits of the confidence interval and the range of deviation from the mean value. Such mistakes can lead to incorrect investment decisions and significant losses. The desire to minimize risk requires developing a risk model. One of the risk assessment tools is the Monte Carlo method, which combines and develops both methods of sensitivity analysis and scenario analysis. In the Monte Carlo method, risk analysis is performed using models of possible outcomes where any factor that is characterized by uncertainty is replaced by a probability distribution. Some types of distributions such as normal distribution is used less frequently, because their use requires special knowledge in the field of mathematics. In this paper, the aim is to formalize the normal distribution for use of non-specialists in mathematical statistics. Object of study is the risk assessment of investment projects. Subject of study is the normal distribution formalization for investment project evaluation. As the result the formulas for investment project variables and the form for normal distribution formalization in MS Excel are proposed. The empirical result is an experiment, which identify a pseudo-random numbers sequence as normally distributed. It facilitates the work of an expert and allows him to use the normal distribution variables correctly.

INTRODUCTION
Assessing the risk of a large investment project, an investment analyst has not been limited by sensitivity analysis and the scenario method (Dixit and Pindyck, 1994). Thanks to the development of computer technology, an investment analyst has the tools to directly select possible scenarios with random number generation using the Monte Carlo method. The advantage of this method over the sensitivity analysis and the scenario method is simultaneous change of several conditions with different probability.
Despite the consistency and simplicity of the Monte Carlo method, the issue of selecting the distribution function for an investment project variable remains insufficiently elaborated. The reason is that traditionally, the basis for the selection of the distribution function and further statistical conclusions are the results of studying the sample, based on conclusions about the parameters of the general population (Reid, 2013). The task of a specialist in the field of statistics is to establish links between the distribution of the general population, the distribution of the sample and the distribution of sample averages. While a statistician determines the distribution function by studying the sample, an investment analyst should predict the sample depending on the type of distribution function.
Sometimes it is possible to fit a distribution to historical data. When using the Monte Carlo method in assessment the effectiveness of investment projects, historical data cannot be, because there is the calculation of future cash flows from the project (Reilly and Brown, 2008). Extrapolation of previous trends for some key variables of an investment project, such as price or sales volumes, does not necessarily give the right idea, and in some cases is impossible, for example, when forecasting sales of innovative, previously unrepresented products (McMillan, 2012). The only way to determine the investment project key variables distribution using the Monte Carlo method is an expert judgment. The task of an investment analyst is to suggest the distribution function type and generate population.
So, this assessment should be carried out by an investment analyst, as well as an expert in the market segment where the project is expected to be implemented and a specialist in mathematical statistics. In practice, the joint work of these specialists is complicated by the fact that a statistic should formalize the knowledge of an expert and pass it on to work for an investment analyst, which requires a considerable amount of time and additional outside specialists. In order to effectively use the Monte Carlo method in practice, it is necessary to present the distribution functions in the understandable form, first, to a user, in our case for an investment analyst.

THEORETICAL BACKGROUND
Monte Carlo simulation method appeared around the year 1944 (Metropolis). Monte-Carlo method is based on the analogy between probability and volume (Fishman, 2006). The mathematics of measure formalizes the intuitive notion of probability, associating an event with set of outcomes and defining the probability of the event to be its volume or measure relative to that of a universe of possible outcomes (Glasserman, 2003). The basic notion in probability theory is that of a random experiment: an experiment whose outcome cannot be determined in advance. The probability distribution of a general random variable is completely specified by the cumulative distribution function (Rubinstein and Kroese, 2007).
Since 1944 Monte-Carlo method has gained popularity and found its application in many areas of human activity (Shonkwiler and Mendivil, 2009). In finance Monte Carlo method is used to model components of project cash flow, which are impacted by uncertainty. Paul F. Dienemann (1966) points out that a single deterministic value of NPV is not a good selection indicator for investment projects. To use Monte Carlo method first, it is necessary to determine the investment project key variables, all possible values that these variables can assume, and the probability of each value occurring. It means to set the distribution function (Dangl, 1999). Further, in the computer model, where the NPV was originally calculated, the random value of the previously selected key variables of the investment project is selected. The value of NPV is determened for each set of random variables. Then, another set of random variables is randomly selected and the final NPV is calculated for the second scenario. This process is repeated many times, and the results of all simulation experiments are combined into a sample and analyzed using statistical methods in order to obtain the probability distribution law of the NPV (Robert and Casella, 2010).
As stated by Platona an Constantinescua (2014), by using Monte Carlo method, the distribution of all possible outcomes of an event is generated by analyzing a model several times, each time using random input values selected from the probability distributions considered normal of the components that comprise the model. In order to select a project was used the database of the Managing Authority for the Sectoral Operational Projects (MA SOP Environment) for major projects to be financed by the Structural Funds. At the time of writing this material, were identified 23 waste management projects and a number of 40 water and wastewater projects, which have been contracted and are under implementation. (Paper Olaru, Şandru and Pirnea, 2014) illustrates some of the results of the "Engineering, Procurement, Construction and Commissioning the gas desulphurization system of a Romanian energy company" investment project. As a result of the Monte Carlo analysis, various reports are obtained showing the estimated effects from the temporal and financial points of view following the impact probability of different risks. Like most papers, these authors devoted their work to the analysis of the results obtained, and not to the assignment of the distribution function (Kozlovsky and Fonitska, 2013),.
While there is a rich literature on a result of the Monte Carlo method use, the number of selection and justification of the type of distribution function studies are relatively scarce. The such information lack is particularly noticeable in the conditions of the computer technology development and the emergence of special Monte Carlo risk assessment programs. While Microsoft Excel and other spreadsheets do not support Monte Carlo simulation, there are several software products that are add-ins to Excel that let perform Monte Carlo simulation (Proctor, 2004). Software packages sample different possible inputs and calculate future outcomes. There are enough probability distributions for any variable (Charnes, 2009). The choice of distribution is completely the paragon of an analyst.
Usually working together, an analyst and an expert use intuitively understandable distributions that do not require special in-depth knowledge of mathematical statistics. This fact significantly reduces the effectiveness of the Monte Carlo method. As observations show, most often assessing the investment projects effectiveness using the Monte Carlo method, the following distribution are used):  triangular distribution used when value is most likely and beyond what limits it will not go (Evans, Hastings and Peacock, 1993);  uniform distribution used when the boundaries of the indicator are known, but it is impossible to determine the most likely value (Johnson, Kemp and Kotz, 2005);  exponential distribution for describing the intervals between consecutive random events for which there are statistics (Johnson, Kemp and Balakrishnan, 1994).
As it can be seen the normal distribution is not included in this list, while its area of application is very wide.
For investment projects, the normal distribution is very useful, since it involves setting some of the most likely value, but it does not limit the range of variation of a variable with a finite length. So, for example, for prices, changes in tax rates, demand for products it is difficult to set an interval in which 100% of all probable values will turn out, since external factors have a great influence here. Therefore, the formalization of this type of distribution function and its representation in terms that can be used by non-specialists in the field of mathematical statistics in for investment project evaluation using the Monte Carlo method is of paramount importance.

RESEARCH OBJECTIVE AND METHODOLOGY
Methodology used to formalize the normal distribution should answer two main questions: How to determine the limits of the confidence interval? How to calculate the standard deviation value? From here to answer the first question the research methodology is to follow these steps:  Using the three-sigma rule to determine the confidence interval;  Conducting an experiment to define the limits of the confidence interval and the range of changes in the average value.
For formalization the normal distribution for investment project evaluation using the Monte Carlo method is necessary parameters of normal distribution depending on the confidence interval. Normal distribution is defined by two parameters:  mean (the average value of a random variable is µ);  standard deviation (the measure that is used to quantify the amount of variation of a set of data values -σ).
The mean (µ) is characterized by the fact that the further X is from the average, the lower the probability of the event realization. The standard deviation of the distribution (σ) determines the measure of accuracy. It means the smaller the σ, the higher the probability of those X, which are located closer to the average (Mann, 2007). In the empirical sciences it is called three-sigma rule of thumb expresses a conventional heuristic that nearly all values are taken to lie within three standard deviations of the mean, and thus it is empirically useful to treat 99.7% probability as near certainty (Kazmier, 2009). Usually a researcher should solve the task to construct a confidence interval for given values of mean and standard deviation. It is understood as an interval that covers an unknown parameter with a given reliability. Confidence is usually set at 90%, 95% and 99%. The higher the confidence level, the wider and less useful the interval.
A researcher who assess investment projects effectiveness using the Monte Carlo method should solve the task to determine expertly both the standard deviation depending on the confidence interval value and the mean value (Kozlovskyi, Khadzhynov, Vlasenko, Marynchak, 2017). The mean, in this case, is an expert's assumption about numerical value of some investment project key variable that can be in the future. For example, an expert introduces the assumption that the price of a product will be 100 dollars (this value will be the mean for the normal distribution). In order to determine the confidence interval, use the fact that there are several different forms of confidence intervals presenting. For example, the statement that "with 95% certainty it can be argued that, most likely, some indicator will be somewhere between a and b ..." is equivalent to the following mathematical formulation "the mean μ of the population is in the range from a to b with confidence level of 0.95" (Patel, J. K. 1996). So, for non-specialists in the field of mathematical statistics it more understandable the first formulation for the value of the confidence interval determining.
An expert will make the following statement: "with 95% certainty, it can be argued that the product price to be marketed will be 100 dollars, a deviation from the average will be 10 dollars". In the language of mathematical statistics, it means that the mean μ is in the range from 90 to 110 dollars with a confidence level of 0.95. One of the parameters, the average value of the random variable, is already determined. The standard deviation is calculated depending on the confidence probability value. Since the mean of a standardized normal distribution is zero, and the standard deviation is one, any set of normally distributed values can be converted into a standardized form.
To further use the normal distribution for investment projects evaluating using the Monte Carlo method, it is necessary to clearly define the limits of the confidence interval and the range of changes in the average value (Kozlovskyi, et al., 2019). Obviously, the narrower the confidence interval and the smaller the value of the deviation from the mean, the more likely that an expert's judgments are described by the normal distribution. But there are situations when an expert, for example, is only 70% sure of his estimates, or when the range of deviation exceeds 50%. Then the question arises about the legitimacy of using the normal distribution (Kozlovskyi, et al., 2018).
In order to determine the confidence interval limits and the deviation, it is proposed to conduct the following experiment: using the tool for generating random numbers built in the MS Excel, it is possible to generate a sequence of pseudo-random numbers obey the normal distribution for the confidence interval from 75% to 99% and the deviation from 5% to 25%. Then, using software products designed for risk analysis, for example, Oracle Crystal Ball or Palisade @RISK, selection of the distribution function type is performed. Using the approach to formalization the normal distribution for assessment of investment projects effectiveness using the Monte Carlo method it is possible to obtain normal distribution parameters for further use.

RESULTS AND DISCUSSION
The result of normal distribution formalization would be formulas for investment project variables within the standard deviation of the mean and a form for calculation in MS Excel. Conducting an experiment to define the limits of the confidence interval and the range of changes in the average value applies to historical data. In this case, it is understood as a previously generated sequence of pseudo-random numbers. The essence of this experiment is the ability to identify a pseudo-random numbers sequence as normally distributed with different limits of the confidence interval and deviations. For example, for the 99% confidence interval and 20% deviation Palisade @RISK determined that the data obey the normal distribution law (Fig. 1). For the 75% confidence interval and 20% deviation (Fig. 2) data obey the Weibull distribution. The results of the distribution function selection to a pseudo-random numbers sequence to different values of the confidence interval limits and deviations from the mean are presented in the table 1. As can be seen from Table 1, if both the confidence interval is not exceeded 80%, and the deviation fluctuated within 20%, the tools of the risk analysis software products identifies the sequence of numbers as normal distributed. Therefore, using the normal distribution for assessment of investment projects effectiveness using the Monte Carlo method:  90% -99% confidence interval can be used for a range of deviations from the mean not exceeding 25%;  80% -85% confidence interval can be used for a range of deviations from the mean not exceeding 20%;  75% confidence interval can be used for a range of deviations from the mean not exceeding 15%.
Only in this case, the investment project key variables modeling the NPV by the Monte Carlo method will take on a normal appearance. It is known that for a normal distribution, within the limits of ±2σ, 95.4% of the values of a random variable are found. With a probability of 90%, the random variable is in ± 1.65σ of the expectation. In other words, half of the normally distributed values lie in an interval whose length is 4/3 of the standard deviation. This means that the interquartile range is in the range from 2/3 of the standard deviation to the left of the mean to 2/3 of the standard deviation to the right of the mean.
To calculate the standard deviation value from here it is necessary to divide the difference between the upper and lower interval ranges by the number of standard deviations. To calculate the standard deviation value, it is necessary to divide the difference between the upper and lower interval ranges by the number of standard deviations. For example, if an expert with 90% confidence states that the changes will fluctuate ±15% of the average value of 20 units, the standard deviation will be 82 1 645 The corresponding formulas for each sigma limits are presented in the table 2.  The next step is to create a form for calculation in MS Excel.
First, it is necessary to set the confidence interval statement ( fig. 3). To display a list of values for confidence intervals that can be chosen from, it is convenient to use a list box. To quickly change a range of deviation from the mean it may be used a scroll bar or spin button ( fig. 4). A researcher or an investment analyst enters the mean value (cell D5). In the form for calculation in MS Excel should be entered 2 formulas and 1 error message. The first formula calculates the deviation from the mean (cell K3 for lower bound and cell M3 for upper bound). The second formula is needed to determine deviation (cell D6). These formulas are presented in Table 2. To enter all the formulas, it is convenient to use logical functions that allows select a standard deviation depending on the confidence interval. An error message is required to inform the user that deviation from the mean number has exceeded the valid range for value of confidence interval. In view of all the above the form for normal distribution formalization in MS Excel looks like the one given in fig. 5. The form in MS Excel for normal distribution formalization facilitates the calculation of the deviation and can be used as an addition to the predictive modeling, forecasting, simulation by the Monte Carlo method. The proposed approach to the normal distribution formalization has effectively established itself in practice. One of the industries where risk assessment is not only desirable, but mandatory, is the gas industry. This is because the oil and gas facilities construction which usually includes the drilling of wells, the construction of oil and gas treatment facilities, a complex of pipelines, power lines, highways, social infrastructure, etc., lasts more than a few years. While at the project initialization stage efficiency is determined based on certain conditions at the commissioning stage conditions may vary significantly. For example, often the oil and gas price differ from the one set in the project in several times.
According to legislation (Law of Ukraine «About investment activity» and Law of Ukraine «About on protection of foreign investments in Ukraine»), at the project initialization stage, it is enough to assess investment opportunities and achieve the planned technical and economic indicators. However, realizing the importance of risk factors and uncertainties in gas industry, investors are not limited to the calculation of NPV, IRR and sensitivity analysis. Even at the conceptual stage, Monte Carlo method is used for investment project evaluation. As an example, it can be considered the gas field construction investment project evaluation D. Bilenko (2018). A project feature is its multivariance. Approximately 100-150 complex options are analyzed and compared among themselves. Complex options consist of 5-6 options for production profiles, 4-6 methods of gas preparation and 5-6 options for the natural gas transport. The main criteria for comparison are NPV (net present value) and the list of risks. The list of risks has totaled more than 50 pieces. The list of risks has included such types of risks, which is worth to set the normal distribution using the Monte Carlo method:  delay in commissioning due to the pipeline route disagree passing near the nature reserve;  excess of actual cost over estimated due to the need to finance force majeure circumstances not covered by the project;  late receipt of permission to cross engineering communications due to increased workload.
As it can be seen, the risks listed above describe events that have never happened before. That's why, the risk factors characteristics can be determined only by expert opinions. The work of the expert committee is carried out in the meeting form of investors, a project group of engineers and economists and gas operating specialists. The result of the meeting should be objective and reliable information enough to form the list of risks, both probability of occurrence and damage. For these risks, the probability of occurrence is going to be described by the normal distribution.
For it, meeting participants should determine value of variable that can be in the future (mean) and how confident they are in statement (confidence interval). For example, let the experts give the following estimates:  experts 90% confident that delay in commissioning due to the pipeline route disagree passing near the nature reserve is about 15 days ± 3 days;  experts 99% confident that excess of actual cost over estimated due to the need to finance force majeure circumstances not covered by the project is about 120 000 $ ± 30 000 $;  experts 85% confident that late receipt of permission to cross engineering communications due to increased workload is about 4 weeks ± 1 week.
Using formulas for each sigma limits presented in table 2, easy to calculate the deviation for investment project in gas industry: In order to further use the Monte Carlo method to model the influence of investment project risk factors, some rules are immediately put into the model MS Excel. According it a clear correlation is established between risk factors. For example, delay in commissioning of 3 additional days leads to a shift in the timing of the construction completion, as a result, fines.

CONCLUSION
In this paper, different forms of normal distribution confidence intervals presenting are formulated in the both languages of a statistician and an investment expert. It makes non-specialists in mathematical statistics understand the essence of standard deviation for. The results that prove it are listed below.
An experiment, which identify a pseudo-random numbers sequence as normally distributed, has been carried out. The results of the distribution function selection form to a sequence of pseudo-random numbers is presented in the form of table. It allows expert to use the range of deviation from the mean correctly.
The formulas for investment project variables within the standard deviation of the mean are proposed.
The form for normal distribution formalization in MS Excel is designed. Developed form greatly simplifies the work of an investment analyst and does not require knowledge in mathematical statistics.
On the example of the gas field construction investment project evaluation, an approach to the normal distribution formalization was implemented in practice.
The proposed approach to the formalizing the normal distribution for investment project evaluation using the Monte Carlo method will make it possible to use the normal distribution more often, which will significantly improve the simulation results and Monte Carlo method efficiency.