Monte Carlo simulation and importance sampling applied to sensory analysis validation of specialty coffees 1

- Coffee sensory analysis is usually made by a sensory panel, which is formed by trained tasters, following the recommendations of the Specialty Coffee Association of America. However, the preference for a coffee is commonly determined by experimentation with consumers, who typically have no special skills in terms of sensory characteristics. Therefore, this study aimed at applying an intensive computational method to study sensory notes given by an untrained sensory panel, considering the probability distributions of the class of extreme values. Four types of specialty coffees produced under different processes and in varied altitudes in the mountainous region of Mantiqueira, Minas Gerais, were considered. We concluded that the generalized Pareto distribution can be applied to sensory analysis to discriminate types of specialty coffees. Furthermore, the method of importance sampling by Monte Carlo simulation showed greater variability considering a probabilistic model adjusted to identify specialty coffees.


INTRODUCTION
According to Ramos et al. ( 2 016), a coffee is considered specialty when it has superior quality compared to its competitors in terms of origin, absence of defects, processing, and/or sensory characteristics, such as aroma and flavor, which are exclusively or essentially influenced by geography, as well as natural and chemical factors (CHAGAS et al., 2013;MALTA;CHAGAS, 2009). Therefore, given the number of variation sources, computer simulation techniques involving methods of statistical data analysis have been widely used in the field of sensory quality.
Regarding the sensory quality profile of specialty coffees, which is associated with genetic and environmental factors (BORÉM et al., 2016;RIBEIRO et al., 2016). Ramos et al. (2016), used the chi-squared automatic interaction detection (CHAID) method to construct decision trees based on classifiers that related the sensory attributes to the environmental characteristics of the Serra da Mantiqueira region. The authors observed that CHAID method provided promising results regarding accuracy and hit rates by discriminating samples of Coffea arabica coffees, whose sensory evaluations, with scores ≥88 points, were characterized by production at altitudes ≥1,200 m, with body intensity discriminated into high and low.
To identify similarities among four specialty coffees, Ossani et al. ( 2 017), used multiple factor analysis for contingency tables with categorized data obtained from a sensory experiment conducted with different consumer groups. Despite their heterogeneity, the consumers involved in the analysis were successful at discriminating specialty coffees produced at different altitudes and under different processing methods.
Following this argument, the authors analyzed a sensory experiment with four specialty coffees produced in the Serra da Mantiqueira region of Minas Gerais, Brazil. Statistical modeling based on the distribution of extreme values was used, considering the highest sensory scores as random phenomena, since there may be variations in the consumers' judgment due to external factors, such as fatigue, sensory ability, among others.
To better understand these scores, the theory of extreme values plays a relevant role in the study of atypical, rare, low-frequency events or events that are occasionally discriminated as outliers. This theory consists of two methodologies: one in which extreme events occur in blocks, which are modeled by the generalized extreme value (GVE) distribution; and another one in which extreme events are defined as those that surpass a threshold, also known as peaks over threshold (POT) (THOMAS et al., 2016). However, given the complexity of integral resolution, analytical results are obtained using a statistical model.
Plausible approximations have been obtained by using computationally intensive methods, such as importance sampling. In this method, a new distribution function is introduced and the values are corrected by weights, thus preventing changes to the expected results.
Given the above, this study used the importance sampling technique along with Monte Carlo simulation as a tool to evaluate probabilistic models based on the distribution of extreme values to model the sensory quality of four specialty coffees produced in the Serra da Mantiqueira region.

MATERIAL AND METHODS
Data on the scores assigned to each coffee type in the sensory experiment were obtained from tests performed at the Federal University of Lavras (Universidade Federal de Lavras -UFLA). In compliance with the decision awarded by the Ethics Committee, as registered in the Certificate of Presentation for Ethical Consideration (CAAE): 14959413.1.0000.5148, the samples of coffee arabica were prepared by removing defective grains and roasted, respecting the minimum period of 24 hours before tasting, according to the protocols of the Specialty Coffee Association of America. Monte Carlo simulation and importance sampling applied to sensory analysis validation of specialty coffees standard color wheels, following Ferreira et al. (2016). The beverage was prepared at a concentration of 7% w/v using filtered water ready for consumption, free of contaminants and without added sugar. With these specifications, four specialty coffee types, whose samples were coded A, B, C and D, were prepared and are described in Table 1.
The four coffee types were evaluated as to their sensory characteristics: taste, acidity, body and note. In different sessions, the volunteering consumers were grouped into two classes: (a) individuals who are used to consuming coffee, but lack basic knowledge on specialty coffees and (b) individuals who are used to consuming coffee and have been provided with basic information on specialty coffees.
Using the POT method, the notes above the preestablished threshold were considered. Considering that a generalized distribution can also be defined using the surpassing values method, we assume that when X is a random variable corresponding to the extreme value probability function, and given a normalized threshold u, the variable X -u , which represents the extreme values, follows a generalized Pareto distribution (GPD).
Thus, to analyze the chosen threshold with the aid of the POT package (RIBATET, 2007) of software R (R DEVELOPMENT CORE TEAM, 2014), mean residual life plots and plots of the parameters estimated as a function of the threshold were used, and using the MASS and EVD packages of R, the distributions for the sensory notes were fitted using the maximum likelihood method (COLES, 2001).
(3) By the relationship between, the inverse transformation is given by: Therefore, integral in (3) can be written as: (4) Nevertheless, the integral in (4) is in the scale of Uniform distribution, hence, to return to GDP distribution, which is given by: To performing importance sampling in this research study, we considered candidate distribution h(x) given by N(u;σ=1), fixed, with parameters u and σ of GDP(u;σ;ξ). The h(x) must be chosen so that probability will not be equal to zero, otherwise it will result in infinite weights. Thus, Normal distribution focused on u and σ = 1 as h(x) becomes a convenient choice because it finds support in (-∞, + ∞). Therefore, I in (4) was solved following (5).

(6)
in which n = 1000 performing Monte Carlo simulations and x i * the realization of the random variable X i * .
The fit adequacy of each distribution was validated using the Kolmogorov-Smirnov (KS) goodness-of-fit test, which verifies the fit of a probability distribution to the original data. Further information on this test can be found in Thomas et al. (2016).
The Ljung-Box test was used to verify the assumption of independence of the observations. More details can be found in Ljung and Box (1978). In both tests, a level of significance of 1% (p<0.01) was adopted.

RESULT AND DISCUSSION
Importance sampling was used to provide an approximation for E [X], with X, the random variable, associated with the maximum note provided by a taster. As previously noted, X follows a GDP.
The results were organized into two situations: the first refers to importance sampling obtained from the original sample; and the second considers the fact that X follows a GPD. In the second case, GPD parameter estimations fitted to the original data and sampled values of this distribution were employed to the importance sampling. Table 3 presents the results of fitting the GPD to the different specialty coffee types for the following parameters: m corresponding to mean, ŝ to scale and x to shape. Coffee D had the highest expected value for the maximum notes, and on average, the maximum note attributed to this coffee was 9.7 points. In contrast, coffee C had the lowest expected value for the maximum note.
Another expressive result is verified regarding the Ljung-Box and Kolmogorov Smirnov tests, which allow to interpret that the maximum sensory notes given by each volunteer can be considered independent and satisfactorily fit by the GPD at the 1% significance level.
Briefly, importance sampling was employed to estimate E [X], and the results obtained are reported in Table 4. For that purpose, the relationship between the uniform distribution and the GPD was used to obtain f(x), and the choice of the candidate distribution for h(x) was made such that h(x) did not provide probabilities equal to zero. According to this justification, the normal distribution centered on m and with variance 1 was used as the candidate distribution.
The means obtained by importance sampling for the first situation are quite close to the theoretical values, except for the mean obtained for coffee B, whose estimate was 9.0 points, whereas the one obtained by importance sampling was 7.5 points.
Considering the sampling performed in the second situation, very reasonable approximations were also observed for the theoretical mean. For example, for coffee D, the theoretical mean was 9.7 points, and the approximation via importance sampling was 9.8 points. For coffee B, the approximation for E[X] did not show good accuracy because the average Monte Carlo for E [X] was discrepant in relation to the theoretical value. Important additional information is the precision of the importance sampling, given by the Monte Carlo standard deviation of the simulation, which indicates the variability of the samples generated by f (x). Moreover, coffee B had the lowest precision among the specialty coffees, indicating that for this coffee, the notes presented greater variability in relation to the mean of the GPD, which is confirmed by the estimated reported in Table 4. Table 3, importance sampling was successfully employed for coffees A, C and D, and considering the samples from the GPD, it was possible to establish a precision measure for the expected value of the maximum notes for these coffees.    The importance sampling method allows some flexibility in the choice of the candidate distribution for h(x), and suggests that the one that behaves similarly to the desired distribution, in this case the GPD. For example, the Gumbel and Weibull distributions were used, but none achieved results better than those reported in Table 4. In this sense, further studies can be performed to improve the results reported in Table 4, especially those for coffee B.

CONCLUSIONS
1. The GP distribution can be applied to the sensory analysis of specialty coffees made by a heterogenous sensory panel of consumers; 2. The importance sampling method was successfully used for the specialty coffees of the Coffee arabica species genotypes Ye llow Bourbon and Acaiá. Coffee type D had the highest Monte Carlo mean sensory note and high Monte Carlo precision. Coffee type B had the highest variability among the analyzed coffees.