skip to main content
research-article
Open Access

Identifying the Big Shots—A Quantile-Matching Way in the Big Data Context

Authors Info & Claims
Published:10 March 2022Publication History

Skip Abstract Section

Abstract

The prevalence of big data has raised significant epistemological concerns in information systems research. This study addresses two of them—the deflated p-value problem and the role of explanation and prediction. To address the deflated p-value problem, we propose a multivariate effect size method that uses the log-likelihood ratio test. This method measures the joint effect of all variables used to operationalize one factor, thus overcoming the drawback of the traditional effect size method (θ), which can only be applied at the single variable level. However, because factors can be operationalized as different numbers of variables, direct comparison of multivariate effect size is not possible. A quantile-matching method is proposed to address this issue. This method provides consistent comparison results with the classic quantile method. But it is more flexible and can be applied to scenarios where the quantile method fails. Furthermore, an absolute multivariate effect size statistic is developed to facilitate concluding without comparison. We have tested our method using three different datasets and have found that it can effectively differentiate factors with various effect sizes. We have also compared it with prediction analysis and found consistent results: explanatorily influential factors are usually also predictively influential in a large sample scenario.

Skip 1INTRODUCTION Section

1 INTRODUCTION

Increasingly large volumes of data can now be stored and processed. These large, granular datasets enable service innovation [38], create opportunities for managers to realize strategic business value [12, 24, 34, 42], and for information systems (IS) researchers to investigate emergent phenomena or revisit established phenomena more broadly and deeply [2], and significantly influence how people derive knowledge and make decisions [1]. However, they also lead to epistemological concerns [1]; it is necessary to investigate how traditional research methods should be adapted for big data environments, and particularly to address the “deflated p-value” problem [39] and the role of prediction versus explanation. These are the issues we will explore in this study.

The deflated p-value problem refers to the phenomenon in which the p-value quickly approaches zero as the sample size increases so that even a tiny deviation from a given value can be detected [39]. Therefore, variables will be significant when the data sample becomes massive, which results in p-values having little practical utility [39]. A group of over 800 scientists [6] has thus called for abandoning statistical significance as a standard (the 0.05 dichotomania). Similarly, the American Statistical Association commented in an editorial [54, 55] that the categorical drawing of conclusions based on p-values should be avoided.

We advocate focusing on the effect size rather than the p-value of a focal factor. The effect size measures the magnitude of an effect [44] and provides a more generally interpretable, quantitative description of the size of an observed effect that is independent of the possibly misleading influences of sample size [18]. Although researchers have begun to report one type of effect size—the regression coefficient \( \theta \), which measures the sensitivity of a dependent variable to changes in an independent variable—along with the p-value, this univariate effect size method has several limitations. First, it cannot be generalized to a multiple-variable scenario. For example, socioeconomic status is determined by the three variables of income, education, and occupation, a person's personality is measured from five dimensions by many items, and categorical factors are usually operationalized as a set of dummy variables. The effect size at the factor level would be of more interest in these situations. But it cannot be directly measured by the univariate effect size method in a summative way (e.g., \( {\theta _1} + {\theta _2} + {\theta _3} \) ), as the individual effects usually correlate with each other and cannot be treated independently [47]. Second, its use hinders comparison among different variables and different studies, especially when various scales are used.

To deal with this problem, we have made use of the likelihood ratio test (LRT) statistic to measure the joint effect size at the factor level, which we name the multivariate effect size. The reason we have chosen the LRT statistic as our basic measure is because of its close connection to the regression coefficient (\( \theta \)) and its ease of calculation and interpretation. The LRT statistic asymptotically follows the \( {\chi ^2} \) distribution, but one factor can be operationalized as several variables. More independent variables (IVs) provide more freedom in estimating the variability of the dependent variable [20, 44], thus creating bias in the estimation of multivariate effect sizes and making it challenging to compare across factors. Previously developed methods to solve this issue include the cumulative probability method and those that match representative characteristics of the distribution, e.g., location, spread, skewness, and peakedness. The cumulative probability method transforms the test statistic to the corresponding cumulative probability (ranging from 0 to 1) based on the cumulative distribution function. The method allows us to compare different types of studies and provide a concise and universal summary of different statistical tests [37]. But its applicability is significantly constrained—the corresponding cumulative probability cannot be explicitly calculated for test statistics obtained in a large sample context. For example, in RStudio (version 1.1.383), pchisq()—the function that calculates the cumulative probability based on the value of the \( {\chi ^2} \) test statistic—would produce a probability value of 1 for a \( {\chi ^2} \) test statistic larger than 1,590 with less than 20 degrees of freedom, while this function can display the probability at a maximum of 1–10−320, which is smaller than but close to one. That is, the cumulative probability method cannot differentiate the effect size of factors with a \( {\chi ^2} \) test statistic larger than 1,590, which is frequently encountered in large sample research. In the example that we will discuss in Section 4 of this article, the LRT statistics for all factors exceeded 1,590 with a 25% (n = 1,565,720) sample. Thus, in handling very large data samples, the cumulative probability method is no longer an effective comparison tool. The other method, which matches the representative characteristics of the distribution—e.g., mean and SD—does not produce results consistent with the cumulative probability method and can lead to wrong conclusions. For example, assume that two factors, \( {F_1} \) and \( {F_2} \), are operationalized as 3 and 5 variables, with LRT statistics \( LR{T_1} = 21.11 \) and \( LR{T_2} = 25.75 \) following the \( {\chi ^2} \) distribution. The transformed statistics \( {Z_1} \) and \( {Z_2} \) based on the mean and SD would be \( \frac{{21.11 - 3}}{{\sqrt {2{\rm{*}}3} }} = 7.39 \) and \( \frac{{25.75 - 5}}{{\sqrt {2{\rm{*}}5} }} = 6.56 \), respectively. Therefore, we would claim that the first factor \( {F_1} \) has a larger effect size. However, the corresponding cumulative probabilities for \( LR{T_1} \) and \( LR{T_2} \) are both 0.9999, indicating these two factors have the same effect size.

A quantile-matching-based transformation method is proposed to address this problem. This method attempts to minimize the distance of different distributions across cumulative probabilities to remove the impact of various degrees of freedom. The method has three advantages: (1) Feasibility—it has a one-to-one corresponding relationship with cumulative probability under perfect matching, and consequently inherits the merits of cumulative probability in comparisons. But it can be applied to scenarios where the cumulative probability method fails as the sample size scales significantly. (2) Flexibility—the quantile matching transformation can be done on the full distribution or a specific interval, providing tremendous flexibility in real large sample applications. Most cumulative probabilities would approach 1 in large sample context, so matching on the right tail of the distribution would be more helpful. (3) Fast calculation—it is easy to implement and calculate.

To demonstrate the flexibility of the quantile matching method, we have applied it to three different intervals—(1) full distribution, (2) [0.99, 0.9999999] interval, and (3) the extreme scenario [1–10−320, 1–10−322] interval of the \( {\chi ^2} \) distributions. Nevertheless, as the corresponding \( {\chi ^2} \) statistics are not calculatable in [1–10−320, 1–10−322] in RStudio, we have used the estimates obtained from quantile mechanics (QM) and natural spline interpolation for matching [45]. Furthermore, we have developed an absolute multivariate effect size method to help researchers decide between small, medium, and large effect sizes without needing to compare with other factors and facilitate the comparison of the same factor across different studies.

In addition, we have measured the focal factor's effect size in terms of prediction. We have compared both sets of results and found explanatorily influential factors derived from the LRT analysis in a large sample context are predictively influential as well. This then helps answer our second question concerning the role of prediction versus explanation in large sample research.

This study makes three main contributions to the methodology literature of IS research: (1) we developed a multivariate effect size method by making use of the LRT statistic to address the deflated p-value problem. This method enables the measurement of a joint effect size across the variables used to operationalize the focal factor. (2) We have introduced the quantile-matching method to deal with the impact of different numbers of operationalized variables on the multivariate effect size estimation of the focal factors. This method produces results consistent with those of the cumulative probability method under perfect matching and can effectively handle large samples. (3) In large sample applications, we found explanatory influential factors are usually also predictively influential.

The remainder of the article is organized in the following order. First, we will review the history of the p-value and the concerns raised about its misuse and misinterpretation and discuss existing methods to deal with the deflated p-value problem. Next, we will examine the relative and absolute multivariate effect size for different factors using the LRT statistic and adjust the bias caused by different numbers of operationalized variables across factors using the quantile-matching method, demonstrating its unique advantages. Then, we will present our empirical analysis using an example taken from an IS application on e-mail marketing (n = 6,230,253) and compare the results from our method with the ROC analysis results. We have also replicated our analysis on the US accident dataset and Airbnb listings dataset. Finally, we will discuss our results and their implications.

Skip 2THEORETICAL BACKGROUND Section

2 THEORETICAL BACKGROUND

2.1 The P-value and Big Data

The p-value was first introduced by the UK statistician Ronald Fisher [17] in the 1920s to determine whether the observed data complied with the proposed hypotheses by calculating the difference between the predicted and the observed data series. Fisher regarded the p-value as an informal way of checking whether the evidence of a treatment effect was worth a second look. However, in the late 1920s, during the movement to make evidence-based decision-making more rigorous and objective, many non-statistician authors combined Fisher's easy-to-calculate p-value with Neyman and Pearson's reassuringly rigorous rule-based approach to create a hybrid system, and thus a p-value of 0.05 became enshrined as “statistically significant” [43].

The abuse and misinterpretation of the p-value were first noted in the late 20th century. Researchers were mainly concerned about two aspects. First, the p-value does not establish the probability as to whether the investigator's hypothesis is correct; it only represents the false-positive rate given the observations [15, 21, 22]. Second, using the p-value to divide the results into “significant” and “insignificant” categories is arbitrary [16]. More extensive concerns about the p-value have recently been voiced and are presented in Table 1. Among these concerns, the deflated p-value problem has received considerable attention. This problem is not a new phenomenon. In 1993, Kass and Raftery [30] pointed out “frequently tests tend to reject null hypotheses almost systematically in very large samples”, an observation that has been further elaborated by Lin et al. [39], Kim and Ji [33], and Kim et al. [32]. Because the p-value measures the distance between the data and the null hypothesis united by standard errors, as n approaches infinity, the standard error will be close to 0. This means even a tiny deviation from a given value can be detected.

Table 1.
PaperJournalConcerns of p-valueProposed Solution
Lin et al. [2013]ISRDeflated p-value problem in the large sample context.Report effect size and confidence interval.
Halsey et al. [2015]Nature MethodsP-value varies highly across samples.Increasing statistical power (sample size) could mitigate the variation across samples.Report effect size estimates and their precision (95% confidence intervals).
Kim and Ji [2015]Journal of Empirical FinanceDeflated p-value problem in the large sample context.Select a different level of significance by taking account of sample size.
Lazzeroni et al. [2016]Nature MethodsP-value varies highly across samples.Propose the p-value prediction interval and the p-value confidence intervals to capture the uncertainty in a sole p-value.
Goodman [2016]ScienceThe p-value neither measures nor is part of a formula that provides the credibility of the conclusions.The P-values are unlikely to disappear, and the ASA did not recommend their elimination – rather a change in how they are interpreted and used.The way statistical inference is taught to scientists should contain a variety of named, competing approaches each with strengths and weaknesses.
Kyriacou [2016]JAMAThe concept of the p-value is frequently misunderstood and misused.The automatic application of dichotomized hypothesis testing based on prearranged levels of statistical significance (0.05 p-value) should be replaced by a more complex process using effect estimates, confidence intervals, and even p-values, thereby permitting scientists, statisticians, and clinicians to use their own inferential capabilities to assign scientific significance.
Altman and Krzywinski [2017]Nature MethodsA p-value is a probability statement about the observed sample in the context of a hypothesis, not about the hypotheses being tested.Three main ideas for using, interpreting, and reporting p-values have emerged: (1) the use of more stringent p-value cutoffs supported by Bayesian analysis, (2) the use of the p-value to estimate the false discovery rate (FDR), and (3) the combination of p-values and effect sizes to create more informative confidence intervals.
Altman and Krzywinski [2016]Nature MethodsEven when the null hypothesis is true, if we have done many tests, we will have a high chance of obtaining a significant p-value, and the confidence interval does not mitigate the problem either.N/A.
Kim et al. [2018]ABACUSDeflated p-value problem in the large sample context.Report effect size and confidence interval.
Benjamin et al. [2018]Nature Human BehaviorThe corresponding Bayes Factor of p = 0.05 is only 2.5 to 3.4, providing ‘weak’ or ‘very weak’ support for the alternative hypothesis.Change the threshold of significance to 0.005, as the corresponding Bayes Factor would be 14 to 26, indicating ‘substantial’ or ‘very strong’ support for the alternative hypothesis, and the minimum false positive rate will decrease to 5%.

Table 1. Summary of Related Research on P-value

The influence of the deflated p-value problem has grown as large datasets have become extensively applied in research. Lin et al. [39] surveyed articles in MIS Quarterly and Information Systems Research (ISR) from 2004 to 2010 along with abstracts from the Workshop on IS and Economics and symposia on Statistical Challenges in Electronic Commerce Research. They found over 50% of the papers used extremely large samples (>10,000) and solely relied on low p-values and the sign of the coefficient. Kim and Ji [33] also reported 42% of the articles published during 2012 in four finance journals exploited extremely large samples and used the p-value null hypothesis testing method only. Similarly, Kim et al. [32] found 39% of the studies published during 2014 in eight accounting journals used extremely large samples; each study used the p-value and only one reported the confidence interval. Additionally, we have reviewed empirical papers published in MIS Quarterly and ISR from 2014 to 2018 and found that although IS researchers were more cautious with a large sample—among papers exploiting extremely large samples—58% also reported the effect size (\( \theta \)) or confidence interval along with the p-value, and a few of them reported the predictive power [3, 49]. Furthermore, there is a slight decrease in the percentage of papers (48%) dealing with extremely large samples, but the average sample size increased to 4.8 million. These findings highlight the urgent need to develop another test statistic to overcome the deflated p-value problem in the large sample context.

2.2 Effect Size

Several methods have been proposed to deal with the deflated p-value problem, such as changing the significance threshold [9, 33], reporting the confidence interval of the estimated coefficient [4, 32, 36, 39], and reporting the effect size [32, 39]. Lowering the p-value threshold is easy to implement, but it cannot fully address the deflated p-value concern as the sample size can be easily increased, particularly if p-hacking behavior is considered [14]. A confidence interval describes the range that the true value of the unknown parameter would fall into with a certain probability, and the width of the interval decreases with the sample size, however, it can only be applied at the variable level. Effect size focuses on the size of an observed effect rather than whether the effect is there, providing a more quantitative description of the observed effect and a different statistical inference. Because variables will become significant unless they are strictly equal to zero in very large data samples, it is natural to use the effect size to draw statistical inferences in large sample research.

The existing effect size measures can be divided into two types: those specific to comparing two conditions (Cohen's d, Hedges’ g, Glass's d, etc.) and those describing the proportion of variability explained (\( {\eta ^2} \), \( \eta _p^2 \), R2, adjusted R2, etc). We found that the most reported effect size measure is the regression coefficient (\( \theta \))—among 58% of the IS papers published between 2014 and 2018 that reported effect size, nearly all of them used the regression coefficient (\( \theta \)). The regression coefficient (\( \theta \)) describes the sensitivity of the dependent variable to a unit change in the independent variable; in that sense, it belongs to the first type of effect size measure. Although the use of the regression coefficient (\( \theta \)) alleviates the deflated p-value concern to some extent, this method has limitations. First, the regression coefficient (\( \theta \)) is an unstandardized effect size measure whose value is influenced by the choice of scale. This hinders the comparison of effect size across different variables and different studies (standardized coefficient can help with this issue but it still faces the second limitation as pointed below). Second, the regression coefficient (\( \theta \)) method can only be applied at the variable level, not the factor level. In many cases, we are interested in the implication of the underlying factors, instead of variables. For instance, if we want to study the relationship between people's personalities and their job performance, where the personality is measured from five dimensions with many items, we are interested in the effect size of the Big Five personality factors in explaining peoples’ job performance, rather than the effect size of each item. In this case, the traditional effect size (\( \theta \)) won't be able to provide many insights. In addition, our e-mail marketing case in Section 4 indicates the effect size for the Seasonal/Festive variable increases the e-mail opening odds by 27%. But the effect sizes for Membership Tier and Message Type cannot be obtained because they are categorical factors and are operationalized as several variables.

In the current study, we have used the LRT statistic to enable the application of effect size in multivariate situations. This test statistic directly measures the joint effect size of all the variables corresponding to the focal factor by calculating the unique amount of explanatory power provided by the focal factor to the dependent variable. Furthermore, we have developed a quantile matching method and an absolute multivariate effect size test statistic to enable the comparison of effect size across different factors and different studies to overcome the scaling effect.

Skip 3MULTIVARIATE EFFECT SIZE Section

3 MULTIVARIATE EFFECT SIZE

Our multivariate effect size measure is developed based on the LRT statistic due to the following: (1) the LRT statistic has a close connection to the regression coefficient (\( \theta \)), (2) it is easy to calculate, and (3) it is easy to interpret.

3.1 Log-likelihood Ratio Test Statistic and Multivariate Effect Size

When comparing the relative tenability of two competing nested models, the most common method is the log likelihood ratio test, where \( \begin{equation*} {\rm{\Lambda }} = \frac{{{\rm{max}}[{L_0}(Null{\rm{\ }}Model|Data)]}}{{{\rm{max}}[{L_1}(Alternative{\rm{\ }}Model|Data)]}}. \end{equation*} \)Then \( - 2log{\rm{\Lambda }} \) has an asymptotic \( {\chi ^2} \) distribution with q degrees of freedom, where q is the difference in the number of free parameters between the general and restricted hypotheses [29]. By testing whether \( -2log{\rm{\Lambda }} \) is significantly different from 0, we can tell whether the alternative model is preferred or not. Similarly, we apply this method to identify the important factors by constructing the LRT statistic \( LR{T_i} \): (1) \( \begin{equation} LR{T_i} = - 2( {{\cal L}( {{\boldsymbol{\tilde{\theta }}}}) - {\cal L}( {{\boldsymbol{\hat{\theta }}}})}), \end{equation} \)where \( {\cal L}( {\boldsymbol{\theta }} ) \) is the log likelihood of \( {\boldsymbol{\theta }} \), \( {\boldsymbol{\hat{\theta }}} = ( {{{\hat{\theta }}_1},{{\hat{\theta }}_2}, \ldots {{\hat{\theta }}_v},{{\hat{\theta }}_{v + 1}}, \ldots {{\hat{\theta }}_w}} ) \) is the maximum likelihood (ML) estimator for \( {\boldsymbol{\theta }} \) over the full parameter set, and \( {\boldsymbol{\tilde{\theta }}} = ( {{{\tilde{\theta }}_1},{{\tilde{\theta }}_2}, \ldots {{\tilde{\theta }}_v},0, \ldots 0} ) \) is the ML estimator under the null hypothesis \( {H_0} \): the exclusion of \( {X_i} \) has no impact on the model's goodness of fit, i.e., \( \ {\theta _{v + 1}} = \cdots = {\theta _w} = 0 \). Then \( LR{T_i} \) measures the impact of the exclusion of \( {X_i} \) from the model on the model's goodness of fit, which can be used as an indicator of the effect size of \( {X_i} \).

Based on Equation (1), the LRT statistic can be rewritten as: \( \begin{equation*} LR{T_i} = - 2( {{\cal L}( {{\boldsymbol{\tilde{\theta }}}} ) - {\cal L}( {{\boldsymbol{\hat{\theta }}}})}) \approx - \mathop \sum \limits_{i = v + 1}^w \frac{{{\partial ^2}{\cal L}}}{{\partial {{\hat{\theta }}_i}\partial {{\hat{\theta }}_i}}}\hat{\theta }_i^2 = - \mathop \sum \limits_{i = v + 1}^w {\lambda _i}p_{ii}^2\hat{\theta }_i^2. \end{equation*} \)(Proof is presented in Appendix A.) Therefore, the LRT statistic measures the overall effect size for the focal factor by summing the square of the independent coefficient \( \theta \) for each variable weighted by the second derivative of the log likelihood on \( \theta \). This can be interpreted as the amount of information the variable carries about \( \theta \) or the credibility of \( \theta \). Note that the second derivative also equals \( {\lambda _i}p_{ii}^2 \), in which \( {\lambda _i} \) is the eigenvalue and \( {p_{ii}} \) is the value from the eigenvector matrix of the corresponding Hessian matrix, as eigenvectors give the directions in which \( \theta \) increases or decreases the most and eigenvalues give the magnitudes of those changes in \( \theta \), the LRT statistic shows a joint effect from individual \( \hat{\theta }_i^2 \). Through the eigenvalue–eigenvector pairs, the LRT statistic helps address the issue of aggregating multiple univariate effect sizes. Therefore, we have used the LRT statistic as a basis to develop our test statistic. When \( w - v = 1 \) (the factor is operationalized as one variable), the LRT statistic gives us the univariate effect size by calculating the square of its corresponding coefficient weighted by its credibility. That is, the LRT statistic is closely related to the traditional effect size (\( \theta \)) in the univariate setting. But it can be easily extended into a multivariate setting. We have also provided numerical evidence for the close relationship between the LRT statistic and traditional effect size (\( \theta \)) in the univariate setting in Table 9, Section 4— the correlation between the LRT statistic and the square of coefficient (\( \theta \)) is extremely close to 1. By applying the LRT method in our email marketing case, we can easily obtain the multivariate effect size for Membership Tier and Message Type—44,733.9 and 39,527.7, respectively. This can be considered as a joint effect from all the variables corresponding to these two factors.

Although the LRT statistic can help quantify the joint effect from multiple variables, its expected value and uncertainty are determined by its degrees of freedom and may not enable the comparison of the two factors unless both factors have the same degrees of freedom. In the next section, we will propose the following quantile-matching method to facilitate the comparison of factors with different degrees of freedom and provide theoretical justifications for quantile matching.

3.2 Quantile-Matching Transformation Method for Multivariate Effect Size

As discussed, we cannot compare the effect size of factors directly from the LRT statistic because more variables in a factor will lead to a higher expected explanatory power and greater variance. To resolve this problem, we must adjust the effect of the number of variables underlying the factor of interest. Alternative methods to achieve this have been proposed: (1) using cumulative probability, which provides a concise and universal summary of different statistical tests that no other statistic can achieve [37]; and (2) matching the representative characteristics of the distribution. Four characteristics are typically used to describe a distribution: location, spread, skewness, and peakedness. The location can be measured by the mean or median and the SD or interquartile ranges can be used to describe the spread. The classic method of matching both mean and SD (location and spread) of the distributions of the LRT statistic can be achieved by: \( \begin{equation*} {Z_i} = \frac{{{{\widehat {LRT}}_i} - E\left( {{{\widehat {LRT}}_i}} \right)}}{{\sqrt {Var\left( {{{\widehat {LRT}}_i}} \right)} }} = \frac{{{{\widehat {LRT}}_i} - d}}{{\sqrt {2d} }}, \end{equation*} \)where the mean of \( {Z_i} \) is zero and its variance is 1.

However, both methods above have major limitations, particularly in the large sample context. Although cumulative probability is a powerful tool for comparisons across contexts, the prevalence of a large sample significantly limits its applicability. The pchisq() function in RStudio (1.1.383) can display a maximum cumulative probability of 1–10−323. However, the cumulative probability for a \( {\chi ^2} \) test statistic larger than 1,590 and with degrees of freedom of less than 20 will be calculated as 1. Such a large test statistic is quite common with a large sample: in the example in Section 4, with n = 1,565,720, the LRT statistics for all factors exceeded 1,590. Matching several representative characteristics of the distribution method only uses the aggregated information of the distribution (e.g., location and spread), leaving other information about the distribution unused. This leads to comparison inconsistency when using the cumulative probability method. In the example presented in Section 1, for the two test statistics \( LR{T_1} = 21.11\sim{\chi ^2}( 3 ) \) and \( LR{T_2} = 25.75\sim{\chi ^2}( 5 ) \), the corresponding cumulative probabilities for \( LR{T_1} \) and \( LR{T_2} \) are both 0.9999, whereas the transformed mean/SD statistics \( {Z_1} \) and \( {Z_2} \) are 7.39 and 6.56, respectively, leading us to wrongly claim the first factor \( {F_1} \) has a larger effect size.

In this study, we will propose a method of adjusting the impact of the number of variables underlying the factor of interest based on matching across quantiles of distributions. This quantile matching transformation method can achieve the three Fs: (1) Feasibility—it has a one-to-one corresponding relationship with cumulative probabilities under a perfect matching situation, but it can also be applied to scenarios where the cumulative probability method fails due to the use of extremely large samples; (2) Flexibility—the quantile-matching transformation may be done on the full distribution or a specific interval, providing tremendous flexibility in real applications; and (3) Fast calculation—it is easy to implement and calculate.

The quantile-matching method is as follows: let M be a random variable and \( {\boldsymbol{N}} = ( {{N_1}, \ldots ,{N_p}} ) \) be a collection of p random variables. The goal for quantile-matching is to find a linear combination \( {\boldsymbol{\beta 'N}} \) where \( {\boldsymbol{\beta }} \) minimizes the integrated squared difference between the quantile functions of M and \( {\boldsymbol{\beta 'N}} \) across [0, 1] [50]. The quantile function is the inverse function of the cumulative distribution function, suppose \( {Q_\xi }( \alpha ) \) denotes the \( \alpha \)th quantile of the random variable \( \xi \), then \( {\rm{P}}\{ {\xi \le {Q_\xi }( \alpha )} \} = \alpha ,{\rm{\ for\ }}\alpha \in [ {0,1} ] \). In addition, if only a certain quantile interval \( [ {{\alpha _1},{\alpha _2}} ] \) is of interest, the integration interval can be changed to \( [ {{\alpha _1},{\alpha _2}} ] \) instead of [0, 1].

Note that the two LRT statistics \( {\widehat {LRT}_1} \) and \( {\widehat {LRT}_2} \) can be rewritten as the quantile functions \( {Q_{{k_1}}}( {{q_1}} ) \) and \( {Q_{{k_2}}}( {{q_2}} ) \), where \( {Q_k}( \alpha ) \) denotes the \( \alpha \)th quantile of the random variable, \( {q_1} \) and \( {q_2} \) are the corresponding cumulative probabilities for \( {\widehat {LRT}_1} \) and \( {\widehat{LRT}_2} \), and \( {k_1} \) and \( {k_2} \) are the corresponding degrees of freedom of \( {\widehat {LRT}_1} \) and \( {\widehat {LRT}_2} \). In this article, we propose the transformed LRT statistics, \( Z_1^q \) and \( Z_2^q \), based on the linear quantile-matching transformation to adjust the effect of the number of variables or the effect of the degrees of freedom in different factors. We define \( Z_1^q \) and \( Z_2^q \) as the functions of the cumulative probability, q, as \( \frac{{{Q_{{k_1}}}( q ) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} \) and \( \frac{{{Q_{{k_2}}}( q ) - {a_{{k_2}}}}}{{{b_{{k_2}}}}} \), where in practice, \( {a_{{k_1}}} \), \( {b_{{k_1}}} \), \( {a_{{k_2}}} \), \( {b_{{k_2}}} \) are the estimated values obtained by minimizing the integrated squared difference between the quantile functions. In an ideal situation where we can conduct perfect matching at \( {q_1} \) and \( {q_2} \) for two LRT statistics—\( {\widehat {LRT}_1} \) and \( {\widehat {LRT}_2} \)—with degrees of freedom \( {k_1} \) and \( {k_2} \), we have an exact match in \( Z_1^q \) and \( Z_2^q \) at the two cumulative probabilities \( {q_1} \) and \( {q_2} \): (2) \( \begin{equation} \frac{{{Q_{{k_1}}}\left( {{q_1}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} = \frac{{{Q_{{k_2}}}\left( {{q_1}} \right) - {a_{{k_2}}}}}{{{b_{{k_2}}}}}, \end{equation} \)and: (3) \( \begin{equation} \frac{{{Q_{{k_1}}}\left( {{q_2}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} = \frac{{{Q_{{k_2}}}\left( {{q_2}} \right) - {a_{{k_2}}}}}{{{b_{{k_2}}}}}. \end{equation} \)

We will demonstrate in the following that under an ideal situation that leads to Equations (2) and (3), the transformed test statistic \( Z_i^q \) will have a one-to-one corresponding relationship with the cumulative probability. Specifically, the following relationship will hold: \( \begin{equation*} If{\rm{\ }}{q_1} > {q_2},{\rm{\ }}then{\rm{\ }}Z_1^q > Z_2^q,{\rm{\ }}and{\rm{\ }}vice{\rm{\ }}versa. \end{equation*} \)

If \( {q_1} > {q_2} \), then \( {Q_{{k_1}}}( {{q_1}} ) > {Q_{{k_1}}}( {{q_2}} ) \), and according to Equation (3) we have: \( \begin{equation*} \frac{{{Q_{{k_1}}}\left( {{q_1}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} > \ \ \frac{{{Q_{{k_1}}}\left( {{q_2}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} = \frac{{{Q_{{k_2}}}\left( {{q_2}} \right) - {a_{{k_2}}}}}{{{b_{{k_2}}}}} \end{equation*} \)that is, \( Z_1^q > Z_2^q \).

If \( Z_1^q > Z_2^q \), then: \( \begin{equation*} \frac{{{Q_{{k_1}}}\left( {{q_1}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} > \ \ \frac{{{Q_{{k_2}}}\left( {{q_2}} \right) - {a_{{k_2}}}}}{{{b_{{k_2}}}}} \end{equation*} \)and according to Equation (3), we have: \( \begin{equation*} \frac{{{Q_{{k_1}}}\left( {{q_1}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} > \ \ \frac{{{Q_{{k_1}}}\left( {{q_2}} \right) - {a_{{k_1}}}}}{{{b_{{k_1}}}}} \end{equation*} \)therefore, \( {q_1} > {q_2} \). Therefore, the comparison based on \( Z_1^q \) and \( Z_2^q \) will be identical to the comparison based on \( {q_1} \) and \( {q_2} \). We recognize that in a real application, Equations (2) and (3) will not always hold perfectly since (1) we have matched the quantiles across distributions numerically (shown in Section 3.3), so the quantile matching results may be less accurate when a large step size is used; and (2) the quantile cannot be calculated explicitly when the sample size is very large, and therefore an explicit match on those points cannot be achieved. Although the one-to-one corresponding relationship between the cumulative probability and the quantile-matching-based transformed statistic can be violated to a small extent, the relationship roughly holds.

3.2 The Implementation of Quantile-Matching Transformation for LRT Statistics

The above quantile-matching method can be applied to any type of distribution. In our study, we are specifically interested in \( {\chi ^2} \) distributions because our LRT statistic \( {\widehat {LRT}_i} \) follows an asymptotic \( {\chi ^2} \) distribution. We have attempted to find a linear relationship between \( {\chi ^2} \) distributions, with degrees of freedom from 2 to 20 with \( \chi _{( 1 )}^2 \) according to the following matching formula: (4) \( \begin{equation} \chi _{\left( n \right)}^2 \approx a + b\chi _{\left( 1 \right)}^2 \end{equation} \)by minimizing the following integrated squared difference of the two quantile functions: \( \begin{equation*} \mathop \int \limits_0^1 {\left\{ {{Q_{\chi _{\left( n \right)}^2}}\left( \alpha \right) - {Q_{a + b\chi _{\left( 1 \right)}^2}}\left( \alpha \right)} \right\}^2}d\alpha \end{equation*} \)

We have chosen a linear model specification due to its simplicity and our numerical estimation have shown it can also achieve a reasonable model fit (the adjusted R2 is above 0.8 for all analyses). We have used the numerical method proposed by Sgouropoulos et al. [50] to estimate \( a \) and \( b \), which provides a better fitting at the tails of the distributions and works as follows: \( \begin{equation*} ( {\hat{a},\hat{b}} ) = \arg \mathop {\min }\limits_{a,b} \mathop \sum \limits_{j = v}^{mv} {\left\{ {{Q_{\chi _{\left( n \right)}^2}}\left( j \right) - {Q_{a + b\chi _{\left( 1 \right)}^2}}\left( j \right)} \right\}^2}, \end{equation*} \)where \( v \) denotes the stepwise increment used to divide the quantile interval.

In our study, we have chosen the quantile ranging from [0.000001,0.999999] with a step size of 1E-6. We have also excluded 0 and 1 from the range because \( {Q_{\chi _{( n )}^2}}( 0 ) = 0 \) and \( {Q_{\chi _{( n )}^2}}( 1 ) = Inf \) for all n, as the inclusion of infinity would cause problems in the integration.

Then, if \( n \ge 2 \), \( {\widehat {LRT}_i} \) can be transformed in the following way: (5) \( \begin{equation} Z_i^q = \frac{{{{\widehat {LRT}}_i} - {{\hat{a}}_n}}}{{{{\hat{b}}_n}}} \end{equation} \)

The estimated \( {\hat{a}_n} \) and \( {\hat{b}_n} \) for each n are summarized in Table 2:

Table 2.
Degrees of Freedom, n\( {\hat{a}_n} \) (Full Distribution)\( {\hat{b}_n} \) (Full Distribution)
20.6020561.397977
31.3121481.687911
42.0737741.926307
52.866782.13332
63.6813912.318726
74.5120212.488112
85.3551342.645015
96.2083242.791839
107.0698722.930306
117.9397583.060265
128.8145453.185479
139.6946453.305381
1410.5794343.420594
1511.4684013.531628
1612.3611223.638908
1713.2572403.742791
1814.1564533.843580
1915.0584983.941536
2015.9631494.036886

Table 2. Corresponding a and b for Full Distribution Matching

We have compared the transformed test statistic using the mean/SD method and the QM transformation method at the 0.95 quantiles of \( {\chi ^2} \) distribution with degrees of freedom from 1 to 10. The results in Table 3 show that the QM transformation method provides more consistent results with the cumulative probability than the mean/SD method—for the QM transformation, the difference from \( Z_1^q \) remains below 0.5%, whereas the difference from \( Z_1^{mean/SD} \) quickly surpasses 7% for the mean/SD method.

Table 3.
Degrees of Freedom, n\( LR{T_n} \)\( Z_n^{M/SD} \)Difference with \( Z_1^{M/SD} \)(%)\( Z_n^Q \)Difference with \( Z_1^Q \) (%)
13.8412.0093.841
25.9911.9960.7%3.855−0.4%
37.8151.9662.1%3.853−0.3%
49.4881.9403.4%3.850−0.2%
511.0711.9204.4%3.847−0.2%
612.5921.9035.3%3.844−0.1%
714.0671.8896.0%3.8410.0%
815.5071.8776.6%3.8390.0%
916.9191.8677.1%3.8380.1%
1018.3071.8587.5%3.8360.1%
  • Notes: \( {\rm{\ }}LR{T_n} \) is the corresponding value of \( \chi _{( n )}^2 \) at the 0.95 quantile; \( Z_n^{M/SD} \) is the mean/SD transformed \( LR{T_n} \); \( Z_n^Q \) is the quantile-matching transformed \( LR{T_n} \); Difference refers to difference compared to \( {Z_1}. \)

Table 3. Mean/SD and QM Transformation (Full Distribution) at q = 0.95

  • Notes: \( {\rm{\ }}LR{T_n} \) is the corresponding value of \( \chi _{( n )}^2 \) at the 0.95 quantile; \( Z_n^{M/SD} \) is the mean/SD transformed \( LR{T_n} \); \( Z_n^Q \) is the quantile-matching transformed \( LR{T_n} \); Difference refers to difference compared to \( {Z_1}. \)

To address the concern where the p-value deflates quickly to zero in the large sample context and achieve a high matching accuracy, we have repeated our analysis in the following two intervals according to Equation (4): [0.99,0.9999999] with step size 1E-7 and the extreme scenario [1–10−320,1–10−322] with step size 10−320. The estimated \( {\hat{a}_n} \) and \( {\hat{b}_n} \) for each n based on [0.99,0.9999999] are summarized in Table 4. For the 0.9999 quantile of the \( {\chi ^2} \) distribution with degrees of freedom from 1 to 10, we have compared the transformed test statistic using the mean/SD method, the QM transformation method (full distribution), and the QM transformation method ([0.99,0.9999999]). Results are presented in Table 5. We can see that the QM transformation method using the [0.99,0.9999999] interval produces the most consistent results with the cumulative probability method. The reason is that the \( {\chi ^2} \) test statistic changes much more quickly at the right tail with the same unit change in the cumulative probability; full distribution matching would underestimate this change and achieve lower accuracy, as other parts of the distribution are also taken into consideration, whereas interval matching does not have this problem. This demonstrates the advantage of quantile-matching-based transformation—\( {\hat{a}_n} \) and \( {\hat{b}_n} \) can be calculated according to different integration intervals, which provides more flexibility for different scenarios.

Table 4.
Degrees of Freedom, n\( {\hat{a}_n} \) ([0.99, 0.9999999])\( {\hat{b}_n} \) ([0.99, 0.9999999])
22.0241.087
33.7281.155
45.2911.213
56.7715651.264396
68.1964471.311452
79.5810341.355043
810.931.396
912.261.434
1013.571.471
1114.861.506
1216.141.539
1317.411.572
1418.661.603
1519.901.633
1621.141.662
1722.371.690
1823.591.718
1924.801.744
2026.011.770

Table 4. Corresponding a and b for [0.99, 0.9999999] Distribution Matching

Table 5.
Degrees of Freedom, n\( LR{T_n} \)\( Z_n^{M/SD} \)Difference (%)\( Z_n^{Q\_Full} \)Difference (%)\( Z_n^{Q\_[ {0.99,0.9999999} ]} \)Difference (%)
115.13710.0015.1415.14
218.4218.2117.9%12.7515.8%15.080.4%
321.1087.3926.0%11.7322.5%15.050.6%
423.5136.9031.0%11.1326.4%15.020.8%
525.7456.5634.4%10.7229.1%15.010.9%
627.8566.3136.9%10.4331.1%14.991.0%
729.8786.1138.8%10.2032.6%14.981.1%
831.8285.9640.4%10.0133.9%14.971.1%
933.7205.8341.7%9.8534.9%14.971.2%
1035.5645.7242.8%9.7235.7%14.951.2%
  • Notes: \( LR{T_n} \) is the corresponding value of \( \chi _{( n )}^2 \) at the 0.9999 quantile; \( Z_n^{M/SD} \) is the mean/SD transformed \( LR{T_n} \); \( Z_n^{Q\_Full} \) is the quantile-matching transformed \( LR{T_n} \) based on the full distribution; \( Z_n^{Q\_[ {0.99,0.9999999} ]} \) is the quantile-matching transformed \( LR{T_n} \) based on [0.99, 0.9999999]; Difference refers to difference compared to \( {Z_1}. \)

Table 5. Mean/SD and QM Transformation (Full Distribution and [0.99, 0.9999999]) at q = 0.9999

  • Notes: \( LR{T_n} \) is the corresponding value of \( \chi _{( n )}^2 \) at the 0.9999 quantile; \( Z_n^{M/SD} \) is the mean/SD transformed \( LR{T_n} \); \( Z_n^{Q\_Full} \) is the quantile-matching transformed \( LR{T_n} \) based on the full distribution; \( Z_n^{Q\_[ {0.99,0.9999999} ]} \) is the quantile-matching transformed \( LR{T_n} \) based on [0.99, 0.9999999]; Difference refers to difference compared to \( {Z_1}. \)

Then, we have tried repeating our analysis according to Equation (4) in the extreme intervals, e.g., [1–10−320,1–10−322]. However, as the corresponding \( {\chi ^2} \) statistics are not calculatable for quantiles larger than 1-10-320 in RStudio, direct matching is not possible. To address this issue, we have used QM and natural spline interpolation methods to obtain the estimated \( {\chi ^2} \) statistics for those quantiles by the following steps [45]. Details are presented in Appendix B:

  • We have transformed the probability distribution function (PDF) of the \( {\chi ^2} \) distribution to obtain a second-order ordinary differential Equation (ODE) by using the quantile mechanic approach.

  • Power series approach is used to obtain the estimates of the first few items in ODE, which are served as the initial approximates.

  • Natural spline interpolation is used to force the initial approximates to converge to a solution close to the R values.

  • The obtained spline function is then used to estimate the \( {\chi ^2} \) statistics in the extreme interval.

Then, the estimated \( {\chi ^2} \) statistics are used for quantile matching by Equation (4) and the estimated \( {\hat{a}_n} \) and \( {\hat{b}_n} \) are presented in Table 6. In practice, we suggest IS researchers to use the following criteria to decide how the transformation should be done:

  • If the LRT statistics exceed 1,500, Table 6 should be used.

  • If the corresponding quantile of the LRT statistic is larger than 0.99, Table 4 should be used; otherwise, Table 2 is sufficient.

Table 6.
Degrees of Freedom, n\( {\hat{a}_n} \) (Extreme Scenario)\( {\hat{b}_n} \) (Extreme Scenario)
29.23420.9999
315.55491.0004
419.41301.0021
525.48211.0021
629.83521.0031
735.79121.0028
840.28951.0035
942.95361.0053
1047.20761.0060
1151.68921.0064
1256.76951.0064
1361.02361.0068
1464.95591.0074
1569.43901.0076
1672.36301.0089
1776.52501.0092
1880.04811.0099
1984.30881.0101
2087.95631.0107

Table 6. Corresponding a and b for Extreme Scenario Matching ([1–10−320,1–10−322])

The idea of the transformation method is essential. It ensures comparison accuracy and flexibility for a multivariate effect size across different factors with different numbers of operationalized variables. As shown in Table 18, in the context of the factors that influence accident severity, the LRT statistics for Location and Time are 24,522.13 and 8,699.50; because they are operationalized as 11 and one variables, theoretically their LRT statistics cannot be directly used in comparison and their cumulative probabilities are computed as 1 in RStudio. If we use the standardized LRT statistics for comparison, the transformed statistic for Location is \( \frac{{24,522.13 - 11}}{{\sqrt {22} }} = 5,\!225.79 \), while the transformed statistic for Time is \( \frac{{8,699.50 - 1}}{{\sqrt 2 }} = 6,\!150.77 \). We will conclude Time is more influential in explaining accident severity than Location. However, if we apply the quantile-matching transformation method, then the transformed statistic for Location is (24,522.13-51.6892)/1.0064 = 24314.83, which is much larger than 8,699.50. That is, in fact, Location is more influential in explaining accident severity than Time.

3.4 The Development of Absolute Multivariate Effect Size

The method outlined above enables us to identify the influential factors within one model. However, sometimes it would be more meaningful to conclude whether the factor is influential based on an absolute threshold. To do that, we have used the proportion of explanatory power provided by the focal factor similar to R2, (6) \( \begin{equation} {p_i} = \frac{{LR{T_i}}}{{LR{T_{\boldsymbol{X}}}}}, \end{equation} \)where \( LR{T_i} \) is the explanatory power provided by \( {X_i} \), and \( LR{T_{\boldsymbol{X}}} \) is the total explanatory power provided by all Xs.

However, because one factor can be operationalized as several variables and a greater number of variables provides more freedom for the model to estimate the variance in the dependent variable [20, 44], a factor operationalized as more variables tend to achieve a higher absolute multivariate effect size using Equation (6). To adjust the bias caused by the number of operationalized variables, we have used the quantile-matching transformed statistic to calculate the absolute multivariate effect size, which is similar to the adjusted R2,

(7)
where \( Z_i^q \) is the quantile-matching transformed multivariate effect size of \( {X_i} \), \( Z_{\boldsymbol{X}}^q \) is the quantile-matching transformed multivariate effect size of all Xs, and the transformation can be done based on the full distribution [0.99,0.9999999] or the extreme scenario. We will demonstrate the use of this method in Section 4.

Moreover, we can provide the thresholds for researchers to refer to by building connections with Cohen's \( {f^2} \). Cohen's \( {f^2} \) is defined in the following way: \( \begin{equation*} {f^2} = \frac{{{R^2}}}{{1 - {R^2}}}, \end{equation*} \)which is the ratio between the variance explained by the IVs and the variance explained by residuals. According to Cohen's [13] guidelines, \( {f^2} \ge 0.02,\ {f^2} \ge 0.15 \) and \( {f^2} \ge 0.35 \) represent small, medium, and large effect sizes, respectively, and the corresponding \( {R^2} \) is 0.196, 0.36, and 0.51. Because \( {R^2} \) is similar to \( {p_i} \) (as one measures the proportion of variance explained, and the other measures the proportion of explanatory power provided by IVs), we have adopted the same thresholds for \( {p_i} \). Furthermore, the \( Adjusted\ {p_i} \) can be rewritten as \( \begin{equation*} Adjusted\ {p_i} = \frac{{Z_i^q}}{{Z_{\boldsymbol{X}}^q}} = \left\{ \begin{array}{@{}*{1}{c}@{}} {{{\hat{b}}_N} \cdot \frac{{{{\widehat {LRT}}_i}}}{{{{\widehat {LRT}}_{\boldsymbol{X}}} - {{\hat{a}}_N}}},\ \ n = 1}\\[6pt] {{{\hat{b}}_N} \cdot \frac{{\frac{{{{\widehat {LRT}}_i} - {{\hat{a}}_n}}}{{{{\hat{b}}_n}}}}}{{{{\widehat {LRT}}_{\boldsymbol{X}}} - {{\hat{a}}_N}}},\ \ n > 1} \end{array}\right., \end{equation*} \)where n is the number of variables by which the focal factor \( {X_i} \) is operationalized and N is the number of IVs in the full model. Because \( {\hat{a}_N} \) is relatively small compared to \( {\widehat {LRT}_{\boldsymbol{X}}} \), \( Adjusted\ {p_i} \) is approximately equal to \( {\hat{b}_N} \cdot {p_i} \) when \( n = 1 \), and \( {\hat{b}_N} \cdot {\tilde{p}_i} \) when \( n > 1 \), where \( {\tilde{p}_i} = \) \( \frac{{\frac{{{{\widehat {LRT}}_i} - {{\hat{a}}_n}}}{{{{\hat{b}}_n}}}}}{{{{\widehat {LRT}}_{\boldsymbol{X}}}}} \) is the proportion of explanatory power provided by the focal factor after adjusting its number of operationalized variables. Then, the new thresholds for \( Adjusted\ {p_i} \) can be obtained by multiplying the corresponding \( {\hat{b}_N} \) in Tables 2, 4, or 6 with the degrees of freedom equal to the number of IVs in the full model.

Skip 4EXAMPLE: E-MAIL COMMUNICATION EFFECTIVENESS Section

4 EXAMPLE: E-MAIL COMMUNICATION EFFECTIVENESS

To demonstrate the use of the transformed statistic, we have used a large data set from the e-mail archive of an international coffee house operating in China. The dataset includes 6,230,253 e-mails sent to 2,206,652 members covering three membership tiers: Bottom Tier, Middle Tier, and Top Tier. E-mails are classified into three categories: New Product, Promotion/Discount, and Upgrade Incentive/Reminder. We have also obtained information about consumers’ previous e-mail opening decisions, their membership duration, whether the e-mail is seasonal, and whether the company name is mentioned in the title. As e-mail marketing has had a positive impact on sales [58], and not only offers communication via email served as price discrimination tools, they are also a form of “advertising” for the firm's products [48]. Therefore, we have investigated what factors influence consumers’ e-mail opening decisions to build the following logit model: \( \begin{equation*} {P_{\begin{array}{@{}*{1}{c}@{}} {Open\ an\ email = logit({\theta _0} + {\theta _1}\left( {Membership{\rm{\ }}Tier} \right) + {\theta _2}\left( {Message{\rm{\ }}Type} \right)}\\ +\, {{\theta _3}\left( {Membership{\rm{\ }}Tier{\rm{*}}Message{\rm{\ }}Type} \right) + {\theta _4}\left( {Control{\rm{\ }}Variables} \right) + \varepsilon )} \end{array}}} \end{equation*} \)

We chose Membership Tier and Message Type as our main focuses because the prior purchase is commonly used as an indicator of familiarity with the firm [52] and familiarity is associated with increased levels of trust [19]. Following this line of reasoning, a higher membership tier indicates a higher level of trust which could (1) reduce suspicion that the received email is spam and (2) improve receptiveness toward the firm's communications. In addition, the design of the message has been shown to significantly influence the effectiveness of communication [7, 57] and different message types are associated with various benefits—customers may associate new product introduction emails with hedonic benefits since they fulfill members’ desire to learn about new products, Promotion/Discount emails communicate potential monetary gains and provide utilitarian benefits, while Upgrade Incentive/Reminder emails offer symbolic benefits. Hence, it is reasonable to believe Message Type will play an important role in consumers’ email opening decisions.

The interactions between membership tiers and incentive e-mails are dropped due to high correlations with other variables (>0.85). Table 7 presents the results. Clear evidence of the deflated p-value problem is observed, as the p-values for most variables are \( < \!\!2E - 16 \) (highlighted in bold)—lower than the minimum value that can be shown in the logit package of RStudio 1.1.383. The traditional effect size method (\( \theta \)) suggests that Previous E-mail Opened has the largest effect on consumers’ current opening decisions.

Table 7.
5%10%15%20%25%50%100%
Intercept−1.667***−1.666***−1.677***−1.673***−1.689***−1.676***−1.675***
New Product−0.426***−0.456***−0.411***−0.407***−0.381***−0.414***−0.411***
Upgrade Incentive/Reminder0.505***0.512***0.530***0.538***0.521***0.523***0.530***
Other−0.479***−0.473***−0.474***−0.475***−0.468***−0.475***−0.475***
Previous E-mail Opened1.918***1.921***1.906***1.914***1.912***1.918***1.916***
Middle Tier0.064**0.061***0.073***0.072***0.088***0.075***0.073***
Top Tier0.284***0.268***0.286***0.273***0.293***0.294***0.285***
Membership Duration−0.115***−0.106***−0.112***−0.109***−0.109***−0.110***−0.110***
Company Name0.326***0.334***0.311***0.314***0.301***0.306***0.308***
Seasonal/Festive0.239***0.240***0.240***0.239***0.243***0.242***0.239***
New Product X Middle Tier−0.059*−0.010−0.039*−0.065***−0.078***−0.044***−0.045***
New Product X Top Tier0.290***0.342***0.311***0.284***0.288***0.308***0.302***
Other X Middle Tier0.389***0.390***0.394***0.395***0.374***0.385***0.391***
Other X Top Tier0.321***0.324***0.333***0.343***0.322***0.318***0.330***

Table 7. Coefficient Estimates for Each Subsample Using the Simple Logit Model

However, if we attempt to compare the overall effect for Membership Tier and Message Type, we will encounter a problem, particularly if interaction terms are also considered. Nevertheless, we may assume Message Type is more influential because the coefficients of its variables are larger.

First, we have applied the LRT method to obtain the multivariate effect size on factor level; the results are presented in Table 8. Then, we have calculated the correlation between the obtained LRT statistic with the square of coefficients \( \theta \) for univariate factors (Membership Duration, Previous E-mail Opened, Seasonal/Festive, and Company Name) to demonstrate the close relationship between the LRT method and the traditional effect size method (\( \theta \)). All the correlations are very close to 1 (Table 9), indicating the LRT method inherits the merits of the traditional effect size method.

Table 8.
5%10%15%20%25%50%100%DfRank
Membership Duration462.34954.991399.801880.772330.164712.929371.381 5
Membership Tier2219.154421.596740.068937.0911101.822344.744733.96 2
Message Type1940.024005.055993.207922.209941.8219707.239527.77 3
Previous E-mail Opened43650870321313541742732182184358328713451 1
Seasonal/Festive330.73646.89977.961271.321594.583254.576453.831 6
Company Name475.11975.021489.921955.732473.374916.549751.751 4
Full Model53364.2107006.6161096.8214501.2268587536716107072613

Table 8. Original Multivariate Effect Size

Table 9.
Pearson Correlation5%10%15%20%25%50%100%
LRT Statistic and Square of Coefficient0.99980.99970.99980.99980.99980.99980.9998

Table 9. Pearson Correlation between LRT Statistic and Square of Coefficient in Univariate Situation

After that, we have applied quantile-matching transformation to the LRT statistics presented in Table 8 using the results in Table 6. We have found that the most influential factor in explaining consumers’ decisions to open e-mails is their past decision, followed by membership tier and message type (shown in Table 10). We have also calculated the absolute multivariate effect size for each factor according to Equation (7) and have presented the results in Table 11. The thresholds for small, medium, and large effect sizes are 0.197, 0.362, and 0.513. The absolute multivariate effect size of Previous E-mail Opened is around 0.82. Adopting similar statistical reasoning of the adjusted R2, we can interpret it as Previous E-mail Opened providing around 82% explanatory power of the whole model. Comparing with the effect size thresholds, we can conclude that Previous E-mail Opened has a large absolute effect size, whereas the other factors have lower than small effect sizes.

Table 10.
5%10%15%20%25%50%100%DfRank
Membership Duration462.34954.991399.801880.772330.164712.929371.3815
Membership Tier2182.554378.186689.498879.7311037.7522245.9044565.9162
Message Type1898.913958.185940.777864.399878.3719616.4839381.6473
Previous E-mail Opened436518703213135417427321821943583387134511
Seasonal/Festive330.73646.89977.961271.321594.583254.576453.8316
Company Name475.11975.021489.921955.732473.374916.549751.7514
Full Model52943.16106223.26159948.13212991.83266712.33533030.371063433.6313

Table 10. Relative Multivariate Effect Size (Quantile-Matching Transformed LRT Statistic)

Table 11.
5%10%15%20%25%50%100%Df
Membership Duration0.0090.0090.0090.0090.0090.0090.0091
Membership Tier0.0410.0410.0420.0420.0410.0420.0426
Message Type0.0360.0370.0370.0370.0370.0370.0377
Previous E-mail Opened0.8240.8190.8210.8180.8180.8180.8191
Seasonal/Festive0.0060.0060.0060.0060.0060.0060.0061
Company Name0.0090.0090.0090.0090.0090.0090.0091
  • Notes: Factors with large effect sizes are highlighted in bold.

Table 11. Absolute Multivariate Effect Size

  • Notes: Factors with large effect sizes are highlighted in bold.

4.1 Comparison with Predictively Influential Factors

Although explanation is central to theory development, methods for assessing the predictive claims of a model have received increasing attention in recent years. As a result, we have also compared the influential factors identified by our method with those identified by the predictive measurement of the area under the ROC curve (AUROC).

First, we will demonstrate below how we have identified influential factors based on their predictive accuracy by applying ROC analysis. After fitting a logistic regression model: (8) \( \begin{equation} \log \left( {\frac{p}{{1 - p}}} \right) = {\theta _0} + {\theta _1}{X_1} + {\theta _2}{X_2} + \cdots {\theta _k}{X_k}, \end{equation} \)where p = Pr(Y = 1), we can estimate the value of p, denoted by \( \hat{p} \), representing a score for prediction. Predicted values of Y can be derived by setting a “threshold value” \( v \) . An observation is predicted as “Y = 1” if \( \hat{p} \ge v \), where \( \hat{p} \) is calculated by substituting \( {\theta _i} \) with the ML estimate \( {\hat{\theta }_i} \).

A ROC curve is commonly used to assess the predictive performance of a binary regression, which is a plot of sensitivity (true positive rate) versus specificity (1 – false positive rate). By varying the threshold value \( v \) and calculating the AUROC, we can measure a model's predictive performance and the probability of making a correct binary classification [8, 26]. In practice, the calculation of AUROC is based on sample misclassification errors. To obtain a reliable estimate of AUROC, we have used N-fold cross-validation [27]. For example, with N = 10, we have randomly separated the data set into 10 non-overlapping blocks. To predict the category for observations in block j, j = 1, …, 10, we have obtained \( {\hat{\theta }_i} \) to determine the formula for the score \( \hat{p} \) using the other nine blocks and compute the AUROC(j). The AUROC is then estimated from the sample mean \( \mathop \sum \nolimits_{j = 1}^{10} AUROC( j )/10 \).

Using cross-validation, we have obtained \( AURO{C_{full}} \) for the full model. To assess the effect size of \( {X_i} \), we have refitted the logistic regression model in (8) without \( {X_i} \) in each block j and recalculate \( AURO{C_i} \). The level of influence is the difference in AUROC: \( \begin{equation*} DAURO{C_i} = AURO{C_{full}} - AURO{C_i}. \end{equation*} \)

In our example, we have found Previous E-mail Opened is the most influential factor—even from the perspective of prediction—followed by Message Type and Membership Tier. Less predictive power results came from Seasonal/Festive and Company Name. We have also replicated the ROC analysis with various sample sizes and have reported the results in Tables 12 and 13; the main results remain unchanged. However, we have noted that the influence ranking of factors is not stable when the sample size is too small (below 10%).

Table 12.
5%10%15%20%25%50%100%Df
Membership Duration0.003060.003110.003000.003080.003040.003090.003041
Membership Tier0.006990.006800.006870.006880.006880.006880.006956
Message Type0.008310.008700.008530.008480.008520.008500.008537
Previous E-mail Opened0.145080.145130.145390.144940.145250.144970.144961
Seasonal/Festive0.000930.000840.000890.000900.000850.000880.000881
Company Name0.000890.000940.000950.000900.000970.000930.000921

Table 12. Average DAUROC Results for Each Subsample

Table 13.
 5%10%15%20%25%50%100%
Membership Duration4444444
Membership Tier3333333
Message Type2222222
Previous E-mail Opened1111111
Seasonal/Festive5666666
Company Name6555555

Table 13. Average DAUROC Rank Results for Each Subsample

Next, we will compare the results obtained from the LRT and the AUROC analyses and have found both sets of results are highly correlated. First, we have calculated the Pearson correlation of the test statistic value produced by these two methods (presented in Table 14). We have discovered the Pearson correlation is above 0.999 across all sample sizes. We then calculate the Spearman correlation of the two sets of rankings. The correlation decreases, as the AUROC analysis identifies Message Type as the second most influential factor, whereas the LRT analysis indicates that Membership Tier is the second most influential. However, the correlation is still above 0.8 when the AUROC analysis results become stable (as shown in Table 15).

Table 14.
Pearson Correlation5%10%15%20%25%50%100%
DAUROC/QM (Full Distribution)0.99910.99910.99910.99910.99910.99910.9991
DAUROC/QM (0.95 to 0.99999)0.99970.99960.99970.99970.99970.99960.9996

Table 14. Pearson Correlation Results

Table 15.
Spearman Correlation5%10%15%20%25%50%100%
DAUROC/QM (Full Distribution)0.771430.885710.885710.885710.885710.885710.88571
DAUROC/QM (0.95 to 0.99999)0.771430.885710.885710.885710.885710.885710.88571

Table 15. Spearman Correlation Results

Initially, our finding may seem surprising as there should be a disparity between the ability to explain phenomena at the conceptual level and the ability to generate predictions at the measurable level. This is because explaining and predicting are different as the type of uncertainty associated with explanation is of a different nature than that associated with prediction [28]. According to Shmueli [51], measurable data are not accurate representations of their underlying constructs, therefore, operationalization of theories and constructs into statistical models and measurable data creates a disparity between the ability to explain phenomena at the conceptual level and the ability to generate predictions at the measurable level. That is, the results of the explanation are usually different from those of the prediction. However, Konishi and Kitagawa [35] have pointed out there may be no significant difference between inferring the true structure and making a prediction if an infinitely large quantity of data is available or if the data are noiseless. Our results have provided empirical support for this view and this demonstration is particularly relevant to big data research where predictive performance is the main objective when building statistical models. Furthermore, LRT analysis substantially outperforms ROC analysis in terms of computation time (shown in Table 16). On average, ROC analysis is around 100 times more time-consuming than our method.

Table 16.
5%10%15%20%25%50%100%
LRT Analysis2.75.08.511.914.129.762.0
ROC Analysis281.5602.9995.01339.31798.83661.97240.1

Table 16. Average Computation Time(s) of Standardized LRT Analysis and ROC Analysis

4.2 Replication on the US Accidents Dataset

We have replicated our analysis on a public dataset—the US accidents dataset. This is a countrywide car accident dataset, which covers 49 states of the USA and was collected by Moosavi et al. [41]. The accident data were collected from February 2016 to Dec 2020 and include 47 variables. After excluding the records that contain missing values, there are 1,464,180 records remaining.

Traffic accidents are a major public health problem. Helping to predict their causes or occurrences have been a topic of interest in the field of machine learning. Several studies have used Moosavi's accident dataset to study this issue: Moosavi et al. [40] uses a deep-neural network model to predict real-time traffic accidents based on traffic events, weather data, time, and points-of-interest. They evaluated their model with the F1-score defined as \( \frac{{2{\rm{*}}precision{\rm{*}}recall}}{{precision + recall}} \), and demonstrated a significant improvement for class of accidents. Kebede [31] used the image data in this dataset to predict locations where car accidents are likely to happen with transfer learning with the CNN method. Parra et al. [46] evaluated different explainable machine learning models (e.g., random forests and decision trees) on this dataset in predicting road traffic crashes. Brodeur et al. [10] examined the impact of COVID-19 safer-at-home policies on car crashes with this dataset and found a 20% reduction in vehicular collisions.

In our example, we have tried understanding how the severity of an accident is influenced by the following three factors: weather, location, and time. The original accident severity in the dataset has four levels. For demonstration purposes, we have transformed it to a binary variable by considering levels 1 and 2 as less severe and levels 3 and 4 as very severe, such that we could apply logit model in this situation. Weather is described by temperature, wind_chill, humidity, pressure, pressure, visibility, wind_speed, and precipitation, Location is described by bump, crossing, give_way, junction, no_exit, railway, roundabout, station, stop, traffic calming, traffic signal, and turning loop, Time is indicated by the variable sunrise_sunset (whether it is day or night time). A simple logit model is built, and the results are presented in Table 17. Again, clear evidence of the deflated p-value problem is observed as the p-values for most variables are \( < 2E - 16 \).

Table 17.
EstimateStd. Errorz valuePr(>|z|)
Intercept0.58060.055410.4705<2.22e-16***
Temperature0.08300.001269.9780<2.22e-16***
Wind_Chill−0.06760.0010−64.6209<2.22e-16***
Humidity0.00690.000159.6018<2.22e-16***
Pressure−0.11470.0019−60.5012<2.22e-16***
Visibility−0.00120.0008−1.47350.14062299
Wind_Speed0.01390.000431.0866<2.22e-16***
Precipitation0.83540.048417.2714<2.22e-16***
Bump−0.98910.1762−5.61471.97E-08***
Crossing−0.66950.0127−52.9124<2.22e-16***
Give_Way0.28310.04566.20215.57E-10***
Junction0.22780.007131.8630<2.22e-16***
No_Exit0.25420.06144.14103.46E-05***
Railway0.35920.026513.5350<2.22e-16***
Roundabout−2.51000.7162−3.50440.00045771***
Station−0.47220.0196−24.1444<2.22e-16***
Stop−1.00270.0250−40.0614<2.22e-16***
Traffic_Calming1.33080.120911.0087<2.22e-16***
Traffic_Signal−0.66570.0081−82.0339<2.22e-16***
Sunrise_Sunset−0.46710.0051−92.1781<2.22e-16***

Table 17. Coefficient Estimates for U.S. Accidents Using the Simple Logit Model

First, we have used the LRT method to obtain the raw multivariate effect size at the factor level, then applied quantile-matching transformation using Table 6. The results are presented in Table 18. We have found that the most influential factor in explaining accident severity is Location, followed by Weather. We have also calculated the absolute multivariate effect size for each factor according to Equation (7) and have presented the results in Table 18, we have found Location provides about 47% explanatory power of the whole model and Weather provides about 31% explanatory power. As the thresholds for small, medium, and large effect sizes are 0.198, 0.364, and 0.515, we can conclude Location has a medium absolute effect size, Weather has a small effect size, and Time has a lower than small effect size.

Table 18.
LRT StatisticStandardized LRT StatisticRelative Multivariate Effect SizeAbsolute Multivariate Effect SizeAverage DAUROCDf
Weather16085.994297.2916005.390.30950.03237
Location24522.135225.7924314.830.47020.038511
Time8699.506150.778699.500.170.01051

Table 18. Multivariate Effect Size and DAUROC Results of U.S. Accident Dataset

Furthermore, we have performed the ROC analysis with 10-folder cross-validation and calculated the DAUROC for each factor—the results are presented in Table 18. We have found Weather and Location are more influential in predicting accident severity, which is consistent with the results of the multivariate effect size method, the Person correlation is 0.9401, and the Spearman correlation is 1.

Notice that if we use the standardized LRT statistics for comparison of the influence level of factors, we will conclude that Time is the most influential factor in explaining accident severity, followed by Location and Weather. However, the AUROC method suggests Location is the most influential factor in predicting accident severity, followed by Weather and Time, which is consistent with results generated by the transformed multivariate effect size method. This highlights the importance of the quantile-matching transformation technique in the large sample context. In addition, we have observed the direct use of raw LRT statistics can generate similar results as quantile-matching transformation. However, the direct use of raw LRT statistics lacks theoretical grounding. Therefore, we recommend using the quantile-matching transformation method in general for easy and fast calculation.

4.3 Replication of the Airbnb Listing Dataset

We have used the logit regression model in the previous two examples. In the following example, we have replicated our analysis on the Airbnb listing dataset1 with a linear regression model. This dataset contains 250,000+ listings in ten major cities, including information about hosts, pricing, location, room type, and review scores.

Price determinants of Airbnb listings have always been of interest to researchers, given the uniqueness of Airbnb's listings and the heterogeneity of the landlord [56]. Wang and Nicolau [53] explored the impacts of the five factors—host attributes, site and property attributes, amenities and services, rental rules, and online review ratings—on prices using 180,533 accommodation rental offers in 33 cities. Cai et al. [11] examined the impacts of five groups of explanatory variables on Airbnb prices in Hong Kong: listing attributes, host attributes, rental policies, listing reputation, and listing location. Wu and Qiu [2019] focused on the influence of nice factors (external factors, landlord characteristics, location characteristics, listing characteristics, room facilities, rental rules, trust, sociality, and tenant characteristics) on listings prices using 51,874 listings in 36 cities in China. Similarly, in this example, we have investigated the impact of host attributes, property attributes, location, and online review ratings on listing prices (the prices are converted to Euros). Host attributes include the duration the host has been on Airbnb, whether the host's identity is verified, whether the host is a “Superhost”, and the total number of listings the host has on Airbnb. The property attribute describes the property as an entire place, a private room, or a shared room. Location refers to the city the property is in. Online review ratings include the overall rating and the respective rating of cleanness, check-in experience, communication experience with the host, location within the city, and listing's value relative to its price. There are 161,447 listings remaining after cleaning the data. A simple linear model is built, and the results are presented in Table 19. Our observation of the p-values for quite a few variables is <2e-16, indicating the deflated p-value problem.

Table 19.
EstimateStd. Errorz valuePr(>|z|)
Intercept53.26938.53676.244.38E-10***
Host Duration0.14160.02136.6443.07E-11***
Verified7.65941.46485.2291.71E-07***
Superhost−3.50951.4316−2.4511.42E-02*
No. of Total Lists0.07710.01027.5833.40E-14***
Private room−68.34835.2341−13.058<2e-16***
Shared room56.93903.399616.749<2e-16***
Cape Town53.01355.54989.552<2e-16***
Hong Kong−1.07563.5631−0.3020.762742
Istanbul−2.65423.3221−0.7990.424317
Mexico City72.81163.056323.823<2e-16***
New York52.76622.936017.972<2e-16***
Paris18.69703.28375.6941.24E-08***
Rio de Janeiro38.85913.132312.406<2e-16***
Rome75.64543.126424.196<2e-16***
Sydney2.74850.79033.4785.06E-04***
Review_Score_Location−3.25231.0918−2.9790.002895**
Review_Score_Cleanliness6.36940.90577.0332.04E-12***
Review_Score_Checkin3.71151.11283.3358.53E-04***
Review_Score_Communication53.26938.53676.244.38E-10***
Overall Rating0.14160.02136.6443.07E-11***

Table 19. Coefficient Estimates for Airbnb Listing Prices Using the Linear Regression Model

The raw multivariate effect size of each factor is first obtained through the LRT method, then quantile-matching transformation is applied using Tables 4 and 6 (Table 4 is used for Host Attributes and Online Review Ratings, and Table 6 is used for Property Attributes and Location), and the results are presented in Table 20. We find that the most influential factors in explaining Airbnb listing prices are Property Attributes and Location. The absolute multivariate effect size for each factor is also calculated according to Equation (7) and is presented in Table 20. Property Attributes and Location have medium absolute effect sizes (both can provide about 42% explanatory power of the whole model), while the effect sizes of Host Attributes and Online Review Ratings is lower than small.

Table 20.
LRT StatisticStandardized LRT StatisticRelative Multivariate Effect SizeAbsolute Multivariate Effect SizeAverage DRMSEDf
Host Attributes150.8951.93120.030.02560.10084
Property Attributes1981.9989.951972.860.42061.53962
Location2040.3478.781986.820.42361.54849
Online Review Ratings158.1948.44119.760.02550.11165

Table 20. Multivariate Effect Size and DRMSE Results of Airbnb Listing Dataset

Furthermore, we have performed the Root-Mean-Square Error (RMSE) analysis with 10-folder cross-validation and calculated the Difference in RMSE for each factor. The results are presented in Table 20. RMSE is a measure of the differences between values predicted by a model and the values observed. It is commonly used to measure the prediction performance of linear regression models. RMSE is calculated as follows: \( \begin{equation*} RMSE = {\rm{\ }}\sqrt {\mathop \sum \limits_{i = 1}^n \frac{{{{\left( {{{\hat{y}}_i} - {y_i}} \right)}^2}}}{n}.} \end{equation*} \)

We have found Property Attributes and Location are more influential in predicting listing prices. This is consistent with the results of the multivariate effect size method, and the Person correlation is 0.9999.

Again, if we use the standardized LRT statistics to compare the influence level of factors, we will conclude Property Attributes is the most influential factor in explaining listing prices. However, shown by the RMSE method, Property Attributes and Location are similarly influential in predicting listing prices, which is consistent with results generated by the transformed multivariate effect size method. This underlines the usefulness of the quantile-matching transformation technique in the large sample context.

Skip 5CONCLUSIONS Section

5 CONCLUSIONS

The deflated p-value problem caused by large samples can lead to severe issues in IS research. In this study, we have proposed a multivariate effect size method to address this problem by making use of the LRT statistic. Multivariate effect size measures the joint effect of variables by which the focal factor is operationalized and shown to be closely related to the traditional effect size (\( \theta \)). But it can be easily extended to multivariate situations, thus overcoming the disadvantage that the traditional effect size (\( \theta \)) can only be applied at the variable level. This statistic asymptotically follows the \( {\chi ^2} \) distribution. However, because one factor can be operationalized as several variables, comparisons among factors with different numbers of variables may be troublesome. Previous transformation methods face limitations in the large sample context. For example, the mean/SD transformation can give us inconsistent results with cumulative probabilities when the difference in cumulative probabilities is too small to be recorded. For this reason, we have applied the quantile-matching transformation method. This method achieves the three Fs: (1) Feasibility—it is feasible in scenarios where the cumulative probability method fails due to the use of extremely large samples, and maintains a one-to-one corresponding relationship with cumulative probability under a perfect matching situation; (2) Flexibility—the matching can be done on the full distribution or a specific interval providing tremendous flexibility in real applications; and (3) Fast calculation—it is easy to implement and calculate. We have also proposed using the adjusted proportion of explanatory power provided by the focal factor as the absolute effect size measure and build connections with classic Cohen's \( {f^2} \) to determine the thresholds for small, medium, and large effect sizes. We have applied the methods to three datasets through the following steps:

  • Obtain the estimated multivariate effect size (LRT statistic) for the factors of interest.

  • Transform the multivariate effect size using Tables 2, 4, or 6:

    • If the multivariate effect size exceeds 1,500, Table 6 should be used.

    • If the corresponding quantile of the multivariate effect size is larger than 0.99, Table 4 should be used; otherwise, Table 2 is sufficient.

  • Compare the transformed multivariate effect size or calculate the absolute effect size to identify the influential factors.

As demonstrated, our method can be used in the large sample context to identify influential factors with greater accuracy and we have shown explanatorily influential factors are often also predictively influential in large sample scenarios.

Our study has made the following contributions. First, we have used the LRT statistic to measure the multivariate effect size of a result to apply on the factor level, thus addressing the deflated p-value issue. Second, we have introduced the quantile-matching transformation method to deal with the impact of different numbers of degrees of freedom of the \( {\chi ^2} \) distributions. These transformations provide great accuracy and flexibility and can be applied to any distribution or part of a distribution and enable comparison regardless of the context. Third, we have shown explanatorily influential factors are also predictively influential in the large sample context, thus providing insights into the role of explanation and prediction in the large sample context.

We must also acknowledge the limitations of our study. First, our study does not address the concern that the p-value cannot directly measure the probability of the null hypothesis given the observed data. Second, our method only applies to scenarios where the models are nested—we cannot compare the influence of factors from non-nested models. Further studies can extend the analysis to a more general context.

APPENDICES

A APPENDIX

Suppose the LRT statistic of factor \( {X_i} \) as \( LR{T_i} \), which can be written as: \( \begin{equation*} LR{T_i} = - 2( {{\cal L}( {{\boldsymbol{\tilde{\theta }}}}) - {\cal L}( {{\boldsymbol{\hat{\theta }}}})}), \end{equation*} \)where \( {\cal L}( {\boldsymbol{\theta }} ) \) is the log-likelihood of \( {\boldsymbol{\theta }} \), \( {\boldsymbol{\hat{\theta }}} = ( {{{\hat{\theta }}_1},{{\hat{\theta }}_2}, \ldots {{\hat{\theta }}_v},{{\hat{\theta }}_{v + 1}}, \ldots {{\hat{\theta }}_w}} ) \) is the ML estimator for \( {\boldsymbol{\theta }} \) over the full parameter set, and \( {\boldsymbol{\tilde{\theta }}} = ( {{{\tilde{\theta }}_1},{{\tilde{\theta }}_2}, \ldots {{\tilde{\theta }}_v},0, \ldots 0} ) \) is the ML estimator under the null hypothesis \( {H_0} \): the exclusion of \( {X_i} \) has no impact on the model's goodness of fit. Then \( LR{T_i} \) measures the impact of the exclusion of \( {X_i} \) from the model on the model's goodness of fit, which can be used as an indicator of the effect size of \( {X_i} \).

Note that if we do a Taylor expansion around \( {\boldsymbol{\hat{\theta }}} \) for \( {\cal L}( {{\boldsymbol{\tilde{\theta }}}} ) \), we will obtain: \( \begin{equation*} {\cal L}( {{\boldsymbol{\tilde{\theta }}}}) = {\cal L}( {{\boldsymbol{\hat{\theta }}}}) + \mathop \sum \limits_{i = 1}^w \left( {{{\hat{\theta }}_i} - {{\tilde{\theta }}_i}} \right)\left(\frac{{\partial {\cal L}}}{{\partial {{\hat{\theta }}_i}}}\Big|{\boldsymbol{\hat{\theta }}}\right) + \frac{1}{2}\mathop \sum \limits_{i = 1}^w \mathop \sum \limits_{j = 1}^w \left( {{{\hat{\theta }}_i} - {{\tilde{\theta }}_i}} \right)\left(\frac{{{\partial ^2}{\cal L}}}{{\partial {{\hat{\theta }}_i}\partial {{\hat{\theta }}_j}}}\Big|{\boldsymbol{\hat{\theta }}}\right)\left( {{{\hat{\theta }}_j} - {{\tilde{\theta }}_j}} \right) \end{equation*} \)The second term will be zero because \( \frac{{\partial {\cal L}}}{{\partial {{\hat{\theta }}_i}}} = 0 \); therefore, it can be rewritten as: \( \begin{equation*} {\cal L}( {{\boldsymbol{\tilde{\theta }}}}) = {\cal L}( {{\boldsymbol{\hat{\theta }}}}) + \frac{1}{2}( {{\boldsymbol{\hat{\theta }}} - {\boldsymbol{\tilde{\theta }}}})H{( {{\boldsymbol{\hat{\theta }}} - {\boldsymbol{\tilde{\theta }}}})^T}, \end{equation*} \)where H is the Hessian matrix and is equal to \( PD{P^T} \), D is the diagonal matrix and contains the eigenvalue \( {\lambda _i} \) of H, and P is the eigenvector matrix of H. Then we have: \( \begin{equation*} - 2( {{\cal L}( {{\boldsymbol{\tilde{\theta }}}}) - {\cal L}( {{\boldsymbol{\hat{\theta }}}} )}) = - \mathop \sum \limits_{i = 1}^w {\lambda _i}p_{ii}^2{( {{{\hat{\theta }}_i} - {{\tilde{\theta }}_i}} )^2} = - \mathop \sum \limits_{i = 1}^v {\lambda _i}p_{ii}^2{( {{{\hat{\theta }}_i} - {{\tilde{\theta }}_i}})^2} - \mathop \sum \limits_{i = v + 1}^w {\lambda _i}p_{ii}^2\hat{\theta }_i^2, \end{equation*} \)where \( {p_{ii}} \) is the value on the diagonal of the eigenvector matrix P; then \( {\lambda _i}p_{ii}^2 \) would be the diagonal value of the Hessian matrix \( \frac{{{\partial ^2}{\cal L}}}{{\partial {{\hat{\theta }}_i}\partial {{\hat{\theta }}_i}}} \). Note that the first term would go to zero as \( {\hat{\theta }_i} \) and \( {\tilde{\theta }_i} \) are strongly correlated. So, their difference is \( O( {1/n} ) \) and their difference squared is \( \ O( {1/{n^2}} ) \) because \( \frac{{{\partial ^2}{\cal L}}}{{\partial {{\hat{\theta }}_i}\partial {{\hat{\theta }}_i}}} \) is only \( O( n ) \), which drops out in summation. Therefore, the LRT statistic for \( {X_i} \) can be rewritten as: \( \begin{equation*} LR{T_i} = - 2( {{\cal L}( {{\boldsymbol{\tilde{\theta }}}}) - {\cal L}( {{\boldsymbol{\hat{\theta }}}})}) \approx \mathop \sum \limits_{i = v + 1}^w \frac{{{\partial ^2}{\cal L}}}{{\partial {{\hat{\theta }}_i}\partial {{\hat{\theta }}_i}}}\hat{\theta }_i^2 = - \mathop \sum \limits_{i = v + 1}^w {\lambda _i}p_{ii}^2\hat{\theta }_i^2. \end{equation*} \)

B APPENDIX

The PDF and CDF of the \( {\chi ^2} \) distribution are given by: \( \begin{equation*} f\left( {x,m} \right) = \frac{1}{{{2^{\frac{m}{2}}}{\rm{\Gamma }}\left( {m/2} \right)}}{x^{\frac{m}{2} - 1}}{e^{ - \frac{x}{2}}},{\rm{\ }}m > 0,x \in \left[ {0, + \infty } \right) \end{equation*} \) \( \begin{equation*} F\left( {x,m} \right) = \frac{{\gamma \left( {\frac{m}{2},\frac{x}{2}} \right)}}{{{\rm{\Gamma }}\left( {\frac{m}{2}} \right)}} = P\left( {\frac{m}{2},\frac{x}{2}} \right), \end{equation*} \)where \( \gamma ( {.,.} ) = \) incomplete Gamma function, \( P( {.,.} ) = \) regular Gamma function.

We have used the quantile mechanics QM approach to obtain the second-order nonlinear differential Equation of the \( {\chi ^2} \) distribution. Suppose: (1) \( \begin{equation} Q\left( p \right) = {F^{ - 1}}\left( p \right),\ \end{equation} \)where the function \( {F^{ - 1}}( p ) \) is the compositional inverse of the CDF, then the first order quantile Equation can be obtained from the differentiation of Equation (1): (2) \( \begin{equation} Q'\left( p \right) = \frac{1}{{f\left( {Q\left( p \right)} \right)}} = {2^{\frac{m}{2}}}\left( {{\rm{\Gamma }}\left( {m/2} \right)} \right)Q{\left( p \right)^{1 - \frac{m}{2}}}{e^{\frac{{Q\left( p \right)}}{2}}}{\rm{\ \ }} \end{equation} \)Differentiate Equation (2) to obtain: (2) \( \begin{equation*} Q''( p ) = {2^{\frac{m}{2}}}\left( {{\rm{\Gamma }}\left( {\frac{m}{2}} \right)} \right)\left[ {Q{{( p )}^{1 - \frac{m}{2}}}{e^{\frac{{Q( p )}}{2}}}\frac{1}{2}Q'( p ) + \left( {1 - \frac{m}{2}} \right)Q{{( p )}^{ - \frac{m}{2}}}{e^{\frac{{Q( p )}}{2}}}Q'( p )} \right] \end{equation*} \)

After applying factorization, we can obtain: (3) \( \begin{equation} Q''\left( p \right) = \frac{1}{2}{\left( {Q'\left( p \right)} \right)^2} + \frac{{2 - m}}{{2Q\left( p \right)}}{\left( {Q'\left( p \right)} \right)^2}{\rm{\ \ \ }} \end{equation} \)with the boundary conditions: \( Q( 0 ) = 0,\ Q'( 0 ) = 1. \)

We apply the power series approach to solve Equation (3), and the solution will be: (4) \( \begin{equation} Q\left( p \right) = {d_0} + {d_1}p + {d_2}{p^2} + {d_3}{p^3} + {d_4}{p^4} + {d_5}{p^5} + \cdots = \mathop \sum \limits_{n = 0}^\infty {d_n}{p^n}{\rm{\ \ \ \ }} \end{equation} \)The coefficients are \( {d_0},\ {d_1},{d_2},{d_3},{d_4}, \ldots ,{d_n}. \) Differentiate Equation (4): (5) \( \begin{equation} Q'\left( p \right) = {d_1} + 2{d_2}p + 3{d_3}{p^2} + 4{d_4}{p^3} + 5{d_5}{p^4} + 6{d_6}{p^5} + \cdots = \mathop \sum \limits_{n = 1}^\infty n{d_n}{p^{n - 1}}{\rm{\ \ \ \ \ }} \end{equation} \)Differentiate Equation (5): (6) \( \begin{equation} Q''\left( p \right) = 2{d_2} + 6{d_3}p + 12{d_4}{p^2} + 20{d_5}{p^3} + 30{d_6}{p^4} + 42{d_7}{p^5} + \cdots = \mathop \sum \limits_{n = 2}^\infty n\left( {n - 1} \right){d_n}{p^{n - 2}}{\rm{\ \ }} \end{equation} \)

Substitute Equations (4), (5), and (6) into Equation (3), collect like terms at each term—constant, \( p \), \( {p^2} \), and \( {p^3} \), we obtain:

When m = 1, \( Q( p ) = p + {p^2} + \frac{5}{6}{p^3} \),

When m \( \ne 1 \), \( Q( p ) = p + \frac{1}{{4( {m - 1} )}}{p^2} + \frac{1}{{6m( {m - 1} )}}{p^3} \).

We then adopt natural spline interpolation to reduce the errors between the RStudio software values and the initial approximate from the QM method.

As we focus on extreme intervals, e.g., [1–10−318, 1–10−320], the resulted \( Q( p ) \) values are very similar to each other, which creates difficulty in obtaining meaningful interpolation results. We transform \( Q( p ) \) by subtracting a constant and taking log10 before interpolation: \( \begin{equation*} Q{( p )_{transformed}} = log10( {Q( p ) - c} ) + 320, \end{equation*} \)where c is a constant after rounding up \( Q( {1\hbox{-}{{10}^{-320}}} ) \).

The final closed-form expression for the quantile functions of the \( {\chi ^2} \) distribution at degrees of freedom from 1 to 20 are:

When m = 1, \( Q{( p )_{final}} = 1465.9113 - 7.2213Q( p ) - 8.1206Q{( p )^2} - 7.5275Q{( p )^3} - 12.9783Q{( p )^4}\\ - 6.4765Q{( p )^5} \)

When m = 2, \( Q{( p )_{final}} = 1473.6545 - 7.2263Q( p ) - 8.1261Q{( p )^2} - 7.5326Q{( p )^3} - 12.9872Q{( p )^4}\\ - 6.4809Q{( p )^5} \)

When m = 3, \( Q{( p )_{final}} = 1480.5043 - 7.2310Q( p ) - 8.1315Q{( p )^2} - 7.5376Q{( p )^3} - 12.9958Q{( p )^4}\\ - 6.4852Q{( p )^5} \)

When m = 4, \( Q{( p )_{final}} = 1486.8796 - 7.2359Q( p ) - 8.1369Q{( p )^2} - 7.5426Q{( p )^3} - 13.0045Q{( p )^4}\\ - 6.4895Q{( p )^5} \)

When m = 5, \( Q{( p )_{final}} = 1492.9350 - 7.2407Q( p ) - 8.1423Q{( p )^2} - 7.5476Q{( p )^3} - 13.0132Q{( p )^4}\\ - 6.4938Q{( p )^5} \)

When m = 6, \( Q{( p )_{final}} = 1498.7505 - 7.2457Q( p ) - 8.1479Q{( p )^2} - 7.5528Q{( p )^3} - 13.0219Q{( p )^4}\\ - 6.4983Q{( p )^5} \)

When m = 7, \( Q{( p )_{final}} = 1504.3741 - 7.2504Q( p ) - 8.1532Q{( p )^2} - 7.5577Q{( p )^3} - 13.0306Q{( p )^4}\\ - 6.5025Q{( p )^5} \)

When m = 8, \( Q{( p )_{final}} = 1509.8384 - 7.2550Q( p ) - 8.1584Q{( p )^2} - 7.5625Q{( p )^3} - 13.0389Q{( p )^4}\\ - 6.5067Q{( p )^5} \)

When m = 9, \( Q{( p )_{final}} = 1515.1670 - 7.2597Q( p ) - 8.1637Q{( p )^2} - 7.5674Q{( p )^3} - 13.0474Q{( p )^4}\\ - 6.5109Q{( p )^5} \)

When m = 10, \( Q{( p )_{final}}\! = \!1520.3773 - 7.2643Q( p ) - 8.1690Q{( p )^2} - 7.5723Q{( p )^3} - 13.0557Q{( p )^4}\\ - 6.5151Q{( p )^5} \)

When m = 11, \( Q{( p )_{final}}\! = \!1525.4831 - 7.2694Q( p ) - 8.1746Q{( p )^2} - 7.5776Q{( p )^3} - 13.0645Q{( p )^4}\\ - 6.5197Q{( p )^5} \)

When m = 12, \( Q{( p )_{final}}\! = \!1530.4944 - 7.2737Q( p ) - 8.1795Q{( p )^2} - 7.5821Q{( p )^3} - 13.0725Q{( p )^4}\\ - 6.5235Q{( p )^5} \)

When m = 13, \( Q{( p )_{final}}\! = \!1535.4212 - 7.2784Q( p ) - 8.1847Q{( p )^2} - 7.5870Q{( p )^3} - 13.0810Q{( p )^4}\\ - 6.5277Q{( p )^5} \)

When m = 14, \( Q{( p )_{final}}\! = \!1540.2706 - 7.2833Q( p ) - 8.1902Q{( p )^2} - 7.5921Q{( p )^3} - 13.0894Q{( p )^4}\\ - 6.5321Q{( p )^5} \)

When m = 15, \( Q{( p )_{final}}\! = \!1545.0482 - 7.2877Q( p ) - 8.1951Q{( p )^2} - 7.5966Q{( p )^3} - 13.0974Q{( p )^4}\\ - 6.5361Q{( p )^5} \)

When m = 16, \( Q{( p )_{final}}\! = \!1549.7600 - 7.2920Q( p ) - 8.2001Q{( p )^2} - 7.6012Q{( p )^3} - 13.1055Q{( p )^4}\\ - 6.5399Q{( p )^5} \)

When m = 17, \( Q{( p )_{final}}\! = \!1554.4109 - 7.2968Q( p ) - 8.2054Q{( p )^2} - 7.6062Q{( p )^3} - 13.1139Q{( p )^4}\\ - 6.5442Q{( p )^5} \)

When m = 18, \( Q{( p )_{final}}\! = \!1559.0044 - 7.3013Q( p ) - 8.2105Q{( p )^2} - 7.6108Q{( p )^3} - 13.1221Q{( p )^4}\\ - 6.5482Q{( p )^5} \)

When m = 19, \( Q{( p )_{final}}\! = \!1563.5443 - 7.3059Q( p ) - 8.2156Q{( p )^2} - 7.6156Q{( p )^3} - 13.1301Q{( p )^4}\\ - 6.5524Q{( p )^5} \)

When m = 20, \( Q{( p )_{final}}\! = \!1568.0337 - 7.3103Q( p ) - 8.2206Q{( p )^2} - 7.6202Q{( p )^3} - 13.1383Q{( p )^4}\\ - 6.5563Q{( p )^5} \)

The closed-form expressions can provide excellent approximations of the RStudio software values within the interval [1–10−318, 1–10−320] and we have presented the plots for \( {\chi ^2}( 1 ) \), \( {\chi ^2}( 5 ) \), \( {\chi ^2}( {10} ) \), and \( {\chi ^2}( {20} ) \) below.

Footnotes

REFERENCES

  1. [1] Abbasi A., Sarker S., and Chiang R. H.. 2016. Big data research in information systems: Toward an inclusive research agenda. Journal of the Association for Information Systems 17, 2 (2016), 132.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Agarwal R. and Dhar V.. 2014. Big data, data science, and analytics: The opportunity and challenge for IS research. Information Systems Research 25, 3 (2014), 443448.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Agarwal A., Hosanagar K., and Smith M. D.. 2015. Do organic results help or hurt sponsored search performance? Information Systems Research 26, 4 (2015), 695713.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Altman N. and Krzywinski M.. 2016. P values and the search for significance. Nature Methods 14, 1 (2016), 34.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Altman N. and Krzywinski M.. 2017. Interpreting p values. Nature Methods 14, 3 (2017), 213215.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Amrhein V., Greenland S., and McShane B.. 2019. Scientists rise up against statistical significance. Nature. 567, 7748 (2019), 305307.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Ansari A. and Mela C. F.. 2003. E-customization. Journal of Marketing Research 40, 2 (2003), 131145. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Bamber D.. 1975. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology 12, 4 (1975), 387415.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Benjamin D. J., Berger J. O., Johannesson M., Nosek B. A., Wagenmakers E., Berk R., and Camerer C.. 2018. Redefine statistical significance. Nature Human Behaviour 2, 1 (2018), 6.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Brodeur A., Cook N., and Wright T.. 2021. On the effects of COVID-19 safer-at-home policies on social distancing, car crashes and pollution. Journal of Environmental Economics and Management 106, 490 (2021), 102427.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Cai Y., Zhou Y., and Scott N.. 2019. Price determinants of airbnb listings: Evidence from hong kong. Tourism Analysis 24, 2 (2019), 227242.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Chen D. Q., Preston D. S., and Swink M.. 2015. How the use of big data analytics affects value creation in supply chain management. Journal of Management Information Systems 32, 4 (2015), 439.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Cohen J. E.. 1988. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.Google ScholarGoogle Scholar
  14. [14] Crane H.. 2017. Why ‘redefining statistical significance’ will not improve reproducibility and could make the replication crisis worse. SSRN. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Diamond G. A. and Forrester J. S.. 1983. Metadiagnosis: An epistemologic model of clinical judgment. The American Journal of Medicine 75, 1 (1983), 129137.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Feinstein A. R.. 1977. Clinical biostatistics. Clinical Pharmacology & Therapeutics 22, 4 (1977), 485498.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Fisher R. A.. 1925. Statistical Methods for Research Workers. Genesis Publishing Pvt Ltd.Google ScholarGoogle Scholar
  18. [18] Fritz C. O., Morris P. E., and Richler J. J.. 2012. Effect size estimates: Current use, calculations, and interpretation. Journal of Experimental Psychology: General 141, 1 (2012), 2.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Gefen D.. 2000. E-commerce: The role of familiarity and trust. Omega 28, 6 (2000), 725737. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Good I. J.. 1967. Contributions to the discussion of a paper by F.J. Anscombe. Journal of the Royal Statistical Society, series B 29, 1 (1967), 3942.Google ScholarGoogle Scholar
  21. [21] Goodman S. N.. 1999a. Toward evidence-based medical statistics.1: The P value fallacy. Annals of Internal Medicine 130, 12 (1999a), 9951004.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Goodman S. N.. 1999b. Toward evidence-based medical statistics.2: The bayes factor. Annals of Internal Medicine 13, 12 (1999b), 10051013.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Goodman S. N.. 2016. Aligning statistical and scientific reasoning. Science 352, 6290 (2016), 11801181.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Grover V., Chiang R. H., Liang T. P., and Zhang D.. 2018. Creating strategic business value from big data analytics: A research framework. Journal of Management Information Systems 35, 2 (2018), 388423.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Halsey L. G., Curran-Everett D., Vowler S. L., and Drummond G. B.. 2015. The fickle P value generates irreproducible results. Nature Methods 12, 3 (2015), 179.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Hanley J. A. and McNeil B. J.. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982), 2936.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Hastie T., Tibshirani R., and Friedman J.. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Helmer O. and Rescher N.. 1959. On the epistemology of the inexact sciences. Management Science 6, 1 (1959), 2552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Huelsenbeck J. P. and Crandall K. A.. 1997. Phylogeny estimation and hypothesis testing using maximum likelihood. Annual Review of Ecology and Systematics 28, 1 (1997), 437466.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Kass R. E. and Raftery A. E.. 1993. Bayes Factors and Model Uncertainty. Technical Report TR-254. University of Washington, Seattle, Washington.Google ScholarGoogle Scholar
  31. [31] Kebede Y. A.. 2020. A mixed-method proposal for traffic hotspots mapping in African cities using raw satellite imagery. International Journal of Engineering Research & Technology 9, 10 (2020), 806–811.Google ScholarGoogle Scholar
  32. [32] Kim J. H., Ahmed K., and Ji P. I.. 2018. Significance testing in accounting research: A critical evaluation based on evidence. Abacus 54, 4 (2018), 524546.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Kim J. H. and Ji P. I.. 2015. Significance testing in empirical finance: A critical review and assessment. Journal of Empirical Finance 34, December (2015), 114.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Kitchens B., Dobolyi D., Li J., and Abbasi A.. 2018. Advanced customer analytics: Strategic value through integration of relationship-oriented big data. Journal of Management Information Systems 35, 2 (2018), 540574.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Konishi S. and Kitagawa G.. 2008. Information Criteria and Statistical Modelling. Springer Science & Business Media.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Kyriacou D. N.. 2016. The enduring evolution of the P value. Jama 315, 11 (2016), 11131115.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Lazzeroni L. C., Lu Y., and Belitskaya-Lévy I.. 2016. Solutions for quantifying P-value uncertainty and replication power. Nature Methods 13, 2 (2016), 107.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Lehrer C., Wieneke A., Vom Brocke J., Jung R., and Seidel S.. 2018. How big data analytics enables service innovation: materiality, affordance, and the individualization of service. Journal of Management Information Systems 35, 2 (2018), 424460.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Lin M., Lucas H. C., and Shmueli G.. 2013. Research commentary—too big to fail: Large samples and the p-value problem. Information Systems Research 24, 4 (2013), 906917.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Moosavi S., Samavatian M. H., Parthasarathy S., Teodorescu R., and Ramnath R.. 2019. Accident risk prediction based on heterogeneous sparse data: New dataset and insights. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 3342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Moosavi S., Samavatian M. H., Parthasarathy S., and Ramnath R.. 2019. A countrywide traffic accident dataset. arXiv:1906.05409. Retrieved from https://arxiv.org/abs/1906.05409.Google ScholarGoogle Scholar
  42. [42] Müller O., Fay M., and vom Brocke J.. 2018. The effect of big data and analytics on firm performance: An econometric analysis considering industry characteristics. Journal of Management Information Systems 35, 2 (2018), 488509.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Nuzzo R.. 2014. Scientific method: Statistical errors. Nature News 506, 7487 (2014), 150.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Nakagawa S. and Cuthill I. C.. 2007. Effect size, confidence interval and statistical significance: A practical guide for biologists. Biological Reviews 82, 4 (2007), 591605.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Okagbue H. I., Adamu M. O., and Anake T. A.. 2020. Closed-form expressions for the quantile function of the chi square distribution using the hybrid of quantile mechanics and spline interpolation. Wireless Personal Communications 115, 3 (2020), 20932112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Parra C., Ponce C., and Rodrigo S. F.. 2020. Evaluating the performance of explainable machine learning models in traffic accidents prediction in california. In Proceedings of the 2020 39th International Conference of the Chilean Computer Science Society. IEEE, 18.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Raudenbush S. W., Becker B. J., and Kalaian H.. 1988. Modeling multivariate effect sizes. Psychological Bulletin 103, 1 (1988), 111.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Sahni N. S., Zou D., and Chintagunta P. K.. 2016. Do targeted discount offers serve as advertising? Evidence from 70 field experiments. Management Science 63, 8 (2016), 26882705. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Singh P. V., Sahoo N., and Mukhopadhyay T.. 2014. How to attract and retain readers in enterprise blogging? Information Systems Research 25, 1 (2014), 3552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Sgouropoulos N., Yao Q., and Yastremiz C.. 2015. Matching a distribution by matching quantiles estimation. Journal of the American Statistical Association 110, 510 (2015), 742759.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Shmueli G.. 2010. To explain or to predict? Statistical Science 25, 3 (2010), 289310.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Söderlund M.. 2002. Customer familiarity and its effects on satisfaction and behavioral intentions. Psychology & Marketing 19, 10 (2002), 861879. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Wang D. and Nicolau J. L.. 2017. Price determinants of sharing economy-based accommodation rental: A study of listings from 33 cities on Airbnb. com. International Journal of Hospitality Management 62, April (2017), 120131.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Wasserstein R. L. and Lazar N. A.. 2016. The ASA's statement on p-values: Context, process, and purpose. The American Statistician 70, 2 (2016), 129133.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Wasserstein R. L., Schirm A. L., and Lazar N. A.. 2019. Moving to a world beyond ‘p < 0.05’. The American Statistician. 73, sup1, (2019), 119.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Wu X. and Qiu J.. 2019. A study of Airbnb listing price determinants: Based on data from 36 cities in China. Tourism Tribune 34, 4 (2019), 1328.Google ScholarGoogle Scholar
  57. [57] Xiao B. and Benbasat I.. 2015. Designing warning messages for detecting biased online product recommendations: An empirical investigation. Information Systems Research 26, 4 (2015), 793811. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Zantedeschi D., Feit E. M., and Bradlow E. T.. 2017. Measuring multichannel advertising response. Management Science 63, 8 (2017), 27062728. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Identifying the Big Shots—A Quantile-Matching Way in the Big Data Context

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Management Information Systems
          ACM Transactions on Management Information Systems  Volume 13, Issue 2
          June 2022
          261 pages
          ISSN:2158-656X
          EISSN:2158-6578
          DOI:10.1145/3483345
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 March 2022
          • Accepted: 1 October 2021
          • Revised: 1 September 2021
          • Received: 1 December 2020
          Published in tmis Volume 13, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format