Assessment of Factors Causing Bias in Marketing- Related Publications

The present paper aims at revealing and ranking the factors that most frequently cause bias in marketing-related publications. In order to rank the factors causing bias, the authors employed the Analytic Hierarchy Process method with three different scales representing all scale groups. The data for the study were obtained through expert survey, which involved nine experts both from the academia and scientific publishing community. The findings of the study confirm that factors that most frequently cause bias in marketing related publications are sampling and sample frame errors, failure to specify the inclusion and exclusion criteria for researched subjects and non-responsiveness.


Introduction
Bias can be defined as any systematic error in the design, conduct or analysis of a study. In research, bias occurs when "systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others". Bias can attend at any phase of research, including study design or data collection, as well as in the process of data analysis and publication. Bias is not a dichotomous variable [1,2].
Most fields of science, including social sciences, are currently facing a deep 'reproducibility crisis' [3][4][5]. Research is relevant in all fields of science, whether conducted in the form of experiments, quantitative, qualitative studies or multidimensional research, so bias can manifest itself at every stage of the research process. Bias patterns and risk factors can thus be assessed across multiple topics within a discipline, across disciplines or larger scientific domains (social, biological and physical sciences), and across all of science [6][7][8][9]. Many of the biases described can be accepted in literature though their extent still remains unspecified. It is likely that different biases pose different threats to different disciplines. The interests of the authors of the present paper relate precisely to the biases of marketing research in interpreting and researching biases characteristics. Different scholars [6,10,11] have confirmed that bias is uneven. The bias in research could be manifested in a number of ways, as found in literature [5,[12][13][14]. Publication or literature biases can be considered among the most frequently mentioned and discussed. Bias is also relevant in life sciences [15][16][17][18][19][20], psychology [21,22] education [23,24] and economics [25,26].
An important bias of studies is manifested in its lower precision [27]. This could be related to the fact that minor studies might report effect of larger magnitude. This problem could stem from the genuine heterogeneity in study design [6] and relevant in different areas of research and related to measurement bias that occurs during the process and reflects a discrepancy between the information Based on our observations from the literature review, the authors of the present paper distinguished 10 factors that are responsible for the biggest part of research bias in marketing related publications.
(1) Failure to examine and critically assess prior literature: Most studies start with an idea, question or a topic, sometimes rather numerous-are not new, and had already been previously studied. Often such bias emerges due to a failure to evaluate the issue in previous studies or literature. Researchers and statisticians [14,50] have documented publication bias across a variety of academic disciplines including behavioral sciences [51], education [52,53], special education [54,55], ecology [56], medicine [28,[57][58][59][60][61], psychology [13,50,51,[62][63][64][65][66] and theatre and performance [5]. As [67] mentioned, using databases, such as Emerald, EBSCO, Jstor and ScienceDirect, it is necessary to read all more detailed articles on the chosen topic. However, it should be noted that unpublished scientific results may differ systematically from published results [68][69][70]. Such bias is referred to as publication bias that can affect the literature analysis. According to [54], there are two possible approaches for correct literature analysis: search and inclusion procedures and a formal and statistical approach. It involves, "conducting searches (such as electronic database search, hand searching of journals, contacting experts) to identify all relevant studies, including gray literature (i.e., unpublished studies)". Greenhalgh and Peacock [71] have confirmed that using data from electronic databases is the result of a failure to carry out studies. An important aspect of bias in literature and publications is gray literature bias [6,10,71,72]. Polanin, Tanner-Smith and Hennessey defined gray literature as, "literature can be broadly thought of as anything not published in a professional journal, including dissertations, policy reports, conference proceedings, book chapters, or otherwise unpublished studies" [72].
(2) Failure to specify the inclusion and exclusion criteria for study subjects: Researchers often omit the specification of the research subject, by engaging inclusion and exclusion criteria. If the criteria were named, other researchers would understand why the current results may differ from other published studies. To be eligible for publication, research papers needed to be based on and refer to empirical research, rather than on commentaries, letters, editorials or reviews. This process involved identification, screening and inclusion and exclusion criteria [8].
(3) Failure to determine and report errors in measurement methods: Measurement bias occurs during the process and reflects a discrepancy between the information collected and the information the researcher seeks to obtain. Reference [73] confirmed that response rates to surveys have fallen, in particular, in developed countries, which highlights the actual problem regarding measurements methods in researches. Some authors such as [74][75][76] state that non-response rate is caused by a practice of bringing to the study a sample of reluctant people, who may provide data filled with measurement errors. Questions arise when this hypothesized relationship between low propensity to respond and measurement error arises. The first has to do with the quality of the statistics (e.g., means, correlation coefficients) computed on the basis of a survey. That is, does the mean square error of a statistic increase when sample persons who are less likely to be contacted or cooperate are incorporated into the response? The level of effort analyses the change in statistics over increased levels of effort, taking change in the statistics to indicate the risk of nonresponse bias, and no change to indicate the absence of risk. However, if measurement error is correlated with level of effort (or response propensity), then an observed change or lack of change in the statistic may be due to measurement error and not to nonresponse bias [29].
(4) Failure to specify the exact statistical assumptions made in the analysis and failure to perform sample size analysis before starting the study: When an omitted variable (i.e., an unmeasured variable not included in a model) creates a correlation between the error terms in these two stages, traditional techniques, such as ordinary least squares (OLS) regression, may report biased coefficient estimates [77]. Since most studies will include statistical analysis of the data, specifying the level of significance (called the alpha level) that is acceptable and the exact statistical tests methods used is commonplace. Most trials that claim two methods are equivalent (or non-superior) or underpowered, which means they have too few subjects. The sample size must be reasonable in order to obtain statistically reliable results. As stated in [78], prior studies as an estimation can be used, but "although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. It is shown that the use of this approach often results in underpowered studies, sometimes to an alarming degree" (p.1547).
(5) Improper Specification of the Population: It is a biased study of population that loses validity in relation to the degree of the bias [7,79] in their research, compared the precision and bias of projections of total population with the precision and bias of projections by different dimensions and country level. Population specification biases could occur when a researcher does not understand what the object of the study was. Many researchers have used time series models to construct population forecasts and prediction intervals at the national level, but few have evaluated the accuracy of their forecasts or the out-of-sample validity of their prediction intervals [79]. Researchers studying bias in population specification have focused on patterns of overall population growth [80,81], while others have examined individual components, such as mortality and fertility, migration [41,82], but they were all linked by key issues, such as uncertainty in population forecasts, the development of models that provide specific measures of uncertainty.

(6) Sampling and Sample Frame Errors:
Survey sampling and sample frame errors occur when the wrong subpopulation is used to select a sample, or because of variation in the number or representativeness of the sample that responds, but the resulting sample is not representative of the population concerned. In some cases, sample selection bias can lead researchers to find significant relationships that do not exist, or in other cases it can lead researchers to fail to find significant relationships that do exist [77] (Bias in sampling and sample frame can occur when including inappropriate objects with or without certain characteristics [83].

(7) Selection Errors:
Determining whether or not an observation in an overall population appears in its final representative sample is the first stage, and modeling the relation between the hypothesized dependent and independent variables in the final [77]. This bias is related to the sampling error, sample selected by a non-probability method. It could happen when respondents choose to self-participate in a study and only those interested respond; you can end up with selection error because there may already be an inherent bias. This can also occur when respondents who are not relevant to the study participate, or when there is a bias in the way participants are put into groups.  [73] confirm, that best practices argue that researchers should attempt to maximize response rates and to minimize risk of nonresponse errors [84]. However, research [85][86][87] has called the traditional view into question by showing no strong relationship between nonresponse rates and nonresponse bias [29]. This may occur because either the potential respondent was not contacted or the respondent refused to respond. The key factor is the absence of data, rather than inaccurate data. An increase in mean square error could occur because (a) incorporating the difficult to contact or reluctant respondents results in no nonresponse bias in the final estimate, but measurement error does exist, or (b) nonresponse bias exists, but the measurement error in these reluctant or difficult to contact respondents' reports exceeds the nonresponse bias [73]. The second question has to do with methodological inquiries for detecting nonresponse bias. Although many types of analyses of nonresponse bias can be conducted, four predominant approaches have been used: (1) comparing characteristics of the achieved sample, usually the demographic characteristics, with a benchmark survey [88], (2) comparing frame information for respondents and nonrespondents [89], (3) simulating statistics based on a restricted version of the observed protocol [85], often called a "level of effort" analysis, and (4) mounting experiments that attempt to produce variation in response rates across groups known to vary on a survey outcome of interest [90] Findings from these studies show that nonresponse bias varies across individual statistics within a survey and is relatively larger on project needs to be able to answer the question.
(9) Missing data, dropped subjects and use of an intention to treat analysis: It must be acknowledged that incomplete or missing data and the publication of such data can materially vary. This may in particular be related to the presentation of specific results when the analysis was performed but not properly described. This may be due to not only to the lack of understanding on the investigator's part, but also to coincidence, on the other hand. There is a lack of understanding in reporting and information, as it can be assumed that the data or part of the results are not relevant.
More ad hoc research methods can be used to supplement missing data [10,91]. It can always be assumed that the lack of data is accidental, but the reason for that is bias. Negative research findings that are likely to outweigh the number of positive findings continue to be sidelined, not published in unused files [9,12,50,51,92]. Fanelli [6] examined the situation of negative and insignificant results in a study. The empirical studies in [6] have shown that negative or insignificant results are fairly common, and in social sciences in particular.
The loss of negative data is due to the fact that results that do not meet expectations and/or contradict the hypothesis are necessary for scientific progress. Negative findings are important to consider because they encourage researchers to think critically, reassess or from a different angle, correct, and perhaps confirm, their current beliefs and move forward [60]. It is therefore essential that all findings-positive, negative and non-existent-are made available to researchers in order to ensure a fair and comprehensive summary of research to inform policy, practice or research.
(10) Problems in pointing out weaknesses of own study: The authors of [24] empirically identified the following weaknesses in studies: (1) lack of an underlying theory of action, (2) disproportionate reliance on descriptive data, (3) conflation of correlation with causation, (4) problems in measurement and statistical analyses, (5) absence of study replication, (6) weak designs without comparability between library and non-library groups and (7) evidence of publication bias focusing on positive results. Outcome reporting biases could be the most problematic as every researcher believes that his/her study is good without any weaknesses. Research by psychologists has shown that at least 63% of researchers have not published full research results [93], thus not acknowledging any weaknesses in their work. This can be treated as ignoring the identification of study weaknesses and biasing the results. Withholding negative, inconclusive or nonsignificant findings distorts the understanding of research within a domain and causes the potential benefits of an intervention to be overestimated [57].

Analytic Hierarchy Process Method
Analytic Hierarchy Process (AHP) method found application in marketing science in the first year of its invention [92]. It is used in strategic marketing planning [94], analyzing marketing mix [95], revealing consumer intentions [96], assessing determinants of purchase decisions [97] evaluating marketing personnel [98] and in comprehensive market evaluation [99]. AHP application is also common in publishing research [100][101][102]. It is also a common tool for ranking independent factors having impact on a complex phenomenon [103][104][105]. In order to ensure robustness of the results, we chose an AHP method with three different scales, representing three main scale groups, i.e., from first category, we chose inverse linear scale [106], logarithmic [107] and a power scale [108] from the second and the third categories. Once the Eigenvectors using all the three scales were computed, the next step was the normalization of the obtained results. We chose an AHP as a tool for our research, because it is a suitable technique for evaluating phenomena that cannot be assessed using purely quantitative method [109]. The number of factors potentially causing bias in marketing related publications was limited to 10 positions (the maximum number of alternatives that AHP method is capable to process adequately).
The factors used in the research were the following: failure to examine and critically assess the background research literature; failure to specify the inclusion and exclusion criteria for researched subjects; failure to determine and report the error of measurement methods; failure to specify the exact statistical assumptions made in the analysis and failure to perform sample size analysis before the study begins; improper population specification; sampling and sample frame errors; selection errors; non-responsiveness; missing data, dropped subjects and use of an intention to treat analysis; and problems to point out the weaknesses of own study. The inquiry method used for the purpose of the study was an interview, involving nine experts. Six of the experts were professors at the Business schools and/or Marketing/Management departments at the Universities in Lithuania, Poland and the Czech Republic. Three experts work as editors-in-chief or the managing editors of Scopus indexed business and economics related journals. The number of experts exceeds the required validity threshold [110].
Research papers in the area recognize 11 different AHP measurement scales organized into three different categories that are suitable for research [111]. It is considered that there are no significant differences in research outcomes because of the different measurement scales, although in order to ensure robustness of the results, it is recommended to use the combination of different scales. We chose three different measurement scales, representing all three categories. The mathematical expression of selected scales is presented below: Inverselinearscale : c = 9 10 − x ; Logarithmicscale : c = log a (x + a − 1); andPowerscale : c = x a ; where: x-value on the integer judging scale for pairwise comparisons from 1 to 9, c-a ratio used as entry into the decision matrix [112] (p. 3).
A typical data processing process in AHP appears to be as follows: At first, experts are presented with pair-wise comparison matrices. After all of the experts evaluated the factors causing biasness in marketing related publications by using an ex ante prepared pair-wise questionnaire form, each completed questionnaire is checked for consistency. The matrix is considered consistent if p ik = p ij p jk , ∀ i, j, k, and a priority vector w exists; then, w = (ω 1 , . . . , ω n ), where p ij = ω i ω j , ∀ i, j. For the calculation of Consistency Index of experts, λ max is calculated for every matrix: here: λ max -largest eigenvector of each standardized matrix; n-number of independent rows in the matrix; νj -eigenvector of matrix.
A filled expert pair-wise comparison matrix A is considered consistent when, λ max = n, although in a real-life situation, it happens quite infrequently. In case a marginal p ij changes, matrix A satisfies the preselected compatibility threshold (0.2 was selected) and λ max becomes close to n. After calculating the eigenvalue λ max , the Consistency Index CI is being calculated: here: CI-Consistency Index; n-number of possible alternatives.
Consistency Index is being used for calculation the overall Consistency Ratio: where: CR-Consistency Ratio; RI-random Index.
If matrices show CI < 0.2, the aggregated expert evaluation indices are calculated using a geometrical mean formula: where: p A ij -aggregated evaluation of element, belonging to i row and j column; n-number of matrices of the pair-wise comparison of each expert.
When new aggregated matrixes are being calculated, consistency check procedure has to be performed again. If a matrix is found consistent, then preferred ranks of alternatives are being calculated using formula: where: ω j -weight of j alternative. In case the matrices are consistent, but expert evaluations are significantly dispersed, index of expert mutual agreement (S*) is being calculated [113]: where: H α -Shannon alpha diversity; H β -Shannon beta diversity; H γ -Shannon gamma diversity.

Results and Discussion
The results of calculations are presented in Tables 1-3: Failure to specify the exact statistical assumptions and failure to perform sample size analysis Missing data, dropped subjects and use of an intention to treat analysis 9 9 9 9 Problems to point out the weaknesses of your own study 10 10 10 10 Although differences in reliability indicators obtained using different scales are truly marginal, the highest level of consistency index (83.5%) was derived using Logarithmic scale. It is rather an accidental result, not a rule, and should be attributed to the characteristics of data researched as there is no undisputable proof about superiority of one scale above others in terms of consensus index and employment of a combination of scales is preferred in order to achieve the robustness of results [112].
The values of computed eigenvectors are presented in Table 2.
Analyzing the eigenvectors of research bias inducing factors, we grouped them into two distinct groups: important factors-items whose eigenvectors are above 0.1 (sampling and sample frame errors, failure to specify the inclusion and exclusion criteria for researched subjects, non-responsiveness, failure to examine and critically assess the prior literature and Selection errors). The rest are less important. However, in this group differences still exist, as last ranked factor-problems to point out weaknesses of your own study has twice lower eigenvector value compared to improper population specification, which indicates its lesser influence on occurrence of bias in marketing related publication.
The results obtained confirm the necessity of using a combination of different measurement scales, as results obtained with inverse and logarithmic/power scales differ not only in eigenvector values, but also in ranks of studied factors. Although differences in results derived by logarithmic and power scales are not significant, and vary only in eigenvector values, results of the inverse scale show differences also in ranks. In order to offset these differences, the results were normalized. The process was followed by the computation of the final rank of researched factors that induces bias in marketing related publications (Table 3).
In different studies publication biases are indicated and analyzed by one or few most important or dominated factors, especially in marketing publications.
An analysis of the most frequent cause for biases in marketing related publications led the authors of the paper to the conclusion that the most important factors causing bias are sampling and sample frame errors and failures to specify the inclusion and exclusion criteria for the study subjects (see Table 3).
As evidenced by the data in Table 1, these results are confirmed by normalized eigenvector of power scale as the biggest one, logarithmic scale as well. This ranking shows the importance of sampling and sample frame errors in marketing publications. In different sciences this type of biases bear different importance. For example, study in medicine science made by Lin [114] shows that "sampling error did not cause noticeable bias but the standardized mean difference, odds ratio, risk ratio, and risk difference suffered from this bias to extents" [114] research results shows importance of sampling and sample frame errors in a different approach: how to decrease the sample size to a stratified sample design to achieve an equivalent precision. The second in ranking importance by ranking in most frequently caused biases in marketing related publications is failure to specify the inclusion and exclusion criteria for study subjects. Researchers use inclusion and exclusion criteria to determine characteristics of the subjects or elements in a study. Typical inclusion criteria might be demographic, geographic and occupational groups [115]. Exclusion criteria are not the opposite of inclusion criteria: they identify attributes that prevent a person from being included in the study [116]. The fundamental problem still arises when the researchers do not define inclusion and exclusion criteria clearly. Simply indicating subjects in the study met inclusion criteria is insufficient and does not allow readers to judge the validity of the decision. Selecting inclusion criteria that are not related to the research object and do not describe the variables in sufficient detail is another potential research pitfall.
Ranking third among the most frequent cause of bias in marketing related publications is the non-responsiveness bias. Reference [117] confirmed general decline in survey response rates in their research. The new wave of online polls could be as alternative; nevertheless, even that does not ensure that the non-responsiveness biases can be avoided. Non-responsiveness bias in marketing related publications are most common for different types of surveys. Probability based surveys still display less bias than non-probability surveys [118]. A way to address the non-responsiveness bias is to choose correct type of survey.
One more frequent reason for bias in marketing is related publications-literature bias (see Table 2). As was mentioned in the literature review as part of the present paper, this type of bias is relevant in many fields of science; however, it is less common in social sciences [119]. In marketing-related publications, publication and literature bias are directly related. The findings of the studies conducted by [119] show that "there is a strong relationship between the results of a study and whether it was published, a pattern indicative of publication bias". Selection bias in was ranked fifth in the scale of the most frequent cause of bias in marketingrelated publications (see Table 2). This bias is related to the sampling error. Bias like selection is present and relevant in different fields of science, and in medicine in particular. Apparently, selection bias is also important in marketing publications. Selection bias could be addressed before starting the study [120] as it is hidden problem [121]. One of the possible ways to avoid selection biases is to contact someone who is knowledgeable about causal inference methods [121].
The further two factors that most frequently cause biases in marketing related publications, i.e., improper population specification and failure to determine and report the error of measurement methods, show more than 0.1 in logarithmic scale. Those two factors are important, although to a lesser extent than the first five covered earlier.
The last three factors show less importance in most frequently caused biases in marketing related publications: failure to specify the exact statistical assumptions and failure to perform sample size analysis, missing data, dropped subjects and use of an intention to treat analysis and problems to point out the weaknesses of own study. This can be seen as ignoring the identification of study weaknesses and biasing the results. This is related for marketing biases in publications as well, but is not as important as first seven factors, as was concluded in the present study.

Conclusions
Research literature in the area suggests a fairly broad range of factors that in one way or another cause bias in marketing related publications. For the purpose of the study covered by the present paper, the authors ranked the factors determining the ones, which most frequently cause bias in marketing-related publications. The study concluded that sampling and sample frame errors are the factors most frequently causing bias in marketing related publications. It may be attributed to the fact that marketing is about revealing people's preferences and improper selection of the sampling frame may discredit the main research question, not only trigger some doubts about robustness of the results. It should be noted that this is specifically related to the marketing area, and other disciplines in social sciences may show different rankings of the factors. Failure to specify the inclusion and exclusion criteria for researched subjects, which was ranked second, is very important to much broader context [122]. The third factor-non-responsiveness-once again is related to possible improper mirroring of researched population, which is very important in marketing research [123].
The least important factors were named missing data, dropped subjects and use of an intention to treat analysis and problems to point out the weaknesses of your own study. These bias-creating factors should be attributed not to the problems in research design, some methodologic weakness, but rather associated to ethics of the researcher. In general, research ethics is improving as novel instruments for assuring it is being implemented [124,125], so this research problem is of diminishing importance.
The findings of this study should be considered preliminary and treated as a trigger to start wider scientific discussions on bias inducing factors in research publications. It would be scientifically sound to conduct similar researches in other fields of social sciences in order to reveal common factors causing bias in research publications. The findings of the study are instrumental in creating universal recommendations helping to eradicate/mitigate effect of at least some of the factors creating bias in scientific literature. Funding: This research received external funding.

Conflicts of Interest:
The authors declare no conflict of interest.