Perceptions of emerging biotechnologies

Research on public views of biotechnology has centered on genetically modified (GM) foods. However, as the breadth of biotechnology applications grows, a better understanding of public concerns about non-agricultural biotechnology products is needed in order to develop proactive strategies to address these concerns. Here, we explore the perceived benefits and risks associated with five biotechnology products and how those perceptions translate into public opinion about the use and regulation of biotechnology in the United States. While we found greater support for non-agricultural biotechnology product, 70% of individuals surveyed showed no or little variation in their support across the products, indicating opinions about early GM products may be influencing the acceptance of emerging biotechnologies. We identified five common patterns of opinions about biotechnology and used machine learning models to integrate a wide range of factors and predict a respondent’s opinion group. While the model was particularly good at identifying individuals supportive of biotechnology, differentiating between individuals from the non- and conditionally-supportive opinion groups was more challenging, emphasizing the complexity of public opinions of emerging biotechnology products.


Introduction
The largest opinion difference between the public and the scientific community is not about climate change, evolution, or vaccines, but about biotechnology. In 2015, 88% of American Association for the Advancement of Science members surveyed responded that genetically modified (GM) foods were safe to eat, while only 37% of the general public agreed [1]. While such findings support the idea that scientific knowledge could lessen public opposition to biotechnology, research in this area has shown mixed results [2][3][4]. Therefore, while technological advances have improved the ease and precision with which organisms can be engineered, the scientific community continues to grapple with how biotechnology should be applied, discussed, and regulated [5,6].
There is a growing consensus that a process in which scientists engage with interested and affected parties is the best way to develop proactive strategies that attend to public concerns [7]. In such processes, understanding pre-existing public perceptions and concerns is a useful starting point [8]. However, while most research on public opinion of biotechnology focuses on GM foods [6,9,10], biotechnology products are emerging in medicine, energy, and environmental science [11]. For example, scientists are engineering mosquitoes to combat insect-borne diseases [12], algae that produce fossil fuel alternatives [13], and viruses that specifically target and kill tumor cells [14]. To address possible public concerns about these emerging applications of biotechnology, a better understanding is needed of how biotechnology products from these diverse sectors will be received.
We surveyed a representative sample of US adults (n=521) to explore the perceived benefits and risks associated with five biotechnology products and how those perceptions translated into opinions regarding Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. the use and regulation of each product. Respondents were provided with short descriptions of each biotechnology product including examples of benefits and risks associated with each product (table S1 is available online at stacks.iop.org/ERL/14/114018/ mmedia). After reading about each biotechnology product they were asked to rank their degree of familiarity with and level of support for the product and then asked about their beliefs regarding the benefits, risks, and need to regulate the product. A range of additional information, including personal values, perceived knowledge about biotechnology, feelings of pro-environmentalism, ability to think scientifically, trust in scientific institutions, education level, political affiliation, and religious beliefs, was also gathers as potential predictive variables (table S2).

Survey design
An online survey was distributed in April 2017 to a representative sample (n=521) of the United States by Qualtrics, with the intent to obtain a sample that was roughly 68% White, 15% Hispanic, 12% African American, and 5% Asian. The survey established basic demographics about respondents including age, gender, ethnicity, level of education, political ideology, and income. Previously established groups of questions were included to measure pro-environmentalism (New Environmental Paradigm: NEP) [15], personal values [16], and scientific reasoning skills [4]. In addition to these established scales, questions based on the willingness to sacrifice framework [17], about levels of trust in academic, government, and industry scientists, and about religious beliefs were also used to generate scales. The consistency of each scale was determined using Chronbach's α. While originally planned as a scale, two questions regarding religion were used as stand-alone variables in order to distinguish between religious fundamentalism (Q: I believe the bible is the literal word of God) and general religiosity (Q: Religion is important in my life).
To measure opinions on biotechnology, respondents were presented with five different blocks each containing a different biotechnology product presented in randomized order (table S1) and asked to score their agreement with the following statements in respect to each technology on a Likert Scale (1: Strongly disagree to 5: Strongly agree, 3: Neither agree nor disagree): 1. I am familiar with this technology.
2. I support the development of this technology.
3. I believe this technology is beneficial. 4. I believe this technology is risky. 5. I believe this technology should be regulated.

Statistical analysis
A series of statistical analyses were conducted in this study. First in section 3.1, multiple linear regression models were used to explain the relationship between independent variables (e.g. demographics, personal values) and general opinions toward biotechnology ( i.e. the dependent variables). To measure general opinions, scales were generated for each of the above questions using responses to all five biotechnology products (see table S2). Next in section 3.2, we used analysis of variance (ANOVA) tests to determine if there was an effect of biotechnology product (i.e. independent variable) on responses to each of the five questions (i.e. dependent variables). For questions with significant differences between products (ANOVA; p0.05), post-hoc Tukey's Honest Significant Difference (HSD) test was used to determine which biotechnology products were different. Additionally, we used the standard deviation to determine the extent to which an individual's responses to the five questions differed across the five biotechnology products. All statistical analysis was conducted in R 3.3.3.

Opinion group clustering
To identify common biotechnology opinion patterns [18], responders were clustered based on their support for and their beliefs about benefits, risks, and regulation of all five biotechnology products. Two clustering algorithms (k-means and Partitioning Around Medoids (pam)) were tested with the clustering number (k) ranging from 3 to 10. The best algorithm (k-means) and k (k=5) were selected using the 'clValid' package in R using silhouette width as a validation metric (figure S1(A)). Because k-means clustering is not deterministic, clustering was performed once, and those cluster assignments (figure S1(B)) were used for the remainder of the study.

Multiclass machine learning classification
To determine how well independent features could classify individuals into their biotechnology opinion group we trained multiclass machine learning models using each feature individually and with all features combined [19]. We randomly generated 10 datasets that were down-sampled to include equal numbers of individuals from each opinion group (i.e. balanced datasets). Both the parameter sweep (used to select the best combination of parameter values, see table S3) and training were done within a 5-fold cross validation scheme to avoid overfitting. Briefly, for 5-fold crossvalidation, individuals are divided into five groups, then groups 1-4 are used for the parameter sweep and training and group 5 for testing, next groups 1-3 and 5 are used for the parameter sweep and training and group 4 for testing, etc. Model performance was measured using the F-measure (F1), or the harmonic mean of precision (True Positive Rate (TPR)/ TPR+False Positive Rate (FPR)) and recall (TPR/ TPR+False Negative Rate (FNR)), where an F1 is generated for each opinion group and the macro F1 represents the mean F1 from all five opinion groups. Because individuals could be classified into 1 of 5 opinion groups, a machine learning model randomly guessing would have an F1=0.2, while a perfect model would have an F1=1. To ensure model performance scores were not artificially inflated [19], only the performance on the testing data, which is withheld from the model during training, was reported.
All single feature models were built using the Logistic Regression algorithm. Note that binary feature (e.g. Hispanic, lived on a farm) are likely to perform below random expectation when used individually to classify individuals as belonging to one of five opinion groups However, four additional algorithms were tested for the combined model: Support Vector Machine (SVM), SVM with a polynomial kernel (SVMpoly), Random Forest, and Gradient Boosting. All algorithms were implemented in Scikit-Learn using Python3.

Code and data availability
The survey questions and code used to process, perform statistical analyses, and cluster the survey results can be downloaded from GitHub at https:// github.com/azodichr/Biotech_Survey. The python code (ML_classification.py) used to train and test machine learning models is also available on GitHub at https://github.com/ShiuLab/ML-Pipeline. Any data that support the findings of this study are included within the article. See independent (Data S1) and dependent variables (Data S2).
Across the five biotechnology products, support was positively correlated with beliefs about the benefits of each technology (Pearson's Correlation Coefficient (PCC): 0.57-0.86) and negatively correlated with beliefs about the risks of each technology, although to a lesser degree (PCC −0.14∼−0.27) (table S5). We used multiple linear regression to assess the association between independent variables and beliefs and opinions about biotechnology in general by averaging response over the five biotechnology products. Overall support for biotechnology ('I support the development of this technology') was most strongly associated with self-reported familiarity with biotechnology (p<0.001), familiarity with agriculture (p=0.005), and with gender (p<0.001), with men being more supportive than women [20] (table 1). Because research on risk perception has shown what is referred to as the 'white-male' effect, where white men tend to judge risk as lower [21], we tested for an interaction between the male and white variables. We found a small white-male effect (effect size=0.36; p=0.04) on the belief that the biotechnology products are beneficial ('I believe this technology is beneficial'), where white men tended to believe more strongly in the benefits. However, the white-male effect was not present for beliefs about risks, benefits, or the need for regulation. Taken together, this suggests the 'white-male' effect was minimal in our survey. Two personal values were also significantly associated with beliefs about the benefits of biotechnology, with openness being associated with the belief that biotechnology is beneficial (p=0.02) and biospheric altruism (i.e. altruism toward other species or the environment in general) being associated with the belief that biotechnology is not beneficial (p=0.02) (table 1). It has been demonstrated that extreme opposition to GM foods is associated with low objective knowledge about science and genetics [3]. While we found perceived riskiness and the need to regulate biotechnology were most strongly associated with the ability to think scientifically (p=0.002 and 0.001, respectively; table 1), we found that those with a greater ability to think scientifically were more likely to believe biotechnology was risky and needed to be regulated ( figure 1(A)). This suggests that individuals with generally negative opinions about biotechnology tend to have stronger scientific reasoning skills. Furthermore, self-reported trust in academic, government, and industry scientists was also associated with the perception of high risks for biotechnology products (p=0.003), which counterintuitively suggests that individuals who trust scientists are more likely to believe that these products, developed by scientists, are risky. This is at odds with previous work that found trust in scientists lowered perceived risks of gene technologies [22] and with work that found trust in industry scientists was more strongly associated with support for biotechnology than knowledge about genetics [23]. Finally, and in line with previous findings [24,25], we found that religious fundamentalism was strongly associated with the beliefs in the risks and need to regulate biotechnology.

Comparison of opinions and beliefs about specific biotechnology products
Next, we assessed the differences in beliefs and opinions about each of the five biotechnology products using ANOVA followed by Tukey's HSD tests when the ANOVA was significant (p<0.05). We found that products related to agriculture were believed to be the least beneficial and had the least support, with support for hornless cattle significantly lower than the other products (table S6; figure 1(B)), while biopharmaceuticals were seen as the most beneficial and had the most support (HSD: p<0.05). While these differences were significant, we found that 27% of respondents reported the same degree of support across all products (standard deviation (sd)=0), and an additional 60% had minor differences in support (sd<1) (figure 1(C)) between products. Furthermore, the perceived riskiness of biotechnology varied only moderately by product, with gene drives perceived as riskier and biopharmaceuticals perceived as less risky (p=0.047; figure 1(B); table S6) and the no difference in opinions about the need for regulation across products (p=0. 23). Surprisingly, 50% (n=261) of individuals agreed or strongly agreed that all five biotechnology products should be regulated, with an additional 14% (n=71) believing four of the five products should be regulated and this response was not associated with political affiliation (p=0.25; table 1). Taken together, this suggests most US adults support regulating biotechnology and have similar beliefs and opinions about all biotechnology products regardless of their unique benefits and risks.
To better understand common patterns of opinions and beliefs about the different biotechnology products, we used k-means clustering to sort respondents into five opinion groups (see Methods). Three of these opinion groups (i.e. A, D, E) had similar opinions about biotechnology across the five products ( figure 1(D)). For example, individuals in opinion group A (n=43) had strong negative opinions toward all biotechnology products, specifically they did not support or believe any product was beneficial (dark green), On the other hand, individuals in opinion group E (n=119) strongly supported all biotechnology products and viewed them as beneficial (see dark purple). Interestingly, individuals in both of these response groups believed all biotechnology products were at least moderately risky and needed to be regulated. Individuals in two opinion groups, B (n=68) and C (n=131), had different opinions about different biotechnology products. While moderately supportive of some products, group B respondents were opposed to biofortified crops and hornless cattle, believing those products were risky and not beneficial. Of note, unknown long-term human health effects were listed as possible risks in the description of those two products, suggesting group B individuals were driven by their fear of negative impacts on human health (see table S1). Conversely, group C respondents believed the risks were similar across all products, however, only believed the products that explicitly addressed human health (i.e. biopharmaceuticals, biofortified crops, and gene drives) were beneficial. This suggests group C individuals were driven by their hope for products that improve human welfare.

The social basis for opinions about biotechnology
No single demographic factor clearly distinguished individuals from the five opinion groups (table S4). Therefore, to better understand the social basis for these opinion groups, we evaluated 29 independent features derived from the survey data (see table S7) for their ability to predict an individual's opinion group using logistic regression. The logistic regression models were trained, and their performance was tested using a balanced, replicated, 5-fold cross validation scheme (see Methods 2.4). We used the F-measure (F1), or the harmonic mean of precision and recall, as a metric to measure predictive performance, where F1=1 would indicate the model correctly predicted the opinion group of every respondent and an F1=0.2 would indicate the model performed no better than random guessing (i.e. if the model randomly assigned each individual to one of the five opinion groups, the model would be correct 20% of the time). While seven features were able to distinguish between opinion groups better than random expectation (F1>0.2) ( figure 2(A)), the features differed in which opinion groups they were able to distinguish ( figure 2(B)). For example, across all five opinion groups familiarity with biotechnology had a performance (referred to as F1 macro ) of F1 macro =0.27 (± 0.04 sd), making it the best individual predictor of opinion groups. While it performed exceptionally well at identifying individuals from opinion group E (F1 E =0.53±0.04), with 73% of opinion group E individuals correctly predicted as belonging to group E (figure 2(B)), it was less useful for distinguishing between individuals belonging to the less supportive opinion groups (A-D). Biospheric altruism, on the other hand, performed just above random expectation overall (F1 macro =0.224±0.022), but performed well at classifying individuals from opinion group D (F1 D =0.34±0.10), with 51% of opinion group D individuals correctly predicted as belonging to group D. Such differences indicated a need to jointly consider all features in one model.
To do so we used a machine learning approach (see Methods), that resulted in an integrative model that was more accurate (F1 macro =0.37±0.02) than any individual feature model (gray bar; figure 2(A); table S7). As expected from the overall better performance of individual feature models at classifying the more supportive opinion groups, the integrative model performed best at classifying individuals from opinion groups D and E (F1 D =0.43±0.06; F1 E =0.54±0.04; figure 2(B), S1). However, the model was less able to distinguish between non-supportive (opinion group A) and conditionally supportive (opinion group B and C) respondents than individuals from groups D and E (figure S2). While disappointing, this difficulty in itself suggests there is not a single social basis for different patterns of negative opinions about biotechnology, but rather that individuals with a broad range of personal values, religious and political beliefs, and educational backgrounds can have negative opinions about all or some biotechnology products.
Finally, to determine the degree and direction of the relationship between each factor and the opinion groups, we compared the distribution of responses for each factor across all five opinion groups. While differences were observed for a number of factors (figure S2), we chose to focus on those that were most predictive in the individual models. While individuals from the non-supportive (A) and conditionally supportive (B and C) opinion groups did not tend to differ for these factors, we found notable differences between the moderately (D) and strongly (E) supportive opinion groups (figure 2(C)). For example, opinion group E individuals had the highest median selfreported familiarity with the biotechnology products and were more willing to sacrifice for biospheric and animal welfare causes. While, individuals from the moderately supportive group (D) were less willing to sacrifice for these causes and were less likely believe in the New Ecological Paradigm (NEP) [15], a scale used to measure general pro-environmental beliefs. This highlighted the large gap in the social basis associated with moderate compared to strong support for biotechnology, where strong supporters tend to support, and moderate supporters tend to have neutral opinions regarding environmental issues.

Discussion
Here we explored support for and perceived benefits and risks of established and emerging biotechnology products from the agricultural, environmental, and medical sectors. We found that 70% of individuals surveyed showed no (27%) or little (43%) variation in their responses across the biotechnology products included, however responses varied greatly between individuals. Clustering individuals into opinion groups we found that some respondents were strongly opposed of all of the biotechnology products (A; n=43), while others strongly supported all products (E; n=119). Only two clusters of respondents varied in their opinions across the technologies, with one group (B; n=68) being opposed to products with unknown long-term human health effects listed as risks, while the other group (C; n=131) was only supportive of products that would explicitly benefit human health. The final opinion group (D; n=160) consisted of individuals who tended to have neutral opinions and beliefs across all the products. Overall, both conservative and liberal respondents believed that all five biotechnology products should be regulated. This supports a growing body of literature [26,27] suggesting the general public, regardless of political ideology, would not support deregulation of biotechnology products.
Research on the relationship between knowledge and opinions about biotechnology, typically GM foods, has shown mixed results. For example, education has been positively associated with acceptance of the benefits of GM foods and negatively associated with levels of perceived risks [28]. However, educational interventions designed to increase perceptions of the benefits of biotechnology were shown to have no or even negative impacts on biotechnology acceptance [29]. We found that level of education was not associated with general beliefs or opinions about biotechnology and was not predictive of opinion groups. Furthermore, studies focusing on the difference between subjective (i.e. self-reported) and objective ( i.e. test based) knowledge on support for biotechnology have found subjective knowledge to be more related to acceptance than objective knowledge [30]. Similarly, extreme opposition to GM foods has been associated with low objective knowledge about science and genetics and the gap between subjective and objective knowledge increases as the extremity of opposition increases [3]. Our study included both a measure of subjective knowledge about each biotechnology product and objective ability to think scientifically. Ability to think scientifically, rather than testing for fact-based knowledge, tests for general scientific reasoning skills [4]. Interestingly, we found that individuals who perceived biotechnology to be risky and believed it needed to be regulated had strong scientific reasoning skills, a subtle-yet important-distinction from previous findings.
A limitation of this study was the difficulty with distinguishing between individuals belonging to the non-supportive (A) and conditionally supportive (B and C) opinion groups. There are a number of possible methodological reasons for this. First, our survey relied heavily on self-reported data, such as subjective familiarity with biotechnology, trust in scientists, and willingness to sacrifice for different causes, which may be less predictive of biotechnology acceptance than related, but objective, measures [30]. Second, some variables, such as general risk aversion [31] or if the respondent had or knew someone who had benefited from biopharmaceuticals, may have been predictive, but were not included in our survey. Finally, it is possible that our predictive variables were sufficient for predicting opinion groups, but that the difference in the social basis between these opinion groups was too subtle for our model to learn with the sample size available. Given these methodological limitations, it is also possible that our difficulty in distinguishing between these individuals was because there was no single or obvious basis for different patterns of negative opinions about biotechnology. Rather, individuals with different life experiences, values, religious beliefs, political orientation, etc may object to some or all biotechnology products for diverse reasons. This highlights the complexity of public opinions and beliefs about biotechnology and suggests that more work is needed to understand what drives differences in opinions about emerging biotechnology products in different sectors.